This blog is on my implementation of Google's Quick Draw dataset.
Link to Code.
Abstract
Quick, Draw! is an online game developed by Google that challenges players to draw a picture of an object or idea and then uses a neural network artificial intelligence to guess what the drawings represent. The AI learns from each drawing, increasing its ability to guess correctly in the future.The game is similar to Pictionary in that the player only has a limited time to draw (20 seconds). The concepts that it guesses can be simple, like 'foot', or more complicated, like 'animal migration'. This game is one of many simple games created by Google that are AI based as part of a project known as 'A.I. Experiments'. Quick, Draw
Requirement
The most important thing about this problem is to get the dataset. The Quick Draw Dataset is a collection of 50 million drawings across 345 categories, contributed by players of the game Quick, Draw. The drawings were captured as timestamped vectors, tagged with metadata including what the player was asked to draw and in which country the player was located. You can browse the recognized drawings on quickdraw.withgoogle.com/data.
Getting the dataset
The dataset is available on Google Cloud Storage as ndjson files separated by category. See the list of files in Cloud Console, or read more about accessing public datasets using other methods. As an example, to easily download all simplified drawings, one way is to run the command gsutil -m cp gs://quickdraw_dataset/full/simplified/*.ndjson
Full dataset separated by categories
Raw files (.ndjson)
Simplified drawings files (.ndjson)
Binary files (.bin)
Numpy bitmap files (.npy)
Here's what I did :
I logged onto the google cloud platform here. The page would look like this.
I downloaded .npy files for each of the drawings that I wanted. The labels included :
Apple 🍎
Bowtie 🎀
Candle 🕯️
Door 🚪
Envelope ✉️
Fish 🐟
Guitar 🎸
Ice Cream 🍦
Lightning ⚡
Moon 🌛
Mountain 🗻
Star ⭐️
Tent ⛺️
Toothbrush 🖌️
Wristwatch ⌚️
Setting the environment
Now, we have the .npy files. Let's see how we should load it and save as pickle files. Pickle is basically serialization and de-serialization of objects. I am going to serialize the .npy files for training.
Loading the data
Since you have the dataset as mentioned above, place the .npy files in /data folder.
Run the above script. I am taking 10,000 images per sample and storing the features and labels in the respectable pickle file.
Training
Defining the model
Model : The above gist defines the model that we'll use.
Layers : There are two convolution-max pool layers in the model followed by fully connected layers and dropout.
Dropouts : They are really important since you don't want to overfit on your training set. Dropout forces your model to learn from all the neurons and not just some of them, this helps to generalize on new data.
Output : The output layer has softmax activation which enables the output to be one of the 15 labels that we defined previously
Optimizer : I am using Adam optimizer
Epochs : 3
Batch size : 64
Callbacks : I am using TensorBoard for visualization of my model.
Save & load : I am saving the model ('QuickDraw.h5') after each batch so that I can make predictions on the trained model later.
Visualization
Metrics
At the end of the training, my accuracy is around 90% on the training set and our loss is around 0.4 which is pretty impressive. The test set also has similar accuracy (around 92%). We are clearly not overfitting the data which is always a good sign.
Building the application
Now, we have everything ready. Let's build the application which uses the webcam to help you 'draw' the doodles on the screen.
This is the application that I wrote.
Webcam : I am using the webcam for streaming live webcam feed. Notice that Lower_green and Upper_green are 2 numpy arrays that store the pixel intensities of the lower range and upper range of the color blue.
Filters : I filter out only blue color from the image using these 2 numpy arrays and several filters and masks.
Doodling : After that, I trace out the path of a blue-colored object on the screen and send that for prediction. In conclusion, I hold a blue color object in my hand and trace out the path of that object to doodle on the screen.
Prediction : The doodles are then sent to the model for prediction.
This code is now merged into Google's official GitHub repository. [Pull Request]
Output
Congratulations, you have successfully built the prototype for Quick, Draw.
Comments