Do It Yourself Artificial Intelligence

This is a transcript of the talk I gave at Crowd Supply’s Teardown Conference.

The AIY Projects Voice Kit, free on the cover of Issue 57 of the MagPi
The Google AIY Projects Voice Kit.
The AIY Projects Vision Kit.
The Updated AIY Projects kits.
The AIY Projects Vision Bonnet, with the Intel Movidius chip.

Voice Kit

Before we talk about how to use it, lets talk about building it.

Putting together the Google AIY Projects Voice Kit. (Image credit: Google)
The Google Cloud Platform “Getting Started” screen.
Select, or create, a project.
Creating a new project.
The now populated list of projects.
The new AIY Project.
Enabling the Google Assistant API
The Google Assistant API has been enabled.
Creating an OAuth client ID.
Before creating a client ID the application’s Consent Screen needs to be configured.
Configuring the Consent Screen.
Creating credentials.
The generated Oauth credentials, ready for downloading.
My Google Account’s Activity Controls panel.
Enabling Google Cloud Speech API
Google Cloud Speech is not free!
If you’ve never set up billing before you’ll need to do that now.
Picking up $300 of free services for signing up for billing.
Even once your $300 of credit is used up, you won’t be automatically billed.
The Google Cloud Speech API is now enabled.
Creating a Service Account key.
Creating a Service Account key.
The original Voice HAT (Image credit: Google)
The new Voice Bonnet (Image credit: Google)

Vision Kit

Again let’s start by looking at how to put it together. This one really is just plugging things together, you don’t even need a screw driver this time around.

Putting the Vision Kit together. (Image credit: Google)
  • The Dog / Cat / Human model can identify whether there’s a dog, cat, or person in an image and draw a box around the identified objects. It’s based on the MobileNet model.
  • The Dish Classifier model is designed to identify food in an image. Again it’s based on the MobileNet model.
  • The Google Image Classifier is a general-purpose model designed to recognize and identify a number of common objects. It’s based on the MobileNet ImageNet classifier model.
  • The Image Classifier model is also designed to identify objects in an image. However this one is based on the SqueezeNet model.
  • The Nature Explorer. It has 3 machine learning models based on MobileNet, trained on photos contributed by the iNaturalist community. These models are built to recognize 4,080 different species (~960 birds, ~1020 insects, ~2100 plants). It’s only included in the most recent SD card image. If you are using a version of the card image older than March, you will need to update it.
The Vision Bonnet (Image credit: Google)
The ribbon cable (left) and button connector (right). (Image credit: Google)
The electronics of the ‘do-you-think-he-saurs.’

The Bigger Picture

I’m actually quite intrigued in how these kits fit into the bigger picture.

Voice Kit Links

Links to the work I’ve done with the Voice Kit.

Vision Kit Links

Links to the work I’ve done with the Vision Kit.

Scientist, Author, Hacker, Maker, and Journalist.

Get the Medium app