A Magic Mirror That Looks Back At You

This the third post in a of three posts building a simple voice controlled Magic Mirror. The first post in the series showed how to put the mirror together, while the second post in the series looked at how to use Machine Learning locally on the device to do custom hotword recognition without using the cloud.

At the end of last year I built a simple voice controlled Magic Mirror with the Raspberry Pi and the Google AIY Projects Voice Kit, and then went on to look at how to use Machine Learning locally on the Raspberry Pi to do custom hot word recognition without connecting back to the cloud.

Our Magic Mirror alongside the Google AIY Projects Voice and Vision Kits.

However, I was never entirely happy with how I left the build. Unlike our previous build there wasn’t a button or dial to trigger Voice Assistant, and the default hot word support in the Voice Assistant is not perhaps the most atmospheric. While using a custom hot word to activate the mirror worked, it didn’t work as well as using the default “Okay, Google.”

At the time I was vaguely thinking about going on and adding a camera to the mirror and using it and computer vision to do facial detection. Then, instead of activating the mirror using a hot word, just walking up and looking at the mirror would be enough for it to start listening for your commands.

But that seemed like a lot of work, or at least it did before Google announced their second AIY Projects kit, the Vision Kit.

The completed Google AIY Projects Vision Kit.

I managed to get my hands on the hardware at the start of this year, and have been thinking about going back to my magic mirror ever since. The only real question would be how to integrate the two kits.

In the end I’m opting for what I’m going to refer to as ‘loosely coupled.’ Instead of trying to connect the two Raspberry Pi boards together using a spare GPIO pin or two, and directly signal the Voice Kit when a face is visible, I’m going to use the connection they already share, the wireless network.

Assembling the Vision Kit

I’ve talked about how to put the Vision Kit together in detail elsewhere, so I shan’t walk over that ground again. However this time around we just need the wiring harness, shorn of its cardboard skin. So go ahead and plug the Vision Kit wiring harness together following the instructions.

The wiring harness of the AIY Projects Vision Kit.

Next, go ahead and download the latest SD Card image for the Vision Kit, and set set up your Raspberry Pi installation as normal—enabling wireless networking, secure shell, and OTG access—and then install a VNC server.

Face detection with the Vision Kit

If everything is working correctly, when you boot the Vision Kit for the first time, and after some time has passed — at least a minute or two, possibly a bit longer — the green privacy LED will light up as the Joy Detector demo automatically starts.

After you’ve played around with the demo application to make sure everything is working as expected, you should go ahead and stop the service and then disable it so it won’t restart on boot.

$ ssh pi@raspberrypi.local
pi@raspberrypi.local's password:
.
.
.
$ sudo systemctl stop joy_detection_demo.service
$ sudo systemctl disable joy_detection_demo.service

and the set up the development environment,

$ source ~/AIY-projects-python/env/bin/activate
(env) $ cd ~/AIY-projects-python/src/examples/vision

From here we can run the simple face_detection_camera.py example which runs continuous face detection using the VisionBonnet and prints the number of detected faces in the camera image.

Starting it from the command line we can run it over 50 frames,

(env) $ python ./face_detection_camera.py --num_frames 50
Iteration #0: num_faces=0
Iteration #1: num_faces=0
Iteration #2: num_faces=0
Iteration #3: num_faces=0
Iteration #4: num_faces=1
Iteration #5: num_faces=1
Iteration #6: num_faces=1
Iteration #7: num_faces=1
Iteration #8: num_faces=0
Iteration #9: num_faces=0
Iteration #10: num_faces=1
Iteration #11: num_faces=1
.
.
.
Iteration #49: num_faces=0
(env) $

should give a count of the number of faces in each frame. It may take some time to initialise the script before it starts, so patience is needed.

Modifying the Vision Kit code

The plan is to modify the example face detection code to save, and then update, a file with the number of faces it sees rather than print the number to the console. We can then use this file as a really simple way to share the current face count in the camera view.

We can then share this file from the Vision Kit, over the network to the Voice Kit, which can check the current face count. Essentially, the file takes the place of the hot word, or arcade button. If the Vision Kit face count is greater than zero, then the Voice Kit will start to listen for instructions

Sharing the file using SimpleHTTPServer

We can share the file using Python’s SimpleHTTPServer module. Go ahead and create the /home/pi/www directory and then save the following code into it,

We can then start a simple web server that isn’t caching the file on port 9000 from that directory as follows,

$ cd /home/pi/www
$ python server_notcached.py 9000
Serving HTTP on 0.0.0.0 port 9000 ...

and serve our face_count.txt file to the network.

Modifying the Voice Kit code

One benefit of working in this ‘loosely coupled’ way is that we don’t have to make any structural changes to the magic mirror we built before we test our integration between the Voice and Vision Kits. Instead we can go ahead and get our code running before trying to integrate the actual Vision Kit hardware into the existing build.

The Vision Kit sitting in front of our Voice Kit-based Magic Mirror.

Looking at the code we wrote to control our magic mirror last time, perhaps the easiest to modify is going to be the version using the Google Cloud Speech API.

In that code we used the aiy.cloudspeech.get_recognizer() call to grab a Cloud Speech recogniser, and then we paused waiting for our custom hot phrase. We can just go straight ahead and replace that check with some code that waits to see whether the Vision Kit can see a face.

While this code uses the Cloud Speech API, which is a paid for service, unlike our previous version it isn’t running all the time. So it’s not going to be as ruinously expensive as before, you’re only going to be paying for calls to the API when a face is in view of the Vision Kit camera.

You can keep tabs on both your API usage, using the project’s API Dashboard, and current billing costs, using the Billing Console. You can see the cost breakdown on a per-project basis by clicking through on each project listed in the “Projects linked to this billing account” tab near the bottom of the console.

The API Dashboard showing the API usage of our build over the last hour.

You can use the online calculator to work out roughly how much you’re going to spend running you project.

Putting it all together

First, login to the Vision Kit and start the face detection code running,

$ source ~/AIY-projects-python/env/bin/activate
(env) $ cd ~/AIY-projects-python/src/examples/vision
(env) $ python face_detection_camera_with_file.py

and then in another terminal window start the web server.

$ cd /home/pi/www
$ python server_notcached.py 9000
Serving HTTP on 0.0.0.0 port 9000 ...

Then login to the Voice Kit, and start the Magic Mirror software,

$ cd MagicMirror
$ npm start

and then in another terminal window start our Voice Kit code.

$ source ~/AIY-voice-kit-python/env/bin/activate
(env) $ cd ~/AIY-voice-kit-python/src/
(env) $ python mirror_with_file.py

Now go ahead and stare into the Raspberry Pi Camera Module. If all goes well you should hear a ‘ting’ noise as the Voice Kit figures out that you want to say something.

The Vision and Voice Kits working together.

Meanwhile if you’re keeping track of the script outputs on your laptop you should see something along these lines.

Voice Kit windows (left) and Vision Kit windows (right).

If you watch you can see the number of faces detected changing as I step into view of the Raspberry Pi camera, and the Voice Kit subsequently doing speech recognition. The VNC window (bottom left) shows the Magic Mirror display.

Modifying the magic mirror

At this point we’re effectively done. All we need to do now is quickly mount the camera behind the two-way glass. So remove the back from the mirror, and cut a small hole for the camera lens

The Vision Kit mounted (left) inside the Magic Mirror beside the existing Voice Kit hardware.

Now go ahead and close up the mirror and restart our services. If everything goes to plan you should see something like this, as the mirror looks back at us.

The final working mirror.

More kits coming soon

The first batch of Vision Kits — a limited run of just 2,000 units—arrived on the shelves at Micro Center at the tail end of last year. The kits sold out fairly quickly, however if you’re interested in picking one up, more kits are now expected on shelves shortly.

Check back here, or follow me on Twitter, for news on further availability.

If you like this build I’ve also written other posts on building a retro-rotary phone Voice Assistant with the Raspberry Pi and the AIY Projects Voice Kit, and a face-tracking cyborg dinosaur called “Do-you-think-he-saurs” with the Raspberry Pi and the AIY Projects Vision Kit.

This post was sponsored by Google.

--

--

--

Scientist, Author, Hacker, Maker, and Journalist.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

From Ballerina to AI Researcher: Part II

My Machine Learning Diary: Day 60

How to speedup 31*31 conv 10 times

Starbucks Capstone Project

Landing a Researcher Role at a Large Tech Company

Fun with PyTorch’s tensor functions

The past, present and future of deep learning

What is Confusion Matrix?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alasdair Allan

Alasdair Allan

Scientist, Author, Hacker, Maker, and Journalist.

More from Medium

How Bowling Scoring Works

Computer Science — Hamming Distance

What is a UVC camera? What are the different types of UVC cameras?

I Typed This Article On a Keyboard I Made From Scratch