A Magic Mirror That Looks Back At You
Machine Vision imprisoned behind a sheet of glass
This the third post in a of three posts building a simple voice controlled Magic Mirror. The first post in the series showed how to put the mirror together, while the second post in the series looked at how to use Machine Learning locally on the device to do custom hotword recognition without using the cloud.
At the end of last year I built a simple voice controlled Magic Mirror with the Raspberry Pi and the Google AIY Projects Voice Kit, and then went on to look at how to use Machine Learning locally on the Raspberry Pi to do custom hot word recognition without connecting back to the cloud.
However, I was never entirely happy with how I left the build. Unlike our previous build there wasn’t a button or dial to trigger Voice Assistant, and the default hot word support in the Voice Assistant is not perhaps the most atmospheric. While using a custom hot word to activate the mirror worked, it didn’t work as well as using the default “Okay, Google.”
At the time I was vaguely thinking about going on and adding a camera to the mirror and using it and computer vision to do facial detection. Then, instead of activating the mirror using a hot word, just walking up and looking at the mirror would be enough for it to start listening for your commands.
But that seemed like a lot of work, or at least it did before Google announced their second AIY Projects kit, the Vision Kit.
I managed to get my hands on the hardware at the start of this year, and have been thinking about going back to my magic mirror ever since. The only real question would be how to integrate the two kits.
In the end I’m opting for what I’m going to refer to as ‘loosely coupled.’ Instead of trying to connect the two Raspberry Pi boards together using a spare GPIO pin or two, and directly signal the Voice Kit when a face is visible, I’m going to use the connection they already share, the wireless network.
Assembling the Vision Kit
I’ve talked about how to put the Vision Kit together in detail elsewhere, so I shan’t walk over that ground again. However this time around we just need the wiring harness, shorn of its cardboard skin. So go ahead and plug the Vision Kit wiring harness together following the instructions.
Next, go ahead and download the latest SD Card image for the Vision Kit, and set set up your Raspberry Pi installation as normal—enabling wireless networking, secure shell, and OTG access—and then install a VNC server.
Face detection with the Vision Kit
If everything is working correctly, when you boot the Vision Kit for the first time, and after some time has passed — at least a minute or two, possibly a bit longer — the green privacy LED will light up as the Joy Detector demo automatically starts.
After you’ve played around with the demo application to make sure everything is working as expected, you should go ahead and stop the service and then disable it so it won’t restart on boot.
$ ssh pi@raspberrypi.local
pi@raspberrypi.local's password:
.
.
.
$ sudo systemctl stop joy_detection_demo.service
$ sudo systemctl disable joy_detection_demo.service
and the set up the development environment,
$ source ~/AIY-projects-python/env/bin/activate
(env) $ cd ~/AIY-projects-python/src/examples/vision
From here we can run the simple face_detection_camera.py
example which runs continuous face detection using the VisionBonnet and prints the number of detected faces in the camera image.
Starting it from the command line we can run it over 50 frames,
(env) $ python ./face_detection_camera.py --num_frames 50
Iteration #0: num_faces=0
Iteration #1: num_faces=0
Iteration #2: num_faces=0
Iteration #3: num_faces=0
Iteration #4: num_faces=1
Iteration #5: num_faces=1
Iteration #6: num_faces=1
Iteration #7: num_faces=1
Iteration #8: num_faces=0
Iteration #9: num_faces=0
Iteration #10: num_faces=1
Iteration #11: num_faces=1
.
.
.
Iteration #49: num_faces=0
(env) $
should give a count of the number of faces in each frame. It may take some time to initialise the script before it starts, so patience is needed.
Modifying the Vision Kit code
The plan is to modify the example face detection code to save, and then update, a file with the number of faces it sees rather than print the number to the console. We can then use this file as a really simple way to share the current face count in the camera view.
We can then share this file from the Vision Kit, over the network to the Voice Kit, which can check the current face count. Essentially, the file takes the place of the hot word, or arcade button. If the Vision Kit face count is greater than zero, then the Voice Kit will start to listen for instructions
Sharing the file using SimpleHTTPServer
We can share the file using Python’s SimpleHTTPServer
module. Go ahead and create the /home/pi/www
directory and then save the following code into it,
We can then start a simple web server that isn’t caching the file on port 9000 from that directory as follows,
$ cd /home/pi/www
$ python server_notcached.py 9000
Serving HTTP on 0.0.0.0 port 9000 ...
and serve our face_count.txt
file to the network.
Modifying the Voice Kit code
One benefit of working in this ‘loosely coupled’ way is that we don’t have to make any structural changes to the magic mirror we built before we test our integration between the Voice and Vision Kits. Instead we can go ahead and get our code running before trying to integrate the actual Vision Kit hardware into the existing build.
Looking at the code we wrote to control our magic mirror last time, perhaps the easiest to modify is going to be the version using the Google Cloud Speech API.
In that code we used the aiy.cloudspeech.get_recognizer()
call to grab a Cloud Speech recogniser, and then we paused waiting for our custom hot phrase. We can just go straight ahead and replace that check with some code that waits to see whether the Vision Kit can see a face.
While this code uses the Cloud Speech API, which is a paid for service, unlike our previous version it isn’t running all the time. So it’s not going to be as ruinously expensive as before, you’re only going to be paying for calls to the API when a face is in view of the Vision Kit camera.
You can keep tabs on both your API usage, using the project’s API Dashboard, and current billing costs, using the Billing Console. You can see the cost breakdown on a per-project basis by clicking through on each project listed in the “Projects linked to this billing account” tab near the bottom of the console.
You can use the online calculator to work out roughly how much you’re going to spend running you project.
Putting it all together
First, login to the Vision Kit and start the face detection code running,
$ source ~/AIY-projects-python/env/bin/activate
(env) $ cd ~/AIY-projects-python/src/examples/vision
(env) $ python face_detection_camera_with_file.py
and then in another terminal window start the web server.
$ cd /home/pi/www
$ python server_notcached.py 9000
Serving HTTP on 0.0.0.0 port 9000 ...
Then login to the Voice Kit, and start the Magic Mirror software,
$ cd MagicMirror
$ npm start
and then in another terminal window start our Voice Kit code.
$ source ~/AIY-voice-kit-python/env/bin/activate
(env) $ cd ~/AIY-voice-kit-python/src/
(env) $ python mirror_with_file.py
Now go ahead and stare into the Raspberry Pi Camera Module. If all goes well you should hear a ‘ting’ noise as the Voice Kit figures out that you want to say something.
Meanwhile if you’re keeping track of the script outputs on your laptop you should see something along these lines.
If you watch you can see the number of faces detected changing as I step into view of the Raspberry Pi camera, and the Voice Kit subsequently doing speech recognition. The VNC window (bottom left) shows the Magic Mirror display.
Modifying the magic mirror
At this point we’re effectively done. All we need to do now is quickly mount the camera behind the two-way glass. So remove the back from the mirror, and cut a small hole for the camera lens
Now go ahead and close up the mirror and restart our services. If everything goes to plan you should see something like this, as the mirror looks back at us.
More kits coming soon
The first batch of Vision Kits — a limited run of just 2,000 units—arrived on the shelves at Micro Center at the tail end of last year. The kits sold out fairly quickly, however if you’re interested in picking one up, more kits are now expected on shelves shortly.
Check back here, or follow me on Twitter, for news on further availability.
If you like this build I’ve also written other posts on building a retro-rotary phone Voice Assistant with the Raspberry Pi and the AIY Projects Voice Kit, and a face-tracking cyborg dinosaur called “Do-you-think-he-saurs” with the Raspberry Pi and the AIY Projects Vision Kit.
This post was sponsored by Google.