Last week I turned in my machine learning project. We had to build and train an algorithm that detects gestures made in front of a Microsoft Kinect to control a simple image slideshow (there’s a video).
For each frame the Kinect records we get simple x y z coordinates for various body parts (= input features) like hands, elbows, spine etc. In ~2 seconds we’d get data for about 20 frames. Here are data for two gestures, rotate (with the right arm), and swipe right (with the left arm):
That’s a lot of numbers, which is nice for neural networks 'n stuff, but for the human eye it’s difficult to effectively see any correlations. Which features behave differently, which ones are the same for certain gestures?
To get a better feeling for what’s happening here I tried to visualize those data. For that I used a technique that translates the x-, y- and z-axis of 3D-coordinates to corresponding values for red, green, and blue. 1D-data, like angles, get represented by shades of black (the higher the value, the brighter the pixel)
This way I converted all input features into colors and the data from the tables above look now like this:
Each row represents one frame, each column one feature (same sequence as described above).
Now you can see instantly which features change, which stay the same and which ones are different compared to other gestures.
For example, you can see that the color in the left-hand-column on the rotate-graphic stays always quiet the same, while the right hand is pretty colorful. For the swipe-right-gesture (which we did with the left hand only) it’s the other way round.
Also the right elbow’s angle is always changing while rotating (different shades of gray, at least 50), but (obviously) doesn't change much while swiping with the other arm.
We used this technique to decide which features are relevant for our objectives and which are not, if we have enough features to reliably detect a certain gesture and to make sure we dealt with the right data (and didn’t swap anything by accident).
The project was coded in Scala, but I rewrote the visualizing-part in ruby and provide that plus some sample data on github.
I’m sure there are more interesting ways to visualize such data, if you know any, I’d be happy to hear from you!