Playing with the Kinect

I’ve been playing with the Microsoft Kinect in Windows.

Ironically, perhaps, it was far faster to use non-MS tools to do it!

I’ve just spent an hour or so playing with the outputs from the OpenKinect tools, which allow you see the IR outputs, depth map (as a false colour “thermal image” and the visible colour camera. The Kinect uses a simple trigonometry based range finder and triangulates around a thousand points across the field of view. There are some great pictures at http://www.futurepicture.org/?p=116 showing the actual IR pattern, and here is one he has Public Domained:

Kinect speckle pattern

(Click the image to spool a new tab of the post on FUTUREPICTURE)

You can see there are 9 squares, and a repeated pattern of “speckles” from a Diffractive Optical Element (DOE) which I expect was created specially and is effectively identical in every Kinect in the world. (I did my 4th year project on these for my degree)

Another in-depth look at the Kinect can be found at http://blog.mivia.dk/2011/06/my-kinect-datasheet/ which goes into a bit of detail on the range capability.

My first look today was at how robust the sensor suite is. One thing that you learn when doing robotics, military work or just basic research is that multiple sources are generally better than just one. (Whether the 3 readings you get from the Kinect are “multiple sources” is probably open to debate – if one stream dies, they all die. But whilst working nicely, you get 3+ streams {you  also get audio and an accelerometer output as well as the IR and visible cameras and depth})

It is generally accepted that it is impossible to fool IR or visible cameras. The general truth is wrong, of course. However, fooling multiple sensors is far harder than fooling just one, and tricking IR and visible cameras and a laser scanning depth map is pretty hard!

To that end, I thought I would have a look at the weaknesses. Another blog, the datasheet above) tried pointing one Kinect at another and jamming it that way. It didn’t do much. Further, there is now multiple Kinect support built into the OpenKinect libraries.

I did crash the stream by toggling a high powered IR beam at the cameras, but that wasn’t reliable.

Another idea was to shield or reflect the scanner pattern.

A screenshot of the 3 main outputs of the Kinect

The main window is the depth map ("hotter" is closer, black is either no data or really far away), the top right is visible light, the bottom right is IR and speckle pattern

We can see a “hi-viz” shows up incredibly well in the IR picture, though is almost invisible in the visible light image, and of course makes no difference to the depth mapping. The reflective strips are dull in the visible light, but return huge levels of IR to the camera.

In the top left of the images is a floating mylar helium balloon. Without knowing what it was, you would mistake it for noise on the sensors – though obviously, such well correlated noise would alert any well trained operator or AI to the fact that something was happening! The visible image shows it clearly as a red balloon.

Most interesting, I think, is the mylar sheet at my feet. This is mostly black in the depth map, which means that either it is over ~5 meters away (with these settings) which it isn’t, or there is no data. Since it is only 2.5m away, it is clearly falseing the sensor. Likewise, in the IR it is nearly invisible. Due to being quite crumpled, and in a side-lit room at fairly high brightness, the sheet looks quite bright in the visible image, though it is hard to tell what it is.

The final observation is, aside from the stuff everywhere, that matt black absorbs the IR well enough that the depth map cannot be estimated or calculated. Looking at my black leather belt, it absorbs the IR (standing out as black against the grey IR image of my black trousers), and is also black in the visible image.

So, it would appear that for collision avoidance of obstacles, using a Kinect alone would be very foolish. Unlikely though it is, anyone wearing a black leather jacket would be effectively invisible, even at short range. So bikers and goths could well be stealthed to a Kinect based sensor suite.

A final note about distance accuracy: I was amazed to see that the absolute distance given by the software with no calibration is better than 20cm at 5 metres. My testing to a target placed 3.02m away from the sensor revealed that the software could indeed return a value that was exactly correct against my tape measure, and even at 5.02m (the limit of the room) the Kinect reported between 5.20 and 5.05m in broad (indoors) daylight.

Accuracy testing at night and outdoors will be carried out when it stops raining!

Leave a Comment