So I managed to start using my stereo camera to seeing some depth and got the first steps working to generating a point cloud of the scene.
Top row: Left and right images (rectified). and disparity image is lower left using the block matching algorithm.
The new camera uses adjustable screws, see image below:
New camera setting with adjustable screws
I used the code/app from http://blog.martinperis.com/2011/01/opencv-stereo-camera-calibration.html to calibrate the cameras and work out there matrices/ rectifying values for each feed. Once the feeds are rectified (another way to describe this would be to normalize the images, so that they can be interpreted correctly) then the block matching algorithm can be used to determine the disparity map. Next would be to calculate the actual distance of each point from the camera and then we can generate a point cloud. The algorithm only runs at about 5 fps right now on the HD frame on my NVS 4200. I checked with NVIDIA nsight profiling and the block matching kernel takes 188ms, which is the reason for the poor perf. See image below
Profile using NVIDIA nsight of stereo block matching
As always, the code is up at https://github.com/arcanon/raspbot. I know, it won't really compile, which I hope to fix. Next I want to generate the point cloud and then I will make everything easily compile-able. well as easy as you can get, because its using a whole bunch of complicated libraries...
So I made my first stereo camera this weekend with 2 raspberry pi's. It actually worked out pretty easily. The exact stereo angle of the camera's is not exact and only controlled with pieces of paper and elastic bands. The blue bands in the middle and paper on the outer edge tilt the camera a little more inwards. Here is a example anaglyph stereo (you will need red/blue stereo glasses to view it properly):
It would be best to have exact screws which you can use to adjust the angles and such. There are small holes on the camera that would allow these screws to be attached, so its just a matter of finding the right adjustable screws. Although thinking of that now, it should not be a big thing. I know the stereo alignment is a bit funny, which should be fixed, but my eyes where able to find the right focus and you can see the 3D effect quite nicely. The code is up at https://github.com/arcanon/raspbot. It won't compile out of the box, but have a look at video reader for the capture loop/anaglyph composition.
The CUDA kernel that composites the kernel looks like this:
__global__ void anaglyph_dev(char* imageLeft, char* imageRight, char *imageOut, int pitchInputs, int pitchOutput, int width, int height) { int x = blockIdx.x * blockDim.x + threadIdx.x; int y = blockIdx.y * blockDim.y + threadIdx.y; if (x >= width || y >= height) return; int linearPosInputs = y*pitchInputs + x; int linearPosOutput = y*pitchOutput + x*4; // Red imageOut[linearPosOutput] = imageLeft[linearPosInputs]; imageOut[linearPosOutput+1] = 0; imageOut[linearPosOutput+2] = imageRight[linearPosInputs]; }
So I was noticing some issues with ORB on the GPU. The descriptors obtained at (in https://github.com/arcanon/raspbot/blob/master/video_reader.cpp):
orbGpu(finalFrame, maskFrame, gpuKeyPoints, gpuDescriptors); orbGpu.downloadKeyPoints(gpuKeyPoints, keyPoints);
These were somewhat "unreliable" because there was no Gaussian blur before calculating the descriptor values. You could see this easily if you paused the video, so that the key points were always calculated on the same image. In this case, for the GPU the points jumped around while for the CPU they remain constant. This was fixed with orbGpu.blurForDescriptor = true;
The next step will be a stereo camera setup, so that I can filter out the background, which will help a lot filtering out false positives.
The other issue is that the camera has an exposure that is too long, this is a problem for the speed at which the detection happens. You can't move the robot to fast (or the objects can't move past the robot) because then they start to blur and then nothing is recognizable.
Ever lost your coffee and needed some help finding it? I have and it can be a stressful situation, as you don't have the coffee to help calm your nerves. That is where the raspbot jumps in to save the day. Just make sure you have a distinct enough mug, so that it does not get lost in the background. This makes use of OpenCV and object detection. It streams the video from a raspberry pi to a server, which is running OpenCV and has a GPU. Then GPU is then used to decode the H264 stream. Then either the GPU or the CPU is used to analyze a frame for interesting information (in this case, where is the mug). OpenCV has GPU and CPU support, so you can choose.
System overview
Here is the code for the server https://github.com/arcanon/raspbot/blob/master/video_reader.cpp. The other core code is also in that repository. You obviously need to OpenCV. I was running on windows, you have to fight a bit to get it compiled on windows, I will give some tips at some point on that, maybe its also easier with newer versions.
Capture Thread: 1.) Winsock buffer connects to netcat client on PI 2.) Buffers data and passes to H264StreamSource 3.) Data then Passed to GPU for decoding
Detection Thread: 1.) Frame from capture thread recieved 2.) Converts to grey scale 3.) uses ORB/SURF for feature extraction. On CPU or GPU. 4.) Compares against detection object. 5.) Sets up commands to send to PI
Command Thread: 1.) Connects to Python client on PI 2.) When Detection thread has commands, these are sent to the PI.
The first command would cancel the second because the code on the atmega resets the sent command https://github.com/arcanon/raspbot/blob/master/Blink.inowith readBytes[0]=0. The SPI receive processing is asynchronous (well, I guess the SPI interrupt function is serial with respect to clocks), but you don't know when the function is run. So the first command is received and while the command processing in the main loop is running, the next command was read into readBytes[0] which is then overwritten.
The fix is to have a ring buffer which saves all the commands and allows the main loop to run asynchronously to the SPI interrupt function. The new blink.ino contains the fixed code.
So I finally got some time to provide the updated info for my rasp bot. In this post, I give updates on how I have rebuilt the bot to have drive more accurately (the last version with such small wheels did not fair so well on different surfaces and the motors were not great) and I talk about the SPI enhancements for speed control and voltage read back.
The new build:
1.) a left and right servo for the wheels
2.) much larger wheels
3.) a wooden chassis to hold bread board and camera
I learnt during the upgrade, that wood and elastic bands are really good materials to use during building custom projects. Wood because it's relatively easy to form and elastic because you can hold the different pieces together easily without having to glue everything permanently. Later you can then reconfigure things with ease.
In this version, I switched back to a mini bread board, so that I can add things to the circuit more easily. The last version was all soldered together, which was great fun, but really not practical in the long run when you are still trying things out. For the motors, I bought some servos, which are normally intended for controlled angler movement, but I removed these circuits and restrictions. Then I use them as plain DC motors with gear. I liked this solution because the servos were relatively cheap and provide good power. The big wheels I found a model shop around the corner. They are pretty much directly attached to the servos (the servos came with small plastic discs that I could glue to the wheels with a CD providing a in surface between the wheel and the servo disc attachment). This is also really nice because was quite difficult to find a simple gear system for connecting a DC motor to wheels. Mostly others buy a remote toy and modify that because then all the parts are pretty much there. However, you are not able the to build your own design.
Lastly, I now have the new raspberry pi camera. It's great for use with raspivid and streaming. You can get great video and choose your compression(a problem when you only have 440KB/s dsl uploads). That's all stuck or bound together as shown in the picture.
Speed control and voltage read back:
I was having some issues with capturing the video and analysis thereof when moving. The issue Is that with movement it takes a while for the stream to settle. Otherwise there is just to much noise or blurred images(especially in low light) to make reliable descisions). one solution is to make the movement less jerky, as up until now I was just turning the motor on and off (with an input voltage of 4.8v). I did reduce the input to just 3.6, which helped, but still was not good enough. So I changed the code to send a speed value to the atmega with every movement control. The atmega then uses this value to toggle the time which the enable pin of the hbridge(L293), so that it is only on a certain percentage of the time. This is basically software PWM (pulse wave modulation). The atmega does this by counting loop cycles and switching the enable pin on for the number of loop cycles sent from the python script running on the PI. I could also use the atmegas PWM pins, but those conflict with the SPI pins (the MSO and SS pins are 2 of the 3 PWM pins).
I have updated the source https://github.com/arcanon/raspbot/blob/master/Blink.ino and the python script, https://github.com/arcanon/raspbot/blob/master/ardspi.py. The SPI read back is done by using the quick2wire duplex method. Note that 1 byte from the slave is read back at the exact same time as 1 byte from the master. This means the slave needs to pre-prepare the data to be sent or you discard the first byte. I have just done the later for now. I send back the 10 bit conversion of the voltage measured by the adruino on a pin which is connect to the positive battery supply of the motors (3.6V) via a 10K resistor. The same connection is connected via another 10K resistor to ground. The voltage is then measured at 1.8V, divided as expected. I have not done time measure measurements to see how this varies over time.
The python script also contains some networking code, which I will write about in another post. That code allows the PI to take to a server which is capturing the video stream (from raspivid) and then doing various video analysis. I can then make use of a dekstop GPU to improve the speed of the analysis for object detection and such.
Just to add to my last post, I used the "minicom ama0", method of monitoring data from the atmega. Some more details here. I can't remember if I needed to change some special settings, but if anyone has issues I can check my settings if need be.