Robotics/Components/Video

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Video components split up in 2 categories. On one hand you have a video camera, some form of transmission (wire or radio) and a display. On the other hand you have computer vision.

Camera, transmission, display[edit | edit source]

There are very small cameras, some even with built in wireless connection, which are cheap and need hardly any external components. These cameras can be mounted on a robot and let the user see through the robots "eyes".

If the robot has an on-board computer (single board or laptop) a webcam can be used. This allows robot control over a network or even the internet.

Computer Vision[edit | edit source]

Computer vision is a relatively new area. It tries to provide a computer with eyes and the ability to "understand" what it sees. The first step is easy. Webcams have been around for quite some time. However the second step, understanding the image, is the hard one.

Image processing plays a substantial part in robotics where computer vision is used. There are many aspects to image processing including: demosaicing, image registration, differencing, recognition and segmentation.

Demosaicing[edit | edit source]

Demosaicing is perhaps the first step in image processing as it pertains to computer vision due to the fact that this process occurs just after the image is initially captured. In general, demosaicing is an algorithm that takes the image after being processed by a color filter array (CFA), such as a Bayer filter, and reconstructs a full color image. The image shown on the left is the output from the CFA and the image on the right shows the reconstructed image using the demosaicing algorithm. The result is an overall loss in detail and sharpness leading to a loss in resolution.

Image Registration[edit | edit source]

Image registration is the process used to combine multiple images into a single coordinate system. Combining images into a single coordinate system makes it easier to process the data contained in images taken at different times or from different perspectives. There are two main types of image registration based on how the images are combined into a single coordinate system. The first of which is intensity based, which uses correlation metrics to compare intensity patterns within the images to combine them. The other method is feature based, which uses points, lines and contours within the images to combine the images. Image registration has also been used in the realm of digital photography with the process of photo-stitching.

Image Differencing[edit | edit source]

Image differencing is used to obtain the changes between two images and is done by comparing the images pixel by pixel. A typical application of image differencing is with a traffic camera that needs to determine where cars are located in an image. By taking an image and finding the difference between it and the frame before it, the result is an image similar to the one shown where the outlines of the vehicles are neatly shown. The problem with this method is that there is a gap between the images in where the car is located. To solve this problem, taking the difference between the current image and a stock image that was taken when no cars were present gives a much clearer image depicted where the cars are located.

Image Recognition[edit | edit source]

The process of image recognition is difficult to pull off in the realm of computer vision. Nothing comes close to what a human can accomplish, but robots can do simpler tasks. Some of the simpler tasks that can be accomplished within computer vision are recognizing simple geometric shapes, human faces and fingerprints. Different aspects of the problems involved with image recognition include: object recognition, identification and detection. An implementation of image recognition is known as geometric hashing. The geometric hashing algorithm uses a model image of a particular object such as a computer mouse to see if it is located in a given image.

Image Segmentation[edit | edit source]

The process of image segmentation is used to simplify other aspects of image processing. This includes breaking an image down into smaller pieces making object detection and edge detection easier. The K-means algorithm can be used to segment the image into K clusters. The basic algorithm is shown below.

// Basic K-means algorithm

1. Pick K cluster centers, either randomly or based on some heuristic

2. Assign each pixel in the image to the cluster that minimizes the variance between the pixel and the cluster center

3. Re-compute the cluster centers by averaging all of the pixels in the cluster

4. Repeat steps 2 and 3 until convergence is attained (e.g. no pixels change clusters)