Vision Tutorial

This tutorial will help you get started with computer vision on the Raspberry Pi. First we will get you up and running by learning to take images using OpenCV and python, then we will go through some image processing methods which you may find useful for your final project.

OpenCV Python Basics

This part of the tutorial is designed to help you get started with capturing images and using OpenCV on the Raspberry Pi. Throughout this tutorial, we will be using python 3 and a Raspberry Pi camera however for your project you may use any programming language or camera you like.

Importing Libraries

The main libraries which you need to be familiar with for the vision component of your task are the OpenCV (open source computer vision) and numpy (numerical python) libraries. Both libraries have been pre-installed on your Raspberry Pi for you (OpenCV can take hours to install), however if for any reason you do need assistance installing these at any stage, we are happy to guide you through it.

To import these libraries, you should have the following at the top of each script:

import cv2
import numpy as np

The Camera Object

The next step is to create your camera object. There are 2 modules that you can use for this; the picamera module, or OpenCV’s inbuilt cv2.VideoCapture() module. There are advantages to using either method. The picamera module offers more control of the image but cv2.VideoCapture() will work with any camera. So it is up to you to decide which module you would like to use for your project. Below we offer an example of using both approaches. Raspberry Pi camera basics, OpenCV basics.

import time
import picamera

with picamera.PiCamera() as camera:
	camera.resolution = (1024, 768)
	camera.start_preview()
	# Camera warm-up time
	time.sleep(2)
	camera.capture('foo.jpg')
# See https://picamera.readthedocs.io/en/release-1.10/recipes1.html

cap = cv2.VideoCapture(0)  		# Connect to camera 0 (or the only camera)
cap.set(3, 320)                     	# Set the width to 320
cap.set(4, 240)                     	# Set the height to 240
ret, frame = cap.read()	     		# Get a frame from the camera 

# See https://docs.opencv.org/3.0-beta/modules/videoio/doc/reading_and_writing_video.html#videocapture-set

Displaying Frames Using OpenCV and Python

Now you have called the function to read a frame in from the camera, and stored the data in your ‘frame’ variable, you may wish to display your image. The ‘ret’ variable you just created will tell you whether or not the frame was successfully obtained, so first you must ensure the read was a success. Then, you can use the OpenCV cv2.imshow() function to display your frame!

if ret == True:				     # Check if data was obtained successfully
	cv2.imshow("CameraImage", frame)     # Display the obtained frame in a window called "CameraImage"
	cv2.waitKey(0)			     # Make the program wait until you press a key before continuing.

# Please note: cv2.waitKey() must be called for cv2.imshow to display. 
# The input parameter to the cv2.waitKey() function is the wait time delay in milliseconds. 
# If you do not want infinite delays (0 = wait forever), simply put a 1 or other millisecond
# value in the brackets instead.

If we wanted to save this image, we could add a save step after obtaining the image.

cv2.imwrite("frame0001.png", frame)		# Save the frame as frame0001.png

The Clean Up Step

To ensure a clean exit of your program, there are a couple of things you should do at the end of your script. Since we have opened an instance of the camera object and some pop-up windows using OpenCV, we need to close these prior to termination of the program. The following functions should be at the end of each vision script you run.

cap.release()			# Release the camera object
cv2.destroyAllWindows()		# Close all opencv pop-up windows

OpenCV Image Processing

Now you have had the change to read in an image using python and opencv on your Raspberry Pi, you can start looking at the image processing steps you may want to use in your project. This part of the tutorial will present a range of different image processing methods which you may find helpful.

Changing Colour Spaces

Typically, images are captured using a derivative of the RGB colourspace, however there are many other colourspaces available for you to test as well. The default frame input will be read as BGR (blue, green, red) using the cv2.VideoCapture or picamera modules. Changing colourspaces after obtaining an image is quite simple using OpenCV, and examples of this are presented below.

hsv_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV) 		# Convert from BGR to HSV colourspace
lab_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2Lab) 		# Convert from BGR to Lab colourspace

For other colour spaces, such as YUV etc., refer to the following guide to obtain the code to pass into the cv2.cvtColor() function, OpenCV cvtColor docs.

Splitting Colour Channels and Creating Regions of Interest

It may be useful to operate independently with single-colour channels, or regions of interest within images. Extracting a single colour channel will produce a grayscale image which can be used for many image processing applications. For example, let's analyse the example frame below (resolution set to 320 x 240);

Image Not Found — An Example Input Frame

If we extract the blue, green and red colour channels, we produce the following output images;

We can use either cv2.split() or numpy array indexing to extract individual colour channels. To demonstrate the use of cv2.split(), lets set all the blue and green pixels in an image to zero, and then merge the colour channels using cv2.merge().

b, g, r = cv2.split(frame)		# Split the frame into its 3 colour channels
b = b*0					# Set the blue pixels to zero
g = g*0					# Set the green pixels to zero
frame = cv2.merge((b, g, r))		# Merge the channels back into one image

If there was any red in the input image, the modified image should appear red tinged (since we cancelled out all blue and green pixel values). As you can see from the image below, there is a significant difference between setting colour channels to zero and extracting individual channels (as shown above). Setting, for example, the blue and green channels to zero will simply change a pixel from, say [180,200,135] to [0,0,135]. The other colour channels are still present, so the image is still a BGR image; we have just cancelled the appearance of all channels except red.

You may also need to be aware that there is a high computational cost of merging channels, so this should be avoided if possible. An alternative to the OpenCV process can be achieved using numpy arrays, as shown below. Note: OpenCV reads images into python as numpy arrays. Images stored in numpy arrays are represented with 3 elements, being [rows, columns, colour_channels].

frame[:, :, 0] = 0				# Set colour channel 0 (blue) to 0

You may also wish to extract a region of interest from an image at some stage. This can be done by indexing the numpy array. For example, to extract a 100 by 100 pixel region within a frame and double the blue channel values;

sub_frame = frame[200:300, 200:300, 0]		# Extract blue colour channel of a 100 pixel region
sub_frame = sub_frame * 2			# Double the pixel blue channel values
frame[200:300, 200:300, 0] = sub_frame		# Replace the region in the original image with our sub frame

Image Segmentation

Image segmentation is the division of an image into regions or segments which correspond to different objects within the image and is a critical step in most image analyses. In OpenCV there are many approaches to image segmentation.
Simple thresholding can be completed using the cv2.threshold() function, which takes the following format;

ret, dst = cv2.threshold(src, thresh, maxValue, type)

In this function, the parameters have the following meanings:

dst 		→ 		The output image
src 		→ 		The input image (this should be a grayscale image)
thresh 		→ 		The threshold value
maxValue 	→ 		A maximum value, used for binary thresholding
type 		→ 		The thresholding method being used (see here for parameter options)

For more details regarding the threshold function, see the documentation at OpenCV cv2.threshold() docs. An overview of the mathematics behind various thresholding methods is provided here OpenCV thresholding methods docs.

For example, we may wish to threshold the blue channel of an image such that

all pixels > 127 are set to 255
all pixels < 127 are set to 0

We would first extract the colour channel we wish to analyse (which will display as a grayscale image). We can then threshold this channel using the binary thresholding method with a threshold of 127;

import io
import time
import picamera
import cv2
import numpy as np

# Create the in-memory stream
stream = io.BytesIO()
with picamera.PiCamera() as camera:
	time.sleep(2)
	camera.capture(stream, format='jpeg')

# Construct a numpy array from the stream
data = np.fromstring(stream.getvalue(), dtype=np.uint8)

# "Decode" the image from the array, preserving colour
frame = cv2.imdecode(data, 1)

frame_blue = frame[:,:,0];							# Extract blue channel
ret, thresholded_frame = cv2.threshold(frame_blue, 127, 255, cv2.THRESH_BINARY)	# Threshold blue channel

cv2.imshow("Binary Thresholded Frame", thresholded_frame)			# Display thresholded frame
cv2.waitKey(0)									# Exit on keypress

cap.release()
cv2.destroyAllWindows()

The effect of this script on an input frame can be seen below, where the left image is the input image, the middle frame is the blue colour channel, and the right image is the thresholded frame.

As you can see from the above images, our thresholding is clearly not adequate to segment the image for blue regions (since we highlighted the walls, not the blue goal). To correctly extract the blue goal, we should change this threshold and maybe consider using range based thresholding (pixels between 127 and 190, for example), or converting to a different colourspace.

Download Image Stream Recording Script Download Color Thresholder App Download PiCam settings example