Continuing with my previous post, where I discussed about the basic concepts of computer vision, here I will discuss all about colors – more about color depth, channels and color spaces.
More on Color Depth
You are already aware (as discussed in my previous post), there are three ways to represent an image with respect to its color – binary, grayscale and color images. You are also aware that depth of an image basically refers to the different number of unique shades of color which constitute an image. Let us look at the following once again.
Depth is represented as “bits” like 1-bit, 8-bit, 16-bit, 24-bit, etc. An n-bit image simply refers that there are 2n unique shades of color in the image. For example, an 8-bit grayscale image will have 28 = 256 different shades of gray in between black (0) and white (1). Similarly, a 16-bit grayscale image will have 216 = 65,536 different shades of gray in between black (0) and white (1).
Relationship between Depth and Intensity
Next, n-bit image also refers that each of its pixels stores the intensities in n-bit fashion. Please read this carefully, since it will help you out in many places where you need to work with individual pixels. Each pixel of an image has an intensity (as discussed in the previous post). This intensity is directly related to the depth of the image. In the previous point, it is stated that depth represents different shades of a color. Here, I state that shades are nothing but different intensity levels. For example, for an 8-bit grayscale image, there are 28 = 256 different shades of gray. This also means that each pixel can have 256 different intensity levels, level 0 to level 255. Here level zero (0) corresponds to black (0) and level 255 corresponds to white (1). Any value in between 0 and 255 merely represents a gray shade. For instance, a pixel intensity of 127 corresponds of 0.5 gray, 56 corresponds to 56 ÷ 256 = 0.218, 198 corresponds to 198 ÷ 256 = 0.773, etc. Similarly, for a 16-bit grayscale image, there are 216 = 65,536 different shades of gray. Here intensity level zero (0) corresponds to black (0) and level 65535 corresponds to white (1). Again, any value in between 0 and 65535 represents a gray shade.
Now that we are aware that each pixel has an intensity value in between 0 and 2n. Considering 8-bit image, each pixel has 256 intensities, which means that each pixel is represented in an 8-bit format. An 8-bit image is also known as “Byte” image (because 8 bits = 1 byte, simple enough). The intensities, which are in between 0 and 255 are represented as 8 bit binary as demonstrated below.
If you are familiar with binary, then it should be easy to understand, and you would have spotted an error in it as well. Each pixel intensity (0…255) is represented and stored in its 8 bit binary format. Thus, while working with individual pixels and manipulating their values, this little detail should always be kept in mind.
Still confused? The following pictures from Wikipedia will should help. Here you can see a single image in five different depths. As the depth of the image increases, more number of colors are utilized in the construction of the image, and thus makes the image look smoother and more realistic. When we see something in real world from our eyes, we see infinite number of colors and shades in nature. But unfortunately, images being digital in nature cannot capture all of them, and hence are quantized to a finite number of colors determined by the depth of the image. We will discuss more about how cameras perform sampling and quantization of images in my upcoming posts.
Now lets discuss more about color images and how their color is actually represented. Every digital color image is represented according to a color space. There are many types of color spaces, some of which are – RGB, RGBA, HSV, HSL, CMYK, YIQ, YUV, YCbCr, YPbPr, etc. We are interested only in RGB and HSV, and little bit of YCbCr. I will give you a brief idea about these three, for the rest, please refer Wikipedia.
RGB Color Space
RGB stands for Red-Green-Blue. This color space utilizes a combination of the three primary colors viz. Red (R), Green (G) and Blue (B) to represent any color in the image. This makes it the most widely used, intuitive and easy to use color model. It uses the technique of additive color mixing to create new colors. By mixing different intensities of Red, Green and Blue, we can get any possible color. For 8-bit images, each channel (R, G and B) can have an intensity value in between 0 and 255. Thus a mix of these three colors can result in 256 × 256 × 256 = 16,777,216 different colors! You can see it for yourself! Open MS Office, and there you can have the following Color Mixer–
There you can choose your color from a palette. You can also choose from two different color models (RGB and HSL). In RGB model, you can view the individual values of R, G, and B. You can move marker in the right column up and down to adjust the intensity level of that color. In the bottom, there is a horizontal scroll bar which determines the opacity of the color.
Let’s delve deeper into this. Let’s take the following image which I created using (surprisingly) MS Paint! The picture contains several bands of colors. The top band represents pure red color, below that it shows how it fades into white color. The same is repeated with pure green and pure blue, followed by yellow, cyan and magenta (which are a combination of two of the primary colors) and then two bands of white and black color.
So basically what I did is fed this image into a code which I wrote using OpenCV 2.4.3 and split apart the red, blue and green channels. And this is what I got –
As you can see, in the Red (R) channel, the white areas indicate the presence of red color where black areas indicate its absence. As expected, the red, yellow, magenta and white areas of the input image have turned white whereas pure green, pure blue, pure cyan and black areas have turned black. In the other areas, you can see a transition from black to white, which means that the intensity of red color increases gradually in these areas. Similar explanations can be given for Green (G) and Blue (B) channels.
But this is merely a computer generated image. Upon running the same code on the Lena image, this is what you get –
As you can see, the Red (R) channel is quite bright as compared to other two channels. This indicates that the overall intensity of red color is higher as compared to other colors. Well, you can try it out yourself. I have shared a code called usingMouse.cpp in the Code Gallery. You can find the Windows executable here (I will post the Linux executable soon). There will be a file called
usingMouse.exe along with the image lena.jpg. Download both to your computer. Then open command prompt and go to the directory where you have downloaded the files, and then run it by typing the following:
For now, lets skip how to create the executable (which we will discuss later). After executing the executable, you should see a window with lena.jpg image displayed in it. If you hover the mouse over the image, it should show the (x,y) coordinate position. If you left click somewhere on the image, it displays RGB values of that pixel. If you right click anywhere in the image, it displays HSV values (which we will discuss next).
So, basically, this program helps you to extract the RGB information of each pixel. Feel free to replace the lena.jpg image with any other image of your choice and explore it! Have fun with it!
HSV Color Space
HSV stands for Hue-Saturation-Value. Before we describe it, I would like to show you the following picture.
Suppose I want to extract the yellow region of the ball. In this case, there is a lot of variation in the color intensity due to the ambient lighting. The top portion of the ball is very bright, whereas the bottom portion is darker as compared to the other regions. This is where the RGB color model fails. Due to such a wide range of intensity and color mix, there is no particular range of RGB values which can be used for extraction. This is where the HSV color model comes in. Just like the RGB model, in HSV model also, there are three different parameters.
Hue: In simple terms, this represents the “color”. For example, red is a color. Green is a color. Pink is a color. Light Red and Dark Red both refer to the same color red. Light/Dark Green both refer to the same color green. Thus, in the above image, to extract the yellow ball, we target the yellow color, since light/dark yellow refer to yellow.
Saturation: This represents the “amount” of a particular color. For example we have red (having max value of 255), and we also have pale red (some lesser value, say 106, etc).
Value: Sometimes represented as intensity, it differentiates between the light and dark variations of that color. For example light yellow and dark yellow can be differentiated using this.
This makes HSV color space independent of illumination and makes the processing of images easy. But it isn’t much intuitive and some people may have some difficulty to understand its concepts.
Now lets take the same color band image and convert it to HSV and split each channel. So here is what I got–
As you can see, each and every color has a separate hue value. It can be seen clearly that different shades of red have the same hue value. Saturation refers to the amount of color. That’s why you can see such a variation in the saturation channel. The value (or intensity) is the same here since it is a computer generated image. In real images, there will be variation in the intensity channel as well. So, applying the same on the Lena image, we get something like this–
You can get individual H S V values of any image by executing the
usingMouse.exe file as discussed above.
But how the heck can a color image be comprised of three grayscale images?
Well, this is a genuine question, which any beginner should ask when learning about color spaces. In the examples above (both the color band and the Lena image), we have seen that the three channels separated from the original image are grayscale images. Whether it is the red/green/blue channel of an RGB image, or the hue/sat/val channel of HSV image. Any great ideas?!
Well, you know that grayscale images are represented as a single channel. The intensity of each pixel is represented as a value in the range 0-255 (for 8-bit images). Well, this is exactly what happens here. If you use the
usingMouse.exe file to view the RGB/HSV values of each pixel, you will realize that each of the three channels can be represented separately having a value 0-255. Thus, when all the pixel values are taken together, we can have three separate channels, each having a separate intensity value for each of its pixels. This is exactly how a grayscale image is represented, and hence is stored in that format.
The converse is equally true. When you combine three grayscale images, it will result in a color image. And this is no magic as well! The three grayscale images are combined and represented as a color image.
Y’CrCb Color Space
We won’t go into its details. In short, this is another sophisticated color space and is used in video processing and transmission. This is because one of the components (Y) is the major uncompressed component, whereas the other two components are compressed a lot, thus saving bandwidth for transmission. In short,
Y = Luma or Luminescence or Intensity
Cr = RED component minus reference value
Cb = BLUE component minus reference value
Several modern frame grabbers return this type of images. There was a time when I was taking in frames from a camera, and I was confused by the image it was returning to me with a blue-ish tint. At that time I was unfamiliar with this color space. After learning about it, I converted it to RGB format to get the actual desired color picture. So before using any camera, do check out the type of image returned by it.
Well, this is a concept we have been talking all the way throughout this tutorial. For example, in the RGB image, there are three channels – R, G and B channels; In the HSV image, there are three channels – H, S and V channels. So this is not a new concept. As per Wikipedia, a channel is a grayscale image comprising only of one of the components (R/G/B or H/S/V) of the image.
Usually software like OpenCV and MATLAB support images up to four channels. Usually we have three channels (like RGB, HSV), but sometimes we also have a fourth channel (like RGBA, CMYK).
Here we will discuss how to implement these things in software like MATLAB and OpenCV.
To convert an image from one color space to another, the keyword is
cvCvtColor(). The syntax is–
cvCvtColor ( <source>, <destination>, <conversion_code> );
cvCvtColor( src, dst, CV_BGR2HSV);
Important: OpenCV stores RGB images in BGR format (not RGB) by default. Please keep this mind before implementing it.
There are lots of conversion codes, some of which are–
CV_BGR2HSV, CV_HSV2BGR, CV_GRAY2BGR, CV_BGR2GRAY
If you want to split the different channels, then use
cvSplit( <source>, <dest0>, <dest1>, <dest2>, <dest3>);
cvSplit( src, b, g, r, NULL);
If you want to merge different channels, then use
cvMerge( <src0>, <src1>, <src2>, <src3>, <dest>);
cvMerge( h, s, v, NULL, dst);
In MATLAB, to convert an image from one form to another, there are separate functions for each type of conversion–
I = rgb2hsv (img); I = hsv2rgb (img); I = rgb2gray (img);
If you want to view the individual channels of the image, then try this out–
redChannel = rgbImage (:, :, 1); greenChannel = rgbImage (:, :, 2); blueChannel = rgbImage (:, :, 3);
This is because in MATLAB, everything is represented as arrays and matrices, and thus the third argument represents the third dimention of the matrix.
If you want to combine the different channels, then use the
cat() function, which basically concatenates the arrays and matrices–
rgbImage = cat (3, redChannel, greenChannel, blueChannel);
The codes used to generate the RGB/HSV images and channels used in this post can be found in the CV Code Gallery of maxEmbedded.
So folks, this was all about colors and color spaces. Lets summarize it.
- We discussed about color depth in detail, including its relationship with intensity.
- We discussed about three color spaces – RGB, HSV and Y’CrCb.
- In RGB and HSV color models, we discussed the role of each component along with two demonstrations.
- The RGB and HSV values were studied using the
usingMouse.exeprogram from the code gallery.
- Then we defined the already known concept of image channels.
- Then we learnt about the software implementation of the techniques discussed in this tutorial in both OpenCV and MATLAB.
Thank you for reading this long post till the end! I would be glad if you post your feedback, doubts, queries, etc as comment below, so that I will be encouraged to write more tutorials for you!
Till then, please subscribe to maxEmbedded via email or RSS Feeds to stay updated.
VIT University, Vellore, India