# Cg Programming/Unity/Computing Color Histograms

A cat picture.
Color histogram of the cat picture above with the horizontal axis representing RGB values and the vertical axis representing the frequency of these values.

This tutorial shows how to compute a color histogram of an image with the help of compute shaders in Unity. In particular, it shows how to use an atomic function such that multiple threads (i.e., multiple calls to a compute shader function) can access the same memory location. It also shows how to use compute buffers. If you are not familiar with compute shaders in Unity, you should read Section “Computing Image Effects” first. Note that compute shaders are not supported on macOS.

## Computing Color Histograms in General

An RGB color histogram of an image is a bar chart that shows for each value of the red, green, and blue channel, how many pixels of the image feature that value. For example, how many pixels have a red value of 0, how many pixels have a green value of 0, etc. For a color resolution of 8 bits, there are 256 possible values (0 to 255) of the red, green, and blue channels; thus, a RGB color histogram specifies 3 × 256 = 768 numbers. If an alpha channel is also included, the RGBA color histogram consists of 4 × 256 = 1024 numbers.

To compute such an RGBA color histogram, a program would first initialize the 1024 numbers of the histogram to 0. Then it looks at each pixel of the image and increment (by 1) the four numbers in the histogram for the specific red, green, blue, and alpha values of the pixel. Since the same operations are performed for each pixel, this problem is easy to parallelize, except that two different threads for two different pixels might try to increment the same number of the histogram at the same time, which can lead to problems that are called race conditions. These problems can be avoided if the operation to increment one of the numbers of the histogram is an atomic operation, i.e., if it cannot be interrupted by other threads. This is what we use in the compute shader of this tutorial.

## The Big Picture: Calling the Compute Shader

In this tutorial, we start with the C# script that calls the compute shader because it provides the bigger picture. Note that we compute color histograms for any texture image; not only for camera views as in Section “Computing Image Effects”. Thus, you can attach this script to any `GameObject`.

```using UnityEngine;

public class histogramScript : MonoBehaviour {

public Texture2D inputTexture;
public uint[] histogramData;

ComputeBuffer histogramBuffer;
int handleMain;
int handleInitialize;

void Start ()
{
if (null == shader || null == inputTexture)
{
return;
}

histogramBuffer = new ComputeBuffer(256, sizeof(uint) * 4);
histogramData = new uint[256 * 4];

if (handleInitialize < 0 || handleMain < 0 ||
null == histogramBuffer || null == histogramData)
{
Debug.Log("Initialization failed.");
return;
}

}

void OnDestroy()
{
if (null != histogramBuffer)
{
histogramBuffer.Release();
histogramBuffer = null;
}
}

void Update()
{
if (null == shader || null == inputTexture ||
0 > handleInitialize || 0 > handleMain ||
null == histogramBuffer || null == histogramData)
{
Debug.Log("Cannot compute histogram");
return;
}

shader.Dispatch(handleInitialize, 256 / 64, 1, 1);
// divided by 64 in x because of [numthreads(64,1,1)] in the compute shader code
shader.Dispatch(handleMain, (inputTexture.width + 7) / 8, (inputTexture.height + 7) / 8, 1);
// divided by 8 in x and y because of [numthreads(8,8,1)] in the compute shader code

histogramBuffer.GetData(histogramData);
}
}
```

The script defines three public variables: `public ComputeShader shader` which has to be set to the compute shader that is shown below; `public Texture2D inputTexture` which has to be set to the texture for which the histogram should be computed; and `public uint[] histogramData` which the script sets to an array of 1024 unsigned ints of the compute histogram.

The three private variables are: `ComputeBuffer histogramBuffer` which contains the same data as `histogramData` but can be accessed by the compute shader; `int handleMain` and `int handleInitialize` are the indices of the two compute shader functions for the main processing of all pixels and for the initialization of the 1024 numbers of the histogram.

The `Start()` function sets the two handles with `ComputeShader.FindKernel()` and creates the `histogramBuffer` compute buffer and the `histogramData` array. While the compute buffer is created as an array of 256 elements that each contain 4 unsigned ints, the `histogramData` is created as an array of 1024 unsigned ints. This difference does not matter since the memory layout is the same for both. Of course, the `histogramData` could also be defined as an array of 256 structs that each contain 4 unsigned ints. The rest of the `Start()` function does error checking and sets the texture and compute buffer to the corresponding uniform variables for each compute shader function such that they have access to them.

The `OnDestroy()` function simply releases the compute buffer since the hardware resources attached to it are not automatically released by the garbage collector.

The `Update()` function does some error checking and then calls the compute shader function for the initialization of the `histogramBuffer` and the compute shader function for processing all the pixels. For the initialization, we use 4 (= 256 / 64) thread groups of 64 × 1 × 1 threads to initialize the 256 elements of the compute buffer. For the main processing of the pixels we use thread groups of 8 × 8 × 1 threads and compute the number of thread groups by dividing the dimensions of the texture image by 8. The addition of 7 is necessary to make sure that we are not short by one thread group if the dimensions are not divisible by 8. Lastly, the `Update()` function calls `histogramBuffer.GetData(histogramData);` to copy the data from the compute buffer to the Unity array in `histogramData`; note that the two data structures have to have the same memory layout for this call to work.

At the end of each frame, the computed color histogram is available in the public variable `histogramData`; thus, you can look it in the Inspector Window while running the program.

## The Nitty-Gritty Details of the Compute Shader

In this case, the compute shader contains two compute shader functions, one for the initialization and the other one for the main processing of the texels of the texture. Therefore, it also includes two `#pragma kernel` instructions and two `[numthreads()]` instructions:

```#pragma kernel HistogramInitialize
#pragma kernel HistogramMain

Texture2D<float4> InputTexture; // input texture

struct histStruct {
uint4 color;
};
RWStructuredBuffer<histStruct> HistogramBuffer;

{
HistogramBuffer[id.x].color = uint4(0, 0, 0, 0);
}

void HistogramMain (uint3 id : SV_DispatchThreadID)
{
uint4 col = uint4(255.0 * InputTexture[id.xy]);

}
```

As always, you create a compute shader by clicking on Create in the Project Window and choosing Shader > Compute Shader. You should then copy&paste the code into the new file.

The first two lines `#pragma kernel HistogramInitialize` and `#pragma kernel HistogramMain` specify the two compute shader functions (“kernels”) that can be called from a script with the `ComputeShader.Dispatch()` function.

`Texture2D<float4> InputTexture;` specifies a uniform variable for a read-only 2D RGBA texture with name `InputTexture`.

`struct histStruct { uint4 color; };` defines a small structure with only one member: a 4D unsigned int vector called `color`. `color.r` is used to count the red pixels with a certain value (according to the position in the array); and analogously `color.g`, `color.b`, and `color.a` for the green, blue, and alpha channel.

The structure `histStruct` is then used in `RWStructuredBuffer<histStruct> HistogramBuffer;` to define a read/write structured buffer that represents the compute buffer `histogramBuffer` in the C# script. The memory layout matches because the elements of the `RWStructuredBuffer` is of type `histStruct`, which consists of 4 uints.

The function `HistogramInitialize()` uses thread groups of dimensions 64 × 1 × 1, which means that the argument `uint3 id : SV_DispatchThreadID` runs from `uint3(0, 0, 0)` to uint3(255, 0, 0) since we use 4 thread groups. Therefore, the function can use `id.x` to index the 256 elements of the `HistogramBuffer` when initializing all elements to 0.

The function `HistogramMain()` uses thread groups of dimensions 8 × 8 × 1. Since we base the number of thread groups on the texture size, the function can use the argument `uint3 id : SV_DispatchThreadID` to access the texels of the texture with `InputTexture[id.xy]`. Since the RGBA values are read as floating-point values between 0.0 and 1.0, they are multiplied with 255.0 and rounded down by converting them to unsigned ints in the `uint4 col` variable. The RGBA values in `col` are then used to index the `HistogramBuffer` to increment the counter variables in the buffer, i.e., `HistogramBuffer[col.r].color.r` for the red value, `HistogramBuffer[col.g].color.g` for the green value, etc.

To increment the counter variables, the code uses the function `InterlockedAdd()` which takes a variable as first argument and an integer as second argument. In our case, the latter is 1 because we increment by 1. `InterlockedAdd()` is one of the atomic functions of HLSL compute shaders; i.e., the GPU makes sure that any race conditions due to multiple threads trying to increment the same variable at the same time are avoided. There are a couple of atomic functions in HLSL; note that all of them work only with integers or unsigned integers.

If you want to observe the effect of the race conditions, you can replace the calls to the atomic function `InterlockedAdd()` by code like this:

```   HistogramBuffer[col.r].color.r += 1;
// WARNING: THIS CREATES RACE CONDITIONS!
```

On most GPUs, this will not be an atomic operation and, therefore, there will usually be race conditions when you run this code, which lead to undefined results. You might be able to observe in the Inspector Window that the values in the `histogramData` array change somewhat randomly due to these race conditions.

## Summary

You have reached the end of this tutorial! A few of the things that you have learned are:

• What color histograms are and how to compute them.
• How to create and use Unity's compute buffers in a C# script and how to define a corresponding read/write structured buffer in a compute shader.
• How to define and use multiple compute shader functions in one compute shader.
• How to use an atomic function in a compute shader.