Sensory Systems/Computer Models/Descriptive Simulations of Visual Information Processing

Vision
Hearing
Balance
Feeling
Smell
Taste

Technological Aspects

In Animals

Birds
Fish
Marine Animals
Arthropods
Other Animals

Computer Simulation of the Visual System

In this section an overview in the simulation of processing done by the early levels of the visual system will be given. The implementation to reproduce the action of the visual system will thereby be done with MATLAB and its toolboxes. The processing done by the early visual system was discussed in the section before and can be put together with some of the functions they perform in the following schematic overview. A good description of the image processing can be found in (Cormack 2000).

**Schematic overview of the processing done by the early visual system**
Structure	Operations	2D Fourier Plane
World	$I(x,y,t,\lambda )$
Optics	Low-pass spatial filtering
Photoreceptor Array	Sampling, more low-pass filtering, temporal lowhandpass filtering, $\lambda$ filtering, gain control, response compression
LGN Cells	Spatiotemporal bandpass filtering, $\lambda$ filtering, multiple parallel representations
Primary Visual Cortical Neurons: Simple & Complex	Simple cells: orientation, phase, motion, binocular disparity, & $\lambda$ filtering
Primary Visual Cortical Neurons: Simple & Complex	Complex cells: no phase filtering (contrast energy detection)
On the left, are some of the major structures to be discussed; in the middle, are some of the major operations done at the associated structure; in the right, are the 2-D Fourier representations of the world, retinal image, and sensitivities typical of a ganglion and cortical cell. (From Handbook of Image and Video Processing, A. Bovik)

As we can see in the above overview different stages of the image processing have to be considered to simulate the response of the visual system to a stimulus. The next section will therefore give a brief discussion in Image Processing. But first of all we will be concerned with the Simulation of Sensory Organ Components.

Simulating Sensory Organ Components

Anatomical Parameters of the Eye

The average eye has an anterior corneal radius of curvature of $r_{C}$ = 7.8 mm , and an aqueous refractive index of 1.336. The length of the eye is $L_{E}$ = 24.2 mm. The iris is approximately flat, and the edge of the iris (also called limbus) has a radius $r_{L}$ = 5.86 mm.

Optics of the Eyeball

The optics of the eyeball are characterized by its 2-D spatial impulse response function, the Point Spread Function (PSF)

$h(r)=0.95\cdot \exp \left(-2.6\cdot |r|^{1.36}\right)+0.05\cdot \exp \left(-2.4\cdot |r|^{1.74}\right),$

in which $r$ is the radial distance in minutes of arc from the center of the image.

Practical implementation

Obviously, the effect on a given digital image depends on the distance of that image from your eyes. As a simple place-holder, substitute this filter with a Gaussian filter with height 30, and with a standard deviation of 1.5.

In one dimension, a Gaussian is described by

$g(x)=a\cdot \exp \left(-{\frac {x^{2}}{2\sigma ^{2}}}\right).$

Activity of Ganglion Cells

Ignoring the

temporal response
effect of wavelength (especially for the cones)
opening of the iris
sampling and distribution of photo receptors
bleaching of the photo-pigment

we can approximate the response of ganglion cells with a Difference of Gaussians (DOG, Wikipedia [1])

$f(x;\sigma )={\frac {1}{\sigma _{1}{\sqrt {2\pi }}}}\,\exp \left(-{\frac {x^{2}}{2\sigma _{1}^{2}}}\right)-{\frac {1}{\sigma _{2}{\sqrt {2\pi }}}}\,\exp \left(-{\frac {x^{2}}{2\sigma _{2}^{2}}}\right).$

The source code for a Python implementation is available under ^[1].

The values of $\sigma _{1}$ and $\sigma _{2}$ have a ratio of approximately 1:1.6, but vary as a function of eccentricity. For midget cells (or P-cells), the Receptive Field Size (RFS) is approximately

$RFS_{P}\approx 2\cdot {\text{Eccentricity}},$

and for M-cells approximately

$RFS_{M}\approx 6\cdot {\text{Eccentricity}},$

where the RFS is given in arcmin, and the Eccentricity in mm distance from the center of the fovea (Cormack 2000).

Activity of simple cells in the primary visual cortex (V1)

Again ignoring temporal properties, the activity of simple cells in the primary visual cortex (V1) can be modeled with the use of Gabor filters (Wikipedia [2]). A Gabor filter is a linear filter whose impulse response is defined by a harmonic function (sinusoid) multiplied by a Gaussian function. The Gaussian function causes the amplitude of the harmonic function to diminish away from the origin, but near the origin, the properties of the harmonic function dominate

g(x,y;\lambda ,\theta ,\psi ,\sigma ,\gamma )=\exp \left(-{\frac {x'^{2}+\gamma ^{2}y'^{2}}{2\sigma ^{2}}}\right)\cos \left(2\pi {\frac {x'}{\lambda }}+\psi \right),

where

x'=x\cos \theta +y\sin \theta \,,

and

y'=-x\sin \theta +y\cos \theta \,.

In this equation, $\lambda$ represents the wavelength of the cosine factor, $\theta$ represents the orientation of the normal to the parallel stripes of a Gabor function (Wikipedia [3]), $\psi$ is the phase offset, $\sigma$ is the sigma of the Gaussian envelope and $\gamma$ is the spatial aspect ratio, and specifies the ellipticity of the support of the Gabor function.

The size of simple-cell receptive fields depends on its position relative to the fovea, but less strictly so than for retinal ganglion cells. The smallest fields, in and near the fovea, are about one-quarter degree by one-quarter degree, with the center region as small as a few minutes of arc (the same as the diameter of the smallest receptive-field centers in retinal ganglion cells). In the retinal periphery, simple-cell receptive fields can be about 1 degree by 1 degree. ^[2].

Gabor-like functions arise naturally, simply from the statistics of everyday scenes ^[3]. An example how even the statistics of a simple image can lead to the emergence of Gabor-like receptive fields, written in Python, is presented in ^[4]; and a (Python-)demonstration of the effects of filtering an image with Gabor-functions can be found at ^[5].

This is an example implementation in MATLAB:

function gb = gabor_fn(sigma,theta,lambda,psi,gamma)
 
  sigma_x = sigma;
  sigma_y = sigma/gamma;
 
  % Bounding box
  nstds = 3;
  xmax = max(abs(nstds*sigma_x*cos(theta)),abs(nstds*sigma_y*sin(theta)));
  xmax = ceil(max(1,xmax));
  ymax = max(abs(nstds*sigma_x*sin(theta)),abs(nstds*sigma_y*cos(theta)));
  ymax = ceil(max(1,ymax));
  xmin = -xmax;
  ymin = -ymax;
  [x,y] = meshgrid(xmin:0.05:xmax,ymin:0.05:ymax);
 
  % Rotation
  x_theta = x*cos(theta) + y*sin(theta);
  y_theta = -x*sin(theta) + y*cos(theta);
 
  gb = exp(-.5*(x_theta.^2/sigma_x^2+y_theta.^2/sigma_y^2)).* cos(2*pi/lambda*x_theta+psi);
 
end

And an equivalent Pyhon implementation would be:

import numpy as np
import matplotlib.pyplot as mp

def gabor_fn(sigma = 1, theta = 1, g_lambda = 4, psi = 2, gamma = 1):
    # Calculates the Gabor function with the given parameters
        
    sigma_x = sigma
    sigma_y = sigma/gamma
    
    # Boundingbox:
    nstds = 3
    xmax = max( abs(nstds*sigma_x * np.cos(theta)), abs(nstds*sigma_y * np.sin(theta)) )
    ymax = max( abs(nstds*sigma_x * np.sin(theta)), abs(nstds*sigma_y * np.cos(theta)) )
    
    xmax = np.ceil(max(1,xmax))
    ymax = np.ceil(max(1,ymax))
    
    xmin = -xmax
    ymin = -ymax
    
    numPts = 201    
    (x,y) = np.meshgrid(np.linspace(xmin, xmax, numPts), np.linspace(ymin, ymax, numPts) ) 
    
    # Rotation
    x_theta =  x * np.cos(theta) + y * np.sin(theta)
    y_theta = -x * np.sin(theta) + y * np.cos(theta)
    gb = np.exp( -0.5* (x_theta**2/sigma_x**2 + y_theta**2/sigma_y**2) ) * \
         np.cos( 2*np.pi/g_lambda*x_theta + psi )
    
    return gb

if __name__ == '__main__':
    # Main function: calculate Gabor function for default parameters and show it
    gaborValues = gabor_fn()
    mp.imshow(gaborValues)
    mp.colorbar()
    mp.show()

Image Processing

One major technical tool to understand is the way a computer handles images. We have to know how we can edit images and what techniques we have to rearrange images.

Image Representation

Grayscale

For a computer an image is nothing more than a huge amount of little squares. These squares are called "pixel". In a grayscale image, each of this pixel carries a number n, often it holds $0\leq n\leq 255$ . This number n, represents the exactly color of this square in the image. This means, in a grayscale image we can use 256 different grayscales, where 255 means a white spot, and 0 means the square is black. To be honest, we could even use more than 256 different levels of gray. In the mentioned way, every pixels uses exactly 1 byte (or 8 bit) of memory to be saved. (Due to the binary system of a computer it holds: 2⁸=256) If you think it is necessary to have more different gray scales in your image, this is not a problem. You just can use more memory to save the picture. But just remember, this could be a hard task for huge images. Further quite often you have the problem that your sensing device (e.g. your monitor) can not show more than this 256 different gray colors.

Colour

File:ImageRepresentation Color.png

Image represented with RGB-notation

Representing a colourful image is only slightly more complicated than the grayscale picture. All you have to know is that the computer works with a additive colour mixture of the three main colors Red, Green and Blue. This are the so called RGB colours.

Also these images are saved by pixels. But now every pixel has to know 3 values between 0 and 256, for every Color 1 value. So know we have 256³= 16,777,216 different colours which can be represented. Similar to the grayscale images also here holds, that no color means black, and having all color means white. That means, the colour (0,0,0) is black, whereas (0,0,255) means blue and (255,255,255) is white.

Orientation

WARNING - There are two common, but different ways to describe the location of a point in 2 dimensions: 1) The x/y notation, with x typically pointing to the left 2) The row/column orientation Carefully watch out which coordinates you are using to describe your data, as the two descriptions are not consistent!

Image Filtering

1D Filter

In many technical applications, we find some primitive basis in which we easily can describe features. In 1 dimensional cases filters are not a big deal, therefore we can use this filters for changing images. The so called "Savitzky- Golay Filter" allows to smooth incoming signals. The filter was described in 1964 by Abraham Savitzky and Marcel J. E. Golay. It is a impulse-respond filter (IR).

For better understanding, lets look at a example. In 1d we usually deal with vectors. One such given vector, we call x and it holds: $\mathbf {x} =(x_{1},x_{2},\dots ,x_{n})\;with\;n\;\in \mathbb {N}$ . Our purpose is to smooth that vector x. To do so all we need is another vector $\mathbf {(} w)=(w_{1},w_{2},\dots ,w_{m})\;with\;n>m\;\in \mathbb {N}$ , this vector we call a weight vector.

With $y(k)=\displaystyle \sum _{i=1}^{m}w(i)x(k-m+i)$ we now have a smoothed vector y. This vector is smoother than the vector before, because we only save the average over a few entries in the vector. These means the newly found vectorentries, depends on some entries right left and right of the entry to smooth. One major drawback of this approach is, the newly found vector y only has n-m entries instead of n as the original vector x.

Drawing this new vector would lead to the same function as before, just with less amplitude. So no data is lost, but we have less fluctuation.

2D Filter

Going from the 1d case to the 2d case is done by simply make out of vectors matrices. As already mentioned, a gray-level image is for a computer or for a softwaretool as MATLAB nothing more, than a huge matrix filled with natural numbers, often between 0 and 255.

The weight vector is now a weight-matrix. But still we use the filter by adding up different matrix-element-multiplications. $y(n,m)=\displaystyle \sum _{i=1}^{k}\sum _{j=1}^{l}w_{ij}\times x(n-1+i,m-1+j)$

Dilation and Erosion

For linear filters as seen before, it holds that they are commutative. Cite from wikipedia: One says that x commutes with y under ∗ if:

x*y=y*x\,

In other words, it does not matter how many and in which sequence different linear filters you use. E.g. if a Savitzky-Golay filter is applied to some date, and then a second Savitzky-Golay filter for calculationg the first derivative, the result is the same if the sequence of filters is reversed. It even holds, that there would have been one filter, which does the same as the two applied.

In contrast morphological operations on an image are non-linear operations and the final result depends on the sequence. If we think of any image, it is defined by pixels with values x_ij. Further this image is assumed to be a black-and-white image, so we have

x_{ij}=0\;or\;1,\forall i,j

To define a morphological operation we have to set a structural element SE. As example, a 3x3-Matrix as a part of the image.

The definition of erosion E says:

E(M)=\left\{{\begin{aligned}&0,\ \,if\sum \limits _{i,j=0}^{3}{{(se)}_{ij}}<9\\&1,\ \,else\\\end{aligned}}\right.\ \ ,with\ {{(se)}_{ij}},M\in SE

So in words, if any of the pixels in the structural element M has value 0, the erosion sets the value of M, a specific pixel in M, to zero. Otherwise E(M)=1

And for the dilation D it holds, if any value in SE is 1, the dilation of M, D(M), is set to 1.

D(M)=\left\{{\begin{aligned}&1,\;if\sum _{i,j=0}^{3}(se)_{ij}>=1\\&0,\;else\end{aligned}}\right.\;\;,with\;(se)_{ij},M\in SE

.

Morphological Operations on this image ... ... lead to the following results:

Compositions of Dilation and Erosion: Opening and Closing of Images

There are two compositions of dilation and erosion. One called opening the other called closing. It holds:

{\begin{aligned}opening&=&dilation\circ erosion\\closing&=&erosion\circ dilation\end{aligned}}

References

↑ T. Haslwanter (2012). "Mexican Hat Function [Python]". private communications.
↑ David, Hubel (1988). Eye, Brain, and Vision. Henry Holt and Company. Retrieved 2014-08-08.
↑ Olshausen,B.A. and Field,D.J. (1996). "Emergence of simple-cell receptive field properties by learning a sparse code for natural images". Nature. 381 (June 13): 607–609.{{cite journal}}: CS1 maint: multiple names: authors list (link)
↑ scikits-image development team (2012). "Emergence of Gabor-like functions from a SimpleIimage [Python]".
↑ Thomas Haslwanter (2012). "Demo-application of Gabor filters to an image [Python]".

← NeuralSimulation · Auditory_System_Simulation →

[1] T. Haslwanter (2012). "Mexican Hat Function [Python]". private communications.

[2] David, Hubel (1988). Eye, Brain, and Vision. Henry Holt and Company. Retrieved 2014-08-08.

[3] Olshausen,B.A. and Field,D.J. (1996). "Emergence of simple-cell receptive field properties by learning a sparse code for natural images". Nature. 381 (June 13): 607–609.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[4] scikits-image development team (2012). "Emergence of Gabor-like functions from a SimpleIimage [Python]".

[5] Thomas Haslwanter (2012). "Demo-application of Gabor filters to an image [Python]".

[1]

[2]

[3]

[4]

[5]