Sensory Systems/Visual System Simulation

From Wikibooks, open books for an open world
Jump to: navigation, search

Computer Simulation of the Visual System[edit]

In this section an overview in the simulation of processing done by the early levels of the visual system will be given. The implementation to reproduce the action of the visual system will thereby be done with MATLAB and its toolboxes. The processing done by the early visual system was discussed in the section before and can be put together with some of the functions they perform in the following schematic overview. A good description of the image processing can be found in (Cormack 2000).

Schematic overview of the processing done by the early visual system
Structure Operations 2D Fourier Plane
World I(x,y,t,\lambda) 2D Fourier Plane 01.jpg
Optics Low-pass spatial filtering 2D Fourier Plane 02.jpg
Photoreceptor Array Sampling, more low-pass filtering, temporal lowhandpass filtering, \lambda filtering, gain control, response compression
LGN Cells Spatiotemporal bandpass filtering, \lambda filtering, multiple parallel representations 2D Fourier Plane 03.jpg
Primary Visual Cortical Neurons: Simple & Complex Simple cells: orientation, phase, motion, binocular disparity, & \lambda filtering 2D Fourier Plane 04.jpg
Complex cells: no phase filtering (contrast energy detection)

On the left, are some of the major structures to be discussed; in the middle, are some of the major operations done at the associated structure; in the right, are the 2-D Fourier representations of the world, retinal image, and sensitivities typical of a ganglion and cortical cell. (From Handbook of Image and Video Processing, A. Bovik)

As we can see in the above overview different stages of the image processing have to be considered to simulate the response of the visual system to a stimulus. The next section will therefore give a brief discussion in Image Processing. But first of all we will be concerned with the Simulation of Sensory Organ Components.

Simulating Sensory Organ Components[edit]

Anatomical Parameters of the Eye[edit]

The average eye has an anterior corneal radius of curvature of r_C = 7.8 mm , and an aqueous refractive index of 1.336. The length of the eye is L_E = 24.2 mm. The iris is approximately flat, and the edge of the iris (also called limbus) has a radius r_L = 5.86 mm.

Optics of the Eyeball[edit]

The optics of the eyeball are characterized by its 2-D spatial impulse response function, the Point Spread Function (PSF)


h(r) = 0.95\cdot \exp\left( -2.6\cdot |r|^{1.36} \right) + 0.05\cdot\exp\left( -2.4\cdot |r|^{1.74} \right) ,

in which r is the radial distance in minutes of arc from the center of the image.

Practical implementation[edit]

Obviously, the effect on a given digital image depends on the distance of that image from your eyes. As a simple place-holder, substitute this filter with a Gaussian filter with height 30, and with a standard deviation of 1.5.

In one dimension, a Gaussian is described by


g(x) = a \cdot \exp \left( -\frac{x^2}{2\sigma^2} \right) .

Activity of Ganglion Cells[edit]

Mexican Hat function, with sigma1:sigma2 = 1:1.6

Ignoring the

  • temporal response
  • effect of wavelength (especially for the cones)
  • opening of the iris
  • sampling and distribution of photo receptors
  • bleaching of the photo-pigment

we can approximate the response of ganglion cells with a Difference of Gaussians (DOG, Wikipedia [1])


f(x;\sigma) = \frac{1}{\sigma_1\sqrt{2\pi}} \, \exp \left( -\frac{x^2}{2\sigma_1^2} \right)-\frac{1}{\sigma_2\sqrt{2\pi}} \, \exp \left( -\frac{x^2}{2\sigma_2^2} \right).

The source code for a Python implementation is available under [1].

The values of \sigma_1 and \sigma_2 have a ratio of approximately 1:1.6, but vary as a function of eccentricity. For midget cells (or P-cells), the Receptive Field Size (RFS) is approximately

RFS \approx 2 \cdot \text{Eccentricity} ,

where the RFS is given in arcmin, and the Eccentricity in mm distance from the center of the fovea (Cormack 2000).

Activity of simple cells in the primary visual cortex (V1)[edit]

Again ignoring temporal properties, the activity of simple cells in the primary visual cortex (V1) can be modeled with the use of Gabor filters (Wikipedia [2]). A Gabor filter is a linear filter whose impulse response is defined by a harmonic function (sinusoid) multiplied by a Gaussian function. The Gaussian function causes the amplitude of the harmonic function to diminish away from the origin, but near the origin, the properties of the harmonic function dominate

g(x,y;\lambda,\theta,\psi,\sigma,\gamma)=\exp\left(-\frac{x'^2+\gamma^2y'^2}{2\sigma^2}\right)\cos\left(2\pi\frac{x'}{\lambda}+\psi\right) ,

where

x' = x \cos\theta + y \sin\theta\, ,

and

y' = -x \sin\theta + y \cos\theta\, .

In this equation, \lambda represents the wavelength of the cosine factor, \theta represents the orientation of the normal to the parallel stripes of a Gabor function (Wikipedia [3]), \psi is the phase offset, \sigma is the sigma of the Gaussian envelope and \gamma is the spatial aspect ratio, and specifies the ellipticity of the support of the Gabor function.

Gabor-like functions arise naturally, simply from the statistics of everyday scenes [2]. An example how even the statistics of a simple image can lead to the emergence of Gabor-like receptive fields, written in Python, is presented in [3]; and a (Python-)demonstration of the effects of filtering an image with Gabor-functions can be found at [4].

Gabor function, with sigma = 1, theta = 1, lambda = 4, psi = 2, gamma = 1

This is an example implementation in MATLAB:

function gb = gabor_fn(sigma,theta,lambda,psi,gamma)
 
  sigma_x = sigma;
  sigma_y = sigma/gamma;
 
  % Bounding box
  nstds = 3;
  xmax = max(abs(nstds*sigma_x*cos(theta)),abs(nstds*sigma_y*sin(theta)));
  xmax = ceil(max(1,xmax));
  ymax = max(abs(nstds*sigma_x*sin(theta)),abs(nstds*sigma_y*cos(theta)));
  ymax = ceil(max(1,ymax));
  xmin = -xmax;
  ymin = -ymax;
  [x,y] = meshgrid(xmin:0.05:xmax,ymin:0.05:ymax);
 
  % Rotation
  x_theta = x*cos(theta) + y*sin(theta);
  y_theta = -x*sin(theta) + y*cos(theta);
 
  gb = exp(-.5*(x_theta.^2/sigma_x^2+y_theta.^2/sigma_y^2)).* cos(2*pi/lambda*x_theta+psi);
 
end

And an equivalent Pyhon implementation would be:

import numpy as np
import matplotlib.pyplot as mp
 
def gabor_fn(sigma = 1, theta = 1, g_lambda = 4, psi = 2, gamma = 1):
    # Calculates the Gabor function with the given parameters
 
    sigma_x = sigma
    sigma_y = sigma/gamma
 
    # Boundingbox:
    nstds = 3
    xmax = max( abs(nstds*sigma_x * np.cos(theta)), abs(nstds*sigma_y * np.sin(theta)) )
    ymax = max( abs(nstds*sigma_x * np.sin(theta)), abs(nstds*sigma_y * np.cos(theta)) )
 
    xmax = np.ceil(max(1,xmax))
    ymax = np.ceil(max(1,ymax))
 
    xmin = -xmax
    ymin = -ymax
 
    numPts = 201    
    (x,y) = np.meshgrid(np.linspace(xmin, xmax, numPts), np.linspace(ymin, ymax, numPts) ) 
 
    # Rotation
    x_theta =  x * np.cos(theta) + y * np.sin(theta)
    y_theta = -x * np.sin(theta) + y * np.cos(theta)
    gb = np.exp( -0.5* (x_theta**2/sigma_x**2 + y_theta**2/sigma_y**2) ) * \
         np.cos( 2*np.pi/g_lambda*x_theta + psi )
 
    return gb
 
if __name__ == '__main__':
    # Main function: calculate Gabor function for default parameters and show it
    gaborValues = gabor_fn()
    mp.imshow(gaborValues)
    mp.colorbar()
    mp.show()

Image Processing[edit]

One major technical tool to understand is the way a computer handles images. We have to know how we can edit images and what techniques we have to rearrange images.

Image Representation[edit]

Grayscale[edit]
Representation of graylevel images.

For a computer an image is nothing more than a huge amount of little squares. These squares are called "pixel". In a grayscale image, each of this pixel carries a number n, often it holds 0\leq n \leq 255. This number n, represents the exactly color of this square in the image. This means, in a grayscale image we can use 256 different grayscales, where 255 means a white spot, and 0 means the square is black. To be honest, we could even use more than 256 different levels of gray. In the mentioned way, every pixels uses exactly 1 byte (or 8 bit) of memory to be saved. (Due to the binary system of a computer it holds: 28=256) If you think it is necessary to have more different gray scales in your image, this is not a problem. You just can use more memory to save the picture. But just remember, this could be a hard task for huge images. Further quite often you have the problem that your sensing device (e.g. your monitor) can not show more than this 256 different gray colors.

Colour[edit]
Image represented with RGB-notation

Representing a colourful image is only slightly more complicated than the grayscale picture. All you have to know is that the computer works with a additive colour mixture of the three main colors Red, Green and Blue. This are the so called RGB colours.

Also these images are saved by pixels. But now every pixel has to know 3 values between 0 and 256, for every Color 1 value. So know we have 2563= 16,777,216 different colours which can be represented. Similar to the grayscale images also here holds, that no color means black, and having all color means white. That means, the colour (0,0,0) is black, whereas (0,0,255) means blue and (255,255,255) is white.

Orientation[edit]

WhichWayUp.png

WARNING - There are two common, but different ways to describe the location of a point in 2 dimensions: 1) The x/y notation, with x typically pointing to the left 2) The row/column orientation Carefully watch out which coordinates you are using to describe your data, as the two descriptions are not consistent!

Image Filtering[edit]

1D Filter[edit]

In many technical applications, we find some primitive basis in which we easily can describe features. In 1 dimensional cases filters are not a big deal, therefore we can use this filters for changing images. The so called "Savitzky- Golay Filter" allows to smooth incoming signals. The filter was described in 1964 by Abraham Savitzky and Marcel J. E. Golay. It is a impulse-respond filter (IR).

For better understanding, lets look at a example. In 1d we usually deal with vectors. One such given vector, we call x and it holds: \mathbf{x} = (x_1,x_2,\dots,x_n) \; with \; n \; \in \mathbb{N}. Our purpose is to smooth that vector x. To do so all we need is another vector \mathbf(w) = (w_1,w_2,\dots,w_m) \; with \; n>m \; \in \mathbb{N}, this vector we call a weight vector.

Filter 1D Principle.png

With y(k)=\displaystyle \sum_{i=1}^m w(i)x(k-m+i) we now have a smoothed vector y. This vector is smoother than the vector before, because we only save the average over a few entries in the vector. These means the newly found vectorentries, depends on some entries right left and right of the entry to smooth. One major drawback of this approach is, the newly found vector y only has n-m entries instead of n as the original vector x.

Drawing this new vector would lead to the same function as before, just with less amplitude. So no data is lost, but we have less fluctuation.

2D Filter[edit]

Going from the 1d case to the 2d case is done by simply make out of vectors matrices. As already mentioned, a gray-level image is for a computer or for a softwaretool as MATLAB nothing more, than a huge matrix filled with natural numbers, often between 0 and 255.

Filter 2D Principle.png

The weight vector is now a weight-matrix. But still we use the filter by adding up different matrix-element-multiplications. y(n,m)=\displaystyle \sum_{i=1}^k \sum_{j=1}^l w_{ij}\times x(n-1+i,m-1+j)

Dilation and Erosion[edit]

For linear filters as seen before, it holds that they are commutative. Cite from wikipedia: "One says that x commutes with y under ∗ if:

 x * y = y * x \,"

In other words, it does not matter how many and in which sequence different linear filters you use. E.g. if a Savitzky-Golay filter is applied to some date, and then a second Savitzky-Golay filter for calculationg the first derivative, the result is the same if the sequence of filters is reversed. It even holds, that there would have been one filter, which does the same as the two applied.

In contrast morphological operations on an image are non-linear operations and the final result depends on the sequence. If we think of any image, it is defined by pixels with values xij. Further this image is assumed to be a black-and-white image, so we have

 x_{ij}= 0\;or\;1, \forall i,j

To define a morphological operation we have to set a structural element SE. As example, a 3x3-Matrix as a part of the image.

The definition of erosion E says:

E(M)=\left\{ \begin{align}
  & 0,\ \,if\sum\limits_{i,j=0}^{3}{{{(se)}_{ij}}}<9  \\
  & 1,\ \,else  \\
\end{align} \right.\ \ ,with\ {{(se)}_{ij}},M\in SE

So in words, if any of the pixels in the structural element M has value 0, the erosion sets the value of M, a specific pixel in M, to zero. Otherwise E(M)=1

And for the dilation D it holds, if any value in SE is 1, the dilation of M, D(M), is set to 1.

D(M)=\left\{ \begin{align}
        & 1,\; if \sum_{i,j=0}^3 (se)_{ij} >=1 \\
        & 0,\; else 
       \end{align}
       \right. \; \; ,with\; (se)_{ij},M \in SE  
.


Square Morphological.jpg

Compositions of Dilation and Erosion: Opening and Closing of Images[edit]

There are two compositions of dilation and erosion. One called opening the other called closing. It holds:


   \begin{align}
     opening & = & dilation \circ erosion \\
     closing & = & erosion \circ dilation 
   \end{align}

References[edit]

  1. T. Haslwanter (2012). "Mexican Hat Function [Python"]. private communications. http://work.thaslwanter.at/CSS/Code/mexican_hat.py. 
  2. Olshausen,B.A. and Field,D.J. (1996). "Emergence of simple-cell receptive field properties by learning a sparse code for natural images". Nature 381 (June 13): 607-609. 
  3. scikits-image development team (2012). "Emergence of Gabor-like functions from a SimpleIimage [Python"]. http://work.thaslwanter.at/CSS/Code/lena2gabor.py. 
  4. Thomas Haslwanter (2012). "Demo-application of Gabor filters to an image [Python"]. http://work.thaslwanter.at/CSS/Code/gabor_demo.py. 

NeuralSimulation · Auditory_System