# Fractals/Computer graphic techniques/2D/optimisation

```          "premature optimization is the root of all evil" - Donald Knuth in paper "Structured Programming With Go To Statements"
```
```           "good algorithms beat optimized code" Bruce Dawson ( Fractal extreme )
```

# Software optimisation

• Profiling code
• benchmarking 

• "If you've actually profiled the code, have specific snippets so that everyone can run the same code to see its performance, and you have this library published somewhere where it can be seen (GitHub, Bitbucket, etc.), then I don't necessarily see a problem with it going on Code Review. Including the results from your profiler of choice with identified bottlenecks would go a long way towards keeping it on-topic.
• If you're just starting the code but have profiled a small amount of code that exhibits the aberrant performance, then asking it on Stack Overflow would be acceptable. Including the results from your profiler of choice with identified bottlenecks would go a long way towards keeping it on-topic.
• If you're asking for people to just help you optimize the code, don't ask it anywhere. Getting a code dump and being asked to find the performance bottleneck is only something I do professionally, not on my free time." (Makoto)

## Inner loop

• inner loop in wikipedia 

## Symmetry

### Mirror symmetry

Parameter plane and the Mandelbrot set is symmetric with respect to the x-axis in the plane :

```colour(x,y) = colour(x,-y)
```

Here is a C code that shows how to divide symmetric image (with axis in the middle of its height) into 3 parts :

• axis of symmetry
• 2 symmetric parts : above and below axis
```// fill array using mirror symmetry of image
// uses global var :  ...
int FillArray(unsigned char data[] )
{

unsigned char Color; // gray from 0 to 255

// draw axis of symmetry
iy = iyAxisOfSymmetry;
for(ix=ixMin;ix<=ixMax;++ix) PlotPoint(ix, iy, GiveColor(ix, iy));

// draw symmetric parts : above and below axis
for(iyAbove = iyAboveMin; iyAbove<=iyAboveMax; ++iyAbove)
for(ix=ixMin; ix<=ixMax; ++ix)

{ // above axis compute color and save it to the array
iy = iyAxisOfSymmetry + iyAbove;
Color = GiveColor(ix, iy);
PlotPoint(ix, iy, Color );
// below the axis only copy Color the same as above without computing it
PlotPoint(ix, iyAxisOfSymmetry - iyAbove , Color );
}
return 0;
}
```

### Rotational symmetry

Dynamical plane and the Julia set have rotational symmetry. It can be used to speed up computing :

```// fill array using symmetry of image
// uses global var :  ...
int FillArraySymmetric(unsigned char data[] )
{

unsigned char Color; // gray from 0 to 255

printf("axis of symmetry \n");
iy = iyAxisOfSymmetry;

for(ix=ixMin;ix<=ixMax;++ix) PlotPoint(ix, iy, GiveColor(ix, iy));

// above and below axis
for(iyAbove = iyAboveMin; iyAbove<=iyAboveMax; ++iyAbove)
{printf(" %d from %d\n", iyAbove, iyAboveMax); //info
for(ix=ixMin; ix<=ixMax; ++ix)

{ // above axis compute color and save it to the array
iy = iyAxisOfSymmetry + iyAbove;
Color = GiveColor(ix, iy);
PlotPoint(ix, iy, Color );
// below the axis only copy Color the same as above without computing it
PlotPoint(ixMax-ix, iyAxisOfSymmetry - iyAbove , Color );
}
}
return 0;
}
```

## Complex numbers

Computations without explicit definition of complex numbers or without complex type are usually faster.

## Parallel computing

In Mandelbrot and Julia sets each point/pixel is calculated independently, so it can be easly divided into an smaller tasks and carried out simultaneously.

Types of parallel computing : Wikipedia description

### Vectorisation

#### R

Vectorisation 

Here is R code of above image with comments:

```## based on the R code by Jarek Tuszynski
## http://commons.wikimedia.org/wiki/File:Mandelbrot_Creation_Animation_%28800x600%29.gif

iMax=20; # number of frames and iterations

## world coordinate ( float )
cxmin = -2.2;
cxmax = 1.0;
cymax = 1.2;
cymin = -1.2;

## screen coordinate ( integer)

dx=800; dy=600 ;           # define grid size

## sequences of  values = rows and columns
cxseq = seq(cxmin, cxmax, length.out=dx);
cyseq = seq(cymin, cymax, length.out=dy);
## sequences of  values = rows * columns
cxrseq = rep(cxseq, each = dy);
cyrseq = rep(cyseq,  dx);
C = complex( real=cxrseq, imag = cyrseq); # complex vector
C = matrix(C,dy,dx)       # convert from vector to matrix
# Now C is a matrix of c points coordinates (cx,cy)

# allocate memory for all the frames
F = array(0, dim = c(dy,dx,iMax)) # dy*dx*iMax array filled with zeros

Z = 0                     # initialize Z to zero
for (i in 1:iMax)         # perform iMax iterations
{
Z = Z^2+C               # iteration; uses n point to compute n+1 point

F[,,i]  = exp(-abs(Z))              # an omitted index is used to represent an entire column or row
# store magnitude of the complex number, scale it using an exponential function to emphasize fine detail,
}

# color and save F array to the file
library(caTools)          # load library with write.gif function
# mapped to color palette jetColors . Dark red is a very low number, dark blue is a very high number
jetColors = colorRampPalette(c("#00007F", "blue", "#007FFF", "cyan", "#7FFF7F", "yellow", "#FF7F00", "red", "#7F0000"))
write.gif(F, "Mandelbrot.gif", col=jetColors, delay=100, transparent=0)

```

### Multicore

"I always create as many worker threads as I have cores. More than that and your system spends too much time task switching" - Duncan C

Using OpenMP :

```// fill array using symmetry of image
// uses global var :  ...
int FillArraySymmetric(unsigned char data[] )
{

unsigned char Color; // gray from 0 to 255

printf("axis of symmetry \n");
iy = iyAxisOfSymmetry;
#pragma omp parallel for schedule(dynamic) private(ix,Color) shared(ixMin,ixMax, iyAxisOfSymmetry)
for(ix=ixMin;ix<=ixMax;++ix) PlotPoint(ix, iy, GiveColor(ix, iy));

/*
The use of ‘shared(variable, variable2) specifies that these variables should be shared among all the threads.
The use of ‘private(variable, variable2)’ specifies that these variables should have a separate instance in each thread.
*/

#pragma omp parallel for schedule(dynamic) private(iyAbove,ix,iy,Color) shared(iyAboveMin, iyAboveMax,ixMin,ixMax, iyAxisOfSymmetry)

// above and below axis
for(iyAbove = iyAboveMin; iyAbove<=iyAboveMax; ++iyAbove)
{printf(" %d from %d\n", iyAbove, iyAboveMax); //info
for(ix=ixMin; ix<=ixMax; ++ix)

{ // above axis compute color and save it to the array
iy = iyAxisOfSymmetry + iyAbove;
Color = GiveColor(ix, iy);
PlotPoint(ix, iy, Color );
// below the axis only copy Color the same as above without computing it
PlotPoint(ixMax-ix, iyAxisOfSymmetry - iyAbove , Color );
}
}
return 0;
}
```

#### Cell

"The Sony PlayStation 3 is one of the cheapest parallel computers available on the consumer market."

#### GPGPU

Canvas : list of supporting browsers

## Effects

Effects on 2 core CPU ( with 1 thread per core )

Method time [minutes] relative speed
CPU & no optimisation 80 1
CPU & symmetry 39 2
CPU & symmetry & OpenMP 24 3
GPU ? ( to do )

GPU should be:

• 62 time faster than CPU 
• 760 times faster 

Perturbation algorithm with series approximation ( 100 times faster on zoom levels around e100! ) 

Using WinXP Intel 2.9 GHz CPU (1 CPU used) with a GTX 480 GPU I get the following using 1000x1000 plot with 1000 max iterations :

type time description
gpu 0.07s gpu is a pure CUDA solution on the GPU
gpuarray 3.4s gpuarray uses a numpy-like CUDA wrapper in Python on the GPU
numpy 43.4s numpy is a pure Numpy (C-based) solution on the CPU
python (serial) 1605.6s python is a pure Python solution on the CPU with numpy arrays