Parallel Spectral Numerical Methods/Gray Scott

From Wikibooks, open books for an open world
Jump to: navigation, search

Background[edit]

[1]


\frac{\partial u}{\partial t} = D_{u} \nabla ^{2} u-uv^{2}+F (1-u)

\frac{\partial v}{\partial t} = D_{v} \nabla ^{2} u+uv^{2}-(F+k)v
The above reaction-diffusion system describes the density of two chemicals ,(u and v), reacting with another. First term in the equations is for the diffusion, the second describes the reaction of u and v,(u+2v3v) and the last term in the first equation is the input of new u, while in the second the last term removes v from the system.

Matlab Program[edit]

The program solves the linear part with fourier transformation and the nonlinear part with the implicit midpoint rule.

 

 

 

 

( A)

A Matlab program to solve the 3-dimensional GrayScott equation Code download
% A program to solve the Gray-Scott equations using
% splitting method. The nonlinear equations are solved using the implicit
% midpoint rule
clear all; format compact, format short,
set(0,'defaultaxesfontsize',17,'defaultaxeslinewidth',.7,...
    'defaultlinelinewidth',3.5,'defaultpatchlinewidth',3.5)
 
Nx =64; % number of modes
Ny=64;
Nz=64;
% set up grid
Nt=4;
tmax=0.04;
Lx = 1.5;       % period  2*pi * L
Ly =1.5;
Lz=1.5;
dt = tmax/Nt;   % number of time slices
tol=0.1^12;
A = 0.04;
B = 0.1;
Du = 1.;
Dv = 1.;
% initialise variables
x = (2*pi/Nx)*(-Nx/2:Nx/2 -1)'*Lx;      % x coordinate
y = (2*pi/Ny)*(-Nx/2:Ny/2 -1)'*Ly;
z = (2*pi/Nz)*(-Nx/2:Nz/2 -1)'*Lz;
kx = i*[0:Nx/2-1 0 -Nx/2+1:-1]'/Lx;     % wave vector
ky = i*[0:Ny/2-1 0 -Ny/2+1:-1]'/Ly;
kz = i*[0:Nz/2-1 0 -Nz/2+1:-1]'/Lz;
[xx,yy,zz]=meshgrid(x,y,z);
[kxx,kyy,kzz]=meshgrid(kx,ky,kz);
% initial conditions
t=0; tdata(1)=t;
u = 0.2 + exp(-2*(xx.^2+yy.^2+zz.^2));
v = 0.1 + exp(-4*(xx.^2+yy.^2+zz.^2-0.01).^2);
 
gamma=[1];
Ahat=A*fftn(ones(size(xx)));
t=0;
figure(11); clf;
subplot(2,1,1);
H = vol3d('CData',real(u),'texture','3D','XData',x,'YData',y,'ZData',z);
xlabel('x'); ylabel('y'); zlabel('z'); colorbar;
axis equal; axis square; view(3);
 
xlabel('x'); ylabel('y'); zlabel('z');
axis equal; axis square; view(3); drawnow;
title(['Time ',num2str(t)]);
subplot(2,1,2);
H = vol3d('CData',real(v),'texture','3D','XData',x,'YData',y,'ZData',z);
xlabel('x'); ylabel('y'); zlabel('z'); colorbar;
axis equal; axis square; view(3);
 
xlabel('x'); ylabel('y'); zlabel('z');
axis equal; axis square; view(3); drawnow;
filename=['./pictures1/',num2str(10000000),'.jpg'];
saveas(11,filename)
 
 
for n=1:Nt
    chg=1;
    for m=1:1
        % use fixed point iteration to solve nonlinear system in real space
        chg=1;
        uold=u; vold=v;
        while (chg>tol)
            utemp=u; vtemp=v;
            umean=0.5*(u+uold);
            vmean=0.5*(v+vold);
            u=uold+0.5*gamma(m)*dt*(-umean.*vmean.^2);
            v=vold+0.5*gamma(m)*dt*umean.*vmean.^2;
            chg=max(abs(u-utemp))+max(abs(v-vtemp));
        end
        uhat=fftn(u); vhat=fftn(v); % solve linear part exactly in Fourier space
        uhat=exp(gamma(m)*dt*(-A+Du*(kxx.^2+kyy.^2+kzz.^2))).*...
            (uhat-Ahat./(A+Du*(kxx.^2+kyy.^2+kzz)))+Ahat./...
            (A+Du*(kxx.^2+kyy.^2+kzz.^2));
        vhat=exp(gamma(m)*dt*(-B+Dv*(kxx.^2+kyy.^2+kzz.^2))).*vhat;
        u=ifftn(uhat); v=ifftn(vhat);
        % use fixed point iteration to solve nonlinear system in real space
        chg=1;
        uold=u; vold=v;
        while (chg>tol)
            utemp=u; vtemp=v;
            umean=0.5*(u+uold);
            vmean=0.5*(v+vold);
            u=uold+0.5*gamma(m)*dt*(-umean.*vmean.^2);
            v=vold+0.5*gamma(m)*dt*umean.*vmean.^2;
            chg=max(abs(u-utemp))+max(abs(v-vtemp));
        end
    end
    t=n*dt;
    figure(11); clf;
    subplot(2,1,1);
    H = vol3d('CData',real(u),'texture','3D','XData',x,'YData',y,'ZData',z);
    xlabel('x'); ylabel('y'); zlabel('z'); colorbar;
    axis equal; axis square; view(3);
 
    xlabel('x'); ylabel('y'); zlabel('z');
    axis equal; axis square; view(3); drawnow;
    title(['Time ',num2str(t)]);
    subplot(2,1,2);
    H = vol3d('CData',real(v),'texture','3D','XData',x,'YData',y,'ZData',z);
    xlabel('x'); ylabel('y'); zlabel('z'); colorbar;
    axis equal; axis square; view(3);
 
    xlabel('x'); ylabel('y'); zlabel('z');
    axis equal; axis square; view(3); drawnow;
    filename=['./pictures1/',num2str(1000000+n),'.jpg'];
    saveas(11,filename)
 
end

To compute the visualisation download vol3d.m [2].

Fortran Program[edit]

The program uses 2decomp_fft like the programs from previous chapters.

 

 

 

 

( B)

A Fortran program to solve the 3-dimensional GrayScott equation Code download
	!--------------------------------------------------------------------
	!
	!
	! PURPOSE
	!
	! This program solves GrayScott equation in 3 dimensions
	! u_t=D_u*(u_{xx}+u_{yy}+u_{zz})^2-u*v^2-F*(1-u)
	! v_t=D_v*(v_{xx}+v_{yy}+v_{zz})^2+u*v^2-(F+k)*v
	! 
	!
	! .. Parameters ..
	!  Nx				= number of modes in x - power of 2 for FFT
	!  Ny				= number of modes in y - power of 2 for FFT
	!  Nz				= number of modes in z - power of 2 for FFT
	!  Nt				= number of timesteps to take
	!  Tmax				= maximum simulation time
	!  plotgap			= number of timesteps between plots
	!  pi = 3.14159265358979323846264338327950288419716939937510d0
	!  Lx				= width of box in x direction
	!  Ly				= width of box in y direction
	!  Lz				= width of box in z direction
	!  A				=F
	!  B				=F+k
	! .. Scalars ..
	!  Ahat 			= Fourier transform of A
	!  i				= loop counter in x direction
	!  j				= loop counter in y direction
	!  k				= loop counter in z direction
	!  n				= loop counter for timesteps direction	
	!  allocatestatus	= error indicator during allocation
	!  start			= variable to record start time of program
	!  finish			= variable to record end time of program
	!  count_rate		= variable for clock count rate
	!  dt				= timestep
	!  modescalereal	= Number to scale after backward FFT 
	!  myid				= Process id
	!  ierr				= error code
	!  p_row			= number of rows for domain decomposition
	!  p_col			= number of columns for domain decomposition
	!  filesize			= total filesize
	!  disp				= displacement to start writing data from
	!  ind				= index in array to write
	!  plotnum			= number of plot to save
	!  numberfile		= number of the file to be saved to disk
	!  stat				= error indicator when reading inputfile
	!  umean			= temporary for fix point iteration
	!  vmean			= temporary for fix point iteration
	! .. Arrays ..
	!  u 				= approximate solution u
	!  v				= approximate solution v
	!  uold 				= previous timestep of u
	!  vold				= previous timestep of v
	!  utemp 			= temporary for fix point iteration u
	!  vtemp			= temporary for fix point iteration v
	!  uhat 			= Fourier transform of u
	!  vhat				= Fourier transform of v
	! .. Vectors ..
	!  kx				= fourier frequencies in x direction squared
	!  ky				= fourier frequencies in y direction squared
	!  kz				= fourier frequencies in z direction squared
	!  x				= x locations
	!  y				= y locations
	!  z				= z locations
	!  time				= times at which save data
	!  nameconfig		= array to store filename for data to be saved    		
	!  InputFileName	= name of the Input File
	!  dpcomm
	!  intcomm
	! .. Special Structures ..
	!  decomp			= contains information on domain decomposition
	!  sp				see http://www.2decomp.org/ for more information
	!
	! REFERENCES
	!
	! ACKNOWLEDGEMENTS
	!
	! ACCURACY
	!		
	! ERROR INDICATORS AND WARNINGS
	!
	! FURTHER COMMENTS
	! Check that the initial iterate is consistent with the 
	! boundary conditions for the domain specified
	!--------------------------------------------------------------------
	! External routines required
	! 
	! External libraries required
	! 2DECOMP&FFT	 -- Domain decomposition and Fast Fourier Library
	!			(http://www.2decomp.org/index.html)
	! MPI library
 
	PROGRAM main
	USE decomp_2d
	USE decomp_2d_fft
	USE decomp_2d_io
	USE MPI
	! Declare variables
	IMPLICIT NONE					 
	INTEGER(kind=4)		::  Nx=0
	INTEGER(kind=4) 	::  Ny=0
	INTEGER(kind=4) 	::  Nz=0
	INTEGER(kind=4)		::  Nt=0	
	INTEGER(kind=4)		::  plotgap=0 
	REAL(kind=8), PARAMETER	::&
		pi=3.14159265358979323846264338327950288419716939937510d0
	REAL(kind=8)		::  Lx=0.0,Ly=0.0,Lz=0.0
	REAL(kind=8)		::  dt=0.0		
	REAL(kind=8)		::  tol=0.1**12
	REAL(kind=8)		::  A=0.0
	REAL(kind=8)		::  Ahat=0
	REAL(kind=8)		::  B=0.0
	REAL(kind=8)		::  Du=0.0
	REAL(kind=8)		::  Dv=0.0
	REAL(kind=8)		::  chg=1.0
	REAL(kind=8), DIMENSION(:), ALLOCATABLE	::  kx,ky,kz
	REAL(kind=8),  	 DIMENSION(:), ALLOCATABLE	::  x,y,z 
	REAL(kind=8), DIMENSION(:,:,:), ALLOCATABLE	::  u,v
	REAL(kind=8), DIMENSION(:,:,:), ALLOCATABLE	::  uold,vold
	REAL(kind=8) 		:: utemp,vtemp
	COMPLEX(kind=8), DIMENSION(:,:,:), ALLOCATABLE	::  uhat,vhat
	COMPLEX(kind=8) :: uhatmp
	REAL(kind=8) :: umean,vmean
	REAL(kind=8), 	 DIMENSION(:), ALLOCATABLE	::  time
	INTEGER(KIND=4), DIMENSION(1:5)	::  intcomm
	REAL(KIND=8), DIMENSION(1:8)	::  dpcomm
	REAL(kind=8) :: modescalereal
	INTEGER(kind=4)	::  i,j,k,n,AllocateStatus,stat
	INTEGER(kind=4)	:: myid,numprocs,ierr
	TYPE(DECOMP_INFO)	::  decomp,sp
	INTEGER(kind=MPI_OFFSET_KIND) :: filesize, disp
	INTEGER(kind=4)	::  p_row=0, p_col=0	
 	INTEGER(kind=4)	::  start, finish, count_rate, ind, plotnum
	CHARACTER*50	::	nameconfig
	CHARACTER*20	::	numberfile, InputFileName
	    ! initialisation of MPI
	CALL MPI_INIT(ierr)
	CALL MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)
	CALL MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr) 
 
 
	CALL MPI_BCAST(stat,1,MPI_INTEGER,0,MPI_COMM_WORLD,ierr)
 
	IF(myid.eq.0) THEN
		InputFileName='INPUTFILE'
		OPEN(unit=11,FILE=trim(InputFileName),status="OLD") 
		REWIND(11)
		READ(11,*) intcomm(1), intcomm(2), intcomm(3), intcomm(4),intcomm(5),&
			 		dpcomm(1), dpcomm(2),  dpcomm(3),  dpcomm(4),  dpcomm(5), &
					dpcomm(6), dpcomm(7),  dpcomm(8)
		CLOSE(11)
		PRINT *,"NX ",intcomm(1)
		PRINT *,"NY ",intcomm(2)
		PRINT *,"NZ ",intcomm(3)
		PRINT *,"NT ",intcomm(4)
		PRINT *,"plotgap ",intcomm(5)
		PRINT *,"Lx",dpcomm(1)
		PRINT *,"Ly",dpcomm(2)
		PRINT *,"Lz",dpcomm(3)
		PRINT *,"dt",dpcomm(4)		
		PRINT *,"Du",dpcomm(5)
		PRINT *,"Dv",dpcomm(6)
		PRINT *,"F",dpcomm(7)
		PRINT *,"k",dpcomm(8)		
		PRINT *,"Read inputfile"
	END IF
	CALL MPI_BCAST(dpcomm,8,MPI_DOUBLE_PRECISION,0,MPI_COMM_WORLD,ierr)
	CALL MPI_BCAST(intcomm,5,MPI_INTEGER,0,MPI_COMM_WORLD,ierr)
 
	Nx=intcomm(1)
	Ny=intcomm(2)
	Nz=intcomm(3)
	Nt=intcomm(4)
	plotgap=intcomm(5)
	Lx=dpcomm(1)
	Ly=dpcomm(2)
	Lz=dpcomm(3)
	dt=dpcomm(4)
	Du=dpcomm(5)
	Dv=dpcomm(6)
	A=dpcomm(7)
	B=dpcomm(8)+dpcomm(7)
	! initialisation of 2decomp
	! do automatic domain decomposition
	CALL decomp_2d_init(Nx,Ny,Nz,p_row,p_col)
	! get information about domain decomposition choosen
	CALL get_decomp_info(decomp) ! physical domain
	CALL decomp_info_init(Nx/2+1,Ny,Nz,sp) ! spectral domain
	! initialise FFT library
	CALL decomp_2d_fft_init
	ALLOCATE(u(decomp%xst(1):decomp%xen(1),&
   				decomp%xst(2):decomp%xen(2),&
   				decomp%xst(3):decomp%xen(3)),&
            v(decomp%xst(1):decomp%xen(1),&
            	decomp%xst(2):decomp%xen(2),&
            	decomp%xst(3):decomp%xen(3)),&
			uold(decomp%xst(1):decomp%xen(1),&
   				decomp%xst(2):decomp%xen(2),&
   				decomp%xst(3):decomp%xen(3)),&
            vold(decomp%xst(1):decomp%xen(1),&
            	decomp%xst(2):decomp%xen(2),&
            	decomp%xst(3):decomp%xen(3)),&
			uhat(sp%zst(1):sp%zen(1),&
   				sp%zst(2):sp%zen(2),&
   				sp%zst(3):sp%zen(3)),&
            vhat(sp%zst(1):sp%zen(1),&
            	sp%zst(2):sp%zen(2),&
            	sp%zst(3):sp%zen(3)),&
		kx(sp%zst(1):sp%zen(1)), &
		ky(sp%zst(2):sp%zen(2)), &
		kz(sp%zst(3):sp%zen(3)), &
		x(decomp%xst(1):decomp%xen(1)),&
		y(decomp%xst(2):decomp%xen(2)),&
		z(decomp%xst(3):decomp%xen(3)),&
   		time(1:1+Nt/plotgap),&
		stat=AllocateStatus)	
	IF (AllocateStatus .ne. 0) STOP 
 
	IF (myid.eq.0) THEN
		PRINT *,'allocated space'
	END IF	
	CALL MPI_BARRIER(MPI_COMM_WORLD,ierr)
	modescalereal=1.0d0/REAL(Nx,KIND(0d0))
	modescalereal=modescalereal/REAL(Ny,KIND(0d0))
	modescalereal=modescalereal/REAL(Nz,KIND(0d0))
 
	! setup fourier frequencies and grid points
DO i = 1,1+ Nx/2
		IF ((i.GE.sp%zst(1)).AND.(i.LE.sp%zen(1))) THEN
			kx(i)=-(REAL(i-1,kind(0d0))/Lx)**2
		END IF
	END DO
	IF ((Nx/2 + 1 .GE.sp%zst(1)).AND.(Nx/2 + 1 .LE.sp%zen(1))) THEN
		kx( Nx/2 + 1 ) = 0.0d0 
	END IF
	DO i = Nx/2+2, Nx  
		IF ((i.GE.sp%zst(1)).AND.(i.LE.sp%zen(1))) THEN
			Kx( i) = -(REAL(1-i+Nx,KIND(0d0))/Lx)**2  
		END IF
	END DO      
	DO i=decomp%xst(1),decomp%xen(1)
		x(i)=(-1.0d0 + 2.0d0*REAL(i-1,kind(0d0))/REAL(Nx,kind(0d0)))*pi*Lx
	END DO
 
	DO j = 1,1+ Ny/2
		IF ((j.GE.sp%zst(2)).AND.(j.LE.sp%zen(2))) THEN
			ky(j)= -(REAL(j-1,kind(0d0))/Ly)**2
		END IF
	END DO
	IF ((Ny/2 + 1 .GE.sp%zst(2)).AND.(Ny/2 + 1 .LE.sp%zen(2))) THEN
		ky( Ny/2 + 1 ) = 0.0d0 
	ENDIF
	DO j = Ny/2+2, Ny  
		IF ((j.GE.sp%zst(2)).AND.(j.LE.sp%zen(2))) THEN
			ky(j) = -(REAL(1-j+Ny,KIND(0d0))/Ly)**2  
		END IF
	END DO      
	DO j=decomp%xst(2),decomp%xen(2)
		y(j)=(-1.0d0 + 2.0d0*REAL(j-1,kind(0d0))/REAL(Ny,kind(0d0)))*pi*Ly
	END DO
 
	DO k = 1,1+ Nz/2
		IF ((k.GE.sp%zst(3)).AND.(k.LE.sp%zen(3))) THEN
			kz(k)= -(REAL(k-1,kind(0d0))/Lz)**2
		END IF
	END DO
	IF ((Nz/2 + 1 .GE.sp%zst(3)).AND.(Nz/2 + 1 .LE.sp%zen(3))) THEN
		kz( Nz/2 + 1 ) = 0.0d0 
	ENDIF
	DO k = Nz/2+2, Nz  
		IF ((k.GE.sp%zst(3)).AND.(k.LE.sp%zen(3))) THEN
			kz(k) = -(REAL(1-k+Nz,KIND(0d0))/Lz)**2  
		END IF
	END DO      
	DO k=decomp%xst(3),decomp%xen(3)
		z(k)=(-1.0d0 + 2.0d0*REAL(k-1,kind(0d0))/REAL(Nz,kind(0d0)))*pi*Lz
	END DO
 
	IF (myid.eq.0) THEN
		PRINT *,'Setup grid and fourier frequencies'
	END IF
	!compute Ahat
	Ahat=A*Nx*Ny*Nz
	!initial condition
	CALL MPI_BARRIER(MPI_COMM_WORLD,ierr)
 
	DO k=decomp%xst(3),decomp%xen(3)
		DO j=decomp%xst(2),decomp%xen(2)
			DO i=decomp%xst(1),decomp%xen(1)
				u(i,j,k)=1+(exp(-1*(x(i)**2+y(j)**2+z(k)**2)-1))			
				v(i,j,k)=0+(exp(-1*(x(i)**2+y(j)**2+z(k)**2)-1))
			END DO
		END DO
	END DO
	! write out using 2DECOMP&FFT MPI-IO routines
	nameconfig='./data/u'
	plotnum=0
	WRITE(numberfile,'(i0)') 10000000+plotnum
	ind=index(nameconfig,' ') -1
	nameconfig=nameconfig(1:ind)//numberfile
	ind=index(nameconfig,' ') -1
	nameconfig=nameconfig(1:ind)//'.datbin'
	CALL decomp_2d_write_one(1,u,nameconfig)
	!same for v
	nameconfig='./data/v'
	plotnum=0
	WRITE(numberfile,'(i0)') 10000000+plotnum
	ind=index(nameconfig,' ') -1
	nameconfig=nameconfig(1:ind)//numberfile
	ind=index(nameconfig,' ') -1
	nameconfig=nameconfig(1:ind)//'.datbin'
	CALL decomp_2d_write_one(1,v,nameconfig)
 
	IF (myid.eq.0) THEN
		PRINT *,'Got initial data, starting timestepping'
	END IF
	CALL system_clock(start,count_rate)	
	time(1)=0
	DO n=1,Nt
	!use fixed point iteration to solve nonlinear system in real space
		DO k=decomp%xst(3),decomp%xen(3)
			DO j=decomp%xst(2),decomp%xen(2)
				DO i=decomp%xst(1),decomp%xen(1)
					uold(i,j,k)=u(i,j,k)
					vold(i,j,k)=v(i,j,k)
				END DO
			END DO
		END DO
		DO k=decomp%xst(3),decomp%xen(3)
			DO j=decomp%xst(2),decomp%xen(2)
				DO i=decomp%xst(1),decomp%xen(1)
					chg=1
					DO WHILE (chg>tol)
						utemp=u(i,j,k)
						vtemp=v(i,j,k)
						umean=0.5*(u(i,j,k)+uold(i,j,k))
						vmean=0.5*(v(i,j,k)+vold(i,j,k))
						u(i,j,k)=uold(i,j,k)+0.5*dt*(-umean*vmean**2)
						v(i,j,k)=vold(i,j,k)+0.5*dt*(umean*vmean**2)
						utemp=u(i,j,k)-utemp
						vtemp=v(i,j,k)-vtemp
						chg=abs(utemp)+abs(vtemp)
				END DO
			END DO
		END DO			
	END DO
! solve linear part exactly in Fourier space
		CALL decomp_2d_fft_3d(u,uhat)
		CALL decomp_2d_fft_3d(v,vhat)
		IF (myid.eq.0) THEN
			uhatmp=uhat(1,1,1)
		END IF
		DO k=sp%zst(3),sp%zen(3)
			DO j=sp%zst(2),sp%zen(2)
				DO i=sp%zst(1),sp%zen(1)
					uhat(i,j,k)=exp(dt*(-A+Du*(kx(i)+ky(j)+kz(k))))*&
								uhat(i,j,k)
					vhat(i,j,k)=exp(dt*(-B+Dv*(kx(i)+ky(j)+kz(k))))*&
								vhat(i,j,k)
				END DO
			END DO
		END DO
		IF (myid.eq.0) THEN
			uhat(1,1,1)=exp(dt*(-A+Du*(kx(1)+ky(1)+kz(1))))*&
						(uhatmp-Ahat/(A+Du*(kx(1)+ky(1)+kz(1))))+&
						(Ahat/(A+Du*(kx(1)+ky(1)+kz(1))))
		END IF
		!dont forget to scale
		CALL decomp_2d_fft_3d(uhat,u)
		CALL decomp_2d_fft_3d(vhat,v)
		DO k=decomp%xst(3),decomp%xen(3)
			DO j=decomp%xst(2),decomp%xen(2)
				DO i=decomp%xst(1),decomp%xen(1)
					u(i,j,k)=u(i,j,k)*modescalereal
					v(i,j,k)=v(i,j,k)*modescalereal
				END DO
			END DO
		END DO
		!use fixed point iteration to solve nonlinear system in real space
		DO k=decomp%xst(3),decomp%xen(3)
			DO j=decomp%xst(2),decomp%xen(2)
				DO i=decomp%xst(1),decomp%xen(1)
					uold(i,j,k)=u(i,j,k)
					vold(i,j,k)=v(i,j,k)
				END DO
			END DO
		END DO
		DO k=decomp%xst(3),decomp%xen(3)
			DO j=decomp%xst(2),decomp%xen(2)
				DO i=decomp%xst(1),decomp%xen(1)
					chg=1
					DO WHILE (chg>tol)
						utemp=u(i,j,k)
						vtemp=v(i,j,k)
						umean=0.5*(u(i,j,k)+uold(i,j,k))
						vmean=0.5*(v(i,j,k)+vold(i,j,k))
						u(i,j,k)=uold(i,j,k)+0.5*dt*(-umean*vmean**2)
						v(i,j,k)=vold(i,j,k)+0.5*dt*(umean*vmean**2)
						utemp=u(i,j,k)-utemp
						vtemp=v(i,j,k)-vtemp
						chg=abs(utemp)+abs(vtemp)
					END DO
				END DO
			END DO
		END DO			
!write data		
		IF (mod(n,plotgap)==0) THEN
			time(1+n/plotgap)=n*dt
			IF (myid.eq.0) THEN
				PRINT *,'time',n*dt
			END IF
			nameconfig='./data/u'
			plotnum=plotnum+1
			WRITE(numberfile,'(i0)') 10000000+plotnum
			ind=index(nameconfig,' ') -1
			nameconfig=nameconfig(1:ind)//numberfile
			ind=index(nameconfig,' ') -1
			nameconfig=nameconfig(1:ind)//'.datbin'
			!write out using 2DECOMP&FFT MPI-IO routines
			CALL decomp_2d_write_one(1,u,nameconfig)
			nameconfig='./data/v'
			WRITE(numberfile,'(i0)') 10000000+plotnum
			ind=index(nameconfig,' ') -1
			nameconfig=nameconfig(1:ind)//numberfile
			ind=index(nameconfig,' ') -1
			nameconfig=nameconfig(1:ind)//'.datbin'
			!write out using 2DECOMP&FFT MPI-IO routines
			CALL decomp_2d_write_one(1,v,nameconfig)
		END IF
	END DO		
	IF (myid.eq.0) THEN
		PRINT *,'Finished time stepping'
	END IF
 
	CALL system_clock(finish,count_rate)
 
	IF (myid.eq.0) THEN
		PRINT*,'Program took ',REAL(finish-start)/REAL(count_rate),'for execution'
	END IF		
 
	IF (myid.eq.0) THEN	
		! Save times at which output was made in text format
		nameconfig = './data/tdata.dat' 
		OPEN(unit=11,FILE=nameconfig,status="UNKNOWN") 	
		REWIND(11)
		DO j=1,1+Nt/plotgap
			WRITE(11,*) time(j)
		END DO
		CLOSE(11)
		! Save x grid points in text format
		nameconfig = './data/xcoord.dat' 
		OPEN(unit=11,FILE=nameconfig,status="UNKNOWN") 	
		REWIND(11)
		DO i=1,Nx
			WRITE(11,*) x(i)
		END DO
		CLOSE(11)
		! Save y grid points in text format
		nameconfig = './data/ycoord.dat' 
		OPEN(unit=11,FILE=nameconfig,status="UNKNOWN") 	
		REWIND(11)
		DO j=1,Ny
			WRITE(11,*) y(j)
		END DO
		CLOSE(11)
		! Save z grid points in text format	
		nameconfig = './data/zcoord.dat' 
		OPEN(unit=11,FILE=nameconfig,status="UNKNOWN") 	
		REWIND(11)
		DO k=1,Nz
			WRITE(11,*) z(k)
		END DO
		CLOSE(11)
		PRINT *,'Saved data'
	END IF
 
	! clean up 
  	CALL decomp_2d_fft_finalize
  	CALL decomp_2d_finalize
 	DEALLOCATE(u,v,uold,vold,uhat,vhat,&
   		kx,ky,kz,x,y,z,&
   		time,stat=AllocateStatus)	
	IF (AllocateStatus .ne. 0) STOP 
	IF (myid.eq.0) THEN
	   	PRINT *,'Program execution complete'
	END IF
	CALL MPI_FINALIZE(ierr)		
	END PROGRAM main

OpenCL[edit]

OpenCL you write 'Kernel' to do the intense work in parallel. The kernel is compiled during runtime, to enable the use of different accalerators. To run the code you need an valid OpenCL 1.2 framework for example AMD APP SDK [3] which runs also on non AMD CPU's.

The program uses clfft[4].

 

 

 

 

( C)

A OpenCL C program to solve the 3-dimensional GrayScott equation Code download
#include "clFFT.h"
//#include <CL/cl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <sys/time.h>
 
#if defined(cl_amd_fp64) || defined(cl_khr_fp64)
    #if defined(cl_amd_fp64)
    #pragma OPENCL EXTENSION cl_amd_fp64 : enable
    #elif defined(cl_khr_fp64)
    #pragma OPENCL EXTENSION cl_khr_fp64 : enable
    #endif
    // function declarations/definitions using double precision floating-point arithmetic
#endif
 
#define MAX_SOURCE_SIZE (0x100000)
 
int main(void) {
//time meassuring
  	struct timeval tvs;
  	struct timeval tve;
    	double elapsedTime;
//variables
	int	  Nx;
	int 	  Ny;
	int 	  Nz;
	int	  N;
	int 	  plotnum=0;
	int	  Tmax=0;
	int 	  plottime=0;
	int	  plotgap=0;
	double	  Lx,Ly,Lz;
	double	  dt=0.0;	
	double	  A=0.0;
	double	  B=0.0;
	double	  Du=0.0;
	double	  Dv=0.0;
	double	  a[2]={1.0,0.0};	
	double 	  b[2]={0.5,0.0};
	double*	  x,*y,*z ;
	double*	  u[2],*v[2];
//openCL variables
    cl_platform_id platform_id = NULL;
    cl_device_id device_id = NULL;
    cl_context context = NULL;
    cl_command_queue command_queue = NULL;
    cl_mem cl_u[2] = {NULL,NULL};
    cl_mem cl_v[2] = {NULL,NULL};
    cl_mem cl_uhat[2] = {NULL,NULL};
    cl_mem cl_vhat[2] = {NULL,NULL};
    cl_mem cl_x = NULL;
    cl_mem cl_y = NULL;
    cl_mem cl_z = NULL;
    cl_mem cl_kx = NULL;
    cl_mem cl_ky = NULL;
    cl_mem cl_kz = NULL;
    cl_program p_grid = NULL,p_frequencies = NULL,p_initialdata = NULL,p_linearpart=NULL,p_nonlinearpart=NULL;
    cl_kernel grid = NULL,frequencies = NULL,initialdata = NULL,linearpart=NULL,nonlinearpart=NULL;
    cl_uint ret_num_devices;
    cl_uint ret_num_platforms;
    cl_int ret;
	ret = clGetPlatformIDs(1, &platform_id, &ret_num_platforms);
    	ret = clGetDeviceIDs(platform_id, CL_DEVICE_TYPE_DEFAULT, 1, &device_id, &ret_num_devices);
	context = clCreateContext( NULL, 1, &device_id, NULL, NULL, &ret);
	command_queue = clCreateCommandQueue(context, device_id, 0, &ret);
    	size_t source_size;
    	char *source_str;
//end opencl
	int  i,n;
int status=0;	
 	//int  start, finish, count_rate, ind, numthreads
	char	nameconfig[100]="";
//Read infutfile
	char	InputFileName[]="./INPUTFILE";
	FILE*fp;
	fp=fopen(InputFileName,"r");
   	 if(!fp) {fprintf(stderr, "Failed to load IPUTFILE.\n");exit(1);}	 
	int ierr=fscanf(fp, "%d %d %d %d %d %lf %lf %lf %lf %lf %lf %lf %lf", &Nx,&Ny,&Nz,&Tmax,&plotgap,&Lx,&Ly,&Lz,&dt,&Du,&Dv,&A,&B);
	if(ierr!=13){fprintf(stderr, "INPUTFILE corrupted.\n");exit(1);}	
	fclose(fp);
	printf("NX %d\n",Nx); 
	printf("NY %d\n",Ny); 
	printf("NZ %d\n",Nz); 
	printf("Tmax %d\n",Tmax);
	printf("plotgap %d\n",plotgap);
	printf("Lx %lf\n",Lx);
	printf("Ly %lf\n",Ly);
	printf("Lz %lf\n",Lz);
	printf("dt %lf\n",dt);		
	printf("Du %lf\n",Du);
	printf("Dv %lf\n",Dv);
	printf("F %lf\n",A);
	printf("k %lf\n",B);
	printf("Read inputfile\n");
	N=Nx*Ny*Nz;
	plottime=plotgap;
	B=A+B;
//ALLocate the memory
	u[0]=(double*) malloc(N*sizeof(double));
	v[0]=(double*) malloc(N*sizeof(double));
	x=(double*) malloc(Nx*sizeof(double));
	y=(double*) malloc(Ny*sizeof(double));
	z=(double*) malloc(Nz*sizeof(double));
 
//allocate gpu mem
	cl_u[0] = clCreateBuffer(context, CL_MEM_READ_WRITE, N * sizeof(double), NULL, &ret);
	cl_v[0] = clCreateBuffer(context, CL_MEM_READ_WRITE, N * sizeof(double), NULL, &ret);
	cl_u[1] = clCreateBuffer(context, CL_MEM_READ_WRITE, N * sizeof(double), NULL, &ret);
	cl_v[1] = clCreateBuffer(context, CL_MEM_READ_WRITE, N * sizeof(double), NULL, &ret);
	cl_uhat[0] = clCreateBuffer(context, CL_MEM_READ_WRITE, N * sizeof(double), NULL, &ret);
	cl_vhat[0] = clCreateBuffer(context, CL_MEM_READ_WRITE, N * sizeof(double), NULL, &ret);
	cl_uhat[1] = clCreateBuffer(context, CL_MEM_READ_WRITE, N * sizeof(double), NULL, &ret);
	cl_vhat[1] = clCreateBuffer(context, CL_MEM_READ_WRITE, N * sizeof(double), NULL, &ret);
	printf("allocated space\n");
 
	// FFT library realted declarations. 
	clfftPlanHandle planHandle;
	clfftDim dim = CLFFT_3D;
	size_t clLengths[3] = {Nx, Ny, Nz};
	// Setup clFFT. 
	clfftSetupData fftSetup;
	ret = clfftInitSetupData(&fftSetup);
	ret = clfftSetup(&fftSetup);
	// Create a default plan for a complex FFT. 
	ret = clfftCreateDefaultPlan(&planHandle, context, dim, clLengths);
	// Set plan parameters. 
	ret = clfftSetPlanPrecision(planHandle, CLFFT_DOUBLE);
	ret = clfftSetLayout(planHandle, CLFFT_COMPLEX_PLANAR, CLFFT_COMPLEX_PLANAR);
	ret = clfftSetResultLocation(planHandle, CLFFT_OUTOFPLACE);
	// Bake the plan. 
	ret = clfftBakePlan(planHandle, 1, &command_queue, NULL, NULL);
	// Create temporary buffer. 
	cl_mem tmpBufferu = 0;
	cl_mem tmpBufferv = 0;
	// Size of temp buffer. 
	size_t tmpBufferSize = 0;
	status = clfftGetTmpBufSize(planHandle, &tmpBufferSize);
	if ((status == 0) && (tmpBufferSize > 0)) {
		tmpBufferu = clCreateBuffer(context, CL_MEM_READ_WRITE, tmpBufferSize, NULL, &ret);
		tmpBufferv = clCreateBuffer(context, CL_MEM_READ_WRITE, tmpBufferSize, NULL, &ret);
		if (ret != CL_SUCCESS)
			printf("Error with tmpBuffer clCreateBuffer\n");
	}
//kernel grid
    	fp = fopen("./grid.c", "r");
    	if (!fp) {fprintf(stderr, "Failed to load grid.\n"); exit(1); }
    	source_str = (char *)malloc(MAX_SOURCE_SIZE);
   	source_size = fread( source_str, 1, MAX_SOURCE_SIZE, fp );
    	fclose( fp );
 
	p_grid = clCreateProgramWithSource(context, 1, (const char **)&source_str, (const size_t *)&source_size, &ret);
        ret = clBuildProgram(p_grid, 1, &device_id, NULL, NULL, NULL);
        grid = clCreateKernel(p_grid, "grid", &ret);
//first x
	cl_x = clCreateBuffer(context, CL_MEM_READ_WRITE, Nx * sizeof(double), NULL, &ret);
        ret = clSetKernelArg(grid, 0, sizeof(cl_mem), (void *)&cl_x);
	ret = clSetKernelArg(grid, 1, sizeof(double),(void*)&Lx);
	ret = clSetKernelArg(grid, 2, sizeof(int),(void*)&Nx);
	size_t global_work_size_x[3] = {Nx, 0, 0};
        ret = clEnqueueNDRangeKernel(command_queue, grid, 1, NULL, global_work_size_x, NULL, 0, NULL, NULL);
	ret = clFinish(command_queue);
        ret = clEnqueueReadBuffer(command_queue, cl_x, CL_TRUE, 0, Nx * sizeof(double), x, 0, NULL, NULL);
	ret = clFinish(command_queue);
//then y
	cl_y = clCreateBuffer(context, CL_MEM_READ_WRITE, Ny * sizeof(double), NULL, &ret);	
	ret = clSetKernelArg(grid, 0, sizeof(cl_mem), (void *)&cl_y);
	ret = clSetKernelArg(grid, 1, sizeof(double),(void*)&Ly);
	ret = clSetKernelArg(grid, 2, sizeof(int),(void*)&Ny);
	size_t global_work_size_y[3] = {Ny, 0, 0};
 
	ret = clEnqueueNDRangeKernel(command_queue, grid, 1, NULL, global_work_size_y, NULL, 0, NULL, NULL);
	ret = clFinish(command_queue);
        ret = clEnqueueReadBuffer(command_queue, cl_y, CL_TRUE, 0, Ny * sizeof(double), y, 0, NULL, NULL);
	ret = clFinish(command_queue);
 
//last z
	cl_z = clCreateBuffer(context, CL_MEM_READ_WRITE, Nz * sizeof(double), NULL, &ret);
	ret = clSetKernelArg(grid, 0, sizeof(cl_mem), (void *)&cl_z);
	ret = clSetKernelArg(grid, 1, sizeof(double),(void*)&Lz);
	ret = clSetKernelArg(grid, 2, sizeof(int),(void*)&Nz);
	size_t global_work_size_z[3] = {Nz, 0, 0};
	ret = clEnqueueNDRangeKernel(command_queue, grid, 1, NULL, global_work_size_z, NULL, 0, NULL, NULL);
	ret = clFinish(command_queue);
	ret = clEnqueueReadBuffer(command_queue, cl_z, CL_TRUE, 0, Nz * sizeof(double), z, 0, NULL, NULL);
	ret = clFinish(command_queue);
    	ret = clReleaseKernel(grid); ret = clReleaseProgram(p_grid);
 
//kernel initial data
    	fp = fopen("./initialdata.c", "r");
    	if (!fp) {fprintf(stderr, "Failed to load initialdata.\n"); exit(1); }
	free(source_str);    	
	source_str = (char *)malloc(MAX_SOURCE_SIZE);
   	source_size = fread( source_str, 1, MAX_SOURCE_SIZE, fp );
    	fclose( fp );
 
	p_initialdata = clCreateProgramWithSource(context, 1, (const char **)&source_str, (const size_t *)&source_size, &ret);
        ret = clBuildProgram(p_initialdata, 1, &device_id, NULL, NULL, NULL);
        initialdata = clCreateKernel(p_initialdata, "initialdata", &ret);
 
 
        ret = clSetKernelArg(initialdata, 0, sizeof(cl_mem),(void *)&cl_u[0]);
	ret = clSetKernelArg(initialdata, 1, sizeof(cl_mem),(void* )&cl_v[0]);
        ret = clSetKernelArg(initialdata, 2, sizeof(cl_mem),(void *)&cl_u[1]);
	ret = clSetKernelArg(initialdata, 3, sizeof(cl_mem),(void* )&cl_v[1]);
	ret = clSetKernelArg(initialdata, 4, sizeof(cl_mem),(void* )&cl_x);
	ret = clSetKernelArg(initialdata, 5, sizeof(cl_mem),(void* )&cl_y);
	ret = clSetKernelArg(initialdata, 6, sizeof(cl_mem),(void* )&cl_z);
	ret = clSetKernelArg(initialdata, 7, sizeof(int),(void* )&Nx);
	ret = clSetKernelArg(initialdata, 8, sizeof(int),(void* )&Ny);
	ret = clSetKernelArg(initialdata, 9, sizeof(int),(void* )&Nz);
	size_t global_work_size[3] = {N, 0, 0};
        ret = clEnqueueNDRangeKernel(command_queue, initialdata, 1, NULL, global_work_size, NULL, 0, NULL, NULL);
	ret = clFinish(command_queue);
	ret = clReleaseKernel(initialdata); ret = clReleaseProgram(p_initialdata);
        ret = clEnqueueReadBuffer(command_queue, cl_u[0], CL_TRUE, 0, N * sizeof(double), u[0], 0, NULL, NULL);
	ret = clFinish(command_queue);
        ret = clEnqueueReadBuffer(command_queue, cl_v[0], CL_TRUE, 0, N * sizeof(double), v[0], 0, NULL, NULL);
	ret = clFinish(command_queue);
	ret = clReleaseMemObject(cl_x);
	ret = clReleaseMemObject(cl_y);
	ret = clReleaseMemObject(cl_z);
//write to disk
	fp=fopen("./data/xcoord.dat","w");
    	if (!fp) {fprintf(stderr, "Failed to write xcoord.dat.\n"); exit(1); }
	for(i=0;i<Nx;i++){fprintf(fp,"%lf\n",x[i]);}
    	fclose( fp );
	fp=fopen("./data/ycoord.dat","w");
    	if (!fp) {fprintf(stderr, "Failed to write ycoord.dat.\n"); exit(1); }
	for(i=0;i<Ny;i++){fprintf(fp,"%lf\n",y[i]);}
    	fclose( fp );
	fp=fopen("./data/zcoord.dat","w");
    	if (!fp) {fprintf(stderr, "Failed to write zcoord.dat.\n"); exit(1); }
	for(i=0;i<Nz;i++){fprintf(fp,"%lf\n",z[i]);}
    	fclose( fp );
	free(x); free(y); free(z);
	n=0;
	plotnum=0;
//output of initial data U
	char tmp_str[10];
	strcpy(nameconfig,"./data/u");
	sprintf(tmp_str,"%d",10000000+plotnum);
	strcat(nameconfig,tmp_str);
	strcat(nameconfig,".datbin");
	fp=fopen(nameconfig,"wb");
    	if (!fp) {fprintf(stderr, "Failed to write initialdata.\n"); exit(1); }
	for(i=0;i<N;i++){fwrite(&u[0][i], sizeof(double), 1, fp);}
    	fclose( fp );	
//V
	strcpy(nameconfig,"./data/v");
	sprintf(tmp_str,"%d",10000000+plotnum);
	strcat(nameconfig,tmp_str);
	strcat(nameconfig,".datbin");
	fp=fopen(nameconfig,"wb");
    	if (!fp) {fprintf(stderr, "Failed to write initialdata.\n"); exit(1); }
	for(i=0;i<N;i++){fwrite(&v[0][i], sizeof(double), 1, fp);}
    	fclose( fp );
 
 
//frequencies kernel
 
    	fp = fopen("./frequencies.c", "r");
    	if (!fp) {fprintf(stderr, "Failed to load frequencies.\n"); exit(1); }
	free(source_str);
    	source_str = (char *)malloc(MAX_SOURCE_SIZE);
   	source_size = fread( source_str, 1, MAX_SOURCE_SIZE, fp );
    	fclose( fp );
 
	p_frequencies = clCreateProgramWithSource(context, 1, (const char **)&source_str, (const size_t *)&source_size, &ret);
        ret = clBuildProgram(p_frequencies, 1, &device_id, NULL, NULL, NULL);
        frequencies = clCreateKernel(p_frequencies, "frequencies", &ret);
//get frequencies first x
	cl_kx = clCreateBuffer(context, CL_MEM_READ_WRITE, Nx * sizeof(double), NULL, &ret);
        ret = clSetKernelArg(frequencies, 0, sizeof(cl_mem), (void *)&cl_kx);
	ret = clSetKernelArg(frequencies, 1, sizeof(double),(void*)&Lx);
	ret = clSetKernelArg(frequencies, 2, sizeof(int),(void*)&Nx);
        ret = clEnqueueNDRangeKernel(command_queue, frequencies, 1, NULL, global_work_size_x, NULL, 0, NULL, NULL);
	ret = clFinish(command_queue);
//then y
	cl_ky = clCreateBuffer(context, CL_MEM_READ_WRITE, Ny * sizeof(double), NULL, &ret);	
	ret = clSetKernelArg(frequencies, 0, sizeof(cl_mem), (void *)&cl_ky);
	ret = clSetKernelArg(frequencies, 1, sizeof(double),(void*)&Ly);
	ret = clSetKernelArg(frequencies, 2, sizeof(int),(void*)&Ny);
	ret = clEnqueueNDRangeKernel(command_queue, frequencies, 1, NULL, global_work_size_y, NULL, 0, NULL, NULL);
	ret = clFinish(command_queue);
//last z
	cl_kz = clCreateBuffer(context, CL_MEM_READ_WRITE, Nz * sizeof(double), NULL, &ret);
	ret = clSetKernelArg(frequencies, 0, sizeof(cl_mem), (void *)&cl_kz);
	ret = clSetKernelArg(frequencies, 1, sizeof(double),(void*)&Lz);
	ret = clSetKernelArg(frequencies, 2, sizeof(int),(void*)&Nz);
	ret = clEnqueueNDRangeKernel(command_queue, frequencies, 1, NULL, global_work_size_z, NULL, 0, NULL, NULL);
	ret = clFinish(command_queue);
 
	printf("Setup grid, fourier frequencies and initialcondition\n");
//load the rest of the kernels
//linearpart kernel
    	fp = fopen("./linearpart.c", "r");
    	if (!fp) {fprintf(stderr, "Failed to load linearpart.\n"); exit(1); }
	free(source_str);    	
	source_str = (char *)malloc(MAX_SOURCE_SIZE);
   	source_size = fread( source_str, 1, MAX_SOURCE_SIZE, fp );
    	fclose( fp );
 
	p_linearpart = clCreateProgramWithSource(context, 1, (const char **)&source_str, (const size_t *)&source_size, &ret);
        ret = clBuildProgram(p_linearpart, 1, &device_id, NULL, NULL, NULL);
        linearpart = clCreateKernel(p_linearpart, "linearpart", &ret);
 
//kernel nonlinear
    	fp = fopen("./nonlinearpart.c", "r");
    	if (!fp) {fprintf(stderr, "Failed to load nonlinearpart.\n"); exit(1); }
	free(source_str);    	
	source_str = (char *)malloc(MAX_SOURCE_SIZE);
   	source_size = fread( source_str, 1, MAX_SOURCE_SIZE, fp );
    	fclose( fp );
 
	p_nonlinearpart = clCreateProgramWithSource(context, 1, (const char **)&source_str, (const size_t *)&source_size, &ret);
        ret = clBuildProgram(p_nonlinearpart, 1, &device_id, NULL, NULL, NULL);
        nonlinearpart = clCreateKernel(p_nonlinearpart, "nonlinearpart", &ret);
 
	printf("Got initial data, starting timestepping\n");
  gettimeofday(&tvs, NULL); 
	for(n=0;n<=Tmax;n++){
//linear
	ret = clfftEnqueueTransform(planHandle, CLFFT_FORWARD, 1, &command_queue, 0, NULL, NULL,cl_u, cl_uhat, tmpBufferu);
	ret = clfftEnqueueTransform(planHandle, CLFFT_FORWARD, 1, &command_queue, 0, NULL, NULL,cl_v, cl_vhat, tmpBufferv);
	ret = clFinish(command_queue);
 
        ret = clSetKernelArg(linearpart, 0, sizeof(cl_mem),(void *)&cl_uhat[0]);
        ret = clSetKernelArg(linearpart, 1, sizeof(cl_mem),(void *)&cl_uhat[1]);
        ret = clSetKernelArg(linearpart, 2, sizeof(cl_mem),(void *)&cl_vhat[0]);
        ret = clSetKernelArg(linearpart, 3, sizeof(cl_mem),(void *)&cl_vhat[1]);
	ret = clSetKernelArg(linearpart, 4, sizeof(cl_mem),(void* )&cl_kx);
	ret = clSetKernelArg(linearpart, 5, sizeof(cl_mem),(void* )&cl_ky);
	ret = clSetKernelArg(linearpart, 6, sizeof(cl_mem),(void* )&cl_kz);
	ret = clSetKernelArg(linearpart, 7, sizeof(double),(void* )&dt);
	ret = clSetKernelArg(linearpart, 8, sizeof(double),(void* )&Du);
	ret = clSetKernelArg(linearpart, 9, sizeof(double),(void* )&Dv);
	ret = clSetKernelArg(linearpart, 10, sizeof(double),(void* )&A);
	ret = clSetKernelArg(linearpart, 11, sizeof(double),(void* )&B);
	ret = clSetKernelArg(linearpart, 12, sizeof(double),(void* )&b[0]);
	ret = clSetKernelArg(linearpart, 13, sizeof(double),(void* )&b[1]);
	ret = clSetKernelArg(linearpart, 14, sizeof(int),(void* )&Nx);
	ret = clSetKernelArg(linearpart, 15, sizeof(int),(void* )&Ny);
	ret = clSetKernelArg(linearpart, 16, sizeof(int),(void* )&Nz);
        ret = clEnqueueNDRangeKernel(command_queue, linearpart, 1, NULL, global_work_size, NULL, 0, NULL, NULL);
	ret = clFinish(command_queue);
 
	ret = clfftEnqueueTransform(planHandle, CLFFT_BACKWARD, 1, &command_queue, 0, NULL, NULL,cl_uhat, cl_u, tmpBufferu);
	ret = clfftEnqueueTransform(planHandle, CLFFT_BACKWARD, 1, &command_queue, 0, NULL, NULL,cl_vhat, cl_v, tmpBufferv);
	ret = clFinish(command_queue);    
//nonlinearpart
        ret = clSetKernelArg(nonlinearpart, 0, sizeof(cl_mem),(void *)&cl_u[0]);
        ret = clSetKernelArg(nonlinearpart, 1, sizeof(cl_mem),(void *)&cl_u[1]);
	ret = clSetKernelArg(nonlinearpart, 2, sizeof(cl_mem),(void* )&cl_v[0]);
	ret = clSetKernelArg(nonlinearpart, 3, sizeof(cl_mem),(void* )&cl_v[1]);
	ret = clSetKernelArg(nonlinearpart, 4, sizeof(double),(void* )&dt);
	ret = clSetKernelArg(nonlinearpart, 5, sizeof(double),(void* )&a[0]);
	ret = clSetKernelArg(nonlinearpart, 6, sizeof(double),(void* )&a[1]);
        ret = clEnqueueNDRangeKernel(command_queue, nonlinearpart, 1, NULL, global_work_size, NULL, 0, NULL, NULL);
	ret = clFinish(command_queue);		
// linear part
	ret = clfftEnqueueTransform(planHandle, CLFFT_FORWARD, 1, &command_queue, 0, NULL, NULL,cl_u, cl_uhat, tmpBufferu);
	ret = clfftEnqueueTransform(planHandle, CLFFT_FORWARD, 1, &command_queue, 0, NULL, NULL,cl_v, cl_vhat, tmpBufferv);	
	ret = clFinish(command_queue);
 
        ret = clSetKernelArg(linearpart, 0, sizeof(cl_mem),(void *)&cl_uhat[0]);
        ret = clSetKernelArg(linearpart, 1, sizeof(cl_mem),(void *)&cl_uhat[1]);
        ret = clSetKernelArg(linearpart, 2, sizeof(cl_mem),(void *)&cl_vhat[0]);
        ret = clSetKernelArg(linearpart, 3, sizeof(cl_mem),(void *)&cl_vhat[1]);
	ret = clSetKernelArg(linearpart, 4, sizeof(cl_mem),(void* )&cl_kx);
	ret = clSetKernelArg(linearpart, 5, sizeof(cl_mem),(void* )&cl_ky);
	ret = clSetKernelArg(linearpart, 6, sizeof(cl_mem),(void* )&cl_kz);
	ret = clSetKernelArg(linearpart, 7, sizeof(double),(void* )&dt);
	ret = clSetKernelArg(linearpart, 8, sizeof(double),(void* )&Du);
	ret = clSetKernelArg(linearpart, 9, sizeof(double),(void* )&Dv);
	ret = clSetKernelArg(linearpart, 10, sizeof(double),(void* )&A);
	ret = clSetKernelArg(linearpart, 11, sizeof(double),(void* )&B);
	ret = clSetKernelArg(linearpart, 12, sizeof(double),(void* )&b[0]);
	ret = clSetKernelArg(linearpart, 13, sizeof(double),(void* )&b[1]);
	ret = clSetKernelArg(linearpart, 14, sizeof(int),(void* )&Nx);
	ret = clSetKernelArg(linearpart, 15, sizeof(int),(void* )&Ny);
	ret = clSetKernelArg(linearpart, 16, sizeof(int),(void* )&Nz);
        ret = clEnqueueNDRangeKernel(command_queue, linearpart, 1, NULL, global_work_size, NULL, 0, NULL, NULL);
	ret = clFinish(command_queue);
 
	ret = clfftEnqueueTransform(planHandle, CLFFT_BACKWARD, 1, &command_queue, 0, NULL, NULL,cl_uhat, cl_u, tmpBufferu);
	ret = clfftEnqueueTransform(planHandle, CLFFT_BACKWARD, 1, &command_queue, 0, NULL, NULL,cl_vhat, cl_v, tmpBufferv);
	ret = clFinish(command_queue);
// done
	if(n==plottime){
		printf("time:%lf, step:%d,%d\n",n*dt,n,plotnum);
		plottime=plottime+plotgap;
		plotnum=plotnum+1;
        ret = clEnqueueReadBuffer(command_queue, cl_u[0], CL_TRUE, 0, N * sizeof(double), u[0], 0, NULL, NULL);
        ret = clEnqueueReadBuffer(command_queue, cl_v[0], CL_TRUE, 0, N * sizeof(double), v[0], 0, NULL, NULL);
	ret = clFinish(command_queue);
//output of data U
	char tmp_str[10];
	strcpy(nameconfig,"./data/u");
	sprintf(tmp_str,"%d",10000000+plotnum);
	strcat(nameconfig,tmp_str);
	strcat(nameconfig,".datbin");
	fp=fopen(nameconfig,"wb");
    	if (!fp) {fprintf(stderr, "Failed to write u-data.\n"); exit(1); }
	for(i=0;i<N;i++){fwrite(&u[0][i], sizeof(double), 1, fp);}
    	fclose( fp );	
//V
	strcpy(nameconfig,"./data/v");
	sprintf(tmp_str,"%d",10000000+plotnum);
	strcat(nameconfig,tmp_str);
	strcat(nameconfig,".datbin");
	fp=fopen(nameconfig,"wb");
    	if (!fp) {fprintf(stderr, "Failed to write v-data.\n"); exit(1); }
	for(i=0;i<N;i++){fwrite(&v[0][i], sizeof(double), 1, fp);}
    	fclose( fp );
}
	}
 	gettimeofday(&tve, NULL); 
	printf("Finished time stepping\n");
 	elapsedTime = (tve.tv_sec - tvs.tv_sec) * 1000.0;      // sec to ms
    	elapsedTime += (tve.tv_usec - tvs.tv_usec) / 1000.0;   // us to ms
   	printf("Program took %lfms for execution\n",elapsedTime);
 
 
 
	clReleaseMemObject(cl_u[0]);
	clReleaseMemObject(cl_u[1]);
	clReleaseMemObject(cl_v[0]);
	clReleaseMemObject(cl_v[1]);
	clReleaseMemObject(cl_uhat[0]);
	clReleaseMemObject(cl_uhat[1]);
	clReleaseMemObject(cl_vhat[0]);
	clReleaseMemObject(cl_vhat[1]);
	clReleaseMemObject(cl_kx);
	clReleaseMemObject(cl_ky);
	clReleaseMemObject(cl_kz);
    	ret = clReleaseKernel(frequencies); ret = clReleaseProgram(p_frequencies);
    	ret = clReleaseKernel(linearpart); ret = clReleaseProgram(p_linearpart);
    	ret = clReleaseKernel(nonlinearpart); ret = clReleaseProgram(p_nonlinearpart);
	free(u[0]);
	free(v[0]);
	clReleaseMemObject(tmpBufferu);
	clReleaseMemObject(tmpBufferv);
	/* Release the plan. */
	ret = clfftDestroyPlan(&planHandle);
	/* Release clFFT library. */
	clfftTeardown();
 
	ret = clReleaseCommandQueue(command_queue);
     	ret = clReleaseContext(context);	
	printf("Program execution complete\n");
 
	return 0;
}

Now all the OpenCL-Kernel's, which sould be saved in the same folder. Some of them look very hard to read, but that is due to the lack of a native implementation for complexnumbers.

 

 

 

 

( D)

linearpart.c Code download
#if defined(cl_amd_fp64) || defined(cl_khr_fp64)
    #if defined(cl_amd_fp64)
    #pragma OPENCL EXTENSION cl_amd_fp64 : enable
    #elif defined(cl_khr_fp64)
    #pragma OPENCL EXTENSION cl_khr_fp64 : enable
    #endif
    // function declarations/definitions using double precision doubleing-point arithmetic
#endif
__kernel void linearpart ( __global double* uhatre,__global double* uhatim,__global double* vhatre,__global double* vhatim,__global const double* Kx,__global const double* Ky,__global const double* Kz, const double dt, const double Du,const double Dv, const double A,const double B,const double bhighre,const double bhighim , const int Nx, const int Ny, const int Nz)
{
   const int ind = get_global_id(0);
 
int i,j,k;
k=floor((double)ind/(double)(Ny*Nx));
j=floor((double)(ind-k*(Ny*Nx))/(double)Nx);
i=ind-k*(Ny*Nx)-j*Nx;
double uexp[2];
uexp[0]=dt*bhighre*(-1.0*A+Du*(Kx[i]+Ky[j]+Kz[k]));
uexp[1]=dt*bhighim*(-1.0*A+Du*(Kx[i]+Ky[j]+Kz[k]));
double vexp[2];
vexp[0]=dt*bhighre*(-1.0*B+Dv*(Kx[i]+Ky[j]+Kz[k]));
vexp[1]=dt*bhighim*(-1.0*B+Dv*(Kx[i]+Ky[j]+Kz[k]));
if(ind==0){
double N=(double)Nx*Ny*Nz;
 
uhatre[ind]=exp(uexp[0])*(((uhatre[ind]-N)*cos(uexp[1]))-(uhatim[ind]*sin(uexp[1])))+N;
uhatim[ind]=exp(uexp[0])*(((uhatre[ind]-N)*sin(uexp[1]))+(uhatim[ind]*cos(uexp[1])));
}
 
else{
uhatre[ind]=exp(uexp[0])*((uhatre[ind]*cos(uexp[1]))-(uhatim[ind]*sin(uexp[1])));
uhatim[ind]=exp(uexp[0])*((uhatre[ind]*sin(uexp[1]))+(uhatim[ind]*cos(uexp[1])));
}
vhatre[ind]=exp(vexp[0])*((vhatre[ind]*cos(vexp[1]))-(vhatim[ind]*sin(vexp[1])));
vhatim[ind]=exp(vexp[0])*((vhatre[ind]*sin(vexp[1]))+(vhatim[ind]*cos(vexp[1])));
}


 

 

 

 

( E)

nonlinearpart.c Code download
#if defined(cl_amd_fp64) || defined(cl_khr_fp64)
    #if defined(cl_amd_fp64)
    #pragma OPENCL EXTENSION cl_amd_fp64 : enable
    #elif defined(cl_khr_fp64)
    #pragma OPENCL EXTENSION cl_khr_fp64 : enable
    #endif
    // function declarations/definitions using double precision doubleing-point arithmetic
#endif
__kernel void nonlinearpart ( __global double* ure,__global double* uim,__global double* vre,__global double* vim, const double dt, const double are,const double aim)
{
   const int ind = get_global_id(0);
	const double tol=pown(0.1,12);//0.000000000001;
double chg=1;
const double uoldre=ure[ind];
const double uoldim=uim[ind];
const double voldre=vre[ind];
const double voldim=vim[ind];
double utempre,utempim,vtempre,vtempim;
double uMre,uMim,vMre,vMim;
while(chg>tol){
utempre=ure[ind];
utempim=uim[ind];
vtempre=vre[ind];
vtempim=vim[ind];	
 
uMre=0.5*(ure[ind]+uoldre);
uMim=0.5*(uim[ind]+uoldim);
vMre=0.5*(vre[ind]+voldre);
vMim=0.5*(vim[ind]+voldim);								
 
ure[ind]=uoldre-dt*are*uMre*vMre*vMre+dt*are*uMre*vMim*vMim+2.0*dt*are*uMim*vMre*vMim+2.0*dt*aim*uMre*vMre*vMim+dt*aim*uMim*vMre*vMre-dt*aim*uMim*vMim*vMim;
 
uim[ind]=uoldim-dt*aim*uMre*vMre*vMre+dt*are*uMim*vMim*vMim+2.0*dt*aim*uMim*vMre*vMim-2.0*dt*are*uMre*vMre*vMim-dt*are*uMim*vMre*vMre+dt*aim*uMre*vMim*vMim;
 
vre[ind]=voldre+dt*are*uMre*vMre*vMre-dt*are*uMre*vMim*vMim-2.0*dt*are*uMim*vMre*vMim-2.0*dt*aim*uMre*vMre*vMim-dt*aim*uMim*vMre*vMre+dt*aim*uMim*vMim*vMim;
 
vim[ind]=voldim+dt*aim*uMre*vMre*vMre-dt*are*uMim*vMim*vMim-2.0*dt*aim*uMim*vMre*vMim+2.0*dt*are*uMre*vMre*vMim+dt*are*uMim*vMre*vMre-dt*aim*uMre*vMim*vMim;
 
chg=sqrt((ure[ind]-utempre)*(ure[ind]-utempre)+(uim[ind]-utempim)*(uim[ind]-utempim))+
    sqrt((vre[ind]-vtempre)*(vre[ind]-vtempre)+(vim[ind]-vtempim)*(vim[ind]-vtempim));
}   	
}


 

 

 

 

( F)

grid.c Code download
#if defined(cl_amd_fp64) || defined(cl_khr_fp64)
    #if defined(cl_amd_fp64)
    #pragma OPENCL EXTENSION cl_amd_fp64 : enable
    #elif defined(cl_khr_fp64)
    #pragma OPENCL EXTENSION cl_khr_fp64 : enable
    #endif
    // function declarations/definitions using double precision doubleing-point arithmetic
#endif
__kernel void grid ( __global double* x, const double Lx, const int Nx)
{
  const int ind = get_global_id(0);
	x[ind]=(-1.0 + ((double) 2.0*ind/(double)Nx))*M_PI*Lx;   	
 
}


 

 

 

 

( G)

initialdata.c Code download
#if defined(cl_amd_fp64) || defined(cl_khr_fp64)
    #if defined(cl_amd_fp64)
    #pragma OPENCL EXTENSION cl_amd_fp64 : enable
    #elif defined(cl_khr_fp64)
    #pragma OPENCL EXTENSION cl_khr_fp64 : enable
    #endif
    // function declarations/definitions using double precision doubleing-point arithmetic
#endif
__kernel void initialdata ( __global double* ure, __global double* vre, __global double* uim, __global double* vim, __global const double* x ,__global const double* y ,__global const double* z , const int Nx, const int Ny, const int Nz)
{
   const int ind = get_global_id(0);
 
int i,j,k;
k=floor((double)ind/(double)(Ny*Nx));
j=floor((double)(ind-k*(Ny*Nx))/(double)Nx);
i=ind-k*(Ny*Nx)-j*Nx;
ure[ind]=0.5+exp(-1.0*(x[i]*x[i]+y[j]*y[j]+z[k]*z[k] )-1.0);// 	
vre[ind]=0.1+exp(-1.0*(x[i]*x[i]+y[j]*y[j]+z[k]*z[k] )-1.0);
uim[ind]=0.0;
vim[ind]=0.0;
}


 

 

 

 

( H)

frequencies.c Code download
#if defined(cl_amd_fp64) || defined(cl_khr_fp64)
    #if defined(cl_amd_fp64)
    #pragma OPENCL EXTENSION cl_amd_fp64 : enable
    #elif defined(cl_khr_fp64)
    #pragma OPENCL EXTENSION cl_khr_fp64 : enable
    #endif
    // function declarations/definitions using double precision doubleing-point arithmetic
#endif
__kernel void frequencies ( __global double* kx, const double Lx, const int Nx)
{
   	const int ind = get_global_id(0);
	if ( ind < Nx/2) kx[ind] = -1.0*((double)((ind))/Lx)*((double)( ind)/Lx);
	if ( ind ==Nx/2) kx[ ind]=0.0;
	if ( ind > Nx/2) kx[ ind]=-1.0*(double)(Nx-ind)/Lx*(double)(Nx-ind)/Lx;
}

Here is your makefile! Change the clFFT to your path.

 

 

 

 

( I)

makefile Code download
####### Compiler, tools and options
 
CC            = gcc
CFLAGS        = -m64 -pipe -O2 -Wno-unused-but-set-parameter -Wall
#OPENCL        = /opt/AMDAPP
#CLFFT         = /opt/clFFT
#INC_OPENCL    = -I$(OPENCL)/include 
#INC_CLFFT     = -I$(CLFFT)/include
#INCPATH       = $(INC_OPENCL) $(INC_CLFFT) 
#LIBS_CLFFT    = -L$(CLFFT)/lib64 -lclFFT -L$(OPENCL)/lib/x86_64 -lOpenCL 
DEL_FILE      = rm -f
 
LIBS = -lOpenCL -I../../clFFT/src/package/include/ -L../../clFFT/src/package/lib64 -lclFFT -lm
 
#
 
####### Build rules 
 
CL: grayscottOpenCLs.c
	$(DEL_FILE) grayscottOpenCLs
	$(DEL_FILE) *.o
	$(CC) $(CFLAGS) -o grayscottOpenCLs grayscottOpenCLs.c $(LIBS)
	export LD_LIBRARY_PATH=../../clFFT/src/package/lib64:$$LD_LIBRARY_PATH
	export DISPLAY=:0
 
CL43: grayscottOpenCL43.c
	$(DEL_FILE) grayscottOpenCL43
	$(DEL_FILE) *.o
	$(CC) $(CFLAGS) -o grayscottOpenCL43 grayscottOpenCL43.c $(LIBS)
	export LD_LIBRARY_PATH=../../clFFT/src/package/lib64:$$LD_LIBRARY_PATH
 
CL4k: grayscottOpenCL4k.c
	$(DEL_FILE) grayscottOpenCL4k
	$(DEL_FILE) *.o
	$(CC) $(CFLAGS) -o grayscottOpenCL4k grayscottOpenCL4k.c $(LIBS)
	export LD_LIBRARY_PATH=../../clFFT/src/package/lib64:$$LD_LIBRARY_PATH
 
info: infocl.c
	$(CC) $(CFLAGS) -o infocl infocl.c $(LIBS)
 
out: xdmfcreate.f90
	gfortran -o out xdmfcreate.f90
 
outa: xdmfcreatea.f90
	gfortran -o outa xdmfcreatea.f90
clean: 
	$(DEL_FILE) grayscottOpenCL43
	$(DEL_FILE) grayscottOpenCLs
	$(DEL_FILE) infocl
	$(DEL_FILE) out
	$(DEL_FILE) outa
	$(DEL_FILE) *.xmf
	$(DEL_FILE) *.o

To run you need the INPUTFILE with the parameters.

 

 

 

 

( J)

INPUTFILE Code download
2048
2048
1
4
2
4.0
4.0
4.0
0.010
0.90
0.005
0.038
0.060
 
Nx
Ny
Nt
Plotgap
Lx
Ly
DT
Du
Dv
F
k
To view the data, compile and run xdmfcreate.f90. Open udata.xdmf and vdata.xdmf in ParaView.

 

 

 

 

( K)

xdmfcreate.f90 Code download
	PROGRAM XdmfCreate
        !--------------------------------------------------------------------------------
        ! .. Purpose ..
        ! XdmfCreate is a postprocessing program which creates header files for
        ! Paraview and Visit
        ! It uses the INPUTFILE and assumes that the filenames in the program
        ! are
        ! consistent with those in the current file.
        !
        ! .. PARAMETERS .. INITIALIZED IN INPUTFILE
        ! Nx = power of two, number of modes in the x direction
        ! Ny = power of two, number of modes in the y direction
        ! Nz = power of two, number of modes in the z direction
        ! Nt = the number of timesteps
        ! plotgap = the number of timesteps to take before plotting
        ! Lx = definition of the periodic domain of computation in x direction
        ! Ly = definition of the periodic domain of computation in y direction
        ! Lz = definition of the periodic domain of computation in z direction
        ! Dt = the time step
        !
        ! REFERENCES
        !
        ! www.xdmf.org
        ! Paraview users mailing list
        ! VisIt users mailing list
        !
        ! ACCURACY
        !
        ! ERROR INDICATORS AND WARNINGS
        !
        ! FURTHER COMMENTS
        !------------------------------------------------------------------------------------
        ! EXTERNAL ROUTINES REQUIRED
        IMPLICIT NONE
        ! .. Scalar Arguments ..
        INTEGER(KIND=4) :: Nx, Ny, Nz, Nt, plotgap
        REAL(KIND=8)    :: Lx, Ly, Lz, DT
        ! .. Local scalars ..
        INTEGER(KIND=4) :: stat,plotnum,ind,n,numplots
        REAL(KIND=8), PARAMETER :: pi=3.14159265358979323846264338327950288419716939937510d0
        ! .. Local Arrays ..
        CHARACTER*50    :: InputFileName, OutputFileName, OutputFileName2
        CHARACTER*10    :: number_file
        InputFileName='INPUTFILE'
        OPEN(unit=11,FILE=trim(InputFileName),status="OLD")
        REWIND(11)
        READ(11,*) Nx, Ny, Nz, Nt, plotgap, Lx, Ly, Lz, DT
        CLOSE(11)       
        plotnum=0
        numplots=1+Nt/plotgap
        OutputFileName='udata.xmf'
        OPEN(unit=11,FILE=trim(OutputFileName),status="UNKNOWN")
        REWIND(11)
        WRITE(11,'(a)') "<?xml version=""1.0"" ?>"
        WRITE(11,'(a)') "<!DOCTYPE Xdmf SYSTEM ""Xdmf.dtd"" []>"
        WRITE(11,'(a)') "<Xdmf xmlns:xi=""http://www.w3.org/2001/XInclude"" Version=""2.0"">"
        WRITE(11,'(a)') "<Domain>"
        WRITE(11,'(a)') " <Topology name=""topo"" TopologyType=""3DCoRectMesh"""
        WRITE(11,*) "  Dimensions=""",Nx,Ny,Nz,""">"
        WRITE(11,'(a)') " </Topology>"
        WRITE(11,'(a)') " <Geometry name=""geo"" Type=""ORIGIN_DXDYDZ"">"
        WRITE(11,'(a)') "  <!-- Origin -->"
        WRITE(11,'(a)') "  <DataItem Format=""XML"" Dimensions=""3"">"
        WRITE(11,*) -pi*Lx, -pi*Ly, -pi*Lz 
        WRITE(11,'(a)') "  </DataItem>"
        WRITE(11,'(a)') "  <!-- DxDyDz -->"
        WRITE(11,'(a)') "  <DataItem Format=""XML"" Dimensions=""3"">"
        WRITE(11,*) REAL(2.0*pi*Lx/Nx,kind(0d0)), REAL(2.0*pi*Ly/Ny,kind(0d0)), REAL(2.0*pi*Lz/Nz,kind(0d0))  
        WRITE(11,'(a)') "   </DataItem>"
        WRITE(11,'(a)') "  </Geometry>"
        WRITE(11,'(a)')
        WRITE(11,'(a)') "       <Grid Name=""TimeSeries"" GridType=""Collection"" CollectionType=""Temporal"">"
        WRITE(11,'(a)') "               <Time TimeType=""HyperSlab"">"
        WRITE(11,'(a)') "                       <DataItem Format=""XML"" NumberType=""Float"" Dimensions=""3"">"
        WRITE(11,*) "                   0.0", dt," 2" 
        WRITE(11,'(a)') "                       </DataItem>"
        WRITE(11,'(a)') "               </Time>"
 
        DO n=1,numplots
 
                OutputFileName = 'data/u'
                ind = index(OutputFileName,' ') - 1
                WRITE(number_file,'(i0)') 10000000+plotnum
                OutputFileName = OutputFileName(1:ind)//number_file
                ind = index(OutputFileName,' ') - 1
                OutputFileName = OutputFileName(1:ind)//'.datbin'
 
                OutputFileName2 = "<Grid Name=""T"
                WRITE(number_file,'(i0)') n 
                OutputFileName2 = OutputFileName2(1:13)//number_file
                ind =  index(number_file,' ') - 1
                OutputFileName2 = OutputFileName2(1:13+ind)//'" GridType="Uniform">'
                plotnum=plotnum+1
 
                WRITE(11,'(a)')
                WRITE(11,'(3X,a)') "   ",OutputFileName2
                WRITE(11,'(a)') "       <Topology Reference=""/Xdmf/Domain/Topology[1]""/>"
                WRITE(11,'(a)') "       <Geometry Reference=""/Xdmf/Domain/Geometry[1]""/>"
                WRITE(11,'(a)') "        <Attribute Name=""Displacement"" Center=""Node"">"
                WRITE(11,'(a)') "               <DataItem Format=""Binary"" "
                WRITE(11,'(a)') "                DataType=""Float"" Precision=""8"" Endian=""Little"" "
                WRITE(11,*) "            Dimensions=""",Nx, Ny, Nz,""">"
                WRITE(11,'(16X,a)') "           ",OutputFileName
                WRITE(11,'(a)') "               </DataItem>"
                WRITE(11,'(a)') "       </Attribute>"
                WRITE(11,'(3X,a)') "</Grid>"
        END DO
        WRITE(11,'(a)')"        </Grid>"
        WRITE(11,'(a)') "</Domain>"
        WRITE(11,'(a)') "</Xdmf>"
        CLOSE(11)
		!same for v
        plotnum=0
        numplots=1+Nt/plotgap
        OutputFileName='vdata.xmf'
        OPEN(unit=11,FILE=trim(OutputFileName),status="UNKNOWN")
        REWIND(11)
        WRITE(11,'(a)') "<?xml version=""1.0"" ?>"
        WRITE(11,'(a)') "<!DOCTYPE Xdmf SYSTEM ""Xdmf.dtd"" []>"
        WRITE(11,'(a)') "<Xdmf xmlns:xi=""http://www.w3.org/2001/XInclude"" Version=""2.0"">"
        WRITE(11,'(a)') "<Domain>"
        WRITE(11,'(a)') " <Topology name=""topo"" TopologyType=""3DCoRectMesh"""
        WRITE(11,*) "  Dimensions=""",Nx,Ny,Nz,""">"
        WRITE(11,'(a)') " </Topology>"
        WRITE(11,'(a)') " <Geometry name=""geo"" Type=""ORIGIN_DXDYDZ"">"
        WRITE(11,'(a)') "  <!-- Origin -->"
        WRITE(11,'(a)') "  <DataItem Format=""XML"" Dimensions=""3"">"
        WRITE(11,*) -pi*Lx, -pi*Ly, -pi*Lz 
        WRITE(11,'(a)') "  </DataItem>"
        WRITE(11,'(a)') "  <!-- DxDyDz -->"
        WRITE(11,'(a)') "  <DataItem Format=""XML"" Dimensions=""3"">"
        WRITE(11,*) REAL(2.0*pi*Lx/Nx,kind(0d0)), REAL(2.0*pi*Ly/Ny,kind(0d0)), REAL(2.0*pi*Lz/Nz,kind(0d0))  
        WRITE(11,'(a)') "   </DataItem>"
        WRITE(11,'(a)') "  </Geometry>"
        WRITE(11,'(a)')
        WRITE(11,'(a)') "       <Grid Name=""TimeSeries"" GridType=""Collection"" CollectionType=""Temporal"">"
        WRITE(11,'(a)') "               <Time TimeType=""HyperSlab"">"
        WRITE(11,'(a)') "                       <DataItem Format=""XML"" NumberType=""Float"" Dimensions=""3"">"
        WRITE(11,*) "                   0.0", dt," 2" 
        WRITE(11,'(a)') "                       </DataItem>"
        WRITE(11,'(a)') "               </Time>"
 
        DO n=1,numplots
 
                OutputFileName = 'data/v'
                ind = index(OutputFileName,' ') - 1
                WRITE(number_file,'(i0)') 10000000+plotnum
                OutputFileName = OutputFileName(1:ind)//number_file
                ind = index(OutputFileName,' ') - 1
                OutputFileName = OutputFileName(1:ind)//'.datbin'
 
                OutputFileName2 = "<Grid Name=""T"
                WRITE(number_file,'(i0)') n 
                OutputFileName2 = OutputFileName2(1:13)//number_file
                ind =  index(number_file,' ') - 1
                OutputFileName2 = OutputFileName2(1:13+ind)//'" GridType="Uniform">'
                plotnum=plotnum+1
 
                WRITE(11,'(a)')
                WRITE(11,'(3X,a)') "   ",OutputFileName2
                WRITE(11,'(a)') "       <Topology Reference=""/Xdmf/Domain/Topology[1]""/>"
                WRITE(11,'(a)') "       <Geometry Reference=""/Xdmf/Domain/Geometry[1]""/>"
                WRITE(11,'(a)') "        <Attribute Name=""Displacement"" Center=""Node"">"
                WRITE(11,'(a)') "               <DataItem Format=""Binary"" "
                WRITE(11,'(a)') "                DataType=""Float"" Precision=""8"" Endian=""Little"" "
                WRITE(11,*) "            Dimensions=""",Nx, Ny, Nz,""">"
                WRITE(11,'(16X,a)') "           ",OutputFileName
                WRITE(11,'(a)') "               </DataItem>"
                WRITE(11,'(a)') "       </Attribute>"
                WRITE(11,'(3X,a)') "</Grid>"
        END DO
        WRITE(11,'(a)')"        </Grid>"
        WRITE(11,'(a)') "</Domain>"
        WRITE(11,'(a)') "</Xdmf>"
        CLOSE(11)
        END PROGRAM XdmfCreate
  1. Information about the equations can be found at http://mrob.com/pub/comp/xmorphia/#formula.
  2. http://www.mathworks.com/matlabcentral/fileexchange/22940-vol3d-v2
  3. http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk/
  4. https://github.com/clMathLibraries/clFFT