SPM/Faster SPM

From Wikibooks, the open-content textbooks collection

< SPM
Jump to: navigation, search


Contents

[edit] Optimizations

SPM Benchmarks

[edit] MatLab Optimizations

Disable JavaVM

matlab -nojvm


[edit] Enabling multithreading on Matlab >=7.0.1

Matlab R14 Service Pack 1 (AKA 7.0.1) (or higher) uses the Intel Math Kernel Library. Threading is disabled by default. You can enable threading if you are using a processor with hyperthreading or you have more than one physical processor core.

http://www.mathworks.com/access/helpdesk/help/techdoc/rn/math_new.html#1001367 http://developer.intel.com/software/products/mkl/docs/mklusel.htm#Using%20MKL%20Parallelism http://www.mathworks.com/support/solutions/data/1-V63VS.html?solution=1-V63VS http://www.mathworks.com/support/solutions/data/1-ZGD1M.html?solution=1-ZGD1M

However you should be aware that some users have experienced fatal problems with this setting. The following code will cause MATLAB R14SP2 to crash:

System specs

Hyperthreaded Pentium 4
Gentoo Linux 2.6.10-r4
gcc 3.3.5
MATLAB 7.0.4.352 (R14) Service Pack 2

in bash

export OMP_NUM_THREADS=2
matlab -nojvm

in MATLAB

n = 32;
A=zeros(n,n,n);
B=zeros(n,n,n);
C=zeros(n,n,n);
A(:,1,:) = B(:,:,1)*squeeze(C(:,1,:));

Error message

OMP abort: Unable to set worker thread stack size to 4195328 bytes
Try reducing KMP_STACKSIZE or increasing the shell stack limit.


We contacted the MathWorks, who told us that the environment variable, OMP_NUM_THREADS, is not officially supported for the Linux platform. It should be set to 1 (or not set at all) in order to use MATLAB 7.0.4 (R14SP2) without any thread related errors.

[edit] Install new BLAS

Install the latest Basic Linear Algebra Subroutines for your system. In some cases (e.g. running NaNs on a Pentium4) you can expect a speed increase of x100 just by updating your BLAS! You have several options of where to get your BLAS from:

Links: Matlab 7.0.1 maths features, configuring MKL, MKL discussion forum, MatWorks page on BLAS

[edit] Notes on compiling ATLAS

  • Make sure you have write permissions to the ATLAS directory after you've untarred it (chown root *)
  • On my hyperthreaded P4, I couldn't get ATLAS to compile. This was because the script from the CBU site had entered my CPU clock speed twice (because it's dual processor so the command "grep "MHz" /proc/cpuinfo | gawk 'BEGIN { FS = " " } {print $4 }'" returns two numbers. So here's one way to get the compile to work on a hyperthreaded or SMP system:
    1. Follow the guide on the CBU site. Stop before running "make"
    2. Edit your Make.Linux_P4_SSE2* file. Find "PentiumCPS". You should see your clock speed entered twice on that line. Delete one of the entries.
  • When compiling the mex files, make sure "mex" is in your path (i.e. make sure /usr/local/bin is in your path - it should be. It might not be if your running inside su)

[edit] Installing the Intel Math Kernel Library

  • Download the free non-commercial version of the MKL
  • Extract the files ("tar -xvf filename" on Linux)
  • Linux: make sure the drive is mounted with exec permissions (type "mount" if you're not sure - if the drive is listed as "noexec" then you'll get a "permission denied" error when you try to run the install program"
  • ./install
  • Now copy your MKL files to the matlab directory. I do this by typing something like:
cp /opt/intel/mkl72/lib/32/* /usr/local/matlab/bin/glnx86/

(yes, you do need to copy everything)

  • Finally, tell MatLab to use the relevant .so file by editing $MATLAB/bin/glnx86/blas.spec (read the MKL documentation to find out which .so file you need)
  • Finally, read the MKL documentation on configuring for maximum speed (e.g. turning on threading)
[edit] Installing MKL on Gentoo Linux

Gentoo doesn't use RPM. And it has an ancient version of MKL in Portage.

  • emerge rpm
  • Download the free non-commercial version of the MKL
  • Extract the files ("tar -xvf filename" on Linux)
  • Make sure the drive is mounted with exec permissions (type "mount" if you're not sure - if the drive is listed as "noexec" then you'll get a "permission denied" error when you try to run the install program"
  • ./install
  • The install will fail but should tell you where all the install files are. Change to this directory
  • Find the .rpm file (let's call it mkl????.rpm for now)
  • rpm -i --nodeps mkl????.rpm
  • Now copy your MKL files to the matlab directory. I do this by typing:
cp /opt/intel/mkl72/lib/32/* /usr/local/matlab/bin/glnx86/

(yes, you do need to copy everything)

  • Finally, tell MatLab to use the relevant .so file by editing $MATLAB/bin/glnx86/blas.spec
  • Finally, read the MKL documentation on configuring for maximum speed (e.g. turning on threading)

[edit] SPM Optimizations

[edit] MAXMEM

set your memory setting in spm_defaults.m

defaults.stats.maxmem   = 2^30;
  • 2^30 = 1GByte
  • 2^29 = 512MBytes

[edit] Misc

[edit] Operating system optimizations

[edit] Linux

  • Configure your kernel for your system

[edit] Windows

  • Matlab works best if Windows XP is running in "classic mode"

[edit] Mac OS X

[edit] Making use of your Graphics Processing Unit (GPU)

Modern graphics cards have an enormous amount of processing power which could be harnessed for doing scientific calculations, especially with the arrival of PCI Express which allows very fast full-duplex communication between the GPU and CPU. Efficient use of the GPU could give speed-ups on the order of 15 times. It's something that will happen - the question is when. For example, a 3GHz P4 has a theoretical performance of 6 GFLOPS whilst 40 GFLOPS has been observed for the GeForce 6800 Ultra.

Matlab users can use the GPU via the Jacket Software created by AccelerEyes. AccelerEyes is currently offering the beta version of their Jacket Software free of charge to interested developers. They may be contacted via their website at http://www.accelereyes.com .

Links:

[edit] Clusters and parallel processing

[edit] SPM-specific tools

[edit] pSPM

Parallel SPM can be downloaded from http://prdownloads.sourceforge.net/parallelspm/

It implements realignment, slice-timing correction, normalization, smoothing, and statistics via the PSPM interface. Running SPM in parallel significantly reduces processing time on systems with multiple processors or workstation clusters (as MATLAB by default can only use one processor).

Differences between v1 and v2:

  • added fully parallelized statistics estimation
  • added a -nodisplay option to suppress graphics output
  • the coregistration option outputs a plot (spm2.ps) just like spm normally does
  • added windows support, you should be able to compile in Cygwin or MSVC (or whatever your favorite Windows compiler is). Refer to README.windows within the source distribution
  • fixed a bug in slice timing correction
  • a few minor user interface changes
  • better error handling
  • a few utility files to test the PSPM package

So for those keeping a running tally, here's what is currently parallelized:

  1. coregistration and reslicing
  2. slice timing correction
  3. applying normalization parameters to files (NOT estimation)
  4. smoothing
  5. full stats estimation

Once you have installed the package, use the PSPM_test_dir script to compare the output from parallel processing to regular uni-process SPM processing. When you run PSPM_test_dir it will ask you to select two directories. It will then proceed to compare all the image files in the two directories with the same name, and provide a report regarding discrepancies. There is also a PSPM_compare_struct script which will compare (element by element) two structures in MATLAB to see if they are identical. This might also be useful

At present, stats estimation produces a slight discrepancy of ~10^-12 per image. This seems due to floating point arithmetic precision issues. I have not had to this effect the results in any way. I'm still looking into this.

[edit] General tools for making MatLab parallel

[edit] Intel's cluster Math Kernel Library

[edit] Sun Grid Engine

The SGE is an opensource project sponsored by Sun Microsystems.

What you need:

[edit] FAQs

[edit] With hyper-threading enabled, I only get 50% CPU utilisation

I wouldn't worry about it.

With hyperthreading enabled, XP believes that you've got two processors. By "50% utilisation" XP actually means that one processor is at full utilisation whilst the other is idle. This occurs because, as others have said, Matlab is only single threaded and so can only use one processor. That sounds less than optimal, doesn't it.

It's not a problem because HyperThreading *isn't* the same as having two processors. HyperThreading is a clever way to keep the P4's massive pipeline full by allowing a single physical core to run two threads. But, even when the system is running optimally (i.e. you're running more than one thread) then a single HT processor is no-where near the speed of a true SMP setup. And there's little evidence that HT slows down single-threaded applications. In short: XP is lying to you - your CPU *is* at 100% utilisation for a single-threaded app (which begs the question: if 50% in XP = 100% in reality then does 100% in XP = 200% in reality, to which the answer is no!)

To be honest, I doubt you'll see any improvement by turning off hyperthreading (and, in fact, you might find that XP refuses to boot if you disable HT in the BIOS). And leaving HT on allows XP to run some other processes more efficiently whilst Matlab is running.

Here's what I suggest you do:

If you've got some time on your hands then benchmark your existing system then turn off hyperthreading in the BIOS and benchmark it again. My gut feeling is that you wont see much difference but I could be wrong.

If you've got less time then just don't worry about it. Your expensive CPU is being pushed as hard is it can be pushed. XP is fibbing to you when it says "50% utilisation".

For some benchmarks and some more theory on hyperthreading, have a look at these two links:

http://www6.tomshardware.com/cpu/20021114/index.html

http://www6.tomshardware.com/cpu/20021202/index.html