Neuroimaging Data Processing/Data Quality
Discarding first volumes
The signal in first volumes of an EPI sequence can be off-scale due to longitudinal magnetization not having reached steady-state yet. Therefore, they are often discarded and not used for analysis. A visual inspection of voxel time-series can give a good idea of which volumes are affected.
Keep in mind that discarding the first volumes will have an effect on the number of volumes in your scan (sounds trivial) as well as on the timing information in stimulus onset files and physiological parameter files.
3dTcat is AFNIs function to discard the first volumes. It basically copies the original data, leaving out specified volumes. A simple command could look similar to this:
3dTcat [options] INPUTFILE[2..$]
which in this case writes all but the first two volumes (0 and 1) of the original data (inputfile) into the outputfile. This step is typically used to copy the (interesting part) of the raw data into the results directory for further preprocessing, so that the original data remains untouched. Note that volume number and timing information might have to be adjusted accordingly.
3dTcat can also be used for linear detrending using the -rlt option. Check manual page  for more details and options.
In afni_proc.py this is a default step, however with the number of TRs to remove set to 0. The respective option to remove thre first n TRs is
Signal outliers e.g. due to excessive head movement can seriously affect the analysis. Motion correction algorithms are not build to deal with these. Therefore it makes sense to inspect your data before starting the actual preprocessing and get rid of signal outliers right away. You can plot voxel time series and check in different parts of the brain if they show huge peaks, but there are also automated methods to cover all voxels (see below). A good start is also to watch your volumes in a fast movie-like sequence because big movements between two pictures will become apparent to you.
When outliers are detected, they should be dealt with. Removing whole volumes that show outliers would break the time series, thus complicating the analysis of signal time series and impairing Fourier transformation as used in later steps. Therefore, respective data points are rather interpolated from neighbouring data.
3dToutcount calculates the number of outliers in each volume and writes the results into a file. Outliers are automatically defined as number of MAD (median absolute deviation) that are allowed, accounting for the number of TRs in the dataset. A typical limit is about 3.5*MAD distance to the trend. It makes sense to detrend the time series before looking for outliers. This can be done using the -polort nn option to detrend with polynomial of order nn (order is based on the duration of the first run: 1 + floor(duration/150 sec)) and the -legendre option to use legendre polynomials (allowing for polort > 3). For example:
3dToutcount -automask -fraction -polort 3 -legendre INPUTFILE > OUTCOUNTFILE.1D
The outcountfile will thus contain the fraction of voxels per volume (within the automask) that exceed the outlier limits after 3rd degree legendre polynomial detrending. To check for excessive outliers you can use:
1deval -a OUTCOUNTFILE.1D -expr 't*step(a-0.03)' | grep -v '0'
returning all timepoints with more than 3% outliers, or:
for visual inspection. In the example plot you see a considerable fraction of outliers in the 222th volume, which is also the one that has been found by the 1deval command. After using despike (see below), this outlier will be much reduced.
3dDespike actually removes spikes from the 3D+time input dataset and writes a new dataset with the spike values are replaced to fit a smooth curve. The spike cut values can be set via the option -cut c1 c2, where c1 is the threshold value of s for a 'spike' [default c1=2.5] and c2 is the upper range of the allowed deviation from the curve (s=[c1..infinity) is mapped to s'=[c1..c2) [default c2=4]). The order of the fit curve can be adjusted via -corder. Though 3dDespike can be run without visually checking for outliers, it is advisable to do so before and after despiking to keep track your data and detect possible oddities at the stage they first occur.
3dDespike [options] INPUTFILE
When running the outlier detection again on despiked data you can see if the outliers have been removed. For example the same plot as above but after despiking shows that the outlier has been reduced a lot (notice the difference y-range)
In afni_proc.py a despiking block can be included (but is not by default)
It is also possible to remove outliers in the regression, however, as far as I understand this will actually remove the respective volume and thus get you into the trouble mentioned above. Censoring TRs with more than n% outliers can be achieved by