\chapter{Descriptive statistics} Before going into explicit hypothesis tests it is often useful to get a general feel for what the data looks like - this is where descriptive statistics come in. The most common functions include computing the mean and variance or standard deviation at every voxel. If the data is inherently divided into groups, such as patients and control, or, in our example dataset, males versus females, then the descriptive stats can also be grouped by those variables. To start we can look at the mean Jacobian determinant at every voxel of all the data combined: <<>>= library(RMINC) overall.mean <- mincMean(gf$Filename) overall.mean @ The \texttt{mincMean} function computes the mean at every voxel of a set of filenames specified as an argument. The output is in this case assigned to the \texttt{overall.mean} variable. Repeating the variable in the R session, as done above, causes a summary to be printed. One thing to note about the R syntax above: the dollar symbol is used to access a specific column inside a data frame. What this means is that inside the \texttt{gf} variable - which, remember, is the variable that was read in from the comma-separated values file which describes the dataset - each column has a name which can be accessed by that dollar variable. Here are some examples, first showing the entire contents of \texttt{gf} and then two separate columns alone: <<>>= gf gf$Filename gf$Gender @ So if, in the text file that describes the dataset, the column containing all the filenames was called ``jacobians'', then the \texttt{mincMean} command would have been \texttt{mincMean(gf\$jacobians)}. If an incorrect column is specified - i.e. something which does not contain filenames - then you should receive an error. \section{Writing results to file} Once the means at every voxel have been computed, they can be written to file. This is done with command below: <<>>= mincWriteVolume(overall.mean, "overall-mean.mnc") @ The \texttt{mincWriteVolume} command takes two arguments in the above example - the variable containing the data, and a string giving the filename to which the data should be written to. This MINC file can then be read and viewed with the standard MINC tools such as mincinfo, register, Display, etc. \section{Creating summaries by group} Most often we are more interested in how the means break down by the grouping in this dataset. This can be done by adding another variable to the mincMean call: <<>>= group.means <- mincMean(gf$Filename, gf$Gender) group.means @ The {\em Gender} variable has two levels in it: {\em Male} and {\em Female}. So it will take the mean for all subjects in each group. These can then be written to file by specifiying the column. <<>>= mincWriteVolume(group.means, "male-mean.mnc", "Male") mincWriteVolume(group.means, "female-mean.mnc", "Female") @ If the difference between the two columns is of interest, one can just subtract the two data columns: <<>>= difference <- group.means[,"Male"] - group.means[,"Female"] mean(difference) mincWriteVolume(difference, "diff.mnc", gf$Filename[1]) @ Notice how {\em mincWriteVolume} now needs a third argument: the name of a minc-file which has the same dimensions as the data. By default commands such as {\em mincMean} will store that information; after the subtraction above, however, the result is just a series of numbers with all metadata removed, so it has to be specified when writing the data to file. Of course means are not the only items of interest. Also computable are the standard-deviations, variances, and sums, as illustrated below. Just like {\em mincMean} a column of filenames is required and a grouping variable is optional. <<>>= v <- mincVar(gf$Filename, gf$Gender) s <- mincSd(gf$Filename) s2 <- mincSum(gf$Filename, gf$Gender) @