Lesion analysis tutorial
From VoxBoWiki
Preliminaries
Voxel-based lesion-symptom mapping (VLSM) is a term coined by Bates et al (2003) to describe what's essentially the analog of SPM for lesion analysis – a mass univariate approach to mapping lesion-behavior relationships on a voxel-by-voxel basis. The term VLSM is a mild misnomer, because most of the people using VLSM are interested in behaviors that are not exactly symptoms. In any case, I use the term VLSM extremely loosely.
The purpose of this page is to provide an overview of how to go about VLSM using (mostly) VoxBo tools. In the remainder of this preliminary section, I run through some of the things you'll need to go about VLSM. In the subsequent sections, I consider:
- data handling
- lesion segmentation and registration
- statistics
- miscellaneous
Software
It's always good to have more than one way to get at a problem, all the more so when using a relatively new method like VLSM. I recommend that you familiarize yourself with the following packages:
- MRIcro – Chris Rorden's older image viewer/editor, which many labs continue to use for lesion drawing
- MRIcron – Rorden's newer package, which is under active development and includes NPM, a tool for lesion analysis
- VoxBo (you're on the VoxBo web site right now)
- R – powerful, extensible, free stats package
- SPM – the popular, multi-faceted, MATLAB-based image analysis package
MRIcro(n) and NPM run on Windows, OSX, and Linux, although the Windows versions are generally the most reliable and up to date. VoxBo runs best on Linux and OSX, and every now and then I update the Windows version. Most of the VoxBo tools are command-line tools, although I'm working on more graphical interfaces. SPM requires the commercial package MATLAB to run.
Computers
Computer issues are complicated, so I will just note the following. VLSM analyses tend to be a lot less demanding than fMRI analyses, so even if all you have is a single aging desktop, you can still get by. It's usually good to have a decent amount of RAM, say 2GB. You can probably get by with less, but certain things will be slow.
Dealing with image data
File formats
The following file formats for 3D image volume data are of interest:
- Analyze – images come in two files with the img and hdr suffixes (foo.img and foo.hdr)
- NIfTI – images typically come in a single file with the nii suffix
- roi – MRIcro's old format for ROIs
- voi – MRIcron's new format, it's really just a compressed NIfTI file
For VLSM, we're mostly concerned with 3D volume data (e.g., brain images, lesion maps, and stat maps). VoxBo can read all of the above formats and writes Analyze and NIfTI. In addition, VoxBo reads/writes its own format, cub format, for 3D data. VoxBo also likes to use 4D data, e.g., for collections of lesion maps or for associating related 3D volumes. NIfTI files can be 4D, and VoxBo has its own 4D format, tes format. NIfTI and VoxBo formats can both be gzip-compressed – e.g., instead of foo.nii or foo.tes, you'll see foo.nii.gz or foo.tes.gz (the .voi files produced by MRIcron are actually .nii.gz files, just with a different suffix).
What's in the files
Here are a few simple ways to help you get close to the data, figure out what's in the files, etc.
VoxBo's vbi tool can be used to inspect most image files:
> vbi ch2.nii.gz [I] ch2.nii.gz: (NIfTI 3D), 181x217x181, 1x1x1mm, (byte)
or if you want more detailed information
> vbi -l ch2.nii +- 3D Image file ch2.nii (NIfTI 3D) (byte) | 181x217x181 voxels | 1.0000 x 1.0000 x 1.0000 mm | 0.0MB on disk (lsbfirst) | origin: (0,0,0)
VoxBo's vbim tool can be used to perform a variety of steps on both 3D and 4D image files, including arithmetic manipulation, getting region information, getting maximum and minimum values, smoothing, etc. It can also perform these operations in series, so for example if foo.nii is a 4D file that contains lesion maps, you can use the following to create a mask of all the voxels with more than four lesions.
> vbim foo.nii -count -thresh 4 -quantize 1 -write count.nii
Segmentation and registration
Electronic templates, coordinate systems and conversions
The commonly used standard template for brain image analysis (including fMRI) is an average of 27 images of Colin Holmes's brain (each registered to MNI space), variously known as the Colin brain, ch2, colin27, or simply the MNI template. There are actually multiple MNI templates, but the ch2 average is the one people usually use. It's the template to which most electronic atlases are aligned, and the one people usually use to do lesion drawings.
You can often recognize an MNI-space image from its dimensions (in general, if you come across an image that doesn't seem to match things, often the dimensions will tell you what space it's in). 181x217x181 usually means a 1mm MNI brain, and 91x109x91 usually means 2mm.
Note that two images are said to be in the same "space" if their contents are aligned when you take their origins and voxel sizes into account. The image dimensions are clues that the image is in MNI space, because those are popular dimensions for MNI-space templates, but there's no reason MNI space images need to have those dimensions.
The Talairach Atlas was created using as reference the brain of a patient for whom no electronic brain image is available. Unfortunately, it has become the standard for reporting coordinates, which has made things a real mess. MNI space images are nearly in Talairach space, but not quite in close enough register to be good enough without transformation. There have been at least three transformations used for this, and I think the latest and greatest is described in Lacadie et al. – you can perform the transformation either using the bioimagesuite software (which I haven't tried extensively, but it's free and available at [www.bioimagesuite.org]) or via the online java applet (the link is labeled mni2tal). (I hope to incorporate their transformation into my software soon.)
[talairach.org] has some useful stuff relevant to talairach space, including a java program that you can download to tell you what regions are at your talairach coordinates, and an online java applet with the actual atlas embeded.
Segmenting lesions on patient anatomy
Ultimately we need lesion maps in template space (i.e., as overlays on the MNI template). Segmenting lesions directly on the patient anatomy obviously requires the image in electronic format, but is probably the least stressful viable procedure. This then requires warping the patient's brain into template space, which tends to be one of the more time-consuming (for your computer, not you) steps, depending on which program you use.
Right now, the two programs I know can be made to do a decent job of warping lesioned brains to templates are SPM and ANTS. The former requires MATLAB, but should run on any platform, and usually runs in just a few minutes. The latter really requires Linux or OSX, although it supposedly can run on Windows, and typically takes 5+ hours.
The best procedure is simply to calculate a mapping from the patient's structural image to the template image, and then to apply the same mapping to the lesion map. Here is a brief and somewhat concrete run-down on how to go about it.
Let's suppose your image is called myimage.nii, and you're going to use SPM8 to do the warping. You first need to bring up the fMRI interface. Then click the “Batch” button. In the menu, first select SPM/Spatial/Segment and then SPM/Spatial/Normalise/Normalise:Write. In the segment section, click on “Data” in the list of parameters and select the files you want to warp. In the normalise section, click on Data, select Parameter file, click “Dependency,” and select “Segment: Norm Params Sub->MNI.” For “Images to Write,” you can select the same file as in the segment step.
Find the parameter file and your image. Change the voxel sizes to
1 1 1
Change the bounding box to:
-90 -126 -72 90 90 108
The bounding box describes how much brain to include around the origin, where the origin is usually at or near the middle of the anterior commissure. This is the bounding box for the most common 1mm MNI template.
Now click on the green play button, and your warp will begin. It takes a few minutes even on a fast machine, but you'll get a file named wmyimage.nii, with dimensions 181x217x181. If you look at it alongside the MNI template, it should look fairly similar – the structure of your patient's brain in the shape of the MNI brain. It's important to take a close look at this image, to make sure nothing went grossly awry.
Among the files created by this process is just one you actually need, called something like myimage_seg_sn.mat. This is the mapping from your subject brain to MNI space. You can use the normalise:write step to apply this mapping to other files, including (especially) your lesion map. There's also an inverse mapping that you could use if you need for some reason to map an atlas region onto your subject's unwarped brain (e.g., if you wanted to calculate region sizes).
You may want to add a cleanup step here, in which an expert rater cleans up things like obvious border issues. You will probably also want to convert any floating point values in the normalized lesion mask to 0/1 values. You can do this using vbim:
vbim normalizedlesion.nii -thresh 0.5 -quantize 1 -write lesionmask.nii
This procedure works well with patients for whom you have high-quality MRIs. Clinical scans aren't usually good enough, although it's fine to try the warp and see if you trust it. You can also of course use this procedure for just a subset of your patients, if some only have CTs.
This procedure is a lot less stressful than the side-by-side method described below, although you do want to inspect the results for sanity at various stages. It offers an additional advantage that once your lesion drawing is done, you can take advantages of algorithmic improvements to the warp without much manual labor. The drawback is only that it can be computationally intensive, and that you may or may not trust what the warping algorithm is doing.
Side-by-side drawing (aka "Wimbledon")
Though I don't recommend it, here is a quick description of this common procedure:
- bring up both your patient image and the ch2 template in separate MRIcron windows
- use a pitch rotation on ch2 to find an angle at which the slices more closely resemble your patient's brain
- with roughly corresponding slices side-by-side, draw the lesion on the rotated template by visually estimating where it should be
- pitch rotate ch2 (and with it, the lesion that you just drew) back into its original orientation
- apply a threshold to the lesion map to get it back to 0/1 (continuous values are introduced in rotation)
I call this the "Wimbledon" technique because it requires swiveling your head in the same way you might at a tennis match (although truthfully, extended baseline play is more common at the French Open).
Many variants of this procedure have been used. It's not necessary to use just pitch rotation, and in principle you could use a complete affine transformation.
Regression-based segmentation
Coming soon: notes on how to go about the Stamatakis/Tyler procedure for segmenting lesions, using VoxBo tools
More stuff that's coming soon
- using FSL's susan tool to denoise MRIs
- VoxBo's special version of the AAL atlas
- other atlases
- using vbregion
Inter-rater reliability and quality control
vbmaskcompare is a VoxBo program to compare two masks (e.g., lesion maps). Currently it reports some comparisons of two masks, including the percentage of non-overlapping voxels and the Dice overlap metric. It's easy to use:
> vbmaskcompare mask1.nii mask2.nii [I] vbmaskcompare: checking volumes [I] vbmaskcompare: total voxels: 7109137 [I] vbmaskcompare: mask1.nii: 224410 total, 166144 unique [I] vbmaskcompare: mask2.nii: 78522 total, 20256 unique [I] vbmaskcompare: common: 58266 [I] vbmaskcompare: dice overlap: 0.3847 [I] vbmaskcompare: percent non-overlapping: 76.19%
If you have some spare time, add the -d flag anywhere on the command line, and you'll get these additional lines:
[I] vbmaskcompare: discrepant voxels in MR0042/wplus/warpedmask_MR0042.nii: 148226 [I] vbmaskcompare: discrepant voxels in MR0044/wplus/warpedmask_MR0044.nii: 13297 [I] vbmaskcompare: total discrepant voxels: 161523
"Discrepant voxels" is defined in Fiez et al., but is, roughly, voxels that aren't even close. I'm working on getting the Fiez et al. surface distance metric into the program, but right now it runs too slowly to be practical.
Electronic atlases
There are many electronic atlases, all of them have issues:
- The Brodann's map included with MRIcro(n) is a real mess.
- The AAL atlas provides anatomical labels, and is only a minor mess.
- The Juelich atlas is terrific, but incomplete.
- Various electronic versions of the Talairach atlas exist, and are okay, but there's the issue of alignment to MNI space.
- I recently heard about the SRI24 atlas, I haven't explored it deeply
Depending on what you need (with luck, not Brodmann areas), you can usually find what you need, but not always.
Statistical analysis
Simple VLSM with the t-test (regular and welch's, also BM test)
The simplest case for VLSM is carrying out a traditional two-sample t-test in each voxel, where two groups are patients with and without lesions in the voxel. Other options include Welch's t-test for unequal variances and the Brunner-Munzel test, which is complicated (but available in NPM). Note that with the Welch's t-test, you have different df in each voxel, so it's usually important to convert to z-scores (otherwise the significance threshold would be different at each voxel).
VoxBo's vbtmap is a simple command line tool to carry out a t-test:
> vbtmap lesions.nii behavior.ref statmap.nii -w -z
Where lesions.nii is a 4D file that contains all your lesion maps, and behavior.ref is a text file with one behavioral score per line (in the same order!). You can use vbi to make sure you know what's in the files. statmap.nii is the name for your output file, and it will be a 3D volume containing, in this case, z scores derived from a Welch's t-test.
In NPM, you click a few buttons to do the same thing.
Although NPM (and VoxBo) will sometimes calculate the significance thresholds you need, it's sometimes helpful to calculate them on the side.
Quick and dirty calculation of thresholds in R
The following calculations might be helpful, and are easy to do in R:
- pt(t,df) will give you the probability of observing a t value of t or higher given df
- qt(alpha,df) will give you the t threshold for your desired alpha (probability of observing a smaller t is alpha)
- pnorm(z) will give you the probability of observing a z score of z or less
- qnorm(alpha) will give you the z threshold for your desired alpha (prob. of observing smaller z is alpha)
How many patients do you need in a voxel?
It depends on how you recruit and include/exclude patients, but typically if you have 100 stroke patients, the maximum number of lesions in any voxel will be around 50-60. Most of your voxels will be somewhere around 25, but towards the fringes of your coverage, you'll have voxels with only a single lesion (or 2, or 3, etc.). You can get the breakdown on this using VoxBo's vbmaskinfo.
You could carry out a one-sample t-test in voxels with even just a single patient, but it probably wouldn't answer the right question.
Most researchers exclude voxels with fewer than n patients, where n is set according to the personal convictions of the researcher. In principle, if your statistics are well-behaved, you shouldn't have to worry. In practice, you shouldn't do anything with VLSM you wouldn't do with a simple group study. Beyond that obvious advice, I don't know that anyone has a very good answer to this question.
Note that the Brunner-Munzel test is poorly behaved if you have fewer than 10-15 patients in a voxel. For small studies, this could be most/all of your voxels. The most recent versions of NPM use a different strategy for voxels in which the tabular BM test would be inappropriate. If these constitute most of your voxels, it's probably best to do with a different test.
Regression
Single and multiple regression are supported in VoxBo's vbvolregress tool. Multiple regression models complicate the permutation test, so if you're planning to do a permutation test, best to reduce the model to a simple regression first. If you have binary (0/1) lesion scores, you can reduce it to a t-test. See the next section.
If you do want to do a multiple regression, vbvolregress is fairly flexible and offers a lot of options.
Nuisance variables in non-regression models
Factoring nuisance variables, including lesion scores from other voxels or regions, out of your dependent measure is most easily accomplished in an external stats program. You just need to regress your dependent measure against the nuisance variable, and then use the residuals as your new dependent measure. In R, this boils down to:
scoredata=read.csv(file("scores.csv"),comment.char="#")
attach(scoredata)
...
dimnames(scoredata)2 # show the column names
# scorenames=scorenames[2:length(scorenames)]
writeref(residuals(lm(myscore~nuisancevar)),"newscore.ref")
If you need to extract the vector of lesion scores from a single voxel (or the average score from a region), you can use vbxts. To use it, you must first smush your lesion maps into a single 4D file, which you can do using vbim:
> vbim *.voi -write4d lesions.nii.gz > vbxts -t lesions.nii.gz -p 41 37 22 ; this line extracts from a single voxel > vbxts -t lesions.nii.gz -m mymask.nii ; this line averages a mask-defined region
Bonferroni correction
Bonferroni correction is often overconservative, because it doesn't take into account the non-independence of your comparisons. But this need not be the case for VLSM. When Bonferroni correcting in VLSM, if you have discrete lesion maps (0/1), you usually want to count the number of distinct voxels, not just the number of voxels, and Bonferroni correct for that number. vbmaskinfo will calculate that number for you easily:
> vbmaskinfo *.voi ...
At the moment it's a little slower than it should be, but still shouldn't take more than a minute or so.
Permutation testing
Advantages of permutation testing in VLSM:
- automatically corrects for the independence of the multiple comparisons
- sidesteps the small region issues
- is sometimes more liberal (but no less appropriate) than counting distinct voxels
Disadvantages:
- can be slow, especially if you draw your lesion maps at high resolution
- occasionally, for reasons I don't completely understand, can be worse than the distinct voxel correction
In NPM, carrying out a permutation test is as simple as selecting the number of permutations you'd like in the menu. In VoxBo, you just run the makevlsm script. Running it without any arguments gives you the following help:
> makevlsm
the following flags are honored: -b submit to voxbo queue -d <dest> set destination dir for permutations (default: permdir) -p <num> set number of permutations (default: 0) -l <file> set name of 4D lesion map file (default: lesions.nii.gz) -s <file> set name of score file (default: scores.ref) -w use welch's instead of regular t-test -z generate z maps, not t maps -q <q> set FDR criterion -n <num> set minimum number of patients per voxel (default: 2) -m <mask> set inclusion mask (nonzero voxels are included) -f flip sign of t map notes: The -q option causes vbtmap to print an FDR threshold at the terminal. Usually you wouldn't use it along with -b.
Note that there is some sampling error associated with the p value if you don't run all possible permutations.
Use and abuse of FDR, data preparation, inference from FDR-thresholded maps
The False Discovery Rate (FDR) is the expected proportion of false positives among all voxels exceeding threshold. There is a simple procedure for controlling the FDR, described in an imaging context by Genovese et al. (2002). You specify the rate you would like (e.g., q=0.01) and the FDR procedure usually produces a threshold at which that is your expected rate of false positives. Sometimes there is no such threshold, in which case some software just tells you so, some packages report NaN (not a number) as the threshold, and at least one package uses the magic number 9.2.
FDR is a different standard for significance, contrasted with family-wise error rate (FWER), which is the more familiar constraint that we'd like to control the probability of observing 1 or more false positives. FDR thresholds are almost invariably more liberal than those controlling the FWER, and they do support certain kinds of inferences, so they are popular whenever power is an issue. FDR is "adaptive" in the sense that it does well when you have a lot of activation, and that's when it's easiest to interpret as well. If a large region exceeds FDR threshold, and you know the expected false positive rate is 0.01, then you can feel pretty confident that some part of that region is implicated. It's important to use a reasonable q value, like 0.01, and to exclude regions that are not of interest (e.g., air, skull, CSF).
Dealing with spatial coherence
Coming soon
Rendering and surfing
Coming soon
