Difference between revisions of "ITK Registration Optimization"
Line 291: | Line 291: | ||
== Planned follow-on work == | == Planned follow-on work == | ||
− | + | Devise a new metric for MI registration | |
− | |||
# If we always use every voxel for the metric, then we can cache the weights by the voxel's position wrt the adjacent control points. For example, for Kilian's situation of a control point every 2 voxels, then there really are only a few unique weight sets that are repeated throughout the volume. Luis had already brought up a variation on this idea. | # If we always use every voxel for the metric, then we can cache the weights by the voxel's position wrt the adjacent control points. For example, for Kilian's situation of a control point every 2 voxels, then there really are only a few unique weight sets that are repeated throughout the volume. Luis had already brought up a variation on this idea. | ||
# This method could also be combined with the rule to not evaluate voxels or control points that fall on background voxels. This too has been discussed, but such a rule makes multi-threading tricky in that we don't want to waste threads by allocating them to image regions that contain only background voxels. | # This method could also be combined with the rule to not evaluate voxels or control points that fall on background voxels. This too has been discussed, but such a rule makes multi-threading tricky in that we don't want to waste threads by allocating them to image regions that contain only background voxels. |
Revision as of 03:48, 31 July 2008
Home < ITK Registration OptimizationContents
- 1 Slicer3 Module: RegisterImages
- 1.1 Major Features
- 1.2 Head MRI Registration
- 1.3 Pipeline Registration
- 1.4 Intuitive Parameters
- 1.5 Incorporates testing
- 1.6 Instructions for Enabling the RegisterImages module
- 1.7 Class structure
- 2 Background Research
- 3 Publications
- 4 Workplan
- 5 Performance Testing Results
- 6 Events
- 7 Related Pages
- 8 Performance Measurement
Slicer3 Module: RegisterImages
The RegisterImages module is the product of the research discussed below.
Major Features
The major features of the module include:
- Default parameters register many full-head and skull-stripped MRI: rigid, affine, and BSpline
- Offers a complete, pipeline-based registration solution
- Load and apply existing transforms
- Compute rigid, affine, and bspline transforms in sequence with a single command
- Intuitive parameters
- Instead of setting obscure "scales" for parameters, you set global values for "Expected Offset", "Expected Rotation", ... to indicate how much mis-registration is anticipated in the data being registered
- MinimizeMemory option provides a way to compute bspline registrations using a dense set of control points and a large number of samples on "normal" computers (albeit computation time increases)
- SampleFromOverlap option allows images of vastly different sizes to be registered
- Helps to avoid (but does not completely eliminate) the annoying ITK exception, "too many samples falls outside of the image"
- Incorporates testing
- Specify a baseline image, and modules will perform the requested registration, compare its results with the baseline image, and return success/failure
- Based on an extensible and re-usable class structure.
Each of these features is discussed next.
Head MRI Registration
Example 1: Problem Cases
- Registers images which were not well resolved (or produced seg-faults) using other slicer registration modules:
Difficult affine registration, Same subject, Different protocols: T2 and Fractional Anisotropy
Difficult affine registration, Same subject, Different protocols: T1 and Gradient
Example 2: Affine Registration
- Task:
- Affine registration of head MRI from two different subjects
- Data:
- Using cases UNC-Healthy-Normal002 (fixed) and UNC-Healthy-Normal004 (moving)
- Data provided by Dr. Bullitt at UNC.
- Data is available from Kitware's MIDAS archive at http://hdl.handle.net/1926/542
- Data can be automatically downloaded into ${RegisterImages_BINARY_DIR}/Testing/Data directory by enabling the CMake variable "BUILD_REGISTER_IMAGES_REAL_WORLD_TESTING"
- Warning this also enables additional tests that can take 4+ hours to complete.
- To see the code for automatically downloading from MIDAS (via svn), see Slicer3/Applications/CLI/RegisterImagesModule/Applications/CMakeLists.txt
Affine registration of head MRI from two difference subjects
Affine registration of skull-stripped head MRI from two difference subjects
Example 3: BSpline Registration
- Task:
- BSpline registration of head MRI from two different subjects
- Data:
- Using cases UNC-Healthy-Normal002 (fixed) and UNC-Healthy-Normal004 (moving)
- Data provided by Dr. Bullitt at UNC.
- Data is available from Kitware's MIDAS archive at http://hdl.handle.net/1926/542
- Data can be automatically downloaded into ${RegisterImages_BINARY_DIR}/Testing/Data directory by enabling the CMake variable "BUILD_REGISTER_IMAGES_REAL_WORLD_TESTING"
- Warning this also enables additional tests that can take 4+ hours to complete.
- To see the code for automatically downloading from MIDAS (via svn), see Slicer3/Applications/CLI/RegisterImagesModule/Applications/CMakeLists.txt
BSpline, Default Options, Different Subjects, Same Protocol, Head MRI
BSpline, Dense Control-Point Grid, Different Subjects, Same Protocol, Head MRI
BSpline, (Default Options vs. Dense Control-Point Grid), Different Subjects, Same Protocol, Skull-Stripped Head MRI
Pipeline Registration
The module implements a registration pipeline. The steps in that pipeline are as follows:
- Step 1: Loaded transform
- You may load a pre-computed transform to initialize the registration.
- If one is loaded, it is immediately applied (i.e., the moving image is resampled)
- Step 2: Initial registration
- Options are:
- None (sets the center of rotation to the center of the moving image)
- Landmark (uses N-pairs of landmarks (passed as vectors) and a least-squared error metric to register the images using a rigid transform
- Image Centers (shifts the images to align their centers)
- Centers of Mass (shifts the images to align their centers of mass)
- Second Moments (shifts and rotates the images to align the 1st and 2nd moments)
- Options are:
- Step 3: Registration
- Options are:
- None (applies the loaded transforms)
- Initial
- computes and applies the initial transform to the loaded registrations)
- Rigid
- computes a rigid transform and then applies it to the loaded registrations
- Affine
- computes an affine transform and then applies it to the loaded registrations
- BSpline
- computes a bspline transform and then applies it to the loaded registrations
- PipelineRigid
- computes a rigid transform (initialized using the results from the initial registration) and then applies it to the loaded registrations
- PipelineAffine
- computes a rigid transform (initialized using the results from the initial registration), uses those results to initialize and compute an affine transform, and then applies it to the loaded registrations
- PipelineBSpline
- computes a rigid transform (initialized using the results from the initial registration), uses those results to initialize and compute an affine transform, and then applies it to the loaded registrations, THEN computes and applies a BSpline transform
- Options are:
Intuitive Parameters
- In rare cases (given unusual acquisition conditions and/or highly inconsistent acquisition protocols) you will need to change the default parameters.
- More often you may wish to tweak parameters to achieve your application-specific speed-vs-accuracy tradeoff
IO Tab
- Set the fixed and moving images using images in the scene
- Optionally set the ResampleImage to store the output image
- If not set, registration won't conduct the final resampling, saving computation time
Registration Parameters Tab
- Load Transform
- provide the Loaded Transform for the loaded phase of registration
- Save Transform
- results of the entire registration pipeline will be saved here
- Initialization
- see registration pipeline discussion
- Registration
- see registration pipeline discussion
- For rigid and affine registrations, one-plus-one evoluation optimization is first applied for N iterations, and then FRPR gradient-line-search optimization is applied.
- For more information, check the code: RegisterImagesModule/itkOptimizedImageToImageRegistrationmethod.h/txx
- For BSpline registration, a hierarchical registration scheme is used. An image pyramid having 3 levels is used to resample the images and the control grids. Heuristics are used to control the various resampling parameters. At each level, registration is conducted using FRPR gradient-line-search optimization.
- For more information, check the code: RegisterImagesModule/itkBSplineImageToImageRegistrationMethod.h/txx
- Metric
- Use the Mutual Information metric. It is an multithreaded and optimized version of the Mattes MI method.
- For more information, check the code; Insight/Code/Review/itkOptMattesMutualInformationImageMetric.h/txx
- Use the Mutual Information metric. It is an multithreaded and optimized version of the Mattes MI method.
- "Expected" values
- For rigid, affine, and bspline registration, parameter scales (refer to the Insight Software Guide) are represented as hyper-parameters in the RegisterImages module.
- "Expected Offset" controls the offset scales in rigid and affine registration the deformation vector scale in bspline registration
- "Expected Rotation" is roughly in terms of radians. It controls the rotation angles in rigid and affine registration
- "Expected Scale" is for scaling during affine registration
- "Expected Skew" is for skew for affine registration
- For rigid, affine, and bspline registration, parameter scales (refer to the Insight Software Guide) are represented as hyper-parameters in the RegisterImages module.
Advaned Registration Parameters Tab
- Verbosity level
- Controls the level of detail in the reports in the log file
- Sample from fixed/moving overlap
- When the fixed image is much larger than the moving image, it is CRITICAL to set this flag and to pick a good initialization method. In that way, only the portion of the fixed image that is initially covered by the moving image will be used during registration. This prevents ITK from throwing an exception (error) stating that too many fixed-image samples miss (map outside of) the moving image.
- Fixed image intensity percentage threshold
- A less robust way to overcome the image overlap issue discussed above, you can specify a threshold as a portion (0 to 1) of the fixed image intensity range that should be used to select fixed image samples for computing the metric. That is, by specifying 0.5, only the pixels in the upper half of the fixed-image's intensity range will be used during random sample selection.
- Remember, it is important to include pixels inside and outside of the object of interest, otherwise the fixed image histogram may be too homogeneous for mis-registrations to be detected.
- Random number seed
- To ensure consistent performance, you can set a seed - repeated runs should produce identical results.
- Number of threads
- Number of multi-core/mult-processor threads to use during metric value computations.
- MimimizeMemory
- Turns off caching of intermediate values during bspline registration
- Provides a way to compute bspline registrations using a dense set of control points and a large number of samples on "normal" computers (albeit computation time increases)
- Rule of thumb, if the BSpline registration crashes - re-run with this option enabled.
- use windowed sinc for final interpolation
- If you have time to kill. Extremely slow and only marginally better than bspline resampling (the default).
Registration Testing Parameters
- Baseline Image
- Set the image against which the Resampled Image (IO tab) will be compared after registration
- Number of Failed Pixels Tolerance
- Registration returns "failure" if this many pixels are different between the Resampled and Baseline images
- Intensity Tolerance
- Minimum intensity difference between corresponding Resampled and Baseline pixels for those pixels to be counted as failures
- Radius Tolerance
- The program will search this neighborhood size about each Resampled pixel to find the closest matching Baseline pixel. The closest matching pixels are compared using the Intensity Tolerance (above)
- Baseline Difference Image
- Result of subtracting the resampled image from the baseline image
- Baseline Resamples Moving Image
- resampled image, resampled into the space of the baseline image
Advanced Initial Registration Parameters
- Fixed / Moving Landmarks
- A vector string (comma separated base-3 list) of the indexes of corresponding points in the fixed and moving images
- If supplied, then choose "Landmarks" as the initial registration method (see discussion on registration pipeline)
Advanaced Rigid and Affine Parameters
- MaxIterations
- Number of iterations for one-plus-one and for FRPR registration
- Sampling Ratio
- Portion of the image pixels to be used when computing the metric
Advanced BSpline Parameters
- MaxIterations
- Number of iterations for one-plus-one and for FRPR registration
- Sampling Ratio
- Portion of the image pixels to be used when computing the metric
- Do the math...if you have 40 pixels between control points, then there will be 40^3 (64,000) pixels relevant to each control point. That excessive for directing one control point. Keep the sampling small. For 40 pixels between control points, a sampling density of 0.1 provide 6,400 pixels for metric computation at each control point - more than enough.
- When in doubt, turn on MinimizeMemory
- Control point spacing (pixels)
- Don't think about grid size - instead think about the level of detail that needs to be resolved (see discussion on sampling ratio).
- When in doubt, turn on MinimizeMemory
Incorporates testing
- See discussion on the "Registration Testing Parameters" tab.
Instructions for Enabling the RegisterImages module
You need to change the build scripts (and perhaps their parameter files) to perform a cvs checkout of itk. There are two sets of instructions, one for those building using the getbuildtest script and one for those building using the getbuildtest2 script.
Using getbuildtest.tcl
If you are using getbuildtest.tcl script
- In the file Slicer3/slicer_variables.tcl
- Line 82 (or there about): change it to
SET ::ITK_TAG "HEAD"
Using getbuildtest2.tcl
If you are using getbuildtest2.tcl
- In the file Slicer3/slicer_variables2.tcl
- Line 83 (or there about): change it to
SET ::ITK_TAG "HEAD"
- In the file Slicer3/Scripts/genlib2.tcl
- The svn checkout needs to instead do a cvs update. You need to have cvs in your system path. You should edit the runcmd svn line to instead read:
runcmd cvs -d :pserver:anonymous@www.itk.org:/cvsroot/Insight co Insight
Done
- With the above completed, perform getbuildtest.tcl or getbuildtest2.tcl as normal.
- Please direct questions/comments to the slicer developers' list.
Class structure
- Try it, you'll like it.
- Follows the coding style of itk
- Limited comments, but meaningful variable names
- No documentation is provided or planned - don't even ask.
Background Research
Goals
There are two components to this research
- Identify registration algorithms that are suitable for non-rigid registration problems that are endemic to NA-MIC
- Develop implementations of those algorithms that take advantage of multi-core and multi-processor hardware
Steps involved
- Modify ITK's registration framework to support oriented images
- Modify ITK's registration framework to be thread safe
- Develop multi-threaded versions of select registration modules
- Make everything backward compatible with ITK's existing registration methods and framework
- Deliver in ITK
- Develop helper classes and write IJ article
Target date for these deliverables: Jan 1, 2008
Planned follow-on work
Devise a new metric for MI registration
- If we always use every voxel for the metric, then we can cache the weights by the voxel's position wrt the adjacent control points. For example, for Kilian's situation of a control point every 2 voxels, then there really are only a few unique weight sets that are repeated throughout the volume. Luis had already brought up a variation on this idea.
- This method could also be combined with the rule to not evaluate voxels or control points that fall on background voxels. This too has been discussed, but such a rule makes multi-threading tricky in that we don't want to waste threads by allocating them to image regions that contain only background voxels.
- The metric could be closely tied to a multiresolution registration scheme. In fact, the grid and the image resolutions should perhaps be linearly related. That is, we could tie the metric computation to the resolution of the deformation grid by subsampling the image. There are situations where this is not a right thing to do (just because the grid is coarse doesn't mean that a small movement isn't important); HOWEVER, as part of a multiresolution registration strategy, it is perhaps the viable option. This would need to be evaluated on the data.
- Have "don't-care" regions in which bspline control points are processed/don't move, e.g., no need to adjust ones that only contain background
Status and News
- Thanks (but not your questions/comments) go to
- Luis Ibanez, Matt Turek, Stephen Aylward
Publications
- Aylward, Stephen; Jomier, Julien; Barre, Sebastien; Davis, Brad; Ibanez, Luis, "Optimizing ITK’s Registration Methods for Multi-processor, Shared-Memory Systems." MICCAI Open Source and Open Data Workshop, 2007 (Download PDF)
- BWH Neuroimaging Analysis Center (NAC), 2007-2008: Grid Enabled ITK
- IJ article on oriented images and registration in ITK
- http://www.insight-journal.org/dspace/bitstream/1926/1293/2/Brooks_Arbel_FastOrientedImage_V1.pdf
- Solution presented by the authors is closely related to the changes made in ITK
Algorithmic Requirements and Use Cases
- Requirements
- relatively robust, with few parameters to tweak
- runs on grey scale images
- has already been published
- relatively fast (ideally speaking a few minutes for volume to volume).
- not patented
- can be implemented in ITK and parallelized.
- Use-cases
- Intersubject mapping
- Example data set (Kilian)
- fMRI to hi-res brain morphology mapping
- Example data set (Steve Pieper)
- DTI: components of the diffusion tensor
- Example data (Sylvain)
- Intersubject mapping
Hardware Platform Requirements and Use Cases
- Requirements
- Shared memory
- Single and multi-core machines
- Single and multi-processor machines
- AMD and Intel - Windows, Linux, and SunOS
- Use-cases
- Intel Core2Duo
- Intel quad-core Xeon processors, Visual Studio 8, Windows Vista (Kitware: redwall)
- 6 CPU Sun, Solaris 8 (SPL: vision)
- 12 CPU Sun, Solaris 8 (SPL: forest and ocean)
- 16 core Opteron (SPL: john, ringo, paul, george)
- 16 core, Sun Fire, AMDOpteron (UNC: Styner)
Workplan
Establish testing and reporting infrastructure
- Identify timing tools
- Cross platform and multi-threaded
- Timing and profiling
- Develop performance dashboard for collecting results
- Each test will report time and accuracy to a central server
- The performance of a test, over time, for a given platform can be viewed on one page
- The performance of a set of tests, at one point in time, for all platforms can be viewed on one page
Develop tests
- Develop modular tests
- Develop complete registration solutions for use cases
ITK Optimization
- Target bottlenecks
- Multi-thread metric calculation
- Initial target is MattesMutualInformationImageToImageMetric
- Optimize code
- Sacrifice some memory and algorithm initialization speed to gain algorithm operation speed increases
- Call multi-threaded functions when possible
- Multi-thread metric calculation
- Integrate metrics with transforms and interpolators for tailored performance
Example Results: MattesMutualInformationImageToImageMetric
Example of Optimizations Employed
- GetValue
- Added multi-threading to GetValue function
- Partitions the samples - thereby distributes the computation of the transforms and interpolations across threads
- Added the pre-computation of the FixedImageMarginalPDF for the sample to reduce the need for the thread mutex lock
- Required the concept of an AdjustedFixedImageMarginalPDF that is updated when a fixed image voxel does not map into the moving image and thereby isn't valid for the current computations. By only updating when samples are missed, mutex lock to update a cross-thread data structure is needed less often.
- Each thread now has its own copy of the joinPDF. After threads complete, jointPDFs from each thread are summed. This eliminates mutex from the main loop over samples.
- Added multi-threading to GetValue function
Results
- Speedup on a dual-core system is about 30% (reduction in computation time) when using linear transform and linear interpolation and about 45% when using bspline transform and bspline interpolation.
Performance Testing Results
GetValue Test at Identity Parameters
// Print out a line with the test information std::cout << "GetValue2,"; std::cout << metric->GetNameOfClass() << "," << interpolator->GetNameOfClass(); std::cout << "," << transform->GetNameOfClass(); // Make a time probe itk::TimeProbe timeProbe; // Run at the identity transform parameters. unsigned int numIters = 100; timeProbe.Start(); for (unsigned int iter = 0; iter < numIters; iter++) { value = metric->GetValue( identityParameters ); } timeProbe.Stop(); // Print out the number of samples std::cout << "," << metric->GetNumberOfPixelsCounted(); // Print out the time result. std::cout << "," << timeProbe.GetMeanTime()/numIters << std::endl;
GetValueAndDerivative Test at Identity Parameters
// Print out a line with the test information std::cout << "GetValueAndDerivative2,"; std::cout << metric->GetNameOfClass() << "," << interpolator->GetNameOfClass(); std::cout << "," << transform->GetNameOfClass(); // Make a time probe itk::TimeProbe timeProbe; // Evaluate at the identity transform; unsigned int numIters = 100; timeProbe.Start(); for (unsigned int iter = 0; iter < numIters; iter++) { metric->GetValueAndDerivative( identityParameters, value, derivative ); } timeProbe.Stop(); // Print out the number of samples std::cout << "," << metric->GetNumberOfPixelsCounted(); // Print out the time result. std::cout << "," << timeProbe.GetMeanTime()/numIters << std::endl;
Preliminary Results
January 5, 2008 - Note: "Opt" results are not using the OptLinearInterpolateImageFunction.
- MattesMI GetValue Results
- MattesMI, b-spline interpolation and transform, GetValue Results
- MeanSquares GetValue Results
- MattesMI GetValueAndDerivative Results
- MattesMI, b-spline interpolation and transform, GetValueAndDerivative Results
- MeanSquares GetValueAndDerivative Results
Events
- April 6, 2007: TCon
- April 12, 2007: TCon
- April 18, 2007: TCon
- May 1, 2007: TCon
- June 27, 2007: NAMIC Programmers' Week
- January, 2008: NAMIC AHM
Related Pages
- Non Rigid Registration
- Slicer3:Performance_Analysis
- User:Barre/ITK Registration Optimization
- Testing and ITK Backward Forward Compatibility
Performance Measurement
- LTProf - simple profilter for Windows - Shareware
- Intel's VTune for Linux ($)
- TAU
- Threadmon: Thread usage/blockage
- TotalView ($)
- PerfSuite (POSIX Threads)
- GProf work-around for multi-threaded apps
- References on multi-threaded profiling and code optimization