Difference between revisions of "ITK Registration Optimization"
From NAMIC Wiki
Line 31: | Line 31: | ||
# update your Slicer's getbuildtest.tcl and getbuildtest2.tcl files to perform a cvs checkout of itk | # update your Slicer's getbuildtest.tcl and getbuildtest2.tcl files to perform a cvs checkout of itk | ||
# update your ITK build process to enable the use of code in ITK's review directory (which is still in-development) and to specifically enable the use of optimized registration methods within that review directory. | # update your ITK build process to enable the use of code in ITK's review directory (which is still in-development) and to specifically enable the use of optimized registration methods within that review directory. | ||
+ | # enable building the RegisterImages module | ||
=== If you are using getbuildtest.tcl === | === If you are using getbuildtest.tcl === | ||
Line 57: | Line 58: | ||
-DITK_USE_REVIEW:BOOL=ON \ | -DITK_USE_REVIEW:BOOL=ON \ | ||
-DITK_USE_OPTIMIZED_REGISTRATION_METHODS:BOOL=ON \ | -DITK_USE_OPTIMIZED_REGISTRATION_METHODS:BOOL=ON \ | ||
+ | |||
+ | === Enable the RegisterImages module === | ||
+ | * Slicer3/Applications/CLI/CMakeLists.txt | ||
+ | ** Line 447 (or there about): add the line | ||
+ | SUBDIRS( RegisterImagesModule ) | ||
+ | |||
+ | === Done === | ||
+ | * With the above completed, perform getbuildtest.tcl or getbuildtest2.tcl as normal. | ||
+ | * If you have questions, please contact stephen.aylward(at)kitware.com | ||
+ | ** Special thanks to Matt Turek and Luis Ibanez! | ||
+ | |||
+ | === Known issues === | ||
+ | * BSplines registration is still a memory hog. We're working on it. | ||
== Status and News == | == Status and News == |
Revision as of 02:51, 7 February 2008
Home < ITK Registration OptimizationContents
Summary
Goals
There are two components to this research
- Identify registration algorithms that are suitable for non-rigid registration problems that are endemic to NA-MIC
- Develop implementations of those algorithms that take advantage of multi-core and multi-processor hardware
Steps involved
- Modify ITK's registration framework to support oriented images
- Modify ITK's registration framework to be thread safe
- Develop multi-threaded versions of select registration modules
- Make everything backward compatible with ITK's existing registration methods and framework
- Deliver in ITK
- Develop helper classes and write IJ article
Target date for these deliverables: Jan 1, 2008
Planned follow-on work
- Implement helper-classes in ITK that combine common and effective registration components
- e.g., b-spline deformable registration using LBFSGB optimizer and Mattes MI metric
- Deliver b-spline deformable registration using LBFSGB optimizer and Mattes MI metric as a multi-threaded Slicer modules
Target date for the follow-on work: Jan 15, 2008
Instructions for early adopters
These instructions will tell you how to
- update your Slicer's getbuildtest.tcl and getbuildtest2.tcl files to perform a cvs checkout of itk
- update your ITK build process to enable the use of code in ITK's review directory (which is still in-development) and to specifically enable the use of optimized registration methods within that review directory.
- enable building the RegisterImages module
If you are using getbuildtest.tcl
- Slicer3/slicer_variables.tcl
- Line 82 (or there about): change it
SET ::ITK_TAG "HEAD"
- Slicer3/Scripts/genlib.tcl
- Line 636 (add the following to cmake definitions)
-DITK_USE_REVIEW:BOOL=ON \ -DITK_USE_OPTIMIZED_REGISTRATION_METHODS:BOOL=ON \
- Line 646 (add the following to cmake definitions)
-DITK_USE_REVIEW:BOOL=ON \ -DITK_USE_OPTIMIZED_REGISTRATION_METHODS:BOOL=ON \
If you are using getbuildtest2.tcl
- Slicer3/slicer_variables2.tcl
- Line 83 (or there about): change it
SET ::ITK_TAG "HEAD"
- Slicer3/Scripts/genlib2.tcl
- The svn checkout needs to instead do a cvs update. You need to have cvs in your system path. You should edit the runcmd svn line to instead read:
runcmd cvs -d :pserver:anonymous@www.itk.org:/cvsroot/Insight co Insight
- Line 671 (add the following to cmake definitions)
-DITK_USE_REVIEW:BOOL=ON \ -DITK_USE_OPTIMIZED_REGISTRATION_METHODS:BOOL=ON \
- Line 684 (add the following to cmake definitions)
-DITK_USE_REVIEW:BOOL=ON \ -DITK_USE_OPTIMIZED_REGISTRATION_METHODS:BOOL=ON \
Enable the RegisterImages module
- Slicer3/Applications/CLI/CMakeLists.txt
- Line 447 (or there about): add the line
SUBDIRS( RegisterImagesModule )
Done
- With the above completed, perform getbuildtest.tcl or getbuildtest2.tcl as normal.
- If you have questions, please contact stephen.aylward(at)kitware.com
- Special thanks to Matt Turek and Luis Ibanez!
Known issues
- BSplines registration is still a memory hog. We're working on it.
Status and News
- Have developed mult-threaded registration metrics in ITK
- Lead to the discovery that ITK's registration framework was not thread safe.
- Making ITK's registration framework thread safe is conceptually a bug fix for ITK.
- The incomplete implementation of oriented images in ITK has greatly extended the time and effort needed for this project.
- Fixing this must be done in a manner that maintains ITK's backward compatibility.
- This is a major effort involving approximately 50,000 lines of new code and over 400 new tests in ITK.
- We have chosen to spend the time to integrate with ITK because it will serve the broader community, it will benefit from the support of the broader community, it will avoid having to incorporate another SVN checkout into Slicer's build process, and it will keep us from having to maintain and monitor separate dashboards for this effort.
- Weekly tcons, Monday, 10am
- Luis Ibanez, Matt Turek, Stephen Aylward
- Active proposal to the ITK community:
- Project plan
- IJ article on oriented images and registration in ITK
- http://www.insight-journal.org/dspace/bitstream/1926/1293/2/Brooks_Arbel_FastOrientedImage_V1.pdf
- Solution presented by the authors is closely related to the changes being made in ITK
Publications
- Aylward, Stephen; Jomier, Julien; Barre, Sebastien; Davis, Brad; Ibanez, Luis, "Optimizing ITK’s Registration Methods for Multi-processor, Shared-Memory Systems." MICCAI Open Source and Open Data Workshop, 2007 (Download PDF)
Quick Links
- Dashboard for this project
- Dashboard for BatchMake
- Batchboard (nightly experiment results) for this project
- BWH Neuroimaging Analysis Center (NAC), 2007-2008: Grid Enabled ITK
Algorithmic Requirements and Use Cases
- Requirements
- relatively robust, with few parameters to tweak
- runs on grey scale images
- has already been published
- relatively fast (ideally speaking a few minutes for volume to volume).
- not patented
- can be implemented in ITK and parallelized.
- Use-cases
- Intersubject mapping
- Example data set (Kilian)
- fMRI to hi-res brain morphology mapping
- Example data set (Steve Pieper)
- DTI: components of the diffusion tensor
- Example data (Sylvain)
- Intersubject mapping
Hardware Platform Requirements and Use Cases
- Requirements
- Shared memory
- Single and multi-core machines
- Single and multi-processor machines
- AMD and Intel - Windows, Linux, and SunOS
- Use-cases
- Intel Core2Duo
- Intel quad-core Xeon processors, Visual Studio 8, Windows Vista (Kitware: redwall)
- 6 CPU Sun, Solaris 8 (SPL: vision)
- 12 CPU Sun, Solaris 8 (SPL: forest and ocean)
- 16 core Opteron (SPL: john, ringo, paul, george)
- 16 core, Sun Fire, AMDOpteron (UNC: Styner)
Data
- Now distributed with CVS
Workplan
Establish testing and reporting infrastructure
- Identify timing tools
- Cross platform and multi-threaded
- Timing and profiling
- Develop performance dashboard for collecting results
- Each test will report time and accuracy to a central server
- The performance of a test, over time, for a given platform can be viewed on one page
- The performance of a set of tests, at one point in time, for all platforms can be viewed on one page
Develop tests
- Develop modular tests
- Develop complete registration solutions for use cases
ITK Optimization
- Target bottlenecks
- Multi-thread metric calculation
- Initial target is MattesMutualInformationImageToImageMetric
- Optimize code
- Sacrifice some memory and algorithm initialization speed to gain algorithm operation speed increases
- Call multi-threaded functions when possible
- Multi-thread metric calculation
- Integrate metrics with transforms and interpolators for tailored performance
Example Results: MattesMutualInformationImageToImageMetric
Example of Optimizations Employed
- GetValue
- Added multi-threading to GetValue function
- Partitions the samples - thereby distributes the computation of the transforms and interpolations across threads
- Added the pre-computation of the FixedImageMarginalPDF for the sample to reduce the need for the thread mutex lock
- Required the concept of an AdjustedFixedImageMarginalPDF that is updated when a fixed image voxel does not map into the moving image and thereby isn't valid for the current computations. By only updating when samples are missed, mutex lock to update a cross-thread data structure is needed less often.
- Each thread now has its own copy of the joinPDF. After threads complete, jointPDFs from each thread are summed. This eliminates mutex from the main loop over samples.
- Added multi-threading to GetValue function
Results
- Speedup on a dual-core system is about 30% (reduction in computation time) when using linear transform and linear interpolation and about 45% when using bspline transform and bspline interpolation.
Performance Testing Results
GetValue Test at Identity Parameters
// Print out a line with the test information std::cout << "GetValue2,"; std::cout << metric->GetNameOfClass() << "," << interpolator->GetNameOfClass(); std::cout << "," << transform->GetNameOfClass(); // Make a time probe itk::TimeProbe timeProbe; // Run at the identity transform parameters. unsigned int numIters = 100; timeProbe.Start(); for (unsigned int iter = 0; iter < numIters; iter++) { value = metric->GetValue( identityParameters ); } timeProbe.Stop(); // Print out the number of samples std::cout << "," << metric->GetNumberOfPixelsCounted(); // Print out the time result. std::cout << "," << timeProbe.GetMeanTime()/numIters << std::endl;
GetValueAndDerivative Test at Identity Parameters
// Print out a line with the test information std::cout << "GetValueAndDerivative2,"; std::cout << metric->GetNameOfClass() << "," << interpolator->GetNameOfClass(); std::cout << "," << transform->GetNameOfClass(); // Make a time probe itk::TimeProbe timeProbe; // Evaluate at the identity transform; unsigned int numIters = 100; timeProbe.Start(); for (unsigned int iter = 0; iter < numIters; iter++) { metric->GetValueAndDerivative( identityParameters, value, derivative ); } timeProbe.Stop(); // Print out the number of samples std::cout << "," << metric->GetNumberOfPixelsCounted(); // Print out the time result. std::cout << "," << timeProbe.GetMeanTime()/numIters << std::endl;
Preliminary Results
January 5, 2008 - Note: "Opt" results are not using the OptLinearInterpolateImageFunction.
- MattesMI GetValue Results
- MattesMI, b-spline interpolation and transform, GetValue Results
- MeanSquares GetValue Results
- MattesMI GetValueAndDerivative Results
- MattesMI, b-spline interpolation and transform, GetValueAndDerivative Results
- MeanSquares GetValueAndDerivative Results
Events
- April 6, 2007: TCon
- April 12, 2007: TCon
- April 18, 2007: TCon
- May 1, 2007: TCon
- June 27, 2007: NAMIC Programmers' Week
- January, 2008: NAMIC AHM
Related Pages
- Non Rigid Registration
- Slicer3:Performance_Analysis
- User:Barre/ITK Registration Optimization
- Testing and ITK Backward Forward Compatibility
Performance Measurement
- LTProf - simple profilter for Windows - Shareware
- Intel's VTune for Linux ($)
- TAU
- Threadmon: Thread usage/blockage
- TotalView ($)
- PerfSuite (POSIX Threads)
- GProf work-around for multi-threaded apps
- References on multi-threaded profiling and code optimization