Mbirn: Progress Report, May 2005
Aim 2.2: Adapt and apply shape-based morphometric tools to investigate clinical and control populations:
- 2.2.1: continue to develop interoperability between segmentation and shape analysis tools through standardized data representation;
Conversion tools and processing techniques have been identified and documented. We have started work on creating scriptable conversion routines between the MGH segmentation and the shape analysis processing of LDDMM.
- 2.2.2: port and optimize shape analysis tools for large-scale computation using compute grid technology;
1.0.1 release of LDDMM has been completed. This version has incoporated the Intel optimization components. It has improved portablility. The software now supports linux ia32, ia64, x86_64, MS Windows with cygwin and the IBM P690. http://www.cis.jhu.edu/~timothy/lddmmreleasenotes.html provides the current releasenotes.
- 2.2.3: develop and apply shape-based metrics to differentiate diseased populations from normal populations statistically;
Statistical Analysis has been performed on the unscaled (raw) and scaled (scale-adjusted or normalized for scale) hippocampus data sets. Three technical reports have been prepared, and four drafts are being prepared for publication in relevant literature.
R scripts are being documented. An internal Rweb server has been created. The key libraries used in R have been identified and implemented on the internal Rweb server. Scripts used for the statistical analysis will be refined in the coming months and incorparated into an automated statistical analysis process.
- 2.2.4: provide visualization tools that allow end-user interpretation of the shape analysis results in the context of subject anatomy and other morphometry information.
LDDMM visualization plugin to 3Dslicer exists in 2.4. This version is implemented in the BIRN 2.0 distribution release.
Additional key items
Increased our local Itanium2 Cluster from 8 to 16 nodes. This cluster is used for development work supporting BIRN and the porting of morphemetric applications to the TeraGrid.
An additional 16TB of storage has been purchased and installed. 12TB will be integrated into the existing BIRN SRB infrastructure. The 4TB will be used to support local data supporting the Itanium2 Cluster.
The scale-adjusted LDDMM processing was accoumplised in January 2005, yielding 45x45 interpoint distance matrix. The nearest neighbor structure based on unscaled and scale-adjusted distances are analyzed. Furthermore, the correlation between scale and other size measures such as volume and surface area are investigated. The interpoint distances are also used in classification, and the distances with respect to an "optimal" hippocampus are used in statistical tests for group differences. Statistical analysis on the scale-adjusted distances was performed and the results are compared with the ones from unscaled distances. When scaling was done with respect to the same hippocampus for both left and right hippocampi, it was observed that the scaling was (almost) optimal for the right hippocampus data sets but not for the left hippocampi. A second rescaling of the data for the left hippocampi was performed using a more optimal reference. The analysis of the left (scale-adjusted) hippocampi showed that careful selection of scaling data produces better results. Using all these analysis, we will be able to provide a goal-oriented practical road map (e.g., what to run for classification, nearest neighbor structures, group differences, etc).
We have become much more successful in LDDMM processing on the TeraGrid. During the second scaled run of the LDDMM processing, we only collected the neccessary data needed for the statistical analysis. The amount of out needed to transfer to the SRB was greatly decreased. This allowed for greater throughput for lddmm processing. We accomplished 2025 LDDMM jobs in less than 5 days. With our previous TeraGrid runs of 2025 LDDMM jobs, this would have been accomplished in about 3 weeks. 20,000 cpu/hrs of processing is used for the 2025 LDDMM jobs. We were able to improve performance due to running 250 simultaneous jobs as opposed to only 75 simultaneous jobs in our first runs. The TeraGrid processing improvements are necessary for our future processing of larger data sets.
Another important note is that the 4TB of data for the scale-adjusted processing was written to the SDSC BIRN site. JHU did not have enough storage to hold these runs. Instead of delaying the processing, we were able to share resources with SDSC. When our new storage becomes available, the data will be transfered to the JHU site.