Difference between revisions of "Slicer3:Grid Interface UseCases"

From NAMIC Wiki
Jump to: navigation, search
m (Text replacement - "http://www.slicer.org/slicerWiki/index.php/" to "https://www.slicer.org/wiki/")
 
(8 intermediate revisions by 4 users not shown)
Line 1: Line 1:
=== Use Cases ===
+
<big>'''Note:''' We are migrating this content to the slicer.org domain - <font color="orange">The newer page is [https://www.slicer.org/wiki/Slicer3:Grid_Interface_UseCases here]</font></big>
 
 
Here are two base use cases that we can consider as an initial step in moving the Grid Interface (Grid Wizard) into NA-MIC DBP analyses. These two use cases are "deliverables" for NAMIC, but on different time frames---the first use case should be completed, documented, demonstrated, and in the hands of users by early August.  The second use case should be completed, documented, demonstrated, and in the hands of users by mid-to-late September.
 
 
 
==== EM Segmentation ====
 
 
 
The EM (Expectation Maximization) Segmenter is an algorithm that performs a two-step iterative optimization procedure to detect "stuff" from "non-stuff" (where usually "stuff" is "white matter" and "non-stuff" is everything that isn't).  The algorithm itself is slow, but frequently needs to be run on many data sets as an initial step in some larger scientific analysis problem.  One of the main problems with EM Segmentation is that configuring a set of parameters for a segmentation run is, mildly put, a bit of an art rather than a specific procedure.  We propose that Slicer's main function here is as an exploratory data analysis platform:
 
 
 
# User loads an image into Slicer
 
# User goes through a several step process in configuring and initializing the EM algorithm
 
# User performs EM segmentation on a subsection of a single volume
 
# User goes back to step 2, until satisfied that the EM iterations are converging to a useful (though local) optimum.
 
# User saves a "parameter file" [need more info here about what this is] and then performs the EM segmentation algorithm on the whole volume.
 
# User looks wistfully at a directory, and wishes that she could just "run it on all those files", rather than repeating the process from step 1.
 
 
 
There are a couple of problems with these last two steps, including the general warning from Slicer 101 claiming: don't run this on a machine with less than 2 Gb of RAM, and even that might not be enough sometimes. This is an obvious case for where we can benefit from performing an initial analysis on a cluster, and then performing a batch analysis "in the large" on a cluster.
 
 
 
What we propose as a software deliverable comes in four parts:
 
# A configured GridWizard system for the NAMIC cluster that a user can install into his or her own workspace (users can reconfigure for other clusters, but this may not be terribly friendly)
 
# An EM Segmenter Slicer command-line module that allows a user to perform a single segmentation on a cluster, or a batch segmentation on a cluster
 
# Documentation on how to use both of above
 
# A roll, RPM, or tarball for EMSegmenter for 32-bit and 64-bit Linux cluster so that it can easily be installed on many compute nodes simultaneously
 
 
 
In these use cases, we presume that the user is, virtually speaking, inside the SPL proper and does not need to contend with gatekeeper's s/KEYS password system.  We also assume the following "preamble" to each use:
 
# User loads an image into Slicer
 
# User configures EM Segmenter
 
# User performs small segmentation locally
 
 
 
The usage we propose of EMSegmenter is in two parts: simple, and batch. "Batch" here simply means "the processing of multiple files simultaneously"; in each case, the job physically runs on a batch-oriented processing system.
 
 
 
* Simple mode
 
# Preamble...
 
# User launches Segmentation on a cluster (specifically, the NAMIC cluster)
 
# Window pops up with the task to be performed
 
# User reviews the "task list" and clicks "run"
 
# Job starts
 
# Job output ends up on an sshfs-mounted directory
 
# User can reload results into Slicer, check for accuracy.
 
* Mega-mode
 
# Preamble...
 
# User chooses a directory
 
# User enters a file glob (i.e., filter) into a parameter box in Slicer; this file glob applies to files in the chosen data directory but is not recursive; the file glob may not adhere #: to strict POSIX guidelines (? or [] might not be implemented)
 
# User choose "launch"
 
# Window pops up with the list of tasks that need to be performed
 
# User reviews the task list, and clicks "run"
 
 
 
In either case, a future version should include some additional features from the "job manager" window that pops up when the user wants to run batch jobs:
 
 
 
# Ability to monitor jobs
 
# Ability to inspect job outputs (PBS stderr streams, etc)
 
# Ability to inspect the job artifacts (parallelization scripts)
 
# Ability to choose the scheduling algorithm (and all that implies): do it all statically, or on a first-come-first-served basis.
 
# Ability to select multiple clusters, connected remotely
 
 
 
Many of these features are already present in GridWizard, but validation that they work with the above use case needs to be done _after_ the software is successfully demonstrated.
 
 
 
==== SPHARM: Spherical Harmonic-based Brain Shape Analysis ====
 
 
 
SPHARM is a numerical algorithm aimed at analyzing shape differences between medical volumes. [this description is probably wrong --- need to read the SPHARM paper, obviously] The overall procedure involves taking a valume and essentially performing a Fourier-like analysis where the volume is decomposed into spherical harmonics (the spherical harmonics here taking the place of sinusoidal functions in a standard fourier transform).  If you have two volumes and you perform this mapping, you can examine the coefficients of corresponding harmonics to see where bulges or shrinkage occur, etc.
 
 
 
A key step in the SPHARM processing is the initial mapping of an arbitrary volume onto spherical coordinates.  The software that performs this is part of the SPHARM 1.7 toolkit, using the command (tbd).  We would like to be able to take a large list of volumes and process these through the first step in parallel.  The time savings is obvious: 60 volumes acquired throughout a day can be processed in parallel in under half an hour one a moderate cluster; on a single processor, it could take half a week or longer.
 
 
 
SPHARM, and the way that it is used in dtMRI imaging, is different from EMSegmenter.  Here, we can rely on two things:
 
 
 
# The user does not need to start slicer in order to start a job
 
# The user is comfortable with command-line applications
 
 
 
This is an ideal situation for =gwiz-run=, a command-line application scheduler.
 
 
 
The software we propose for this use case comes in three parts:
 
 
 
# A configured GridWizard for the NAMIC cluster, which can be installed by a user into his or her own workspace
 
# A software package (RPM, tarball, or rocks roll) for 32- and 64-bit clusters to install the SPHARM software
 
# A set of "template" commands that the user can apply to process large numbers of images
 
 
 
Like the EMSegmenter use case above, we assume that the user is virtually inside the SPL and does not have to contend with the gatekeeper login system, or is using the BIRN cluster. Similarly, the cluster will need to have an =ssfs= filesystem mounted to some remote data repository (or the cluster needs to be the repository). The usage we propose of SPHARM is as follows:
 
 
 
# User gathers a large set (>= 25, <= 500) of files to process
 
# User picks a set of configuration parameters
 
# User goes to a NAMIC wiki page listing a "template command"
 
# User runs the =gwiz-gui= application
 
# User cut-and-pastes the template command into the gui, edits it as appropriate
 
# User reviews task list, optionally chooses to change the scheduling algorithm's parameters, clicks "Go"
 
 
 
The "nice-to-have's" from EMSegmenter apply here as well.
 

Latest revision as of 18:01, 10 July 2017

Home < Slicer3:Grid Interface UseCases

Note: We are migrating this content to the slicer.org domain - The newer page is here