Difference between revisions of "Engineering:UCSD"

From NAMIC Wiki
Jump to: navigation, search
Line 1: Line 1:
 
Back to [[Engineering:Main|NA-MIC Engineering]]
 
Back to [[Engineering:Main|NA-MIC Engineering]]
  
= UCSD (PI: Mark Ellisman) =
+
= UCSD (PI: Jeffrey S. Grethe) =
  
 
[[Image:GWE-logo.jpg]]
 
[[Image:GWE-logo.jpg]]
  
 
= Overview =
 
= Overview =
A core activity of UCSD is the development of infrastructure and support for the utilization of distributed computation resources (i.e. Grid Wizard Enterprise).  This infrastructure allows, for example, Slicer3 to execute work in a distributed grid environment and enable NA-MIC algorithms to be tested in such a distributed environment.  This will allow for much quicker validation of algorithms developed in Core 1 and also test effects of parameter settings, through large scale parameter searches. Work described in the previous progress report led to a prototype grid interface (aka Grid Wizard or gwiz). The overall purpose of this work in NA-MIC is to facilitate a "run-everywhere" philosophy for algorithm developers. By adopting a standard for algorithm "self-description" that is followed when command line executables are written, Slicer and distributed computational environments should be able to use the executables directly in their environment.   
+
A core activity of UCSD is the development of infrastructure and support for the utilization of distributed computation resources (i.e. Grid Wizard Enterprise).  This infrastructure allows, for example, Slicer3 to execute work in a distributed grid environment and enable NA-MIC algorithms to be tested in such a distributed environment.  This will allow for much quicker validation of algorithms developed in Core 1 and also test effects of parameter settings, through large scale parameter exploration experiments. Work described in the previous progress report led to a prototype grid interface (aka Grid Wizard or gwiz). The overall purpose of this work in NA-MIC is to facilitate a "run-everywhere" philosophy for algorithm developers. By adopting a standard for algorithm "self-description" that is followed when command line executables are written, Slicer and distributed computational environments should be able to use the executables directly in their environment.   
  
 
A basic requirement for the GWE environment is that a researcher expert in a particular scientific discipline should not need to also become an expert in grid computing in order to produce an application that uses grid technology.  It is important to note that GWE is not meant to be another grid middleware package, rather, it is meant to be a large-scale job launching and management tool that bridges the gulf between these biomedical researchers and current grid middleware by:
 
A basic requirement for the GWE environment is that a researcher expert in a particular scientific discipline should not need to also become an expert in grid computing in order to produce an application that uses grid technology.  It is important to note that GWE is not meant to be another grid middleware package, rather, it is meant to be a large-scale job launching and management tool that bridges the gulf between these biomedical researchers and current grid middleware by:
Line 13: Line 13:
 
* Allowing a researcher to easily specify large parametric computational jobs using the same general syntax as is used in the command line invocation of the analysis algorithms (e.g. through the P2EL language) or through integration with community developed biomedical applications (e.g. Slicer3).
 
* Allowing a researcher to easily specify large parametric computational jobs using the same general syntax as is used in the command line invocation of the analysis algorithms (e.g. through the P2EL language) or through integration with community developed biomedical applications (e.g. Slicer3).
 
* Managing the most common house keeping tasks required to ensure end-to-end success of a computation thereby relieving the researcher of this burden.
 
* Managing the most common house keeping tasks required to ensure end-to-end success of a computation thereby relieving the researcher of this burden.
 +
* Providing the researchers with the ability to easily review/manipulate the input and output data associated with the parameter exploration experiments executed.
  
= Grid Wizard Enterprise (GWE) Background =
+
= Systems =
The field of high performance computing (HPC) has provided a wide array of strategies for supplying additional computing power to the goal of reducing the total “clock time” required to complete large scale analyses. These strategies range from the development of higher performance hardware to the assembly of large networks of commodity computers. However, for the non-computational scientist wishing to utilize these services, usable software remains elusive. Here we present a software design and implementation of a tool, Grid Wizard Enterprise (GWE; http://www.gridwizardenterprise.org/), aimed at providing a solution to the particular problem of the adoption of advanced grid technologies by biomedical researchers.  GWE provides an intuitive environment and tools that bridge this gulf between the researcher and current grid technologies allowing them to run inter-independent computational processes faster by brokering their execution across a virtual grid of computational resources with a minimum of user intervention.  The GWE architecture has been designed in close collaboration with NA-MIC researchers and supports the majority of every-day tasks performed by computational scientists in the fields of computational biology and medical image analysis.
 
  
= GWE Information =  
+
== Grid Wizard Enterprise (GWE) ==
 +
The field of high performance computing (HPC) has provided a wide array of strategies for supplying additional computing power to the goal of reducing the total “clock time” required to complete large scale analyses. These strategies range from the development of higher performance hardware to the assembly of large networks of commodity computers. However, for the non-computational scientist wishing to utilize these services, usable software remains elusive. Here we present a software design and implementation of a system, Grid Wizard Enterprise (GWE; http://www.gridwizardenterprise.org/), aimed at providing a solution to the particular problem of the adoption of advanced grid technologies by biomedical researchers.  GWE provides an intuitive environment and tools that bridge this gulf between the researcher and current grid technologies allowing them to run inter-independent computational processes faster by brokering their execution across a virtual grid of computational resources with a minimum of user intervention.  The GWE architecture has been designed in close collaboration with NA-MIC researchers and supports the majority of every-day tasks performed by computational scientists in the fields of computational biology and medical image analysis.
 +
 
 +
[[File:Wcp-queue-1.png]]
 +
 
 +
[[File:Wcp-order-3.png]]
 +
 
 +
 
 +
 
 +
== Record Set Explorer (GWE's RSE) ==
 +
Parameter exploration experiments consist of (besides the abstract description of the experiment - algorithm/workflow) input parameter values and output result values. To review/manipulate these values (considering their typical large amount) a researcher is forced to deal with a great degree of manual work and error. These common tasks, performed before and after running the parameter exploration experiments (using automated systems such as GWE) can be eased by means of using an end user tool with the appropriate features. Here we present a software design and implementation of a system, Record Set Explorer (RSE; http://www.gridwizardenterprise.org/rse), aimed at providing a generic; but powerful and easy to use tool for users to interactively explore and manipulate record sets. RSE provides an easy to use tool that bridges the management of data to review/create/edit the input and output values of parameter exploration experiments. Its generality provides 3rd party adopters to easily extend the tool to include different source types for record sets such as CSV formatted files, XNAT databases (support already included in the core release of RSE), XML+XSL files, Python data structures, JSON formatted files, etc.
 +
 
 +
[[File:Rse-5-definition-gwe.png]]
 +
 
 +
[[File:Rse-7-exploration-gwe-2.png]]
 +
 
 +
[[File:Rse-gwe-3.png]]
 +
 
 +
 
 +
 
 +
= Information =  
  
 
* GWE  
 
* GWE  
Line 40: Line 60:
 
** [http://www.slicer.org/slicerWiki/images/b/b4/Gwe-advanced-tutorial.ppt GWE Advanced Tutorial]
 
** [http://www.slicer.org/slicerWiki/images/b/b4/Gwe-advanced-tutorial.ppt GWE Advanced Tutorial]
 
** [http://www.slicer.org/slicerWiki/images/8/87/Gslicer-tutorial.ppt GSlicer3 Tutorial]
 
** [http://www.slicer.org/slicerWiki/images/8/87/Gslicer-tutorial.ppt GSlicer3 Tutorial]
 +
 +
  
 
= Monthly Progress (since July 2007) =
 
= Monthly Progress (since July 2007) =

Revision as of 23:29, 8 February 2010

Home < Engineering:UCSD

Back to NA-MIC Engineering

UCSD (PI: Jeffrey S. Grethe)

GWE-logo.jpg

Overview

A core activity of UCSD is the development of infrastructure and support for the utilization of distributed computation resources (i.e. Grid Wizard Enterprise). This infrastructure allows, for example, Slicer3 to execute work in a distributed grid environment and enable NA-MIC algorithms to be tested in such a distributed environment. This will allow for much quicker validation of algorithms developed in Core 1 and also test effects of parameter settings, through large scale parameter exploration experiments. Work described in the previous progress report led to a prototype grid interface (aka Grid Wizard or gwiz). The overall purpose of this work in NA-MIC is to facilitate a "run-everywhere" philosophy for algorithm developers. By adopting a standard for algorithm "self-description" that is followed when command line executables are written, Slicer and distributed computational environments should be able to use the executables directly in their environment.

A basic requirement for the GWE environment is that a researcher expert in a particular scientific discipline should not need to also become an expert in grid computing in order to produce an application that uses grid technology. It is important to note that GWE is not meant to be another grid middleware package, rather, it is meant to be a large-scale job launching and management tool that bridges the gulf between these biomedical researchers and current grid middleware by:

  • Providing the researcher with the ability to easily configure the heterogeneous clustered/grid resources that they have access to.
  • Allowing a researcher to easily specify large parametric computational jobs using the same general syntax as is used in the command line invocation of the analysis algorithms (e.g. through the P2EL language) or through integration with community developed biomedical applications (e.g. Slicer3).
  • Managing the most common house keeping tasks required to ensure end-to-end success of a computation thereby relieving the researcher of this burden.
  • Providing the researchers with the ability to easily review/manipulate the input and output data associated with the parameter exploration experiments executed.

Systems

Grid Wizard Enterprise (GWE)

The field of high performance computing (HPC) has provided a wide array of strategies for supplying additional computing power to the goal of reducing the total “clock time” required to complete large scale analyses. These strategies range from the development of higher performance hardware to the assembly of large networks of commodity computers. However, for the non-computational scientist wishing to utilize these services, usable software remains elusive. Here we present a software design and implementation of a system, Grid Wizard Enterprise (GWE; http://www.gridwizardenterprise.org/), aimed at providing a solution to the particular problem of the adoption of advanced grid technologies by biomedical researchers. GWE provides an intuitive environment and tools that bridge this gulf between the researcher and current grid technologies allowing them to run inter-independent computational processes faster by brokering their execution across a virtual grid of computational resources with a minimum of user intervention. The GWE architecture has been designed in close collaboration with NA-MIC researchers and supports the majority of every-day tasks performed by computational scientists in the fields of computational biology and medical image analysis.

Wcp-queue-1.png

Wcp-order-3.png


Record Set Explorer (GWE's RSE)

Parameter exploration experiments consist of (besides the abstract description of the experiment - algorithm/workflow) input parameter values and output result values. To review/manipulate these values (considering their typical large amount) a researcher is forced to deal with a great degree of manual work and error. These common tasks, performed before and after running the parameter exploration experiments (using automated systems such as GWE) can be eased by means of using an end user tool with the appropriate features. Here we present a software design and implementation of a system, Record Set Explorer (RSE; http://www.gridwizardenterprise.org/rse), aimed at providing a generic; but powerful and easy to use tool for users to interactively explore and manipulate record sets. RSE provides an easy to use tool that bridges the management of data to review/create/edit the input and output values of parameter exploration experiments. Its generality provides 3rd party adopters to easily extend the tool to include different source types for record sets such as CSV formatted files, XNAT databases (support already included in the core release of RSE), XML+XSL files, Python data structures, JSON formatted files, etc.

Rse-5-definition-gwe.png

Rse-7-exploration-gwe-2.png

Rse-gwe-3.png


Information

  • Tutorials

Tutorials are provided via the Slicer Training Site. Links to the tutorials provided through this site are also provided below.


Monthly Progress (since July 2007)

July - September 2007

  • Inception, analysis, architecture, design and implementation of Grid Wizard Enterprise (GWE) based on prior Grid Wizard prototype developed as part of NA-MIC.

October 2007

  • Presentation and live demo of first GWE prototype at BIRN AHM. (PowerPoint / PDF)

December 2007

January 2008

  • Unit tests of GWE's first release candidate components.
  • Internal pre-release of first version of GWE.
  • Release of GWE guides, technical details and collaboration tools in the GWE project site.

February 2008

  • First release of GWE (version 0.6.alpha). See its 'features' page for details.

March 2008

April 2008

  • GWE version 0.6.2.alpha released. See its release notes for details.
  • Wrote paper "Simplifying the Utilization of Grid Computation using Grid Wizard Enterprise" to be submitted to the 'MICCAI Grid Workshop'.

May 2008

June 2008

July 2008

August 2008

  • GWE version 0.6.4.alpha released. See its release notes for details.

September 2008

October 2008

  • Collected user requests at the BIRN AHM.

November 2008

December 2008

January 2009

February 2009

March 2009

April-June 2009

  • Work towards GWE version 0.8.1.beta:
    • Transparent multi cluster support (true grid support).
    • Rearchitecture of transport layer to natively include sessions.
    • P2EL redesign to include behavioral parameters and multi format (XML, CSV, etc).

July-October 2009

  • NAMIC summer week presentations and projects:
  • Assistance for GWE's study case for COPD research effort: 30 thousands parameter exploration experiment.
  • Work towards results browser improvements.

November 2009-February 2010

  • Inception, analysis, architecture, design and implementation of new system: GWE's RSE (Record Set Explorer). It leverages on work done for GWE's results browser module.
  • Unit tests of GWE's RSE first release candidate components.

March 2010

  • GWE's RSE version 0.6.1.alpha release

Miscellaneous

  • Weekly NAMIC engineering teleconferences.
  • GWE testbed and user's support.

Dissemination Activities Prior to Monthly Progress Outlined Above

  • Introductory meeting and demonstration with Tina Kapur & Birn-CC.
  • Hosted NAMIC dissemination event (February 17-18).
  • Taught Data Grid course at UCSD dissemination event.
  • Attended NAMIC dissemination event.
  • Attended SLC AHM.
  • Instructions available for deploying a "tunneled" SRB server.

Infrastructure Prior to Monthly Progress Outlined Above

  • Researched, tested and deployed a newly configured SRB server for NAMIC that allows for the tunneling of all SRB commands via SSH. This tunneling has been tested with the command line (SCommands), Java (JARGON) and Windows (InQ) clients. The current staging server is running at UCSD and is available for testing.
  • A server co-located at BWH will be discussed at the AHM.
  • Leveraged BWH BIRN Rack to provide gigabit connection for na-mic.org.
  • Researching and developing a parallel system for backend parallel processing of Slicer3 algorithms
  • Discussed the use of BatchMake for submitting grid-like jobs to a Condor pool

Data Sharing Prior to Monthly Progress Outlined Above

  • Hosting NAMIC data on data grid accessible to all NAMIC participants.
  • Providing data grid and Portal support to all NAMIC participants.
  • Provided custom project space for NAMIC in BIRN Portal.
  • Provided account generation for first batch of NAMIC users. New users can now utilize the new account request feature in the Portal.
  • Working with Isomics to assist core 3 sites in their data uploads.
  • Provided template data hierarchy constructs and integration of hierarchy in data grid.
  • Provided statistics to Tina Kapur on account creation, number of uploaded data sets, and audit information.