Difference between revisions of "CTSC IGT, BWH"

From NAMIC Wiki
Jump to: navigation, search
 
(8 intermediate revisions by one other user not shown)
Line 7: Line 7:
 
** Key Investigators:
 
** Key Investigators:
 
** Brief Description:
 
** Brief Description:
 +
** Use: What kinds of queries will be important?
 
* NCIGT_Tumor_Resection (HK/AG)
 
* NCIGT_Tumor_Resection (HK/AG)
 
** Key Investigators:
 
** Key Investigators:
 
** Brief Description:
 
** Brief Description:
 +
** Use: What kinds of queries will be important?
 +
* NCIGT_Glioma_Resection (HK)
 +
** Key Investigators:
 +
** Brief Description: Intraoperative MRT brain only
 +
** Use: What kinds of queries will be important? sex,age,tumor size, tumor grade, tumor location (lobe)
 
* NCIGT_Prostate (HE/CT)
 
* NCIGT_Prostate (HE/CT)
 
** Key Investigators:
 
** Key Investigators:
 
** Brief Description:
 
** Brief Description:
 +
** Use: What kinds of queries will be important?
 
* NCIGT_Prostate_Fully_Segmented (HE/CT)
 
* NCIGT_Prostate_Fully_Segmented (HE/CT)
 
** Key Investigators:
 
** Key Investigators:
 
** Brief Description:
 
** Brief Description:
 +
** Use: What kinds of queries will be important?
 
* NCIGT_Brain_Biopsy (FT)
 
* NCIGT_Brain_Biopsy (FT)
 
** Key Investigators:
 
** Key Investigators:
 
** Brief Description:
 
** Brief Description:
 +
** Use: What kinds of queries will be important?
  
 
=Use-Case Goals=
 
=Use-Case Goals=
Line 38: Line 47:
 
'''Step 3. Disseminating & Sharing'''
 
'''Step 3. Disseminating & Sharing'''
 
* In addition to NCIGT mandate to share data, each effort listed above will have different requirements for being able to make data available to collaborating and other interested groups.
 
* In addition to NCIGT mandate to share data, each effort listed above will have different requirements for being able to make data available to collaborating and other interested groups.
 +
 +
'''Step 4. Moving data from central.xnat.org to BWH instance of XNAT'''
  
 
=Outcome Metrics=
 
=Outcome Metrics=
Line 59: Line 70:
  
 
==Current Data Management Process==
 
==Current Data Management Process==
 +
Data on local disk.
  
==Target Data Management Process (Step 1.) Option A.==
+
==Target Data Management Process (Step 1.) Option A. interactive upload using various tools and web gui ==
* Create new project using web GUI
+
See [[ CTSC_DataManagementWorkflow | here ]]
* Manage project using web GUI: Configure settings to automatically place data into the archive (no pre-archive)
 
* Create a subject template (download from web GUI)
 
* Create a spreadsheet conforming to subject template
 
* Upload spreadsheet using web GUI to create subjects
 
 
 
=== Content for Batch anonymize & upload script(s) ===
 
* Run CLI Tool for batch anonymization (See here for HowTo:  http://nrg.wustl.edu/projects/DICOM/DicomBrowser/batch-anon.html)
 
* Need pointer for script to do batch upload & apply DICOM metadata.
 
* Confirm data is uploaded & represented properly with web GUI
 
 
 
==Target Data Management Process (Step 1.) Option B. (web services API) '''CURRENTLY BEING TESTED!'''==
 
 
 
'''1. Create new project on XNAT instance using web GUI'''
 
 
 
* Create a new project by selecting the New button at the GUI top
 
* Select the project from the project list.
 
* From within the Project view, Click "Access" tab and set permissions to be appropriate
 
* From within the Project view, Select the "Manage" tab and configure settings to automatically place data into the archive (no pre-archive)
 
 
 
<br>
 
<br>
 
=== Content for Batch anonymize & upload script(s) ===
 
'''2. Batch Anonymize your local data'''
 
* The approach to writing anonymization scripts is here: http://nrg.wustl.edu/projects/DICOM/AnonScript.jsp
 
* See description for batch anonymization here: http://nrg.wustl.edu/projects/DICOM/DicomBrowser/batch-anon.html
 
* Download and install commandline tools: http://nrg.wustl.edu/projects/DICOM/DicomBrowser-cli.html
 
 
 
'''2.a''' Create a remapping config xml file to describe the spreadsheet to be built from the DICOM data. Root element is "Columns" and each subelement describes a column in the spreadsheet:
 
 
 
* tag = DICOM tag
 
* level = (global, patient, study, series) describes the level at which the remapping is applied
 
 
 
An example is:
 
 
 
<Columns>
 
  <Global remap="Fixed Institution Name">(0008,0080)</Global>
 
  <Global remap="Anon Requesting Physician" (0032,1032)</Global>
 
  <Patient remap="Anon Patient Name">(0010,0010)</Patient>
 
  <Patient remap="Anon PatientID">(0010,0020)</Patient>
 
  <Patient remap="Anon Patient Address" (0010,1040)</Patient>
 
  <Study>(0020,0010)</Study>
 
  <Study>(0008,0020)</Study>
 
  <Series>(0020,0011)</Series>
 
  <Series>(0008,0031)</Series>
 
</Columns>
 
 
 
'''2.b''' Generate a spreadsheed from the data that includes the remapped dicom tags:
 
  DicomSummarize -c remap-config-file.xml -v remap.csv [directory-1 ...]
 
 
 
The arguments in brackets are a '''list''' of directories containing the source DICOM data (separated by spaces?)
 
 
 
'''2.c''' Write an anonymization script for any simple changes, such as deleting an attribute, or setting an attribute value to either a fixed value or a simple function of other attribute values in the same file. Here, make sure to remove patient address and requesting physician as noted, plus whatever else you'd like (recommendations?)
 
 
 
See http://nrg.wustl.edu/projects/DICOM/AnonScript.jsp for detailed information about writing anonymization scripts. Here's a script written by Mark during his testing.
 
// removes all attributes specified in the
 
//  DICOM Basic Application Level Confidentiality Profile
 
// mark@bwh.harvard.edu added the following tags:
 
// (0010,1040) PatientsAddress
 
// (0032,1032) RequestingPhysician
 
// is seems the study and series InstanceUID tags are needed  (0020,000D) (0020,000E)
 
//- (0020,000D)
 
//- (0020,000E)
 
// - (0010,1010) preserve pt age
 
// - (0010,1040) preserve pt sex
 
- (0008,0014)
 
- (0008,0050)
 
- (0008,0080)
 
- (0008,0081)
 
- (0008,0090)
 
- (0008,0092)
 
- (0008,0094)
 
- (0008,1010)
 
  (0008,1030) := "SPL_IGT"
 
- (0008,1040)
 
- (0008,1048)
 
- (0008,1050)
 
- (0008,1060)
 
- (0008,1070)
 
- (0008,1080)
 
- (0008,2111)
 
  (0010,0010) := "case143"
 
  (0010,0020) := "case143"
 
- (0010,0030)
 
- (0010,0032)
 
- (0010,1040)
 
- (0010,0040)
 
- (0010,1000)
 
- (0010,1001)
 
- (0010,1020)
 
- (0010,1030)
 
- (0010,1090)
 
- (0010,2160)
 
- (0010,2180)
 
- (0010,21B0)
 
- (0010,4000)
 
- (0018,1000)
 
- (0018,1030)
 
  (0020,0010) := "MR1"
 
- (0020,0052)
 
- (0020,0200)
 
- (0020,4000)
 
- (0032,1032)
 
- (0040,0275)
 
- (0040,A124)
 
- (0040,A730)
 
- (0088,0140)
 
- (3006,0024)
 
- (3006,00C2)
 
 
 
'''2.d''' Edit the spreadsheet (remap.csv file) that is generated as output.
 
 
 
This spreadsheet will contain all the columns you defined, plus some additional columns needed to uniquely identify each patient, study, and series.
 
 
 
Each new (remap) column should be filled with values. In some cases, some cells in the spreadsheet can be left blank: for a Patient-level remap, one value must be specified for each patient; if the spreadsheet contains multiple rows for each patient, the column needs only be filled in one row for each patient. Similarly, for a Study-level remap, the value need only be filled once. If you don't fill in a required cell, the remapper will complain. If you give, for example, a Patient-level remap column multiple values for a single patient, the remapper will complain.
 
 
 
'''2.e''' Run the remapper:
 
 
 
DicomRemapper -c remap-config-file.xml -o <path-to-output-directory> -v remap.csv [directory-1 ...]
 
 
 
*the remap config XML should be the same file used in 2.a,
 
* remap.csv is the spreadsheet generated in 2.c and edited in 2.d, and
 
* list of directories is the same list of source directories from 2.e.
 
* add an anonymization script to be applied at this stage by using the -d option.
 
* first time you use a script to generate new UIDs, you'll need a new UID root;
 
** do this by adding -s http://nrg.wustl.edu/UIDGen to the DicomRemapper command line.
 
 
 
 
 
'''NOTE''' DicomBrowser doesn't write directly into the database -- it can send to a DICOM server. Below we use webservices to write directly to the database. Does this violate best practices?
 
 
 
 
 
<br>
 
<br>
 
For next steps using web services, use curl or XNATRestClient (See here to '''download XNATRestClient''' in xnat_tools.zip from here: http://nrg.wikispaces.com/XNAT+REST+API+Usage)
 
 
 
<br>
 
<br>
 
'''3. Authenticate''' with server and create new session; use the response as a sessionID ($JSessionID) to use in subsequent queries
 
curl -d POST $XNE_Svr/REST/JSESSION -u $XNE_UserName:$XNE_Password
 
or, use the XNATRestClient
 
XNATRestClient -host $XNE_Svr -u $XNE_UserName -p $XNE_Password -m POST -remote /REST/JSESSION
 
 
 
<br>
 
<br>
 
'''4. Create subjects on XNAT'''
 
  XNATRestClient -host $XNE_Svr -user_session $JSessionID -m PUT /REST/projects/$ProjectID/subjects/s0001
 
(This will create a subject called 'S0001' within the project $ProjectID)
 
 
 
A script can be written to automatically create all subjects for the project.
 
 
 
'''4a. Specify the demographics of a subject already created, or create with demographic specification'''
 
 
 
'''4.a.1''' No demographics are applied to each subject by default. To edit the demographics (like gender or handedness) of a subject '''already created''' using XML Path shortcuts.
 
 
 
xnat:subjectData/demographics[@xsi:type=xnat:demographicData]/gender = male
 
xnat:subjectData/demographics[@xsi:type=xnat:demographicData]/handedness = left
 
 
 
The entire command looks like this (Append XML path shortcuts and separate each by an ''&''. Note that querystring parameters must be separated from the actual URI by a ?):
 
 
 
XNATRestClient -host $XNE_svr -user_session $JSessionID -m PUT -remote "/REST/projects/$ProjectID/subjects/s0001?xnat:subjectData/demographics[@xsi:type=xnat:demographicData]/gender=male&xnat:subjectData/demographics[@xsi:type=xnat:demographicData]/handedness=left"
 
 
 
All XML Path shortcuts that can be specified on commandline for projects, subject, experiments are listed here: http://nrg.wikispaces.com/XNAT+REST+XML+Path+Shortcuts
 
 
 
'''4.a.2''' Alternatively, specify the demographics '''during subject creation''' by generating and uploading an xml file with the subject:
 
 
XNATRestClient -host $XNE_srv -user_session $JSessionID -m PUT -remote "/REST/projects/$ProjectID/subjects/s0002" - local ./$ProjectID_s0002.xml
 
 
 
The XML file you create and post looks like this:
 
 
 
<xnat:Subject ID="s0002" project="$ProjectID" group="control" label="1" src="12"  xmlns:xnat="http://nrg.wustl.edu/xnat" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 
    <xnat:demographics xsi:type="xnat:demographicData">
 
        <xnat:dob>1990-09-08</xnat:dob>
 
        <xnat:gender>female</xnat:gender>
 
        <xnat:handedness>right</xnat:handedness>
 
        <xnat:education>12</xnat:education>
 
        <xnat:race>12</xnat:race>
 
        <xnat:ethnicity>12</xnat:ethnicity>
 
        <xnat:weight>12.0</xnat:weight>
 
        <xnat:height>12.0</xnat:height>
 
    </xnat:demographics>
 
</xnat:Subject>
 
 
 
'''4.b (optional check) Query the server to see what subjects have been created:'''
 
XNATRestClient -host $XNE_Svr -user_session $JSessionID -m GET -remote /REST/projects/$ProjectID/subjects
 
 
 
'''4.c Create experiments (collections of image data) you'd like to have for each subject'''
 
 
 
XNATRestClient -host $XNE_Svr -user_session $JSessionID -m PUT -remote "/REST/projects/$ProjectID/subjects/$SubjectID/experiments/MRExperiment?xnat:mrSessionData/date=01/02/09"
 
XNATRestClient -host $XNE_Svr -user_session $JSessionID -m PUT -remote "/REST/projects/$ProjectID/subjects/$SubjectID/experiments/CTExperiment1?xnat:ctSessionData/date=01/02/09"
 
XNATRestClient -host $XNE_Svr -user_session $JSessionID -m PUT -remote "/REST/projects/$ProjectID/subjects/$SubjectID/experiments/PETExperiment1?xnat:petSessionData/date=01/02/09"
 
 
 
'''4.d (optional check) Query the server to see what experiments have been created:'''
 
XNATRestClient -host $XNE_Svr -user_session $JSessionID -m GET -remote /REST/projects/$ProjectID/subjects/s0001/experiments?format=xml
 
 
 
 
 
<br>
 
<br>
 
 
 
'''5. Create uris for scans, reconstructions and upload them.'''
 
  
Note: when uploading images, it is good form to define the format of the images (DICOM, ANALYZE, etc) and the content type of the
+
==Target Data Management Process (Step 1.) Option B. batch scripted upload via DICOM Server ==
data. '''This will not translate any information in the DICOM header into metadata on the scan.'''
+
See [[ CTSC_DataManagementWorkflow | here ]]
  
//create SCAN1
+
==Target Data Management Process (Step 1.) Option C. batch scripted upload via web services==
XNATRestClient -host $XNE_Svr -user_session $JSessionID -m PUT -remote "/REST/projects/$ProjectID/subjects/$SubjectID/experiments/$ExperimentID/scans/SCAN1?xnat:mrScanData/type=T1"
+
See [[ CTSC_DataManagementWorkflow | here ]]
 
/upload SCAN1 files...
 
XNATRestClient -host $XNE_Svr -user_session $JSessionID -m PUT -remote "/REST/projects/$ProjectID/subjects/$SubjectID/experiments/$ExperimentID/scans/SCAN1/files/1232132.dcm?format=DICOM&content=T1_RAW" -local /data/subject1/session1/RAW/SCAN1/1232132.dcm
 
 
 
//create SCAN2
 
XNATRestClient -host $XNE_Svr -user_session $JSessionID -m PUT -remote "/REST/projects/$ProjectID/subjects/$SubjectID/experiments/$ExperimentID/scans/SCAN2?xnat:mrScanData/type=T2"
 
 
//upload SCAN2 files...
 
XNATRestClient -host $XNE_Svr -user_session $JSessionID -m PUT -remote "/REST/projects/$ProjectID/subjects/$SubjectID/experiments/$ExperimentID/scans/SCAN2/files/1232133.dcm?format=DICOM&content=T2_RAW" -local /data/subject1/session1/RAW/SCAN2/1232133.dcm
 
 
 
//create reconstruction 1
 
XNATRestClient -host $XNE_Svr -user_session $JSessionID -m PUT -remote "/REST/projects/$ProjectID/subjects/$SubjectID/experiments/$ExperimentID/reconstructions/session1_recon_0343?xnat:reconstructedImageData/type=T1_RECON"
 
 
//upload reconstruction 1 files...
 
XNATRestClient -host $XNE_Svr -user_session $JSessionID -m PUT -remote "/REST/projects/$ProjectID/subjects/$SubjectID/experiments/$ExperimentID/reconstructions/session1_recon_0343/files/0343.nfti?format=NIFTI" -local /data/subject1/session1/RECON/T1_0343/0343.nfti
 
 
 
//create reconstruction 2
 
XNATRestClient -host $XNE_Svr -user_session $JSessionID -m PUT -remote "/REST/projects/$ProjectID/subjects/$SubjectID/experiments/$ExperimentID/reconstructions/session1_recon_0344?xnat:reconstructedImageData/type=T2_RECON"
 
 
//upload reconstruction 2 files...
 
XNATRestClient -host $XNE_Svr -user_session $JSessionID -m PUT -remote "/REST/projects/$ProjectID/subjects/$SubjectID/experiments/$ExperimentID/reconstructions/session1_recon_0344/files/0344.nfti?format=NIFTI" -local /data/subject1/session1/RECON/T1_0344/0344.nfti
 
 
  
<br>
 
<br>
 
'''6. Confirm''' data is uploaded & represented properly with web GUI
 
  
 
==Target Query Formulation (Step 2.)==
 
==Target Query Formulation (Step 2.)==

Latest revision as of 20:26, 19 August 2009

Home < CTSC IGT, BWH

Back to CTSC Imaging Informatics Initiative


Mission

Mark Anderson at Surgical Planning and Channing labs currently manages data for many investigators, pulling data from PACS into the research environment. There is interest in setting up a parallel channel by which the data are also enrolled into an XNAT database and accessed from client, and comparing its ease of use with the existing infrastucture. To explore XNAT as a possible long-term informatics solution for the NCIGT project, Mark will be uploading retrospective data for a number of NCIGT efforts (and PIs):

  • NCIGT_Brain_Function (SS/AG)
    • Key Investigators:
    • Brief Description:
    • Use: What kinds of queries will be important?
  • NCIGT_Tumor_Resection (HK/AG)
    • Key Investigators:
    • Brief Description:
    • Use: What kinds of queries will be important?
  • NCIGT_Glioma_Resection (HK)
    • Key Investigators:
    • Brief Description: Intraoperative MRT brain only
    • Use: What kinds of queries will be important? sex,age,tumor size, tumor grade, tumor location (lobe)
  • NCIGT_Prostate (HE/CT)
    • Key Investigators:
    • Brief Description:
    • Use: What kinds of queries will be important?
  • NCIGT_Prostate_Fully_Segmented (HE/CT)
    • Key Investigators:
    • Brief Description:
    • Use: What kinds of queries will be important?
  • NCIGT_Brain_Biopsy (FT)
    • Key Investigators:
    • Brief Description:
    • Use: What kinds of queries will be important?

Use-Case Goals

Step 1. Data Management

  • Anonymize, apply DICOM metadata and upload retrospective datasets; confirm appropriate organization and naming scheme via web GUI.

Step 2. Query & Retrieval

  • Make specific queries using XNAT web services,
  • Download data conforming to specific naming convention and directory structure, using XNAT web services

Each effort listed above will have different requirements for being able to query, retrieve and use data collections. Brief description of how retrospective data will be used within the NCIGT is described below:

  • NCIGT_Brain_Function:
  • NCIGT_Tumor_Resection:
  • NCIGT_Prostate:
  • NCIGT_Prostate_Fully_Segmented:
  • NCIGT_Brain_Biopsy:

Step 3. Disseminating & Sharing

  • In addition to NCIGT mandate to share data, each effort listed above will have different requirements for being able to make data available to collaborating and other interested groups.

Step 4. Moving data from central.xnat.org to BWH instance of XNAT

Outcome Metrics

Step 1. Data Management

Step 2. Query & Retrieval

Step 3. Dissemination & Sharing

Fundamental Requirements

Participants

  • Mark Anderson
  • Tina Kapur

Data

Workflows

Current Data Management Process

Data on local disk.

Target Data Management Process (Step 1.) Option A. interactive upload using various tools and web gui

See here

Target Data Management Process (Step 1.) Option B. batch scripted upload via DICOM Server

See here

Target Data Management Process (Step 1.) Option C. batch scripted upload via web services

See here


Target Query Formulation (Step 2.)

Target Processing Workflow (Step 3.)

Other Information