Home < AHM 2006:ProjectsSlicerDataModel

Project

Designing a Data Centric Model for Slicer 3.

Open Questions from Programmer's Week Discussions

Syntax of the factory for itk (is the extra layer needed?) - Jim Miller
- Keeping the VTK and ITK factory syntax parallel

How can developers add new data types to mrml? - Lauren O'Donnell
- Current slicer supports the idea of modules having their own data types
- Implementation is difficult and not well documented.

Who will be doing what? (Alex, Xiaodong, Mathieu to allocate time and effort)

Synatax and tie-in to Mike's MRML Path

Goals

Design and Implement a prototype of a Data Model server for Slicer 3

It should represent a scene graph
It should compute and return Transforms between objects in the scene graph.
It should be suitable for Image Guided Surgery. This may require it to be compatible with Real-Time OS
It should return datasets (image data)
It should return surface models (vtkPolydata?)

Requirements

It must work as a service
It must be accessible from Batch programs as well as GUI programs
It must be computationally efficient
It must be multi-platform
It must be memory efficient

Use Cases

Some of these use cases were taken from the Slicer requirements for IGS applications

Slicer 3 and IGSTK integration (Nobuhiko Hata, Luis Ibanez, Patrick Cheng)

The Data Model will act as a service that offers to clients the options of

Storing data along with tags (MetaData Identifiers)
Retrieving data using the Identifiers
Modifying data in place

Current Use Cases

The basic Data Model in Slicer supports instances as,

Volumes
Scalar Types
Label Maps (segmentation result)
Reference to Lookup Table
Models
Named Field Data (scalars, vectors, labels) at points and cells (FreeSurferReaders)
Color, Clipping State, Visibility, Scalar Visibility, LookupTable
Transforms*
Fiducials, Fiducial Lists
Name, Label#, Diffuse/Ambient/Specular.

The Data Model API in Slicer allows adding, deleting, reading, and modifying medical image data types (Volumes, Models, Transforms, Fiducials, etc).

Use Cases to Add

In addition to the Data Model provided by Slicer, we will develop additional instances required uniquely for RFA.

State information
Transformation matrix for CT-to-patinet registration in the tracker’s coordinate system
Predicted error from the CT-to-patient registration
Locations of tracker attached to the RFA applicator and US transducer
Transformation matrix from calibration of tracker to the US image coordinate system.
Magnitude and gain of the US imager in the last state of the imaging.
Location of fiducial markers

Strawman "Hello MRML" programs

No client server, just read an xml file

main ()
{
  vtkMrmlTree *mrml = vtkMrmlTree::New();
  mrml->Connect("file://data.xml");
  mrml->PrintSelf();
  mrml->Delete();
}

Connect to a server, modify, commit

main ()
{
  vtkMrmlTree *mrml = vtkMrmlTree::New();
  mrml->Connect("mrml://mrml.na-mic.org/data");
  vtkMrmlTransformNode *trans = vtkMrmlTransformNode::New();
  mrml->AddNode(trans);
  mrml->Commit();
  trans->Delete();
  mrml->Delete();
}

Open mrml file, run a vtk filter, save new file. This example use separate mrml, and vtkmrml libraries.

 #include "mrml.h"
 #include "vtkmrml.h"
 
 main ()
 {
 
   // get mrml tree
   mrml::Tree *mrml = mrml::Tree::New();
   mrml->Connect("file://data.xml");
 
   // get input image in vtk format
   mrml::VolumeNode *volNode = mrml->GetNthVolume(0);
 
   vtkmrml::VolumeData *inData = vtkmrml::VolumeData::New();
 
   inData->SetSourceNode(volNode);
   vtkImageData *imgData = inData->GetImageData(); // converts data from internal format to vtk
 
   // vtk pipeline
   vtkImageGaussianSmooth *igs = vtkImageGaussianSmooth::New();
   igs->SetInput(imgData);
   igs->GetOutput()->Update();
 
   // put output volume in a new mrml volume node
   mrml::VolumeNode *volNodeOut = mrml::VolumeNode::New();
 
   vtkmrml::VolumeData *outData = vtkmrml::VolumeData::New();
 
   outData->SetTargetNode(volNodeOut);
   outData->SetSourceImage(igs->GetOutput());
   outData->Update();   // converts data fom vtkImage into internal format
 
   // add node to the mrml tree
   mrml->AddNode(vol);
 
   // save new file
   mrml->Save("file://data1.xml");
 
   igs->Delete();
 
   mrml->Delete(); // Do we need this? vtk style or smartPointers?
   inData->Delete(); // Do we need this? vtk style or smartPointers?
   outData->Delete(); // Do we need this? vtk style or smartPointers?
   volNodeOut->Delete(); // Do we need this? vtk style or smartPointers?
 }

Connect to a server, run an ITK filter, commit

main ()
{
  vtkMrmlTree *mrml = vtkMrmlTree::New();
  mrml->Connect("mrml://mrml.na-mic.org/data");
  vtkMrmlVolumeNode *vol = mrml->GetNthVolume(0);
  typedef itk::NormalizeImageFilter<<float,3>,<float,3>> ImageFilterType;
  ImageFilterType::Pointer norm = ImageFilterType::New();
  norm->SetInput(vol->GetITKDataF());
  norm->GetOutput()->Update();
  vtkMrmlVolumeNode *vol = vtkMrmlVolumeNode::New();
  vol->SetITKDataF(norm->GetOutput());
  mrml->AddNode(vol);
  mrml->Commit();
  vol->Delete();
  mrml->Delete();
}

ITK Style

using namespaces and ITK idiom

main ()
{
  Mrml::Tree::Pointer mrml = Mrml::Tree::New();
  mrml->Connect("mrml://mrml.na-mic.org/data");

  typedef itk::Image<float,3> ImageType;
  // itkmrml knows about both itk and mrml
  typedef itkmrml::VolumeData<ImageType> VolumeDataFactoryType;

  VolumeDataFactoryType::Pointer factory = VolumeDataFactoryType::New();
  factory->SetSource(mrml->GetNthVolume(0));
  if ( !factory->CanTranslate() ) return;

  typedef itk::NormalizeImageFilter<ImageType,ImageType> ImageFilterType;
  ImageFilterType::Pointer norm = ImageFilterType::New();
  norm->SetInput(factory->GetImage());
  norm->GetOutput()->Update(); // this pulls mrml data into itk::Image

  VolumeDataFactoryType::Pointer outfactory = VolumeDataFactoryType::New();
  mrml::VolumeNode outvol = mrml::VolumeNode::New();
  outfactory->SetImage(norm->GetOutput());
  outfactory->SetTarget(outvol);
  outfactory->Update(); // this pushes itk::Image data into mrml
  mrml->AddNode(outvol);
  mrml->Commit();
}

DataModel API

This is an initial draft of interactions with the DataModel. Most of the entries were taken from the vtkMrmlTree class.

dm->Connect("filename");
dm->Connect("URL");
dm->Commit();
dm->Close();
dm->InsertNode( node, "parent name", "node name");
dm->GetNode("node name");
dm->HasNode("node name");
dm->GetNextNode(); ?? // shouldn't we rather have iterators ?
dm->GetNthItem(); // useful for blind IO...?
dm->Gets by Class():
- GetVolume()
- GetTransform()
- GetMatrix() ?? are these matrices representing Transforms ?
- GetColor()
dm->ComputeTransforms();
dm->ComputeRelativeTransform("node1 name","node2 name");
dm->DeleteNode( node );
dm->DeleteNode( "node name" );
dm->Delete() : // Let's use vtkSmartPointers and avoid to need Delete()...

Node name stands for any type of Identification, it may be implemented in the form of an integer Id, or in the form of a string.

XML versus SQL

Our analysis seems to indicate that SQL and XML are possible solutions for the storage of the data on disk. We intent to implement an API that will talk to the storage implementation and that will hide it from the Slicer applications. In other words, slicer developers and slicer users should not need to know that there is an XML file or a SQL database underneath.

The following table summarizes the advantages and disadvantages of using XML versus SQL. There is also the option of combining both, if we find that each one alone does not provide all the features that we want for Slicer applications.

Feature	XML	SQL
Get element by an identifier	natural but need to be hierarchical	natural
Insert element with an identifier	natural	natural
Hierarchy navigation	natural	must be implemented with auxiliary table
Resistant to power-down	No	Yes
Support for large datasets	Yes	Yes
Speed for access	to be measured	to be measured

SQL Options

Use a model similar to CORBA but with a customized minimal implementation
- http://www.corba.org/
- http://www.omg.org/
Use a model similar to Microsoft Windows DataSet Features
- http://msdn.microsoft.com/data/default.aspx?pull=/library/en-us/dnvs05/html/newdtastvs05.asp
Use an SQL Database Server model
- Microsoft: http://msdn.microsoft.com/sql/
- Postgress: http://www.postgresql.org/
- MySQL: http://dev.mysql.com/
- SQLite: http://www.sqlite.org/
- MetaKit: http://www.equi4.com/

The implementation could be done using a unified approach for all the platforms, or it could be done by creating a common API, that then wraps to different local libraries in different platforms. For example, it could use MS-SQL in Windows, and MySQL in Unix, wrapping both of them in a common C++ API customized for the types for objects that Slicer would manage.

Matrix of current options

	MS-Windows	Linux	Cygwin	Macintosh	Sun	SGI	License	Installation Burden
MS-SQL	yes	no	no	no	no	no	EULA?	Medium (only Windows)
PostgreSQL	yes	yes	yes	yes	yes	yes	BSD (see details)	Medium (requires root or home user build)
MySQL	yes	yes	yes	yes	yes	yes	GPL / Commercial (see details)(see some issues)	Medium (requires root or home user build)
CORBA*	yes	yes	yes	yes	yes	yes	?	High (requires root and network setup)
SQLite	yes	yes	yes	yes	yes	yes	Public Domain (see)	Low (built-in into the application)
MetaKit	yes	yes	yes	yes	yes	yes	X/MIT Style (see)	Low (built-in into the application)

CORBA would actually require a specific package to be tested per platform...

Current Option

SQLite

Features include: (from)

Transactions are atomic, consistent, isolated, and durable (ACID) even after system crashes and power failures.
Zero-configuration - no setup or administration needed.
Implements most of SQL92. (Features not supported)
A complete database is stored in a single disk file.
Database files can be freely shared between machines with different byte orders.
Supports databases up to 2 terabytes (241 bytes) in size.
Sizes of strings and BLOBs limited only by available memory.
Small code footprint: less than 250KiB fully configured or less than 150KiB with optional features omitted.
Faster than popular client/server database engines for most common operations.
Simple, easy to use API.
TCL bindings included. Bindings for many other languages available separately.
Well-commented source code with over 95% test coverage.
Self-contained: no external dependencies.
Sources are in the public domain. Use for any purpose.

Second Option

PostgreSQL DataBase

Online Tutorial

Features

Allows connections via unix domain sockets and TCP/IP connections
Has binding to PHP, C, Python, Perl, Tcl
Size Limitations (taken from)
- Maximum size for a database? unlimited (32 TB databases exist)
- Maximum size for a table? 32 TB
- Maximum size for a row? 1.6TB
- Maximum size for a field? 1 GB (This is what we will map to one Image. If it becomes a limit we could store the image in Slices per field)
- Maximum number of rows in a table? unlimited
- Maximum number of columns in a table? 250-1600 depending on column types
- Maximum number of indexes on a table? unlimited
Object Oriented Database: Fields can be customized object data structures.
- Supports Inheritance: on database can inherit properties from another one (details).
Database server can be a remote machine or the local one
- This will support naturaly a client/server approach such as the one in ParaView
- Client applications can be very diverse in nature: a client
  - Could be a text-oriented tool.
  - A graphical application
  - A web server that accesses the database to display web pages
  - or a specialized database maintenance tool.
The PostgreSQL server can handle multiple concurrent connections from clients.
- For that purpose it starts ("forks") a new process for each connection. From that point on, the client and the new server process communicate without intervention by the original postmaster process.
Supported platforms (see)
Native support for using SSL connections to encrypt client/server communications for increased security. This requires that OpenSSL is installed on both client and server systems and that support in PostgreSQL is enabled at build time

Third Option

MetaKit DataBase

http://www.equi4.com/mkoverview.html

Features

Use your data on any platform. Both the code and datafiles are portable. All byte-ordering managed by the library.
Complex datastructures in one file. Store multiple nested data structures, to create document-centric applications.
Restructure datafiles, instantly. It restructure files on-the-fly, while open.
Serialize all data for transport. Complementing commit/rollback of changes, data can also be serialized.
Recover from system-failures. The use of Stable Storage ensures that files cannot be corrupted by crashes.
Load on-demand, quick startup.Files are opened without reading data. Memory-mapped files if O/S supports it.
Behaves like containers. The API mimics container classes. Quickly get sizes and iterate over rows.
Wide range of operators built-in. Sorting, relational join / group by, set operations, permutations, hashing.
1-32 bits per int (or 64), variable-sized data. The largest int defines storage format. String/binary data is stored as var-sized.
Create fully self-contained applications. Can be linked shared or statically, for hassle-free deployment of components.
Tiny code (125 Kb as Win32 DLL). The library is extremely small, unused functions are stripped off in static links.
Simple API, just 6 core classes. Only a small interface is exposed. One header file lists all the classes you need.
Also use from Python and Tcl. These language bindings are coded to take advantage of the respective idioms.

API Issues: Strawman Answers

Bold: updates after tcon

MRML Tree:
- It MrmlTree a true hierarchy or a list of nodes (as currently)?
- Should it be a real scene tree? Right now it's XML file image in memory which combines scene hierarchy and data persistance. No.
- If it's xml file image, do we use DOM, XPath for internal representation of xml file? No, use existing MRML hierarchy.
- Do we use SQL database to persist MRML tree and data? Do we use database to provide remote access to MRML trees and data in the client/server mode. No. Implementation of the internals of the data model will be hidden behind the API

MRML nodes:
- How is the data accessed from Mrml Node, can we make it independent from vtk/itk types like vtkImageData and itk::Image<>?
- Metadata and vtk data should be separated to avoid redundancy. What metadata is stored in the new MrmlVolume, MrmlModel, etc. Can we use delegation from vtkImageData SetSpacing() etc. methods to avoid duplication? Explicit synchronize metadata methods between MRML node and vtk data. The metadata in the MRML nodes is the definitive version -- any platform-specific metadata is filled out by the factory that generates the structure
- What subset of general vtk vtkImageData and vtkPolyData is supported, multicomponent, tensors etc. Do we create special MRML nodes for tensors. Allow full vtk API for creating and manipulating vtkDataSet and vtkFieldData. Define a specific set of functionality to be represented by MRML -- do not rely on vtk
- How transformations are represented? Do we use new Coordianet System Manager? Yes. Yes Do we use MrmlGroup Node instead MrmlTransform with the new coordinate system manager defining transformations? Yes. Need a way to serialize coordinate systems.
- Support ITK Volumes in API? Only 3D vtk volumes. Define the volume types that MRML needs to support for NA-MIC needs (with ability to extend) ITK and VTK factories will be responsible for mapping them to the specific system.

Coordinate Systems:
- How are Slicer internal coordinate systems (RAS, LPS, ijk, vtk) represented by the new coordinate system manager? Do we store RAStoIJK as part of MRML node metadata? Each RAStoRAS Transform is a MRML node. Coordinate System Manager has pointers to those transforms and internal RAStoIJK transforms of volumes and models.
- How are non-linear transforms represented? Do we support displacement fields, BSplines? From what coordinate system to what they transform? How are vectors, normals and tensors treated? All non-linear transforms are RAStoRAS. Need new vtk Transforms similar to ITK transforms. vtk transforms can be implemented with vtkITK wrappers

Execution model:
- Need C++ classes for Aplication, Modules, Viewers. Move tcl global arrays to vtk collections. Move all visualization and application state and logic into it's own classes. Need C++ API for update loop, observers.

Client/Server:
- Do we support the entire mrml API between client and server? Initially we support only ITK style ImageIO. Later full API?
- Do we use CORBA? Need to have stream based serialization for both Mrml nodes and Mrml data
- Do we use SQL database? Database has to support client/server mode for simulatbious read-write operations.

Path-based MRML3 proposal

Here's a proposal for a path-based MRML3 implementation, using ideas from the Coordinate Space Manager.

Current Status

First C++ draft is in the sandbox

Current Tools

Python prototype by Mike of the path-based XML layer:
1. parses XML elements and overlays semantics of "path" and "ref" tags.
2. pemote resource cache implemented
3. mechanism for handling namespaced elements and attributes implemented
4. reading files implemented
5. writing 70% implemented (remaining issue: renaming resources while maintaining links)
6. renaming resources 50% implemented (what happens when you want to move a remote resource like a URL into a local file?)
C++ prototype by Luis

Test Data

N/A.

Team Members

Mike Halle - BWH
Alex Yarmarkovich - Isomics
Xiaodong Tao - GE
Luis Ibanez - Kitware
Steve Pieper - Isomics

Slides

File:2006 AHM Programming Half Week SlicerDataModel.ppt

AHM 2006:ProjectsSlicerDataModel

Contents