SDIWG:NCBC Software Classification MAGNet Examples

Home < SDIWG:NCBC Software Classification MAGNet Examples

Description: DelPhi provides numerical solutions to the Poisson-Boltzmann equation (both linear and nonlinear form) for molecules of arbitrary shape and charge distribution. The current version is fast (the best relaxation parameter is estimated at run time), accurate (calculation of the electrostatic free energy is less dependent on the resolution of the lattice) and can handle extremely high lattice dimensions. It also includes flexible features for assigning different dielectric constants to different regions of space and treating systems containing mixed salt solutions.
Data Input: DelPhi takes as input a coordinate file format of a molecule or equivalent data for geometrical objects and/or charge distributions
Data Output: electrostatic potential in and around the system
Implementation Language: Fortran and C
Version, Date, Stage: Stable public release
Authors: E.Alexov, R.Fine, M.K.Gilson, A.Nicholls, W.Rocchia, K.Sharp, and B. Honig.
Platforms Tested: Unix-SGI IRIX, linux, PC (requires Fortran and C compilers), AIX IBM version and Mac.
License: Freely available to academia; pay model for commercial users.
Keywords: Finite Difference Poisson-Boltzman Solver
URL: http://trantor.bioc.columbia.edu/delphi
Organization: MAGNet
NCBC Ontology Classification: Atomic --> SoftwareFunction --> Protein Modeling and Classification --> Numerical Calculation of Electrostatic Potential

GRASP

Description: A molecular visualization and analysis program. It is particularly useful for the display and manipulation of the surfaces of molecules and their electrostatic properties.
Data Input: PDB files, potential maps from DelPhi
Data Output: molecular graphics.
Implementation Language: Fortran
Version, Date, Stage: v1.3.6 .Stable public release.
Authors: Anthony Nicholls and Barry Honig.
Platforms Tested: SGI machines: irix 5.x and 6.x (INDYs, INDIGOs including Impact, Octane and O2) systems.
License: Freely available to academia.
Keywords: molecular visualization
URL: http://trantor.bioc.columbia.edu/grasp
Organization: MAGNet
NCBC Ontology Classification: Atomic --> SoftwareFunction --> Protein Modeling and Classification --> Molecular Visualization Package

Nest

Description: Modeling protein structure based on a sequence-template alignment. The current server works only for modeling with a single template. Part of jackal, which can be downloaded.
Data Input: pir and PDB files
Data Output:
Implementation Language: C++
Version, Date, Stage: Stable public release.
Authors: Xiang, Z. and Honig, B.
Platforms Tested: platform independent (web based tool)
License: Freely available to academia.
Keywords: modeling, protein structure, sequence-template alignment.
URL: http://honiglab.cpmc.columbia.edu/cgi-bin/jackal/nest.cgi
Organization: MAGNet
NCBC Ontology Classification: Atomic --> SoftwareFunction --> Protein Modeling and Classification --> Homology Modeling

JACKAL

Description: Jackal is a collection of programs designed for the modeling and analysis of protein structures. Its core program is a versatile homology modeling package nest. JACKAL has the following capabilities: 1) comparative modeling based on single, composite or multiple templates; 2) side-chain prediction; 3) modeling residue mutation, insertion or deletion; 4) loop prediction; 5) structure refinement; 6) reconstruction of protein missing atoms;7) reconstruction of protein missing residues; 8) prediction of hydrogen atoms; 9) fast calculation of solvent accessible surface area; 10) structure superimposition.
Data Input:
Data Output:
Implementation Language: C++
Version, Date, Stage: Version: 1.5 as of Oct, 20, 2002, Stable public release.
Authors: Z. Xiang and B. Honig
Platforms Tested: SGI 6.5, Intel Linux and Sun solaris
License: Freely available to academia.
Keywords: Protein Structure Modeling
URL: http://trantor.bioc.columbia.edu/programs/jackal
Organization: MAGNet
NCBC Ontology Classification: Atomic --> SoftwareFunction --> Protein Modeling and Classification --> Prediction of Side-chain Conformations

GRASP2

Description: GRASP2 is an updated version of the GRASP program used for macromolecular structure and surface visualization, contains a large number of new features and scientific tools: Enhanced GUI; Structure alignment and domain database scanning; A gaussian surface generator and new surface coloring schemes; Sequence visualization and alignment; Completed work can be stored in "project files; Among the many objects that can be stored in a project file are views of the structure; defined subsets, surfaces; Direct printing to printers at full printer resolution.
Data Input: PDB files, potential maps from DelPhi, sequence alignments.
Data Output: molecular graphics, structural alignments.
Implementation Language: C++
Version, Date, Stage: Stable public release
Authors: Donald Petrey and Barry Honig.
Platforms Tested: Windows, Linux
License: Freely available to academia.
Keywords: molecular visualization
URL: http://trantor.bioc.columbia.edu/grasp2
Organization: MAGNet
NCBC Ontology Classification: Atomic --> SoftwareFunction --> Protein Modeling and Classification --> Molecular Visualization Package

PrISM

Description: PrISM is an integrated computational system where computational tools are implemented for protein sequence and structure analysis and modeling.
Data Input:
Data Output:
Implementation Language: Fortran
Version, Date, Stage: Stable public release
Authors: Wang, L, Yang, A. S. & Honig, B.
Platforms Tested:SGI-irix, Intel-linux
License: Freely available to academia.
Keywords: protein analysis/modeling
URL: http://trantor.bioc.columbia.edu/programs/PrISM/
Organization: MAGNet
NCBC Ontology Classification: Atomic --> SoftwareFunction --> Protein Modeling and Classification --> Homology Modeling

Protein-DNA interface alignment

Description: The protein-DNA alignment software allows one to align the interfacial amino acids from two protein-DNA complexes based on the geometric relationship of each amino acid to its local DNA.
Data Input: two PDB files that both contain protein-DNA complexes
Data Output: The programs will output the aligned residues and their corresponding residue-residue similarity scores, s(i,j).
Implementation Language: C++ and Perl
Version, Date, Stage:Stable public release.
Authors: Siggers, T.W., Silkov, A & Honig, B.
Platforms Tested: Linux
License: Freely available to academia
Keywords: protein-DNA interface
URL: http://trantor.bioc.columbia.edu/programs/intfc_aln
Organization: MAGNet
NCBC Ontology Classification: Atomic --> SoftwareFunction --> Protein Modeling and Classification --> Prediction of Side-chain Conformations

SURFace

Description: SURFace algorithms are programs that calculate solvent accessible surface area and curvature corrected solvent accessible surface area
Data Input:
Data Output:
Implementation Language:
Version, Date, Stage: Stable public release.
Authors: Nicholls, A., Sharp, K., Sridharan, S. and Honig, B.
Platforms Tested: SGI
License: Freely available to academia.
Keywords: solvent accessible surface area
URL: http://trantor.bioc.columbia.edu/surf/
Organization: MAGNet
NCBC Ontology Classification: Atomic --> SoftwareFunction --> Protein Modeling and Classification --> Caculation of Solvent Accessible Area

Target Explorer

Description: Automated process of prediction of complex regulatory elements for specified set of transcription factors in Drosophila melanogaster genome. Target Explorer is a complex tool with user-friendly self-explanatory Web-interface that allows to user: 1. create customized library of TF binding site matrices based on user defined sets of training sequences; 2. search for new clusters of binding sites for specified set of TFs; 3.extract annotation for potential target genes.
Data Input: genomic sequences
Data Output: clusters of known binding sites
Implementation Language: perl, cgi
Version, Date, Stage: Stable public release.
Authors: Sosinsky A, Bonin CP, Mann RS, Honig B.
Platforms Tested: platform independent (web based tool)
License: Freely available to academia.
Keywords: prediction of binding sites for transcription factors
URL: http://trantor.bioc.columbia.edu/Target_Explorer/
Organization: MAGNet
NCBC Ontology Classification: Atomic --> SoftwareFunction --> Genomic & Phenotypic Analysis--> Sequence Annotation

MEDUSA and Gorgon

Description: MEDUSA is an algorithm for learning predictive models of transcriptional gene regulation from gene expression and promoter sequence data. By using a statistical learning approach based on boosting, MEDUSA learns cis regulatory motifs, condition-specific regulators, and regulatory programs that predict the differential expression of target genes. The regulatory program is specified as an alternating decision tree (ADT). The Java implementation of MEDUSA allows a number of visualizations of the regulatory program and other inferred regulatory information, implemented in the accompanying Gorgon tool, including hits of significant and condition-specific motifs along the promoter sequences of target genes and regulatory network figures viewable in Cytoscape.
Data Input: Discretized (up/down/baseline) gene expression data in plain text format, promoter sequences in FASTA format, list of candidate transcriptional regulators and signal transducers in plain text format.
Data Output: Regulatory program represented as a Java serialized object file readable by Gorgon and as a human readable XML file. Gorgon currently generates views of learned PSSMs, positional hits along promoter sequences, and views of the ADT as HTML files, and generates network figures as Cytoscape format files.
Implementation Language: Java (prototyped in MATLAB)
Version, Date, Stage: Version 2.0, July 2006, pre-release beta version; Version 1.0 (MATLAB), April 2005, stable public release
Authors: David Quigley, Manuel Middendorf, Steve Lianoglou, Anshul Kundaje, Yoav Freund, Chris Wiggins, Christina Leslie
Platforms Tested: Windows, Linux, Mac OS X
License: Open source
Keywords:
URL: http://www.cs.columbia.edu/compbio/medusa (MATLAB),http://compbio.sytes.net:8090/medusa (Java beta version)
Organization: MAGNet
NCBC Ontology Classification: Atomic --> SoftwareFunction --> Genomic & Phenotypic Analysis --> Regulatory/Signaling network reconstruction

String kernel package

Description:The string kernel package contains implementations for the mismatch and profile string kernels for use with support vector machine (SVM) classifiers for protein sequence classification. Both kernels compute similarity between protein sequences based on common occurrences of k-length subsequences ("k-mers") counted with substitutions. Kernel functions for protein sequence data enable the training of SVMs for a range of prediction problems, in particular protein structural class prediction and remote homology detection. A version of the Spider MATLAB machine learning package is also bundled with the code, which allows users to train SVMs and evaluate performance on test sets with the packaged software.
Data Input: The mismatch kernel requires sequence data in FASTA format. The profile string kernel uses probabilistic profiles, such as those produced by PSI-BLAST, in place of the original sequences. The Spider SVM implementation requires both the kernel matrix and a label file of binary or multi-class labels for the training data; this data must be loaded into MATLAB variables before using Spider routing.
Data Output:The kernel code produces a kernel matrix for the input data in tab-delimited text format. The Spider package trains SVMs and stores the learns classifier and results from applying the classifier on test data as MATLAB objects.
Implementation Language: String kernel code is implemented in C. Spider is a set of object-oriented MATLAB routines.
Version, Date, Stage: Version 1.2, September 2004, stable public release
Authors: Eleazar Eskin, Rui Kuang, Eugene Ie, Ke Wang, Jason Weston, Bill Noble, Christina Leslie
Platforms Tested: Windows, Linux
License: Open source
Keywords:
URL: http://www.cs.columbia.edu/compbio/string-kernels
Organization: MAGNet
NCBC Ontology Classification: Atomic --> SoftwareFunction --> Protein Modeling and Classification

MatrixREDUCE

Description: Regulation of gene expression by a transcription factor requires physical interaction between the factor and the DNA, which can be described by astatistical mechanical model. Based on this model, the MatrixREDUCE algorithm uses genome-wide occupancy data for a transcription factor (e.g.ChIP-chip or mRNA expression data) and associated nucleotide sequences to discover the sequence-specific binding affinity of the transcription factor. The sequence specificity of the transcription factor's DNA-binding domain is modeled using a position-specific affinity matrix (PSAM), representing the change in the binding affinity (Kd) whenever a specific position within a reference binding sequence is mutated. The PSAM can be transformed into affinity logo for visualization using the utility program AffinityLogo, and a MatrixREDUCE run can be summarized in an easy-to-navigate webpage using HTMLSummary.
Data Input: sequence file in FASTA format; and expression data file in tab-delimited text format.
Data Output: PSAMs in numeric and graphical format, parameters of the fitted model, and an HTML summary page.
Implementation Language: ANSI C, making use of Numerical Recipes routines.
Version, Date, Stage: Version 1.0, July 10, 2006, extensively tested in lab.
Authors: Barrett Foat, Xiang-Jun Lu, Harmen J. Bussemaker
Platforms Tested: Linux, Cygwin (Windows), Mac OS X
License:
Keywords: position-specific affinity matrix, binding affinity, cis-regulatory element, expression data, ChIP-chip, transcription factor
URL: http://www.bussemakerlab.org/software/MatrixREDUCE
Organization: MAGNet
NCBC Ontology Classification: Atomic --> SoftwareFunction --> Genomic & Phenotypic Analysis --> Regulatory/Signaling network reconstruction

T-profiler

Description: T-profiler is a web-based tool that uses the t-test to score changes in the average activity of pre-defined groups of genes. The gene groups are defined based on Gene Ontology categorization, ChIP-chip experiments, upstream matches to a consensus transcription factor binding motif, and location on the same chromosome, respectively. If desired, an iterative procedure can be used to select a single, optimal representative from sets of overlapping gene groups. A jack-knife procedure is used to make calculations more robust against outliers. T-profiler makes it possible to interpret microarray data in a way that is both intuitive and statistically rigorous, without the need to combine experiments or choose parameters.
Data Input: Currently, gene expression data from Saccharomyces cerevisiae and Candida albicans are supported.
Data Output:
Implementation Language: T-profiler is written in PHP, data is managed by a MYSQL database server
Version, Date, Stage:
Authors: André Boorsma, Barrett C. Foat, Daniel Vis, Frans Klis, Harmen J. Bussemaker
Platforms Tested: Web-based application
License:
Keywords: gene expression, transcriptome, ChIP-chip, Gene Ontology
URL: http://www.t-profiler.org
Organization: MAGNet
NCBC Ontology Classification: Atomic --> SoftwareFunction --> Genomic & Phenotypic Analysis --> Network characterization

TranscriptionDetector

Description:A tool for finding probes measuring significantly expressed loci in a genomic array experiment. Given expression data from some tiling array experiment, TranscriptionDetector decides the likelihood that a probe is detecting transcription from the locus in which it resides. Probabilities are assigned by making use of a background signal intensity distribution from a set of negative control probes. This tool is useful for the functional annotation of genomes as it allows for the discovery of novel transcriptional units independently of any genomic annotation.
Data Input: Expression data (GEO or other platforms) and designation of which probes represent negative controls and which are data probes.
Data Output: A text file with a list of probes corresponding to significantly expressed loci.
Implementation Language: ANSI C, making use of GSL.
Version, Date, Stage:
Authors: Xiang-Jun Lu, Gabor Halasz, Marinus F. van Batenburg
Platforms Tested: Linux, Cygwin (Windows), Mac OS X
License:
Keywords: tiling arrays, expression, transcriptome
URL: http://www.bussemakerlab.org/software/TranscriptionDetector/
Organization: MAGNet
NCBC Ontology Classification:

PhenoGO

Description: PhenoGO adds phenotypic contextual information to existing associations between gene products and Gene Ontology (GO) terms as specified in GO Annotations (GOA). PhenoGO utilizes an existing Natural Language Processing (NLP) system, called BioMedLEE, an existing knowledge-based phenotype organizer system (PhenOS) in conjunction with MeSH indexing and established biomedical ontologies. The system also encodes the context to identifiers that are associated in different biomedical ontologies, including the UMLS, Cell Ontology, Mouse Anatomy, NCBI taxonomy, GO, and Mammalian Phenotype Ontology. In addition, PhenoGO was evaluated for coding of anatomical and cellular information and assigning the coded phenotypes to the correct GOA; results obtained show that PhenoGO has a precision of 91% and recall of 92%, demonstrating that the PhenoGO NLP system can accurately encode a large number of anatomical and cellular ontologies to GO annotations. The PhenoGO Database may be accessed at www.phenogo.org.
Data Input: Gene Ontology Annotations Files and Medline Abstracts
Data Output: XML file and www.phenogo.org Web Portal
Implementation Language: A variety of modules, the web portal is in Java and MySQL, the computational terminology component (phenOS) is written in Perl scripts that queries tables in IBM DB2, the natural language processing component is written in PROLOG.
Version, Date, Stage: Version 2, Feb 2006
Authors: Yves Lussier and Carol Friedman are the principal investigators. The programmers are Jianrong Li, Lee Sam, and Tara Borlawsky
Platforms Tested: n/a
License: n/a
Keywords: Phenotypic integration, computational phenotypes
URL: http://www.phenogo.org
Organization: MAGNet
NCBC Ontology Classification: Biotool --> Data Management --> Information retrieval, traversal and querying; Atomic --> SoftwareFunction --> Natural Language Processing

MINDY

Description: Given a transcription factor of interest, MINDY uses a large set of gene expression profile data to identify potential post-transcriptional modulators of the transcription factor's activity. MINDY is based on a three-way statistical interaction model that captures the post-transcriptional regulatory event where the ability of a transcription factor to activate/repress its target genes is monotonically controlled by a potential modulator gene.
Data Input: Gene expression data in the EXP format, and a user-specified transcription factor of interest
Data Output: Lists of the putative modulators and target genes of the transcription factor, and the modulatory interactions involving them
Implementation Language: C++ and MATLAB, Java
Version, Date, Stage: Stable release, April 2007
Authors: Kai Wang, Ilya Nemenman, Adam Margolin, Riccardo Dalla-Favera, Andrea Califano
Platforms Tested: Linux, Cygwin
License: n/a
Keywords: gene expression, transcriptional interaction, modulator
URL: n/a
Organization: MAGNet
NCBC Ontology Classification: Atomic --> SoftwareFunction --> Genomic & Phenotypic Analysis --> Regulatory/Signaling network reconstruction

B Cell Interactome

Description: The B cell interactome (BCI) is a network of protein-protein, protein-DNA and modulatory interactions in human B cells. The network contains known interactions (reported in public databases) and predicted interactions by a Bayesian evidence integration framework which integrates a variety of generic and context specific experimental clues about protein-protein and protein-DNA interactions - such as a large collection of B cell expression profiles - with inferences from different reverse engineering algorithms, such as GeneWays and ARACNE. Modulatory interactions are predicted by MINDY, an algorithm for the prediction of modulators of transcriptional interactions.
Data Input: n/a
Data Output: text file of binary interations associated with a probability.
Implementation Language: Perl
Version, Date, Stage: Version 2, March 2007
Authors: Lefebvre C, Lim WK, Basso K, Dalla Favera R, and Califano A.
Platforms Tested: n/a
License:
Keywords: Naive Bayes, Mixed-Interaction Network, human B cells.
URL: http://amdec-bioinfo.cu-genome.org/html/BCellInteractome.html
Organization: MAGNet
NCBC Ontology Classification: Atomic --> SoftwareFunction --> Interaction Modeling

ARACNE

Description: ARACNE is an algorithm for inferring gene regulatory networks from a set of microarray experiments. The method uses mutual information to identify genes that are co-expressed and then applies the data processing inequality to filter out interactions that are likely to be indirect.
Data Input: Text file containing measurements from a set of microarray experiments.
Data Output: Text file containing predicted interactions.
Implementation Language: C++, Java
Version, Date, Stage: Version 1, June, 2006
Authors: Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Dalla Favera R, Califano A.
Platforms Tested: Window, Linux
License: Open source
Keywords: Reverse engineering, mutual information, genetic networks, microarray
URL: http://amdec-bioinfo.cu-genome.org/html/ARACNE.htm
Organization: MAGNet
NCBC Ontology Classification: Atomic --> SoftwareFunction --> Genomic & Phenotypic Analysis --> Regulatory/Signaling network reconstruction

geWorkbench

Description: geWorkbench is a Java application that provides users with an integrated suite of genomics tools. It is built on an open-source, extensible architecture that promotes interoperability and simplifies the development of new as well as the incorporation of pre-existing components. The resulting system provides seamless access to a multitude of both local and remote data and computational services through an integrated environment that offers a unified user experience. Over 50 data analysis and visualization components have been developed for the framework, covering a wide range of genomics domains including gene expression, sequence, structure and network data.
Data Input: Gene epxression data (Affy, GenPix, RMA), Sequence (FASTA), Structure (PDB).
Data Output: Analysis results (multiple formats).
Implementation Language: Java
Version, Date, Stage: 1.0.5, 3/23/07, stable production release
Authors: A. Califano, A. Floratos. M. Kustagi, K. Smith, J. Watkinson, M. Hall, K. Keshav, X. Zhang, K. Kushal, B. Jagla, E. Daly, M. VanGinhoven, P. Morozov.
Platforms Tested: Windows XP, Linux, Mac OS 10.x.
License: Free.
Keywords: Analysis suite, gene expression analysis, sequence analysis, network reconstruction, structure predcition, visualization.
URL: http://www.geworkbench.org
Organization: MAGNet
NCBC Ontology Classification: Atomic --> SoftwareFunction --> Genomic & Phenotypic Analysis; Atomic --> SoftwareFunction --> Interaction Modeling; Atomic --> SoftwareFunction --> Protein Modeling and Classification; Atomic --> SoftwareFunction --> Software Engineering and Development Tool --> Integration --> Resource Integration Components; Atomic --> SoftwareFunction --> Software Engineering and Development Tool --> Integration --> Grid Computing Resources; Atomic --> SoftwareFunction --> Visualization

SDIWG:NCBC Software Classification MAGNet Examples

Contents

DelPhi

GRASP

Nest

JACKAL

GRASP2

PrISM

Protein-DNA interface alignment

SURFace

Target Explorer

MEDUSA and Gorgon

String kernel package

MatrixREDUCE

T-profiler

TranscriptionDetector

PhenoGO

MINDY

B Cell Interactome

ARACNE

geWorkbench

Navigation menu

Views

Personal tools

General

Resources

Search

Tools