Skip to main content Skip to secondary navigation


Main content start

We highlight here software packages and data sets that we make available. Some software is freely downloadable anonymously, some requires that you login and provide your contact information, and some require a license. Contact Dr. Altman if you have questions.

Our recent projects can be found in the Helix Research Lab GitHub.


COLLAPSE is a representation learning method for protein structural and functional sites. The repository contains all package functionality for embedding new protein datasets as well as scripts for functional site search and annotation, pre-training, and transfer learning. 

License: MIT


FEATURE is a suite of automated tools that examine biological structures and produce useful representations of the key biophysical and biochemical features of these structures that are critical for understanding function. The utility of this system extends from medical/pharmaceutical applications (model-based drug design, comparing pharmacological activities) to industrial applications (understanding structural stability, protein engineering).

License: See SimTK page


PocketFEATURE is an algorithm that seeks similar microenvironments within two binding sites, and assesses overall binding site similarity by the presence of multiple shared microenvironments. The method has relatively weak geometric requirements and uses multiple biophysical and biochemical measures to characterize the microenvironments (to allow for diverse modes of ligand binding). The method is able to recognize several proven distant relationships, and predicts unexpected shared ligand binding. License: MIT Use Agreement


FragFEATURE uses the FEATURE system representation to analyze pockets in proteins and to predict small molecule fragments that are likely to bind in specific areas of the pocket. FragFEATURE uses the statistical association between protein microenvironment and the location of nearby molecular fragments to make these predictions.

License: MIT Use Agreement

Ensemble Biclustering for Classification (EBC)

A python implementation of the Ensemble Biclustering for Classification (EBC) algorithm. Although having "biclustering" in its name, EBC is a co-clustering algorithm that allows you to perform co-clustering on very large sparse N-dimensional matrix. For details and examples of using EBC please reference this paper.

License: MIT Use Agreement

Gene-Gene Application for DeepDive

Gene-Gene application for the system DeepDive. Code is not formatted for current release of DeepDive and is being updated.

Parsed datasets are provided by Chris Re's group at Stanford University: DeepDive Open Data