Software / programs


hapLOH profiles and characterizes tumor genomes using data from SNP microarrays (Vattathil & Scheet, 2013). It is designed to be effective in the presence of high levels of germline contamination. The software (written by Selina Vattathil and Jerry Fowler) is available for academic, non-commercial use. Please visit to register.


Cancer in silico Drug Discovery (CIDD) is a platform to integrate data from the TCGA, Connectivity Map (CMap) and Cancer Cell Line Encyclopedia to facilitate and automate discovery of candidate drug compounds with the ultimate goal for treatment or chemo-prevention of cancer. Our manuscript is currently in the final stages of preparation or under review. For inquiries to obtain the software please contact Anthony San Lucas (lead author and developer,, Eduardo Vilar, MD, PhD ( or Paul Scheet, PhD (


vtools is a set of tools for annotating and tracking sequence variation for large-scale exome sequencing projects (San Lucas et al, 2011). It was developed and authored by Anthony San Lucas and Bo Peng, and is available for download at


Haploscope is a tool for visualizing haplotype diversity, based on a cluster-based model for haplotype variation (Scheet & Stephens, 2006). It automates the production of images such as those in Jakobsson et. al. (2008). It is written in Java (by Anthony San Lucas) and may be obtained by visiting

Haploscope is freely downloadable with a GNU GPL v3 license.


fastPHASE is a program to estimate missing genotypes and unobserved haplotypes. It is an implementation of the model described in Scheet & Stephens (2006). This is a cluster-based model for haplotype variation, and gains its utility from implicitly modeling the genealogy of chromosomes in a random sample from a population as a tree but summarizing all haplotype variation in the "tips" of the trees.

The program offers additional functionality, as well, including the following: estimation and correction of genotyping errors based on patterns of linkage disequilibrium (Scheet & Stephens, 2008), haplotype-based association mapping of binary phenotypes, estimation of missing genotypes from low-coverage sequencing data. We are in the process of developing a web-based tutorial for fastPHASE and will be updating this space soon.

Links for registration and download may be obtained from the Stephens Lab Software page.