IntroductionOne of the latest additions in BioSmalltalk is a wrapper for running the well-known ShapeIt2 software (actually is ShapeIt v2). ShapeIt is a fast and accurate method for estimation of haplotypes (a.k.a. phasing) from a set of SNP genotypes (.ped format or its .bed/.bim/.fam binary version) and a genetic map (.map format), and produces as output, either a single set of estimated haplotypes, or a haplotype graph that encapsulates the uncertainty about the underlying haplotypes. The software is currently only available in Unix-like OS.
UsageTo use the wrapper the program binary must be in the system PATH environment variable and all input files, being binarized PLINK (bed, bim, fam) or textual PLINK (ped, map) must share the same name. The following expression launches ShapeIt2 from BioSmalltalk, setting several parameters such as:
- The number of burn-in MCMC iterations
- The input file name (without extension),
- The output file name for the best haplotypes
- The number of threads to use the multi-threading capabilities
BioShapeIt2WrapperR727 new burn: 10; inputBinarized: 'input_brangus'; outputMax: 'output_brangus'; thread: 8; executeIf you like to explicitly specify
BioShapeIt2WrapperR644 new burn: 10; inputTextual: 'input_brangus.ped'; inputMap: 'input_brangus.map'; outputMax: 'output_brangus'; thread: 8; execute
FeaturesNow the BioShapeIt2Wrapper is a superclass for specialized subclasses, each one representing a particular release of ShapeIt2. When I started the wrapper the binary executable of ShapeIt2 was named "shapeit.v2.r644.linux.x86_64", then I checked "shapeit.v2.r727.linux.x64" was released but cannot be run in CentOS 6.x. So you want to keep older version, and also know which binaries are available (it does not mean they are installed in your system of course):
BioShapeIt2Wrapper releases "an OrderedCollection('shapeit.v2.r644.linux.x86_64' 'shapeit.v2.r727.linux.x64')"