Monday, March 06, 2006

PAL: Phylogenetic Analysis Library

The PAL Project:

"PAL: Phylogenetic Analysis Library

Main Features:

PAL is entirely written in the Java language. This allows for a clean object oriented design while avoiding the complexities of C++. Moreover, Java class code runs without needing recompilation on a wide range on platforms. Additionally, PAL also compiles into native code on Unix systems (just like C++) using the GNU compiler for Java (gcj), part of recent releases of the GNU compiler collection (gcc). Corresponding makefiles are included with this distribution of PAL.

PAL consists of a rich variety of objects to facilitate the construction of special-purpose tools for phylogenetic analysis. PAL contains, e.g., ready-to-use objects for:

* reading and writing sequence alignments, distance matrices, and trees
* a large variety of substitution models for nucleotides and amino acids (REV, TN, HKY, F84, F81, JC; Dayhoff, JTT, MTREV24, BLOSUM, VT, WAG, CPREV) as well as for codons (Yang codon model)
* Various models for rate variation over sites (invariable sites, Gamma)
* efficient maximum-likelihood estimation of pairwise distances and of branch lengths in a tree (for unconstrained, clock, and dated-tips clock trees)
* simulating coalescence intervals and estimation of demographic parameters
* likelihood ratio and chi-square tests and for comparison of phylogenetic hypotheses (e.g., Kishino-Hasegawa and Shimodaira-Hasegawa tests)
* manipulating alignments (e.g., bootstrapping) and trees and simulating data
* optimizing uni- and multivariate functions by various methods, computing numerical derivatives, random numbers (simulation quality), sorting etc.
* creating formatted input and output from/to files, standard io streams, and strings, through convenience classes that extend the standard Java IO library
* construct neighbor-joining, UPGMA and SUPGMA trees, and estimating least-squares branch lengths on trees (weighted and unweighted LS)
"