Recent advances in development of sequencing technology has resulted in a deluge of genomic data. In order to make sense of this data, there is an urgent need for algorithms for data processing and quantitative reasoning. An emerging in silico approach, called computational genomic signatures, addresses this need by representing global species-specific features of genomes using simple mathematical models.
This text introduces the general concept of computational genomic signatures, and it reviews some of the DNA sequence models which can be used as computational genomic signatures. The text takes the position that a practical computational genomic signature consists of both a model and a measure for computing the distance or similarity between models. Therefore, a discussion of sequence similarity/distance measurement in the context of computational genomic signatures is presented. The remainder of the text covers various applications of computational genomic signatures in the areas of metagenomics, phylogenetics and the detection of horizontal gene transfer.