Covary is a computational framework designed for large-scale biological sequence analysis, powered by TIPs-VF. Leverage alignment-free, translation-aware embeddings to compare, cluster, and analyze genetic sequences at scale.
Covary is a computational framework designed for large-scale biological sequence analysis, powered by TIPs-VF (Translator-Interpreter Pre-seeding for Variable-length Fragments).
Circumvents computationally expensive multiple sequence alignments (MSA), enabling scalable analyses for large datasets.
Incorporates codon-bound information for biologically meaningful sequence comparison — works in both coding and non-coding sequences.
Computes embeddings and distance matrices for downstream clustering and visualization across PCA, t-SNE, and UMAP.
Resolves sequences at species, genus, family, and order levels. Supports multi-FASTA files from a variety of organisms.
Phylogenomic tools, data processing workflows and simulation pipelines built for Covary.
A k-mer-derived, non-overlapping, and frequency-independent encoding logic. Represents genetic sequences based on relative proximity, directional alignment, and translation awareness.
Explore on GitHub
A lightweight Python toolkit that simulates tumor-specific gene sequence profiles by applying patient mutation data from TCGA cohorts to a reference sequence. Recreates "mutated" FASTA outputs per patient.
Explore on GitHub
A computationally-optimized tool that detects a common seed region across genetic sequences and reorders them to start at the same point, standardizing FASTA inputs for Covary without full MSA.
Explore on GitHubCovary leverages alignment-free, translation-aware embeddings to compare, cluster, and analyze genetic sequences.
Perform large-scale phylogenomic analyses
Embed sequences with translation-aware context
Run fast, scalable exploratory workflows
Analyze sequences without alignment or gaps
Covary is suitable for use in solving a number of research questions spanning phylogenetics, pathogen surveillance, and more.
Covary provides alignment-free, translation-aware comparisons that help distinguish species, infer relatedness, and support phylogenetic analyses even across highly divergent sequences.
Covary can rapidly screen genetic sequences from clinical or environmental samples to identify viral, bacterial, or fungal pathogens without requiring multiple sequence alignment.
Covary enables fast, large-scale profiling of mixed microbial communities, helping researchers uncover taxonomic composition, detect rare organisms, and analyze functional divergence.
Covary's translation-aware, alignment-free framework extends into new domains — from oncogenomics to forensic genetics and precision medicine.
Map subclonal architecture and mutational trajectories in cancer genomes, tracing how tumor lineages diverge over time and under selective pressure.
Learn moreModel the evolutionary trajectories of viral genomes in outbreak settings — forecasting emergent variants and resistance evolution for proactive public health response.
Learn moreApply Covary's identification engine to forensic DNA profiling, environmental surveillance, and species-of-origin determination from complex biological samples.
Learn moreCovary: A translation-aware framework for alignment-free phylogenetics using machine learning.
doi.org/10.1101/2025.11.13.687960 →Rapid Phylogenomic Analysis of Thousands Outbreak-Causing Viral Genomes Using Covary.
doi.org/10.20944/preprints202512.1970.v1 →Have questions about Covary, licensing, training, or the Research Program? Reach out and we'll get back to you.