Below is a list of commonly used terms and definitions in the field of genomics (source : Genome Reference Consortium).
- Assembly : a set of chromosomes, unlocalized and unplaced sequences and alternate loci used to represent an organism’s genome
- Chromosome Assembly : a relatively complete pseudo-molecule assembled from smaller sequences that represent a biological chromosome
- Diploid Assembly : a genome assembly for which a Chromosome Assembly is available for both sets of an individual’s chromosomes
- Haploid Assembly : the collection of Chromosome assemblies, unlocalized and unlocalized sequences and alternate loci that represent an organism’s genome
- Primary Assembly : a primary assemblies represents the collection of assembled chromosomes, unlocalized and unplaced sequences that, when combined, should represent a non-redundant haploid genome
- Assembly Units : collections of sequences used to define discrete parts of an assembly
- Genome Patch : a contig sequence that is released outside of the full assembly release cycle
- FIX patch : FIX patches are released to correct an error in the assembly and will be removed when the new full assembly is released
- NOVEL patch : NOVEL patches are sequences that were not in the last full assembly release and will be retained with the next full assembly release
- Alternate Locus :
- Unlocalized Sequence : a sequence found in an assembly that is associated with a specific chromosome but cannot be ordered or oriented on that chromosome
- Unplaced Sequence : a sequence found in an assembly that is not associated with any chromosome
- PAR (Pseudo-autosomal region) : a region found on the X and Y chromosomes of mammals that allow recombination between the sex chromosomes
- AGP File : a file used to describe the instructions for building a contig, scaffold or chromosome sequence
- Contig : a contiguous sequence generated from determining the non-redundant path along an order set of component sequences
- Component : a low genomic level sequence used to construct the genome, typically these are either clone sequences, WGS sequence or a PCR fragment
- Join : the sequence overlap between two adjacent components in a contig
- Scaffold : an ordered and oriented set of contigs with gaps
- Switch Point : the base at which the contig sequence stops being generated from one component sequence and switches to using the next component sequence
- TPF (Tiling Path file) : provides the order of the component sequences used to build a contig, scaffold or chromosome