Skip to contents

Generate a custom curation database from user-supplied table of translated (amino acid) mitochondrial gene sequences. Requires a CSV file containing three columns: "SeqID" = unique name to be used for sequence, "Gene" = name of gene, and "FASTA" = name of fasta file containing the protein sequence. Combines your sequences with Metazoa or Chordata NCBI RefSeq data. Make sure to carefully consider what you are adding to the custom database. You should only use high-confidence sequences. Poor quality reference data will result in poorly curated gene models.

Usage

custom_curation_db(
  path = ".",
  genes_to_add = NULL,
  gene_fasta_dir = NULL,
  path_to_makeblastdb = NULL,
  base_db = "Metazoa"
)

Arguments

path

Path to the project directory (default = current working directory)

genes_to_add

Full path to CSV file containing three columns: SeqID = unique name to be used for sequence, Gene = name of gene, FASTA = name of fasta file containing the sequence

gene_fasta_dir

Full path to directory containing your gene FASTA files, one file per sequence

path_to_makeblastdb

Full path to makeblastdb, only necessary if not already in your PATH

base_db

Which base NCBI RefSeq database to use, "Metazoa" or "Chordata"? Default = "Metazoa"

Details

Values in "Gene" column of your CSV must only include the following gene abbreviations:
nad1 = "NADH dehydrogenase subunit 1",
nad2 = "NADH dehydrogenase subunit 2",
cox1 = "cytochrome c oxidase subunit 1",
cox2 = "cytochrome c oxidase subunit 2",
cox3 = "cytochrome c oxidase subunit 3",
atp8 = "ATP synthase F0 subunit 8",
atp6 = "ATP synthase F0 subunit 6",
atp9 = "ATP synthase F0 subunit 9",
cox3 = "cytochrome c oxidase subunit 3",
nad3 = "NADH dehydrogenase subunit 3",
nad4l = "NADH dehydrogenase subunit 4L",
nad4 = "NADH dehydrogenase subunit 4",
nad5 = "NADH dehydrogenase subunit 5",
nad6 = "NADH dehydrogenase subunit 6",
cob = "cytochrome b",
dpo = "DNA-polymerase",
lagli = "homing endonuclease",
msh1 = "MutS mismatch DNA repair protein",
mttb = "trimethylamine methyltransferase"