Generate a custom curation database — custom_curation

Generate a custom curation database from user-supplied table of translated (amino acid) mitochondrial gene sequences. Requires a CSV file containing three columns: "SeqID" = unique name to be used for sequence, "Gene" = name of gene, and "FASTA" = name of fasta file containing the protein sequence. Combines your sequences with Metazoa or Chordata NCBI RefSeq data. Make sure to carefully consider what you are adding to the custom database. You should only use high-confidence sequences. Poor quality reference data will result in poorly curated gene models.

Usage

custom_curation_db(
  path = ".",
  genes_to_add = NULL,
  gene_fasta_dir = NULL,
  path_to_makeblastdb = NULL,
  base_db = "Metazoa"
)

Arguments

path: Path to the project directory (default = current working directory)
genes_to_add: Full path to CSV file containing three columns: SeqID = unique name to be used for sequence, Gene = name of gene, FASTA = name of fasta file containing the sequence
gene_fasta_dir: Full path to directory containing your gene FASTA files, one file per sequence
path_to_makeblastdb: Full path to makeblastdb, only necessary if not already in your PATH
base_db: Which base NCBI RefSeq database to use, "Metazoa" or "Chordata"? Default = "Metazoa"

Details

Values in "Gene" column of your CSV must only include the following gene abbreviations:
nad1 = "NADH dehydrogenase subunit 1",
nad2 = "NADH dehydrogenase subunit 2",
cox1 = "cytochrome c oxidase subunit 1",
cox2 = "cytochrome c oxidase subunit 2",
cox3 = "cytochrome c oxidase subunit 3",
atp8 = "ATP synthase F0 subunit 8",
atp6 = "ATP synthase F0 subunit 6",
atp9 = "ATP synthase F0 subunit 9",
cox3 = "cytochrome c oxidase subunit 3",
nad3 = "NADH dehydrogenase subunit 3",
nad4l = "NADH dehydrogenase subunit 4L",
nad4 = "NADH dehydrogenase subunit 4",
nad5 = "NADH dehydrogenase subunit 5",
nad6 = "NADH dehydrogenase subunit 6",
cob = "cytochrome b",
dpo = "DNA-polymerase",
lagli = "homing endonuclease",
msh1 = "MutS mismatch DNA repair protein",
mttb = "trimethylamine methyltransferase"