Change Log
Version 0.1.0 (20 May, 2022)
- Version 0.2.0 (03 August, 2022)
Find automatically the mkdb.pm, so scripts can be run from anywhere (providing the path to them)
get_subtaxa.pl - new script
download_bold.pl - option to cut up taxa to subtaxa with less than max_record_n specimen records
format_db.pl - new option for VTAM format
add benchmarking of select_taxa
Version 0.2.1 (07 August, 2022)
format_db.pl Correct if there is a taxID 0 in taxonomy file
Version 0.2.2 (09 January, 2023)
select_taxa.pl recognizes merged taxids
Version 0.2.3 (24 January, 2023)
format_rdp.pl added to make sequence.tsv and taxonomy.tsv from RDP training files
Version 0.2.4 (17 April, 2023)
download_bold.pl works if taxon_list has a heading (taxon_name) and also if it has not
Version 0.3.0 (06 May, 2024)
format_bold_package.pl replaces download_bold.pl sinc it is quicker and BOLD API do not work any more for large data download
reduce_metadata.pl to create a tsv file with BOLD metadata of the BOLD sequences kept in COInr
for better traceability of the BOLD sequences, ids are in the following format: BOLD_markercode_processid
Version 0.3.1 (28 Oct, 2024)
format_db.pl add sintax option to produce database for SINTAX
Version 0.4.0 (09 May, 2025)
format_bold_package.pl
Read input TSV line by line, to reduce memory need
Can delete sequneces without BOLD BIN; New argument delete_noBIN [0/1/2]
Sequence IDs are in the following format: BOLD_MARKER_PROCESSID_BIN (BOLD_COI-5P_JRPAA9741-15_BIN:ADQ9721 or BOLD_COI-5P_GBBAC2495-15_BIN:NA)
add_taxids.pl
avoid using 0 as a taxid
if more than one taxid for name
takes the one with highest proportion of taxa matching the upwards lineage (as before)
then smallest difference in taxlevel
then prefer a valid scientific name if choice between synonyms and scientific names
taxon name in BOLD mathcing a scientific name of the taxid (for homotypic synomyms it is possible to have different scientific names)
Version 0.5.0 (23 May, 2025)
download_taxonomy.pl, format_db.pl, add_taxids.pl
adapt script to use domain with tax_rank_index 1, instead of superkingdom