Changelog

This page contains (mostly) all of the changes that were made between each version of EnTAP.

EnTAP v1.0.1 (November 13, 2023)

  • Fixed an issue with the formatting of the ‘…gene_ontology_terms.tsv’ file output from EnTAP

EnTAP v1.0.0 (September 26, 2023)

  • Updated RSEM (v1.3.3), TransDecoder (v5.7.1), and DIAMOND (v2.1.8) libraries in EnTAP repository. EnTAP is now compatible with these versions

  • Added new test data (under ‘test_data’) directory in the EnTAP repository. This should work with latest versions of software being used by EnTAP. See docs for how to run.

  • Added additional statistics/percentages at the end of the Log File

  • Added support for Tidyverse format (TSV’s will now print ‘NA’ for empty data)

  • Added Dockerfile to EnTAP repository

  • Added support for a new Gene Ontology term TSV output format. Similar to other formats, but combined into one file. More info can be seen with ‘–output–format’ flag

  • Changed DIAMOND runs (during Ontology and during Similarity Searching) to use ‘very-sensitive’ from ‘more-sensitive’. This should give more alignments, but may take longer to execute now

  • Removed the ‘–level’ flag for Gene Ontology levels. This was not useful to users and caused confusion. Instead, all GO Terms will be printed by default and gene ontology levels are removed from output

  • Renamed and restructured many of the output in the ‘final_results’ directory for better clarity

  • Fixed issue if trying to run EnTAP configuration locally to rebuild the EnTAP database. Users may have seen it fail during the Gene Ontology stage due to a change in formatting of the Gene Ontology database

EnTAP Beta v0.10.9-beta (June 29, 2023)

  • Optimizations made to similarity searching using the EnTAP SQL database. Should improve speed

  • Added ‘api-taxon’ command that will verify whether an input taxon can be found in the taxonomy database. It will return json formatted text

  • Added additional messaging throughout EnTAP execution to stdout

  • Removed Test Data from repository, it is no longer compatible with latest version of software within EnTAP. Will add back an updated dataset next version

  • Changed DIAMOND command to use ‘–max-target-seqs’ instead of ‘–top’ command

  • Fixed an issue where duplicate sequences were printed to the final_annotations files

  • Fixed an issue where the taxanomic species may not have been found when searching against the SQL EnTAP database

EnTAP Beta v0.10.8-beta (March 21, 2021)

  • This version requires a new version of the EnTAP database to be downloaded

  • Added Gene Enrichment files as an output option(gene ID + effective length and geneID + GO term). These can be seen with the output-type flag in the ini file

  • Changed Gene Ontology level printing. 0 will continue to print every term. Other levels will now print that level AND higher. So a level of 1 will print 1, 2, 3, etc. Previous a level of 1 would only print GO Terms with a level of 1

  • Changed ‘uninformative’ input from a file to a list of terms in the ini file. Much more straightforward this way

  • If no alignments are found against a database during DIAMOND, the pipeline will no longer exit, it will continue to the next database. If no alignments are found against any databases, it will stop at that point

  • Fixed a bug where TransDecoder output may not have been parsed correctly for some users. This presented itself as a parsing error and halted EnTAP at that stage of the pipeline

  • Fixed bug where InterProScan Mobidlite database was giving an error for some users (and halting execution)

EnTAP Beta v0.10.7-beta (October 6, 2020)

  • Fixed an issue where certain sequence headers may not have been parsed properly resulting in unrecognized sequence errors during Similarity Searching

EnTAP Beta v0.10.6-beta (August 26, 2020)

  • Added support to pipe the TransDecoder flag ‘–no_refine_starts’ during Execution

  • Fixed an issue where error messages during EggNOG searching would not get printed (seg fault)

  • Contaminant information will not be printed to the log if there are none

EnTAP Beta v0.10.5-beta (August 12, 2020)

  • Added a step to remove the stop codon (‘*’) sometimes printed at the end of the TransDecoder FASTA output. This may have caused an issue when running TransDecoder and InterProScan together

EnTAP Beta v0.10.4-beta (July 29, 2020)

  • Fixed an issue where expression analysis transcriptome generation would fail (error message presented to user as ‘frame selection’)

EnTAP Beta v0.10.3-beta (July 28, 2020)

  • Fixed a parsing issue of user inputs for contanminants and taxon

EnTAP Beta v0.10.2-beta (July 26, 2020)

  • Fixed a pathing issue when EnTAP generated frame selected transcriptomes

EnTAP Beta v0.10.1-beta (July 19, 2020)

Note: Please use v0.10.2-beta or later instead of this version

  • Added support for TransDecoder for Frame Selection

  • Added TPM as an additional output from Expression Filtering

  • Added an .ini file and moved many commands/paths from the command line to this

  • Standardized/finalized output header namings for gFACs support

  • Changed the default Frame Selection software to TransDecoder. GeneMarkS-T can still be selected through the .ini file

  • Changed the default Gene Ontology level to 1. This can be easily changed through the ini file

  • Fixed issue where some EggNOG descriptions were not printed to the final output

  • Fixed a few issues with older GCC versions

  • Fixed an issue where GeneMarkS-T would write to the working directory

EnTAP Beta v0.9.2-beta (June 4, 2020)

  • Updated EggNOG Database links

EnTAP Beta v0.9.1-beta (January 12, 2020)

  • Changed –trim flag to –no-trim. Trimming sequence headers to the first space is the default now. If you have executions from previous versions, you may need to use the –no-trim flag as needed for backwards compatibility (picking up where you left off)

  • Fixed a bug where the –single-end command was not properly recognized

EnTAP Beta v0.9.0-beta (May 12, 2019)

  • This release focused on reducing installation complexity and removing dependencies

  • Overhauled the configuration/execution process by removing EggNOG-mapper and replacing it with an internal EnTAP method. This will make installation and both stages much clearer for the user

  • Removed Boost Libraries from dependencies further reducing installation complexity

  • Added printing of error messages to the standard log from any software being used by EnTAP. This will make debugging much easier

  • Added UniProt mapping to the EnTAP database. This will pull any additional mapping information from UniProt Swiss-Prot alignments

  • Updated supported DIAMOND version to 0.9.9

  • The EnTAP database MUST be re-configured for this release

  • Resolved any incompatibility with DIAMOND and EggNOG databases as well as versioning problems

  • Standardized EnTAP log entries and added additional statistics

    • -ontology flag will now use EnTAP’s method of EggNOG accession (0) or InterProScan (1)

  • Bug fixes

EnTAP Beta v0.8.4-beta (August 2, 2018)

  • Fixed an issue when inputting already translated sequences

EnTAP Beta v0.8.3-beta (May 23, 2018)

  • Minor bug fixes

  • Changes to CMake to hopefully resolve issues a couple users had with linking to Boost Libraries

EnTAP Beta v0.8.2-beta (April 29, 2018)

  • Revamped configuration stage of EnTAP (reduced time and hopefully made things clear/more compatible across systems)

  • Removed - -database-out flag (seemed a bit redundant to me). - -outfiles flag will be the default when indexing databases

  • Added - -data-generate flag. This can be specified in EnTAP config stage (no effect during execution) for whether you’d like to generate the EnTAP databases rather than downloading from FTP address

  • Added - -data-type flag. This can be used in either configuration or execution. Specifies which database you’d like to download/generate or use during execution. Binary (0, default) or SQL (1). Binary is faster with more memory usage, SQL will be slower but easier compatibility.

  • Combined EnTAP databases into one (entap_database.sql/entap_database.bin). WARNING: Re-download or configuration of databases is REQUIRED with this newer version.

  • Removed download_tax.py script (no longer necessary)

EnTAP Beta v0.8.1-beta (April 14, 2018)

  • Added additional error logging to provide more information when something goes wrong

  • Configuration file mandatory (default place to look is current working directory)

  • Changed tax database paths in config file to avoid confusion (separate text and bin). Config file must be re-downloaded/generated!

  • Defaults/output during configuration changed to config file then if not found, database-out flag

  • Added deletion of empty files if a certain stage failed (preventing re-reading an empty file)

  • Added errors/warnings for no alignments/hits in each stage

  • entap_out directory changed to transcriptomes to be more clear (holds only transcriptomic data)

  • Final EnTAP output files moved from the root outfiles directory to final_results directory

  • Several filename changes to add consistency in new transcriptomes directory (final transcriptome is now _final.fasta.

  • Several title changes to the log file to mitigate confusion

  • EggNOG no longer broken down into separate files - those that hit and those that did not hit a database. Now entire transcriptome is pushed with one output file

  • 10 species/contaminants/other in similarity searching statistics has been changed to 20 to provide more information to the user

  • Best hit selection state combined with similarity search

  • Added ‘N’ as an accepted nucleotide

  • Several behind the scenes changes

  • Fixed Cmake global installation issue

  • Fixed incorrect error codes

  • Fixed InterPro printing bug to no hits/hits files

  • Fixed Frame Selection not printing new lines for certain files

EnTAP Beta v0.8.0-beta (December 16, 2017)

  • Overhaul of the taxonomic/gene ontology databases

    • Faster accession/indexing

    • MUST be re-downloaded and re-indexed (or use the updated versions that come with the EnTAP distribution)

    • Taxonomic database includes thousands more entries with synonyms of many species

    • Perl is no longer a dependency, with Python being used to download the database

  • Added blastx support

    • Blastx now allowed for ALL stages of annotation (similarity search + ontology)

    • –runN flag now specifies blastx (frame selection will not be ran)

    • –runP flag now specifies blastp (frame selection will be performed if nucleotide sequences are input)

  • Added InterProScan support

    • Now possible to run EggNOG and/or InterProScan (with both blastx or blastp)

    • EggNOG and/or InterProScan specified with –ontology flag (0 and/or 1)

    • Full output of both will be provided in the final annotations file

  • Added additional statistics to the log file for EggNOG and Expression Analysis

  • Added numerous file/path/software checks to the start of an EnTAP run

    • Test runs/path checks are performed on all software that will be ran

    • Additional checks to specific flags

    • These checks can be turned off for an EnTAP run with –no-check flag (not advised!)

  • –tag flag changed to –out-dir to specify output directory (not just what you’d like it named as)

    • Defaults to current directory with /outfiles folder

  • –paired-end flag for Expression Filtering changed to –single-end (with paired-end being the default)

  • Added contaminant and informative yes/no columns in final annotations file (among other headers)

  • Added ability to input your own list of informative/uninformative terms for EnTAP to flag

  • Added contaminant and none contaminant final annotation files

  • Fixed a sequence id issue in Expression Filtering not mapping to BAM/SAM file

  • Fixed a bug in –trim flag for sequence headers

  • Fixed a bug where some systems had issues with graphing

  • Debug and log files are now time stamped and not overwritten

  • Fixed pathing for EnTAP configuration and made more streamlined

  • Fixed several instances of older compilers complaining

  • Added a lot of error messaging to help diagnose any issues easily

  • Changed similarity search to have full database name, not path

  • Fixed a bug in parsing input fasta file (added corrupt file checks)

EnTAP Beta v0.7.4.1-beta (September 5, 2017)

  • Minor changes to taxonomic database download and indexing

EnTAP Beta v0.7.4-beta (August 26, 2017)

  • Initial beta release!