Installation

EnTAP is packaged with all of the software necessary to fully annotate a set of transcripts. It is optimized to allow a single-command execution for all steps in the pathway, including paramterization by the user. EnTAP does not have a graphical user interface but it does generate visual summaries for the user at each stage as well as detailed summary files and logs.

  1. System Requirements
  2. Dependency Check
  3. Pipeline Software
  4. EnTAP

Before full EnTAP installation, dependencies must be checked to see if they are included in your system (many are by default) and the accompanying pipeline software will need to be installed (unless is already present on the system).

System Requirements

  • Operating System

    • UNIX-based systems
    • Tested on 64 bit systems: ubuntu 16.04, Rocks 6.1, Centos 6.3
  • Storage Minimum

    • EnTAP Database (Gene Ontology References + UniProt Mapping + NCBI Taxonomy): 1.5Gb
    • EggNOG Databases: 24Gb
    • DIAMOND Databases: ~13Gb (with RefSeq Complete Protein + Uniprot Swiss-Prot)
    • Additional storage for files generated depending on transcriptome size: upwards of 15Gb
  • Memory

    • At least 16 Gb of RAM (will very depending on DIAMOND database sizes). More memory is highly recommended to reduce execution times.

Dependency Check

Before continuing on in the installation process, ensure that the following dependencies are fully installed on your system:

  • C++11 compiler (GCC 4.8.1 or later)

  • CMake (2.8 or later)

  • Python (2.7.12 or later) with support for the following modules

    • Matplotlib (figures generated by EnTAP)
  • Unix wget (generally included in most distros)

  • Unix gzip/tar (generally included in most distros)

Pipeline Software

EnTAP leverages several software distributions within the pipeline to provide the best quality annotations. The packages used (and their current/tested versions) can be seen below. This is not to say that newer versions will not be compatible, however they have not been tested yet with EnTAP.

Note

If the software is already installed on your system, this stage can be skipped

Software:
  • RSEM (Expression Filtering with alignment file): version 1.3.0
  • GeneMarkS-T (Frame Selection): version 5.1
  • DIAMOND (Similarity Search): version 0.9.9
    • Version 0.8.31
    • Version 0.9.19
  • InterProScan (Protein Databases): This is not included within the EnTAP repo, but is supported

If you have downloaded the full repository from the GitLab page, each of these (with the exception of GeneMarkS-T) are contained within the /libs directory. GeneMarkS-T must be acquired from the website linked previously due to licensing (free for academic use).

RSEM and DIAMOND both require compilation from source code while GeneMarkS-T does not. To compile these, follow the directions below. These are also found on the respective GitHub pages and are subject to change depending on the version.

DIAMOND Installation

From root EnTAP directory…

cd libs/diamond-0.8.31
mkdir bin
cd bin
cmake ..

Run the following command to install globally:

make install

Run the following command to compile:

make

All set! Ensure that DIAMOND has been properly setup and add the correct path to the entap_config.txt file. If installed globally, add ‘diamond’ (without quotes) to the file. If installed locally, add ‘path/to/EnTAP/libs/diamond-0.8.31/bin/diamond’.

RSEM Installation

From root EnTAP directory…

cd libs/RSEM-1.3.0
make
make ebseq

Run the following command to install globally:

make install

All set! Ensure that RSEM has been properly setup and add the correct path to the entap_config.txt file. If installed globally keep blank. If installed locally, add ‘path/to/EnTAP/libs/RSEM-1.3.0/’.

EnTAP Installation

Once dependencies and pipeline software have been installed, you can now continue to install EnTAP!

First, download and extract the latest release(tagged) version from GitLab: https://gitlab.com/EnTAP/EnTAP/tags

Within the main directory, execute the following command:

cmake CMakeLists.txt

This will generate a MakeFile. Then execute:

make

Or to install to a destination directory:

cmake CMakeLists.txt -DCMAKE_INSTALL_PREFIX=/destination/dir
make install

This will complete the installation process. You are ready to start using EnTAP!