Utilities

There are several scripts that maybe useful to users to convert between different formats, these scripts are housed in the funannotate util submenu.

$ funannotate util

    Usage:       funannotate util <arguments>
    version:     1.8.14

    Commands:
      stats              Generate assembly and annotation stats
      contrast           Compare annotations to reference (GFF3 or GBK annotations)
      tbl2gbk            Convert TBL format to GenBank format
      gbk2parts          Convert GBK file to individual components
      gff2prot           Convert GFF3 + FASTA files to protein FASTA
      gff2tbl            Convert GFF3 format to NCBI annotation table (tbl)
      bam2gff3           Convert BAM coord-sorted transcript alignments to GFF3
      prot2genome        Map proteins to genome generating GFF3 protein alignments
      stringtie2gff3     Convert GTF (stringTIE) to GFF3 format
      quarry2gff3        Convert CodingQuarry output to proper GFF3 format
gff-rename         Sort GFF3 file and rename gene models

Generate genome assembly stats

To generate genome assembly stats in a JSON file.

$ funannotate util stats

    Usage:       funannotate util stats <arguments>
    version:     1.8.14

    Description: Generate JSON file with genome assembly and annotation stats.

    Arguments:
      -f, --fasta              Genome FASTA file (Required)
      -o, --out                Output file (JSON format)
      -g, --gff3               Genome Annotation (GFF3 format)
      -t, --tbl                Genome Annotation (NCBI TBL format)
      --transcript_alignments  Transcript alignments (GFF3 format)
      --protein_alignments     Protein alignments (GFF3 format)

Comparing/contrast annotations to a reference

To compare/contrast genome annotations between different GFF3 or GBK files.

$ funannotate util contrast

    Usage:       funannotate util contrast <arguments>
    version:     1.8.14

    Description: Compare/constrast annotations to reference. Annotations in either GBK or GFF3 format.

    Arguments: -r, --reference            Reference Annotation. GFF3 or GBK format
               -f, --fasta                Genome FASTA. Required if GFF3 used
               -q, --query                Annotation query. GFF3 or GBK format
               -o, --output               Output basename
               -c, --calculate_pident     Measure protein percent identity between query and reference

Format Conversion

$ funannotate util tbl2gbk

    Usage:       funannotate util tbl2gbk <arguments>
    version:     1.8.14

    Description: Convert NCBI TBL annotations + Genome FASTA to GenBank format.

    Required:    -i, --tbl          Annotation in NCBI tbl format
                             -f, --fasta        Genome FASTA file.
                             -s, --species      Species name, use quotes for binomial, e.g. "Aspergillus fumigatus"
    Optional:
                             --isolate          Isolate name
                             --strain           Strain name
                             --sbt              NCBI Submission Template file
                             -t, --tbl2asn      Assembly parameters for tbl2asn. Example: "-l paired-ends"
                             -o, --output       Output basename
$ funannotate util gbk2parts

    Usage:       funannotate util gbk2parts <arguments>
    version:     1.8.14

    Description: Convert GenBank file to its individual components (parts) tbl, protein
                             FASTA, transcript FASTA, and contig/scaffold FASTA.

    Arguments:   -g, --gbk          Input Genome in GenBank format
                                   -o, --output       Output basename
$ funannotate util gff2prot

    Usage:       funannotate util gff2prot <arguments>
    version:     1.8.14

    Description: Convert GFF3 file and genome FASTA to protein sequences. FASTA output to stdout.

    Arguments: -g, --gff3           Reference Annotation. GFF3 format
               -f, --fasta          Genome FASTA file.
               --no_stop            Dont print stop codons
$ funannotate util gff2tbl

    Usage:       funannotate util gff2tbl <arguments>
    version:     1.8.14

    Description: Convert GFF3 file into NCBI tbl format. Tbl output to stdout.

    Arguments:
      -g, --gff3           Reference Annotation. GFF3 format
      -f, --fasta          Genome FASTA file.
$ funannotate util bam2gff3

    Usage:       funannotate util bam2gff3 <arguments>
    version:     1.8.14

    Description: Convert BAM coordsorted transcript alignments to GFF3 format.

    Arguments: -i, --bam           BAM file (coord-sorted)
               -o, --output        GFF3 output file
$ funannotate util protein2genome

    Usage:       funannotate util prot2genome <arguments>
    version:     1.8.14

    Description: Map proteins to genome using exonerate. Output is EVM compatible GFF3 file.

    Arguments:   -g, --genome       Genome FASTA format (Required)
                 -p, --proteins     Proteins FASTA format (Required)
                 -o, --out          GFF3 output file (Required)
                 -f, --filter       Pre-filtering method. Default: diamond [diamond,tblastn]
                 -t, --tblastn_out  Output to save tblastn results. Default: off
                  --tblastn          Use existing tblastn results
                 --ploidy           Ploidy of assembly. Default: 1
                 --maxintron        Max intron length. Default: 3000
                 --cpus             Number of cpus to use. Default: 2
                 --EVM_HOME         Location of Evidence Modeler home directory. Default: $EVM_HOME
                 --tmpdir           Volume/location to write temporary files. Default: /tmp
                 --logfile          Logfile output file
$ funannotate util stringtie2gff3

    Usage:       funannotate util stringtie2gff3 <arguments>
    version:     1.8.14

    Description: Convert StringTIE GTF format to GFF3 funannotate compatible format. Output
                             to stdout.

    Arguments:   -i, --input        GTF file from stringTIE
  $ funannotate util quarry2gff3

      Usage:       funannotate util quarry2gff3 <arguments>
      version:     1.8.14

      Description: Convert CodingQuarry output GFF to proper GFF3 format. Output to stdout.

      Arguments:   -i, --input        CodingQuarry output GFF file. (PredictedPass.gff3)

.. code-block:: none

  $ funannotate util gff-rename

      Usage:       funannotate util gff-rename <arguments>
      version:     1.8.14

      Description: Sort GFF3 file by contigs and rename gene models.

      Arguments:   -g, --gff3           Reference Annotation. GFF3 format
                   -f, --fasta          Genome FASTA file.
                   -o, --out            Output GFF3 file
                   -l, --locus_tag      Locus tag to use. Default: FUN
                   -n, --numbering      Start number for genes. Default: 1