Utilities¶
There are several scripts that maybe useful to users to convert between different formats, these scripts are housed in the funannotate util
submenu.
$ funannotate util
Usage: funannotate util <arguments>
version: 1.7.0
Commands:
contrast Compare annotations to reference (GFF3 or GBK annotations)
tbl2gbk Convert TBL format to GenBank format
gbk2parts Convert GBK file to individual components
gff2prot Convert GFF3 + FASTA files to protein FASTA
gff2tbl Convert GFF3 format to NCBI annotation table (tbl)
bam2gff3 Convert BAM coord-sorted transcript alignments to GFF3
prot2genome Map proteins to genome generating GFF3 protein alignments
stringtie2gff3 Convert GTF (stringTIE) to GFF3 format
quarry2gff3 Convert CodingQuarry output to proper GFF3 format
Comparing/contrast annotations to a reference¶
To compare/contrast genome annotations between different GFF3 or GBK files.
$ funannotate util contrast
Usage: funannotate util contrast <arguments>
version: 1.7.0
Description: Compare/constrast annotations to reference. Annotations in either GBK or GFF3 format.
Arguments: -r, --reference Reference Annotation. GFF3 or GBK format
-f, --fasta Genome FASTA. Required if GFF3 used
-q, --query Annotation query. GFF3 or GBK format
-o, --output Output basename
-c, --calculate_pident Measure protein percent identity between query and reference
Format Conversion¶
$ funannotate util tbl2gbk
Usage: funannotate util tbl2gbk <arguments>
version: 1.7.0
Description: Convert NCBI TBL annotations + Genome FASTA to GenBank format.
Required: -i, --tbl Annotation in NCBI tbl format
-f, --fasta Genome FASTA file.
-s, --species Species name, use quotes for binomial, e.g. "Aspergillus fumigatus"
Optional:
--isolate Isolate name
--strain Strain name
--sbt NCBI Submission Template file
-t, --tbl2asn Assembly parameters for tbl2asn. Example: "-l paired-ends"
-o, --output Output basename
$ funannotate util gbk2parts
Usage: funannotate util gbk2parts <arguments>
version: 1.7.0
Description: Convert GenBank file to its individual components (parts) tbl, protein
FASTA, transcript FASTA, and contig/scaffold FASTA.
Arguments: -g, --gbk Input Genome in GenBank format
-o, --output Output basename
$ funannotate util gff2prot
Usage: funannotate util gff2prot <arguments>
version: 1.7.0
Description: Convert GFF3 file and genome FASTA to protein sequences. FASTA output to stdout.
Arguments: -g, --gff3 Reference Annotation. GFF3 format
-f, --fasta Genome FASTA file.
--no_stop Dont print stop codons
$ funannotate util gff2tbl
Usage: funannotate util gff2tbl <arguments>
version: 1.7.0
Description: Convert GFF3 file into NCBI tbl format. Tbl output to stdout.
Arguments:
-g, --gff3 Reference Annotation. GFF3 format
-f, --fasta Genome FASTA file.
$ funannotate util bam2gff3
Usage: funannotate util bam2gff3 <arguments>
version: 1.7.0
Description: Convert BAM coordsorted transcript alignments to GFF3 format.
Arguments: -i, --bam BAM file (coord-sorted)
-o, --output GFF3 output file
$ funannotate util protein2genome
Usage: funannotate util prot2genome <arguments>
version: 1.7.0
Description: Map proteins to genome using exonerate. Output is EVM compatible GFF3 file.
Arguments: -g, --genome Genome FASTA format (Required)
-p, --proteins Proteins FASTA format (Required)
-o, --out GFF3 output file (Required)
-f, --filter Pre-filtering method. Default: diamond [diamond,tblastn]
-t, --tblastn_out Output to save tblastn results. Default: off
--tblastn Use existing tblastn results
--ploidy Ploidy of assembly. Default: 1
--maxintron Max intron length. Default: 3000
--cpus Number of cpus to use. Default: 2
--EVM_HOME Location of Evidence Modeler home directory. Default: $EVM_HOME
--logfile Logfile output file
$ funannotate util stringtie2gff3
Usage: funannotate util stringtie2gff3 <arguments>
version: 1.7.0
Description: Convert StringTIE GTF format to GFF3 funannotate compatible format. Output
to stdout.
Arguments: -i, --input GTF file from stringTIE
$ funannotate util quarry2gff3
Usage: funannotate util quarry2gff3 <arguments>
version: 1.7.0
Description: Convert CodingQuarry output GFF to proper GFF3 format. Output to stdout.
Arguments: -i, --input CodingQuarry output GFF file. (PredictedPass.gff3)