gff3 file for featureCounts
1
0
Entering edit mode
@a7dd715a
Last seen 2.3 years ago
Turkey

Hi, I read in this site that gff3 format is also acceptable for featureCounts and I don't have gtf but only gff3 file, but it doesn't work.

featureCounts -T 6 -t exon -g gene_id -Q 30 -F GTF -a medtr.R108_HM340.gnm1.ann1.85YW.gcv_genes.gff3  -o htseqresults.txt  
/w_NF18553root_sorted.bam

and the output is this

Load annotation file medtr.R108_HM340.gnm1.ann1.85YW.gcv_genes.gff3 ...    ||
ERROR: no features were loaded in format GTF. The annotation format can be specified by the '-F' option, and the required feature type can be specified by the '-t' option..
The porgram has to terminate.
gtf/gff3 subread featureCounts • 4.2k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

You are using the command line featureCounts function, which is not part of Bioconductor. Looking at the help for that version, I don't see any mention of GFF file format being acceptable. The featureCounts function in the Bioconductor Rsubread package does say it will accept a GFF file as input, so you might try that.

However, the GFF file you are using is probably the wrong one. I believe you want medtr.R108_HM340.gnm1.ann1.85YW.gene_models_main.gff3.gz instead.

An alternative to using GFF (if it doesn't work, that is. I have no experience with using a GFF personally), you could always make a SAF file.

> library(GenomicFeatures)
> download.file("https://peanutbase.org/data/v1/Medicago_truncatula/R108_HM340.gnm1.ann1.85YW/medtr.R108_HM340.gnm1.ann1.85YW.gene_models_main.gff3.gz", "peanut.gff3.gz")
trying URL 'https://peanutbase.org/data/v1/Medicago_truncatula/R108_HM340.gnm1.ann1.85YW/medtr.R108_HM340.gnm1.ann1.85YW.gene_models_main.gff3.gz'
Content type 'application/x-gzip' length 12367722 bytes (11.8 MB)
downloaded 11.8 MB

## Make a TxDb object first

> txdb <- makeTxDbFromGFF("peanut.gff3.gz")
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK

## Get the exon data

> ex <- exonsBy(txdb, "gene")

## make an SAF data.frame

> len <- lengths(ex)
> ex2 <- unlist(ex)
> saf <- data.frame(GeneID = rep(names(len), len), Chr = seqnames(ex2), Start = start(ex2), End = end(ex2), Strand = as.character(strand(ex2)))
> head(saf)
            GeneID                          Chr Start   End Strand
1 BZG31_000s000010 medtr.R108_HM340.gnm1.scf000 16103 16197      -
2 BZG31_000s000010 medtr.R108_HM340.gnm1.scf000 16334 16401      -
3 BZG31_000s000010 medtr.R108_HM340.gnm1.scf000 16938 17116      -
4 BZG31_000s000010 medtr.R108_HM340.gnm1.scf000 17188 17196      -
5 BZG31_000s000020 medtr.R108_HM340.gnm1.scf000 18559 19824      -
6 BZG31_000s000030 medtr.R108_HM340.gnm1.scf000 21927 21971      +
>

And then you can use that directly. See ?featureCounts for more information.

ADD COMMENT

Login before adding your answer.

Traffic: 727 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6