Question: easyRNAseq annotation file
0
gravatar for kylvalda
3.9 years ago by
kylvalda0
United States
kylvalda0 wrote:

Hi,

Apparently I need to use a gtf or gff3 file for annotation with easyRNAseq now when processing hg19 bamfiles now - as method biomaRt uses GRCh39 and I get an error. So far so bad.

However, I seem not to be able to download a gtf or gff3 file for hg19 that easyRNAseq would accept:

> DGE<-simpleRNASeq(bamFiles=getBamFileList(bamfiles), param=param, nnodes=4, verbose=T, override=T)


...

==========================
Processing the annotation
==========================
Validating the annotation source
Read 993 records
Error in .validate(obj, verbose = verbose) :
The provided gff3 contains no annotation of type 'mRNA' and/or 'exon' in the first 1000 lines.


I've tried files the following annotation files but none seems to work:

ref_GRCh37.p13_top_level.gff3.gz
ref_GRCh37.p13_scaffolds.gff3.gz
gencode.v19.annotation.gtf
gencode.v19.annotation.gff3
Homo_sapiens.GRCh37.70.gtf

will try the Illumina iGenome next. What am I missing here?!?

annotation easyrnaseq • 836 views
ADD COMMENTlink modified 3.9 years ago • written 3.9 years ago by kylvalda0
Answer: easyRNAseq annotation file
0
gravatar for Nicolas Delhomme
3.9 years ago by
Sweden
Nicolas Delhomme320 wrote:
Hej Kylvalda! Are all the annotation file giving the same error? And can you tell me 1) where you got the file from 2) which version of R/Bioc and easyRNASeq you are using, so that I can try to reproduce the error. Cheers, Nico --------------------------------------------------------------- Nicolas Delhomme, PhD The Street Lab, Umeå Plant Science Center, Swedish University for Agricultural Sciences (SLU) and Umeå University Tel: +46 90 786 5478 Email: nicolas.delhomme@umu.se SLU - Umeå universitet Umeå S-901 87 Sweden --------------------------------------------------------------- > On 13 Sep 2015, at 21:50, kylvalda [bioc] <noreply@bioconductor.org> wrote: > > Activity on a post you are following on support.bioconductor.org > User kylvalda wrote Question: easyRNAseq annotation file: > > > Hi, > > Apparently I need to use a gtf or gff3 file for annotation with easyRNAseq now when processing hg19 bamfiles now - as method biomaRt uses GRCh39 and I get an error. So far so bad. > > However, I seem not to be able to download a gtf or gff3 file for hg19 that easyRNAseq would accept: > > > DGE<-simpleRNASeq(bamFiles=getBamFileList(bamfiles), param=param, nnodes=4, verbose=T, override=T) > > > ... > > ========================== > Processing the annotation > ========================== > Validating the annotation source > Read 993 records > Error in .validate(obj, verbose = verbose) : > The provided gff3 contains no annotation of type 'mRNA' and/or 'exon' in the first 1000 lines. > > > I've tried files the following annotation files but none seems to work: > > ref_GRCh37.p13_top_level.gff3.gz > ref_GRCh37.p13_scaffolds.gff3.gz > gencode.v19.annotation.gtf > gencode.v19.annotation.gff3 > Homo_sapiens.GRCh37.70.gtf > > will try the Illumina iGenome next. What am I missing here?!? > > > Post tags: easyrnaseq, annotation > > You may reply via email or visit easyRNAseq annotation file >
ADD COMMENTlink written 3.9 years ago by Nicolas Delhomme320
Answer: easyRNAseq annotation file
0
gravatar for kylvalda
3.9 years ago by
kylvalda0
United States
kylvalda0 wrote:

Hey Nico,

 

sorry for the late reply (am currently battling kallisto/sleuth):

 

> sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-unknown-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server release 6.6 (Santiago)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] DESeq2_1.8.1                      RcppArmadillo_0.5.500.2.0
 [3] Rcpp_0.12.1                       easyRNASeq_2.4.7
 [5] BSgenome.Hsapiens.UCSC.hg19_1.4.0 BSgenome_1.36.3
 [7] rtracklayer_1.28.10               Biostrings_2.36.4
 [9] XVector_0.8.0                     GenomicRanges_1.20.6
[11] GenomeInfoDb_1.4.2                IRanges_2.2.7
[13] S4Vectors_0.6.5                   BiocGenerics_0.14.0

loaded via a namespace (and not attached):
 [1] genefilter_1.50.0       locfit_1.5-9.1          reshape2_1.4.1
 [4] splines_3.2.1           lattice_0.20-33         colorspace_1.2-6
 [7] survival_2.38-3         XML_3.98-1.3            foreign_0.8-66
[10] DBI_0.3.1               BiocParallel_1.2.21     RColorBrewer_1.1-2
[13] lambda.r_1.1.7          plyr_1.8.3              stringr_1.0.0
[16] zlibbioc_1.14.0         munsell_0.4.2           gtable_0.1.2
[19] futile.logger_1.4.1     hwriter_1.3.2           latticeExtra_0.6-26
[22] Biobase_2.28.0          geneplotter_1.46.0      biomaRt_2.24.0
[25] AnnotationDbi_1.30.1    proto_0.3-10            acepack_1.3-3.3
[28] xtable_1.7-4            edgeR_3.10.2            scales_0.3.0
[31] limma_3.24.15           Hmisc_3.16-0            annotate_1.46.1
[34] ShortRead_1.26.0        gridExtra_2.0.0         Rsamtools_1.20.4
[37] ggplot2_1.0.1           digest_0.6.8            stringi_0.5-5
[40] DESeq_1.20.0            grid_3.2.1              tools_3.2.1
[43] bitops_1.0-6            magrittr_1.5            RCurl_1.95-4.7
[46] LSD_3.0                 RSQLite_1.0.0           cluster_2.0.3
[49] Formula_1.2-1           futile.options_1.0.0    MASS_7.3-44
[52] rpart_4.1-10            nnet_7.3-11             GenomicAlignments_1.4.1
[55] genomeIntervals_1.24.1  intervals_0.15.1


I've downloaded the mentioned annotations from:

ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/ARCHIVE/ANNOTATION_RELEASE.105/GFF/

http://www.gencodegenes.org/releases/19.html

http://ftp.ensembl.org/pub/release-70/gtf/homo_sapiens/

 

 

Thanks a bunch

 

 

 

 

ADD COMMENTlink written 3.9 years ago by kylvalda0
Answer: easyRNAseq annotation file
0
gravatar for kylvalda
3.9 years ago by
kylvalda0
United States
kylvalda0 wrote:

Same problem with iGenome:

Homo_sapiens/Ensembl/GRCh37/Annotation/Archives/archive-2014-05-23-16-03-55/Genes/genes.gtf

==========================
Processing the annotation
==========================
Validating the annotation source
Read 993 records
Error in .validate(obj, verbose = verbose) :
  The provided gff3 contains no annotation of type 'mRNA' and/or 'exon' in the first 1000 lines.


...

ADD COMMENTlink written 3.9 years ago by kylvalda0
Answer: easyRNAseq annotation file
0
gravatar for kylvalda
3.9 years ago by
kylvalda0
United States
kylvalda0 wrote:

Might this be the problem:

 

> annotParam<-AnnotParam(datasource=system.file('extdata','~/Homo_sapiens/Homo_sapiens/Ensembl/GRCh37/Annotation/Archives/archive-2014-05-23-16-03-55/Genes/genes.gtf'))


> annotParam
AnnotParamCharacter object set to retrieve 'gff3' formatted annotation from:

>

?

 

 

ADD COMMENTlink written 3.9 years ago by kylvalda0
I just realised that my answer has been held up on our mail server. :-\ Here is is: ---- Yes, it likely is. Can you set type="gtf" and try again. annotParam<-AnnotParam(datasource=system.file('extdata','~/Homo_sapiens/Homo_sapiens/Ensembl/GRCh37/Annotation/Archives/archive-2014-05-23-16-03-55/Genes/genes.gtf'),type="gtf") I will try to add asap some more file checking to prevent that from happening. Cheers, Nico --- I will make sure to add a test to ensure that this does not happen. Nico --------------------------------------------------------------- Nicolas Delhomme, PhD The Street Lab, Umeå Plant Science Center, Swedish University for Agricultural Sciences (SLU) and Umeå University Tel: +46 90 786 5478 Email: nicolas.delhomme@umu.se SLU - Umeå universitet Umeå S-901 87 Sweden ---------------------------------------------------------------
ADD REPLYlink written 3.9 years ago by Nicolas Delhomme320
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 90 users visited in the last hour