easyRNASeq packages: overlapping exons warning unsolved (?) and use of paired end information
0
0
Entering edit mode
@gabriele-zoppoli-5976
Last seen 10.2 years ago
Dear Bioconductor mailing list, I am trying to load and summarize ~ 60 paired-read BAM files from human cancer RNA-seq experiments using Illumina 2x50 protocol, for downstream use with edgeR and DESeq. First question: I noticed several posts have been issued on the same topic, i.e. the way to solve the warning: "There are [any number] synthetic exons as determined from your annotation that overlap! This implies that some reads will be counted more than once! Is that really what you want?" when using easyRNASeq. So far, I haven't seen any answer that doesn't pass through the use of GenomicRanges, or even any answer at all for some decently written posts. The easyRNASeq vignette is not entirely clear on that point. I'm therefore wondering whether anybody has come up with a solution and posted it in a plain and reproducible fashion. Second question: does anybody know whether the aforementioned easyRNASeq package makes use of the "properly paired" reads for summarization? I really couldn't find anything on that either, even after a month of googling around. That's what I've done so far: countTable <- easyRNASeq(filesDirectory=getwd(), organism="Hsapiens", annotationMethod="rda", annotationFile="gAnnot.rda", gapped=TRUE, count="genes", summarization="geneModels", filesDirectory=getwd(), filenames=BAM_files, outputFormat="RNAseq", nBcores=4) Checking arguments... Fetching annotations... Computing gene models... Summarizing counts... Processing sample1.bam Updating the read length information. The alignments are gapped. Minimum length of 1 bp. Maximum length of 51 bp. [...] Preparing output Warning messages: [...] 2: In easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", annotationMethod = "rda", : There are 16816 synthetic exons as determined from your annotation that overlap! This implies that some reads will be counted more than once! Is that really what you want? [...] ##rda file is derived from a previous iteration of the same command using annotationMethod="biomaRt" and then doing gAnnot <- genomicAnnotation(count.genes) gAnnot <- gAnnot[space(gAnnot) %in% paste("chr",c(1:22,"X","Y","M"),sep=""),] save(gAnnot,file="gAnnot.rda") as suggested by Nicholas Delhomme Thanks in advance! sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] easyRNASeq_1.6.0 ShortRead_1.18.0 latticeExtra_0.6-24 [4] RColorBrewer_1.0-5 Rsamtools_1.12.3 DESeq_1.12.0 [7] lattice_0.20-15 locfit_1.5-9.1 BSgenome_1.28.0 [10] GenomicRanges_1.12.4 Biostrings_2.28.0 IRanges_1.18.1 [13] edgeR_3.2.3 limma_3.16.5 biomaRt_2.16.0 [16] Biobase_2.20.0 genomeIntervals_1.16.0 BiocGenerics_0.6.0 [19] intervals_0.14.0 loaded via a namespace (and not attached): [1] annotate_1.38.0 AnnotationDbi_1.22.6 bitops_1.0-5 [4] DBI_0.2-7 genefilter_1.42.0 geneplotter_1.38.0 [7] grid_3.0.1 hwriter_1.3 RCurl_1.95-4.1 [10] RSQLite_0.11.4 splines_3.0.1 stats4_3.0.1 [13] survival_2.37-4 XML_3.96-1.1 xtable_1.7-1 [16] zlibbioc_1.6.0 -- Gabriele Zoppoli, MD Ph.D., Clinical and Experimental Oncology and Hematology Visiting Researcher, BCTRL J.C. Heuson, Institut J. Bordet, Bruxelles BE Internal Medicine Resident, DiMI, IRCCS AOU San Martino IST, Genova, IT Former Guest Researcher, LMP, CCR, NCI, NIH, Bethesda MD Tel: +39 010 353 7968 Mobile 1: +32 478 337 942 Mobile 2: +39 349 617 0129 Email: gabriele.zoppoli@unige.it Alt. Email: zoppoli@gmail.com Alt. Email 2: gzoppoli@libero.it Alt. Email 3: gabriele.zoppoli@bordet.be ---------------------------------------------------------- Ζεῦ πάτερ ἀλλὰ σὺ ῥῦσαι ὑπ' ἠέρος Ï á¼·Î±Ï‚ Ἀχαιῶν, ποίησον δ' αἴθρην, δὸς δ' ὀφθαλμοῖσιν ἰδέσθαι: ἐν δὲ φάει καὶ ὄλεσσον, ἐπεί νύ τοι εὔαδεν οὕτως. *Father Zeus, at least deliver the sons of Acheans from the gloom,* *And make clear the air, and give it to our eyes to see.* *In the light destroy us, since to do thus pleases you. (Il. 17, 645-7) * ---------------------------------------------------------- CONFIDENTIALITY NOTICE\ \ This e-mail message is intende...{{dropped:14}}
Annotation Organism edgeR DESeq easyRNASeq Annotation Organism edgeR DESeq easyRNASeq • 1.2k views
ADD COMMENT

Login before adding your answer.

Traffic: 540 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6