Problems Counting Reads with summarizeOverlaps

0

Entering edit mode

Sam McInturf ▴ 300

@sam-mcinturf-5291

Last seen 10.3 years ago

United States

Hello Bioconductors, I am working on an Arabidopsis RNA seq set, and after executing my count reads script, I get a count table with no reads counted (ie sum(colSums(mat)) = 0). I mapped my reads using Tophat2 (without supplying a gtf/gff file) and used samtools to sort and index my accepted_hits.bam files. I loaded these bam files into IGV (integrated Genome Browser) and the reads appear to have been mapped (additionally the align_summary.txt says I have good mapping). Script and session info below. Essentially I use Rsamtools to load in my bamfiles, and then makeTranscriptsFromGFF to make a txdb object, transcriptsBy to get transcripts, and then summarizeOverlaps to count the reads. My gff file is TAIR10. I am relatively sure that the genome I mapped to was the TAIR10 release, but I am not 100% on that. I have also used biomaRt to do this (commented out), but I had the same results. I am currently upgrading to R 3.0.2 and re-installing bioconductor (maybe it will change something?) Thanks for any thoughts! Script: gffdir <- "/data/smcintur/Annotation/TAIR10_GFF3_genes.gff"; library(ShortRead) library(GenomicFeatures) library(Rsamtools) #library(biomaRt) setwd("/data/smcintur/RNASeq/NMseq/newTophat/BamFiles/") bf <- list.files(pattern = ".bam$") bi <- list.files(pattern = "bam.bai$") bfl <- BamFileList(bf, bi) #mart <- makeTranscriptDbFromBiomart(biomart = "Public_TAIRV10", dataset= "tairv10") tx <- transcriptsBy(mart) gff <- makeTranscriptDbFromGFF(gffdir, format = "gff") gff cds <- cdsBy(gff) tx <- transcriptsBy(gff) olapTx <- summarizeOverlaps(tx, bfl) olapCds <- summarizeOverlaps(cds, bfl) txMat <- assays(olapTx)$counts cdsMat <- assays(olapCds)$counts write.table(txMat, file = "countMatTxGff.txt", sep = "\t") write.table(cdsMat, file = "countMatCdsGff.txt", sep = "\t") sessionInfo R version 3.0.0 (2013-04-03) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] BiocInstaller_1.12.0 loaded via a namespace (and not attached): [1] AnnotationDbi_1.24.0 Biobase_2.22.0 BiocGenerics_0.8.0 [4] biomaRt_2.18.0 Biostrings_2.30.1 bitops_1.0-6 [7] BSgenome_1.30.0 DBI_0.2-7 GenomicFeatures_1.14.2 [10] GenomicRanges_1.14.3 IRanges_1.20.6 parallel_3.0.0 [13] RCurl_1.95-4.1 Rsamtools_1.14.2 RSQLite_0.11.4 [16] rtracklayer_1.22.0 stats4_3.0.0 tools_3.0.0 [19] XML_3.98-1.1 XVector_0.2.0 zlibbioc_1.8.0 -- Sam McInturf [[alternative HTML version deleted]]

biomaRt Rsamtools biomaRt Rsamtools • 3.6k views

ADD COMMENT • link updated 12.1 years ago by Valerie Obenchain ★ 6.8k • written 12.1 years ago by Sam McInturf ▴ 300

0

Entering edit mode

Valerie Obenchain ★ 6.8k

@valerie-obenchain-4275

Last seen 4.0 years ago

United States

Hi Sam, Are you getting any error messages related to seqlevels when you count? If you have confirmed that annotation seqlevels match the bam files next look at the maximum possible overlap in a single bam file. bf <- singleBamFile reads <- readGAlignments(bf) ## assuming single-end reads so <- summarizeOverlaps(tx, reads, mode=Union, inter.feature=FALSE) 'inter.feature=FALSE' with 'mode=Union' is countOverlaps with 'type=ANY'. Reads that hit multiple features will be double counted in this case. The idea to to see if there are any common regions at all. If there are still no counts, you could look more closely at a smaller chromosome region in 'reads' where you would expect counts. This visual inspection might give you some insight as to why there is no overlap. Valerie On 12/05/13 18:51, Sam McInturf wrote: > Hello Bioconductors, > I am working on an Arabidopsis RNA seq set, and after executing my count > reads script, I get a count table with no reads counted (ie > sum(colSums(mat)) = 0). > > I mapped my reads using Tophat2 (without supplying a gtf/gff file) and used > samtools to sort and index my accepted_hits.bam files. I loaded these bam > files into IGV (integrated Genome Browser) and the reads appear to have > been mapped (additionally the align_summary.txt says I have good mapping). > Script and session info below. > > Essentially I use Rsamtools to load in my bamfiles, and then > makeTranscriptsFromGFF to make a txdb object, transcriptsBy to get > transcripts, and then summarizeOverlaps to count the reads. My gff file is > TAIR10. I am relatively sure that the genome I mapped to was the TAIR10 > release, but I am not 100% on that. > > I have also used biomaRt to do this (commented out), but I had the same > results. > > I am currently upgrading to R 3.0.2 and re-installing bioconductor (maybe > it will change something?) > > Thanks for any thoughts! > > Script: > > gffdir <- "/data/smcintur/Annotation/TAIR10_GFF3_genes.gff"; > library(ShortRead) > library(GenomicFeatures) > library(Rsamtools) > #library(biomaRt) > setwd("/data/smcintur/RNASeq/NMseq/newTophat/BamFiles/") > bf <- list.files(pattern = ".bam$") > bi <- list.files(pattern = "bam.bai$") > bfl <- BamFileList(bf, bi) > > #mart <- makeTranscriptDbFromBiomart(biomart = "Public_TAIRV10", dataset= > "tairv10") > tx <- transcriptsBy(mart) > gff <- makeTranscriptDbFromGFF(gffdir, format = "gff") > gff > > cds <- cdsBy(gff) > tx <- transcriptsBy(gff) > > olapTx <- summarizeOverlaps(tx, bfl) > olapCds <- summarizeOverlaps(cds, bfl) > > txMat <- assays(olapTx)$counts > cdsMat <- assays(olapCds)$counts > > write.table(txMat, file = "countMatTxGff.txt", sep = "\t") > write.table(cdsMat, file = "countMatCdsGff.txt", sep = "\t") > > sessionInfo > R version 3.0.0 (2013-04-03) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] BiocInstaller_1.12.0 > > loaded via a namespace (and not attached): > [1] AnnotationDbi_1.24.0 Biobase_2.22.0 BiocGenerics_0.8.0 > [4] biomaRt_2.18.0 Biostrings_2.30.1 bitops_1.0-6 > [7] BSgenome_1.30.0 DBI_0.2-7 GenomicFeatures_1.14.2 > [10] GenomicRanges_1.14.3 IRanges_1.20.6 parallel_3.0.0 > [13] RCurl_1.95-4.1 Rsamtools_1.14.2 RSQLite_0.11.4 > [16] rtracklayer_1.22.0 stats4_3.0.0 tools_3.0.0 > [19] XML_3.98-1.1 XVector_0.2.0 zlibbioc_1.8.0 >

ADD COMMENT • link 12.1 years ago Valerie Obenchain ★ 6.8k

0

Entering edit mode

Hi Sam, It depends on whether you are using single read or paired end reads files. As far as I known summarizeOverlaps function only working in case of single read. Anyone please correct me if I am wrong. Kind Regards On Sun, Dec 8, 2013 at 8:25 PM, Valerie Obenchain <vobencha@fhcrc.org>wrote: > Hi Sam, > > Are you getting any error messages related to seqlevels when you count? If > you have confirmed that annotation seqlevels match the bam files next look > at the maximum possible overlap in a single bam file. > > bf <- singleBamFile > reads <- readGAlignments(bf) ## assuming single-end reads > so <- summarizeOverlaps(tx, reads, mode=Union, inter.feature=FALSE) > > 'inter.feature=FALSE' with 'mode=Union' is countOverlaps with 'type=ANY'. > Reads that hit multiple features will be double counted in this case. The > idea to to see if there are any common regions at all. > > If there are still no counts, you could look more closely at a smaller > chromosome region in 'reads' where you would expect counts. This visual > inspection might give you some insight as to why there is no overlap. > > > Valerie > > > On 12/05/13 18:51, Sam McInturf wrote: > >> Hello Bioconductors, >> I am working on an Arabidopsis RNA seq set, and after executing my >> count >> reads script, I get a count table with no reads counted (ie >> sum(colSums(mat)) = 0). >> >> I mapped my reads using Tophat2 (without supplying a gtf/gff file) and >> used >> samtools to sort and index my accepted_hits.bam files. I loaded these bam >> files into IGV (integrated Genome Browser) and the reads appear to have >> been mapped (additionally the align_summary.txt says I have good mapping). >> Script and session info below. >> >> Essentially I use Rsamtools to load in my bamfiles, and then >> makeTranscriptsFromGFF to make a txdb object, transcriptsBy to get >> transcripts, and then summarizeOverlaps to count the reads. My gff file >> is >> TAIR10. I am relatively sure that the genome I mapped to was the TAIR10 >> release, but I am not 100% on that. >> >> I have also used biomaRt to do this (commented out), but I had the same >> results. >> >> I am currently upgrading to R 3.0.2 and re-installing bioconductor (maybe >> it will change something?) >> >> Thanks for any thoughts! >> >> Script: >> >> gffdir <- "/data/smcintur/Annotation/TAIR10_GFF3_genes.gff"; >> library(ShortRead) >> library(GenomicFeatures) >> library(Rsamtools) >> #library(biomaRt) >> setwd("/data/smcintur/RNASeq/NMseq/newTophat/BamFiles/") >> bf <- list.files(pattern = ".bam$") >> bi <- list.files(pattern = "bam.bai$") >> bfl <- BamFileList(bf, bi) >> >> #mart <- makeTranscriptDbFromBiomart(biomart = "Public_TAIRV10", dataset= >> "tairv10") >> tx <- transcriptsBy(mart) >> gff <- makeTranscriptDbFromGFF(gffdir, format = "gff") >> gff >> >> cds <- cdsBy(gff) >> tx <- transcriptsBy(gff) >> >> olapTx <- summarizeOverlaps(tx, bfl) >> olapCds <- summarizeOverlaps(cds, bfl) >> >> txMat <- assays(olapTx)$counts >> cdsMat <- assays(olapCds)$counts >> >> write.table(txMat, file = "countMatTxGff.txt", sep = "\t") >> write.table(cdsMat, file = "countMatCdsGff.txt", sep = "\t") >> >> sessionInfo >> R version 3.0.0 (2013-04-03) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=C LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] BiocInstaller_1.12.0 >> >> loaded via a namespace (and not attached): >> [1] AnnotationDbi_1.24.0 Biobase_2.22.0 BiocGenerics_0.8.0 >> [4] biomaRt_2.18.0 Biostrings_2.30.1 bitops_1.0-6 >> [7] BSgenome_1.30.0 DBI_0.2-7 GenomicFeatures_1.14.2 >> [10] GenomicRanges_1.14.3 IRanges_1.20.6 parallel_3.0.0 >> [13] RCurl_1.95-4.1 Rsamtools_1.14.2 RSQLite_0.11.4 >> [16] rtracklayer_1.22.0 stats4_3.0.0 tools_3.0.0 >> [19] XML_3.98-1.1 XVector_0.2.0 zlibbioc_1.8.0 >> >> > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane. > science.biology.informatics.conductor > -- Reema Singh PhD Scholar Computational Biology and Bioinformatics School of Computational and Integrative Sciences Jawaharlal Nehru University New Delhi-110067 INDIA [[alternative HTML version deleted]]

ADD REPLY • link 12.1 years ago Reema Singh ▴ 570

0

Entering edit mode

summarizeOverlaps() does count paired-end reads. Use option singleEnd=FALSE. Details under 'singleEnd' in the 'Arguments' section on the man page. library(GenomicAlignments) ## devel library(GenomicRanges) ## release ?summarizeOverlaps Valerie On 12/08/13 09:42, Reema Singh wrote: > Hi Sam, > > It depends on whether you are using single read or paired end reads > files. As far as I known summarizeOverlaps function only working in case > of single read. Anyone please correct me if I am wrong. > > Kind Regards > > > On Sun, Dec 8, 2013 at 8:25 PM, Valerie Obenchain <vobencha at="" fhcrc.org=""> <mailto:vobencha at="" fhcrc.org="">> wrote: > > Hi Sam, > > Are you getting any error messages related to seqlevels when you > count? If you have confirmed that annotation seqlevels match the bam > files next look at the maximum possible overlap in a single bam file. > > bf <- singleBamFile > reads <- readGAlignments(bf) ## assuming single-end reads > so <- summarizeOverlaps(tx, reads, mode=Union, inter.feature=FALSE) > > 'inter.feature=FALSE' with 'mode=Union' is countOverlaps with > 'type=ANY'. Reads that hit multiple features will be double counted > in this case. The idea to to see if there are any common regions at all. > > If there are still no counts, you could look more closely at a > smaller chromosome region in 'reads' where you would expect counts. > This visual inspection might give you some insight as to why there > is no overlap. > > > Valerie > > > On 12/05/13 18:51, Sam McInturf wrote: > > Hello Bioconductors, > I am working on an Arabidopsis RNA seq set, and after > executing my count > reads script, I get a count table with no reads counted (ie > sum(colSums(mat)) = 0). > > I mapped my reads using Tophat2 (without supplying a gtf/gff > file) and used > samtools to sort and index my accepted_hits.bam files. I loaded > these bam > files into IGV (integrated Genome Browser) and the reads appear > to have > been mapped (additionally the align_summary.txt says I have good > mapping). > Script and session info below. > > Essentially I use Rsamtools to load in my bamfiles, and then > makeTranscriptsFromGFF to make a txdb object, transcriptsBy to get > transcripts, and then summarizeOverlaps to count the reads. My > gff file is > TAIR10. I am relatively sure that the genome I mapped to was > the TAIR10 > release, but I am not 100% on that. > > I have also used biomaRt to do this (commented out), but I had > the same > results. > > I am currently upgrading to R 3.0.2 and re-installing > bioconductor (maybe > it will change something?) > > Thanks for any thoughts! > > Script: > > gffdir <- "/data/smcintur/Annotation/__TAIR10_GFF3_genes.gff"; > library(ShortRead) > library(GenomicFeatures) > library(Rsamtools) > #library(biomaRt) > setwd("/data/smcintur/RNASeq/__NMseq/newTophat/BamFiles/") > bf <- list.files(pattern = ".bam$") > bi <- list.files(pattern = "bam.bai$") > bfl <- BamFileList(bf, bi) > > #mart <- makeTranscriptDbFromBiomart(__biomart = > "Public_TAIRV10", dataset= > "tairv10") > tx <- transcriptsBy(mart) > gff <- makeTranscriptDbFromGFF(__gffdir, format = "gff") > gff > > cds <- cdsBy(gff) > tx <- transcriptsBy(gff) > > olapTx <- summarizeOverlaps(tx, bfl) > olapCds <- summarizeOverlaps(cds, bfl) > > txMat <- assays(olapTx)$counts > cdsMat <- assays(olapCds)$counts > > write.table(txMat, file = "countMatTxGff.txt", sep = "\t") > write.table(cdsMat, file = "countMatCdsGff.txt", sep = "\t") > > sessionInfo > R version 3.0.0 (2013-04-03) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] BiocInstaller_1.12.0 > > loaded via a namespace (and not attached): > [1] AnnotationDbi_1.24.0 Biobase_2.22.0 > BiocGenerics_0.8.0 > [4] biomaRt_2.18.0 Biostrings_2.30.1 bitops_1.0-6 > [7] BSgenome_1.30.0 DBI_0.2-7 > GenomicFeatures_1.14.2 > [10] GenomicRanges_1.14.3 IRanges_1.20.6 parallel_3.0.0 > [13] RCurl_1.95-4.1 Rsamtools_1.14.2 RSQLite_0.11.4 > [16] rtracklayer_1.22.0 stats4_3.0.0 tools_3.0.0 > [19] XML_3.98-1.1 XVector_0.2.0 zlibbioc_1.8.0 > > > _________________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/__listinfo/bioconductor > <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: > http://news.gmane.org/gmane.__science.biology.informatics.__conductor <http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> > > > > > -- > Reema Singh > PhD Scholar > Computational Biology and Bioinformatics > School of Computational and Integrative Sciences > Jawaharlal Nehru University > New Delhi-110067 > INDIA

ADD REPLY • link 12.1 years ago Valerie Obenchain ★ 6.8k

0

Entering edit mode

Hi all, Indeed summarizeOverlaps() works on paired end data. But different counting modes [Union, Intersection and IntersectionNotEmpty] they do not handle paired end reads. Here the reference [ http://www.bioconductor.org/packages/release/bioc/vignettes/GenomicRan ges/inst/doc/summarizeOverlaps.pdf]-Page 2 [under Counting Modes]. Kind Regards On Sun, Dec 8, 2013 at 6:31 PM, Valerie Obenchain <vobencha@fhcrc.org>wrote: > summarizeOverlaps() does count paired-end reads. Use option > singleEnd=FALSE. Details under 'singleEnd' in the 'Arguments' section on > the man page. > > library(GenomicAlignments) ## devel > library(GenomicRanges) ## release > ?summarizeOverlaps > > Valerie > > > On 12/08/13 09:42, Reema Singh wrote: > >> Hi Sam, >> >> It depends on whether you are using single read or paired end reads >> files. As far as I known summarizeOverlaps function only working in case >> of single read. Anyone please correct me if I am wrong. >> >> Kind Regards >> >> >> On Sun, Dec 8, 2013 at 8:25 PM, Valerie Obenchain <vobencha@fhcrc.org>> <mailto:vobencha@fhcrc.org>> wrote: >> >> Hi Sam, >> >> Are you getting any error messages related to seqlevels when you >> count? If you have confirmed that annotation seqlevels match the bam >> files next look at the maximum possible overlap in a single bam file. >> >> bf <- singleBamFile >> reads <- readGAlignments(bf) ## assuming single-end reads >> so <- summarizeOverlaps(tx, reads, mode=Union, inter.feature=FALSE) >> >> 'inter.feature=FALSE' with 'mode=Union' is countOverlaps with >> 'type=ANY'. Reads that hit multiple features will be double counted >> in this case. The idea to to see if there are any common regions at >> all. >> >> If there are still no counts, you could look more closely at a >> smaller chromosome region in 'reads' where you would expect counts. >> This visual inspection might give you some insight as to why there >> is no overlap. >> >> >> Valerie >> >> >> On 12/05/13 18:51, Sam McInturf wrote: >> >> Hello Bioconductors, >> I am working on an Arabidopsis RNA seq set, and after >> executing my count >> reads script, I get a count table with no reads counted (ie >> sum(colSums(mat)) = 0). >> >> I mapped my reads using Tophat2 (without supplying a gtf/gff >> file) and used >> samtools to sort and index my accepted_hits.bam files. I loaded >> these bam >> files into IGV (integrated Genome Browser) and the reads appear >> to have >> been mapped (additionally the align_summary.txt says I have good >> mapping). >> Script and session info below. >> >> Essentially I use Rsamtools to load in my bamfiles, and then >> makeTranscriptsFromGFF to make a txdb object, transcriptsBy to get >> transcripts, and then summarizeOverlaps to count the reads. My >> gff file is >> TAIR10. I am relatively sure that the genome I mapped to was >> the TAIR10 >> release, but I am not 100% on that. >> >> I have also used biomaRt to do this (commented out), but I had >> the same >> results. >> >> I am currently upgrading to R 3.0.2 and re-installing >> bioconductor (maybe >> it will change something?) >> >> Thanks for any thoughts! >> >> Script: >> >> gffdir <- "/data/smcintur/Annotation/__TAIR10_GFF3_genes.gff"; >> library(ShortRead) >> library(GenomicFeatures) >> library(Rsamtools) >> #library(biomaRt) >> setwd("/data/smcintur/RNASeq/__NMseq/newTophat/BamFiles/") >> >> bf <- list.files(pattern = ".bam$") >> bi <- list.files(pattern = "bam.bai$") >> bfl <- BamFileList(bf, bi) >> >> #mart <- makeTranscriptDbFromBiomart(__biomart = >> >> "Public_TAIRV10", dataset= >> "tairv10") >> tx <- transcriptsBy(mart) >> gff <- makeTranscriptDbFromGFF(__gffdir, format = "gff") >> >> gff >> >> cds <- cdsBy(gff) >> tx <- transcriptsBy(gff) >> >> olapTx <- summarizeOverlaps(tx, bfl) >> olapCds <- summarizeOverlaps(cds, bfl) >> >> txMat <- assays(olapTx)$counts >> cdsMat <- assays(olapCds)$counts >> >> write.table(txMat, file = "countMatTxGff.txt", sep = "\t") >> write.table(cdsMat, file = "countMatCdsGff.txt", sep = "\t") >> >> sessionInfo >> R version 3.0.0 (2013-04-03) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=C LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods >> base >> >> other attached packages: >> [1] BiocInstaller_1.12.0 >> >> loaded via a namespace (and not attached): >> [1] AnnotationDbi_1.24.0 Biobase_2.22.0 >> BiocGenerics_0.8.0 >> [4] biomaRt_2.18.0 Biostrings_2.30.1 bitops_1.0-6 >> [7] BSgenome_1.30.0 DBI_0.2-7 >> GenomicFeatures_1.14.2 >> [10] GenomicRanges_1.14.3 IRanges_1.20.6 parallel_3.0.0 >> [13] RCurl_1.95-4.1 Rsamtools_1.14.2 RSQLite_0.11.4 >> [16] rtracklayer_1.22.0 stats4_3.0.0 tools_3.0.0 >> [19] XML_3.98-1.1 XVector_0.2.0 zlibbioc_1.8.0 >> >> >> _________________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org <mailto:bioconductor@r-project.org> >> https://stat.ethz.ch/mailman/__listinfo/bioconductor >> >> <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: >> http://news.gmane.org/gmane.__science.biology.informatics.__conductor< >> http://news.gmane.org/gmane.science.biology.informatics.conductor> >> >> >> >> >> >> -- >> Reema Singh >> PhD Scholar >> Computational Biology and Bioinformatics >> School of Computational and Integrative Sciences >> Jawaharlal Nehru University >> New Delhi-110067 >> INDIA >> > > -- Reema Singh Postdoctoral Research Assistant College of Life Sciences University of Dundee, Dundee DD1 4HN, Scotland United Kingdom [[alternative HTML version deleted]]

ADD REPLY • link 12.1 years ago Reema Singh ▴ 570

0

Entering edit mode

Reema, Thank you for pointing out the error in the summarizeOverlaps vignette. That sentence was a leftover from a previous version and should have been removed. I have updated the vignette in both release (GenomicRanges 1.14.4) and devel (GenomicAlignments 0.99.8). When the 'reads' argument is a GAlignmentPairs object the stand-alone functions of Union(), IntersectionStrict() and IntersectionNotEmpty() count the data as paired-end. If 'reads' is a GAlignments object they are counted as single-end. summarizeOverlaps() reads the data into a GAlignments or GAlignmentPairs object and calls these functions internally; they are exactly the same. Valerie On 12/09/2013 02:46 AM, Reema Singh wrote: > Hi all, > > Indeed summarizeOverlaps() works on paired end data. But different > counting modes [Union, Intersection and IntersectionNotEmpty] they do > not handle paired end reads. Here the reference > [http://www.bioconductor.org/packages/release/bioc/vignettes/Genomic Ranges/inst/doc/summarizeOverlaps.pdf]- > Page 2 [under Counting Modes]. > > Kind Regards > > > > On Sun, Dec 8, 2013 at 6:31 PM, Valerie Obenchain <vobencha at="" fhcrc.org=""> <mailto:vobencha at="" fhcrc.org="">> wrote: > > summarizeOverlaps() does count paired-end reads. Use option > singleEnd=FALSE. Details under 'singleEnd' in the 'Arguments' > section on the man page. > > library(GenomicAlignments) ## devel > library(GenomicRanges) ## release > ?summarizeOverlaps > > Valerie > > > On 12/08/13 09:42, Reema Singh wrote: > > Hi Sam, > > It depends on whether you are using single read or paired end reads > files. As far as I known summarizeOverlaps function only working > in case > of single read. Anyone please correct me if I am wrong. > > Kind Regards > > > On Sun, Dec 8, 2013 at 8:25 PM, Valerie Obenchain > <vobencha at="" fhcrc.org="" <mailto:vobencha="" at="" fhcrc.org=""> > <mailto:vobencha at="" fhcrc.org="" <mailto:vobencha="" at="" fhcrc.org="">>> wrote: > > Hi Sam, > > Are you getting any error messages related to seqlevels > when you > count? If you have confirmed that annotation seqlevels > match the bam > files next look at the maximum possible overlap in a single > bam file. > > bf <- singleBamFile > reads <- readGAlignments(bf) ## assuming single-end reads > so <- summarizeOverlaps(tx, reads, mode=Union, > inter.feature=FALSE) > > 'inter.feature=FALSE' with 'mode=Union' is countOverlaps with > 'type=ANY'. Reads that hit multiple features will be double > counted > in this case. The idea to to see if there are any common > regions at all. > > If there are still no counts, you could look more closely at a > smaller chromosome region in 'reads' where you would expect > counts. > This visual inspection might give you some insight as to > why there > is no overlap. > > > Valerie > > > On 12/05/13 18:51, Sam McInturf wrote: > > Hello Bioconductors, > I am working on an Arabidopsis RNA seq set, and after > executing my count > reads script, I get a count table with no reads counted (ie > sum(colSums(mat)) = 0). > > I mapped my reads using Tophat2 (without supplying a > gtf/gff > file) and used > samtools to sort and index my accepted_hits.bam files. > I loaded > these bam > files into IGV (integrated Genome Browser) and the > reads appear > to have > been mapped (additionally the align_summary.txt says I > have good > mapping). > Script and session info below. > > Essentially I use Rsamtools to load in my bamfiles, and > then > makeTranscriptsFromGFF to make a txdb object, > transcriptsBy to get > transcripts, and then summarizeOverlaps to count the > reads. My > gff file is > TAIR10. I am relatively sure that the genome I mapped > to was > the TAIR10 > release, but I am not 100% on that. > > I have also used biomaRt to do this (commented out), > but I had > the same > results. > > I am currently upgrading to R 3.0.2 and re- installing > bioconductor (maybe > it will change something?) > > Thanks for any thoughts! > > Script: > > gffdir <- > "/data/smcintur/Annotation/____TAIR10_GFF3_genes.gff"; > library(ShortRead) > library(GenomicFeatures) > library(Rsamtools) > #library(biomaRt) > > setwd("/data/smcintur/RNASeq/____NMseq/newTophat/BamFiles/") > > bf <- list.files(pattern = ".bam$") > bi <- list.files(pattern = "bam.bai$") > bfl <- BamFileList(bf, bi) > > #mart <- makeTranscriptDbFromBiomart(____biomart = > > "Public_TAIRV10", dataset= > "tairv10") > tx <- transcriptsBy(mart) > gff <- makeTranscriptDbFromGFF(____gffdir, format = "gff") > > gff > > cds <- cdsBy(gff) > tx <- transcriptsBy(gff) > > olapTx <- summarizeOverlaps(tx, bfl) > olapCds <- summarizeOverlaps(cds, bfl) > > txMat <- assays(olapTx)$counts > cdsMat <- assays(olapCds)$counts > > write.table(txMat, file = "countMatTxGff.txt", sep = "\t") > write.table(cdsMat, file = "countMatCdsGff.txt", sep = > "\t") > > sessionInfo > R version 3.0.0 (2013-04-03) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets > methods base > > other attached packages: > [1] BiocInstaller_1.12.0 > > loaded via a namespace (and not attached): > [1] AnnotationDbi_1.24.0 Biobase_2.22.0 > BiocGenerics_0.8.0 > [4] biomaRt_2.18.0 Biostrings_2.30.1 > bitops_1.0-6 > [7] BSgenome_1.30.0 DBI_0.2-7 > GenomicFeatures_1.14.2 > [10] GenomicRanges_1.14.3 IRanges_1.20.6 > parallel_3.0.0 > [13] RCurl_1.95-4.1 Rsamtools_1.14.2 > RSQLite_0.11.4 > [16] rtracklayer_1.22.0 stats4_3.0.0 > tools_3.0.0 > [19] XML_3.98-1.1 XVector_0.2.0 > zlibbioc_1.8.0 > > > ___________________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > <mailto:bioconductor at="" r-__project.org=""> <mailto:bioconductor at="" r-project.org="">> > https://stat.ethz.ch/mailman/____listinfo/bioconductor > <https: stat.ethz.ch="" mailman="" __listinfo="" bioconductor=""> > > <https: stat.ethz.ch="" mailman="" __listinfo="" bioconductor=""> <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor="">> > Search the archives: > http://news.gmane.org/gmane.____science.biology.informatics. ____conductor > <http: news.gmane.org="" gmane.__science.biology.informatics._="" _conductor=""> > <http: news.gmane.org="" gmane.__science.biology.informatics.__conductor=""> <http: news.gmane.org="" gmane.science.biology.informatics.conductor="">> > > > > > > -- > Reema Singh > PhD Scholar > Computational Biology and Bioinformatics > School of Computational and Integrative Sciences > Jawaharlal Nehru University > New Delhi-110067 > INDIA > > > > > > -- > Reema Singh > Postdoctoral Research Assistant > College of Life Sciences > University of Dundee, > Dundee DD1 4HN, Scotland > United Kingdom -- Valerie Obenchain Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B155 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: vobencha at fhcrc.org Phone: (206) 667-3158 Fax: (206) 667-1319

ADD REPLY • link 12.1 years ago Valerie Obenchain ★ 6.8k

0

Entering edit mode

Valerie and Reema, I am using single end reads. I did as Valerie suggested so <- summarizeOverlaps(tx, reads, mode="Union",type = "any") colSums(assays(so)$counts #185,940 which feels silly, allow for double counts and return fewer counts than prior P.S. thank you for your prompt reply, my email "hid" it from me Best Sam On Mon, Dec 9, 2013 at 6:08 PM, Valerie Obenchain <vobencha@fhcrc.org>wrote: > Reema, > > Thank you for pointing out the error in the summarizeOverlaps vignette. > That sentence was a leftover from a previous version and should have been > removed. I have updated the vignette in both release (GenomicRanges 1.14.4) > and devel (GenomicAlignments 0.99.8). > > When the 'reads' argument is a GAlignmentPairs object the stand- alone > functions of Union(), IntersectionStrict() and IntersectionNotEmpty() count > the data as paired-end. If 'reads' is a GAlignments object they are counted > as single-end. summarizeOverlaps() reads the data into a GAlignments or > GAlignmentPairs object and calls these functions internally; they are > exactly the same. > > Valerie > > > > On 12/09/2013 02:46 AM, Reema Singh wrote: > >> Hi all, >> >> Indeed summarizeOverlaps() works on paired end data. But different >> counting modes [Union, Intersection and IntersectionNotEmpty] they do >> not handle paired end reads. Here the reference >> [http://www.bioconductor.org/packages/release/bioc/ >> vignettes/GenomicRanges/inst/doc/summarizeOverlaps.pdf]- >> Page 2 [under Counting Modes]. >> >> Kind Regards >> >> >> >> On Sun, Dec 8, 2013 at 6:31 PM, Valerie Obenchain <vobencha@fhcrc.org>> <mailto:vobencha@fhcrc.org>> wrote: >> >> summarizeOverlaps() does count paired-end reads. Use option >> singleEnd=FALSE. Details under 'singleEnd' in the 'Arguments' >> section on the man page. >> >> library(GenomicAlignments) ## devel >> library(GenomicRanges) ## release >> ?summarizeOverlaps >> >> Valerie >> >> >> On 12/08/13 09:42, Reema Singh wrote: >> >> Hi Sam, >> >> It depends on whether you are using single read or paired end >> reads >> files. As far as I known summarizeOverlaps function only working >> in case >> of single read. Anyone please correct me if I am wrong. >> >> Kind Regards >> >> >> On Sun, Dec 8, 2013 at 8:25 PM, Valerie Obenchain >> <vobencha@fhcrc.org <mailto:vobencha@fhcrc.org=""> >> <mailto:vobencha@fhcrc.org <mailto:vobencha@fhcrc.org="">>> wrote: >> >> Hi Sam, >> >> Are you getting any error messages related to seqlevels >> when you >> count? If you have confirmed that annotation seqlevels >> match the bam >> files next look at the maximum possible overlap in a single >> bam file. >> >> bf <- singleBamFile >> reads <- readGAlignments(bf) ## assuming single- end reads >> so <- summarizeOverlaps(tx, reads, mode=Union, >> inter.feature=FALSE) >> >> 'inter.feature=FALSE' with 'mode=Union' is countOverlaps with >> 'type=ANY'. Reads that hit multiple features will be double >> counted >> in this case. The idea to to see if there are any common >> regions at all. >> >> If there are still no counts, you could look more closely at >> a >> smaller chromosome region in 'reads' where you would expect >> counts. >> This visual inspection might give you some insight as to >> why there >> is no overlap. >> >> >> Valerie >> >> >> On 12/05/13 18:51, Sam McInturf wrote: >> >> Hello Bioconductors, >> I am working on an Arabidopsis RNA seq set, and >> after >> executing my count >> reads script, I get a count table with no reads counted >> (ie >> sum(colSums(mat)) = 0). >> >> I mapped my reads using Tophat2 (without supplying a >> gtf/gff >> file) and used >> samtools to sort and index my accepted_hits.bam files. >> I loaded >> these bam >> files into IGV (integrated Genome Browser) and the >> reads appear >> to have >> been mapped (additionally the align_summary.txt says I >> have good >> mapping). >> Script and session info below. >> >> Essentially I use Rsamtools to load in my bamfiles, and >> then >> makeTranscriptsFromGFF to make a txdb object, >> transcriptsBy to get >> transcripts, and then summarizeOverlaps to count the >> reads. My >> gff file is >> TAIR10. I am relatively sure that the genome I mapped >> to was >> the TAIR10 >> release, but I am not 100% on that. >> >> I have also used biomaRt to do this (commented out), >> but I had >> the same >> results. >> >> I am currently upgrading to R 3.0.2 and re- installing >> bioconductor (maybe >> it will change something?) >> >> Thanks for any thoughts! >> >> Script: >> >> gffdir <- >> "/data/smcintur/Annotation/____TAIR10_GFF3_genes.gff"; >> library(ShortRead) >> library(GenomicFeatures) >> library(Rsamtools) >> #library(biomaRt) >> >> setwd("/data/smcintur/RNASeq/____NMseq/newTophat/BamFiles/") >> >> >> bf <- list.files(pattern = ".bam$") >> bi <- list.files(pattern = "bam.bai$") >> bfl <- BamFileList(bf, bi) >> >> #mart <- makeTranscriptDbFromBiomart(____biomart = >> >> >> "Public_TAIRV10", dataset= >> "tairv10") >> tx <- transcriptsBy(mart) >> gff <- makeTranscriptDbFromGFF(____gffdir, format = >> "gff") >> >> >> gff >> >> cds <- cdsBy(gff) >> tx <- transcriptsBy(gff) >> >> olapTx <- summarizeOverlaps(tx, bfl) >> olapCds <- summarizeOverlaps(cds, bfl) >> >> txMat <- assays(olapTx)$counts >> cdsMat <- assays(olapCds)$counts >> >> write.table(txMat, file = "countMatTxGff.txt", sep = >> "\t") >> write.table(cdsMat, file = "countMatCdsGff.txt", sep = >> "\t") >> >> sessionInfo >> R version 3.0.0 (2013-04-03) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=C LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets >> methods base >> >> other attached packages: >> [1] BiocInstaller_1.12.0 >> >> loaded via a namespace (and not attached): >> [1] AnnotationDbi_1.24.0 Biobase_2.22.0 >> BiocGenerics_0.8.0 >> [4] biomaRt_2.18.0 Biostrings_2.30.1 >> bitops_1.0-6 >> [7] BSgenome_1.30.0 DBI_0.2-7 >> GenomicFeatures_1.14.2 >> [10] GenomicRanges_1.14.3 IRanges_1.20.6 >> parallel_3.0.0 >> [13] RCurl_1.95-4.1 Rsamtools_1.14.2 >> RSQLite_0.11.4 >> [16] rtracklayer_1.22.0 stats4_3.0.0 >> tools_3.0.0 >> [19] XML_3.98-1.1 XVector_0.2.0 >> zlibbioc_1.8.0 >> >> >> ___________________________________________________ >> >> Bioconductor mailing list >> Bioconductor@r-project.org <mailto:bioconductor@r-project.org> >> <mailto:bioconductor@r-__project.org>> <mailto:bioconductor@r-project.org>> >> https://stat.ethz.ch/mailman/____listinfo/bioconductor >> <https: stat.ethz.ch="" mailman="" __listinfo="" bioconductor=""> >> >> >> <https: stat.ethz.ch="" mailman="" __listinfo="" bioconductor="">> <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor="">> >> Search the archives: >> http://news.gmane.org/gmane.____science.biology.informatics. >> ____conductor >> <http: news.gmane.org="" gmane.__science.biology.informatics._="">> _conductor> >> >> <http: news.gmane.org="" gmane.__science.biology.informatics._="">> _conductor >> <http: news.gmane.org="" gmane.science.biology.informatics.="">> conductor>> >> >> >> >> >> >> -- >> Reema Singh >> PhD Scholar >> Computational Biology and Bioinformatics >> School of Computational and Integrative Sciences >> Jawaharlal Nehru University >> New Delhi-110067 >> INDIA >> >> >> >> >> >> -- >> Reema Singh >> Postdoctoral Research Assistant >> College of Life Sciences >> University of Dundee, >> Dundee DD1 4HN, Scotland >> United Kingdom >> > > > -- > Valerie Obenchain > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B155 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: vobencha@fhcrc.org > Phone: (206) 667-3158 > Fax: (206) 667-1319 > -- Sam McInturf [[alternative HTML version deleted]]

ADD REPLY • link 12.1 years ago Sam McInturf ▴ 300

0

Entering edit mode

On 12/21/13 05:14, Sam McInturf wrote: > Valerie and Reema, > I am using single end reads. > > I did as Valerie suggested > > so <- summarizeOverlaps(tx, reads, mode="Union",type = "any") This is not the example I used below. You did not include 'inter.feature=FALSE'. 'type' is not an argument to summarizeOverlaps(). In other words, do not use 'type', use 'inter.feature'. Please see my previous explanation and the man page for summarizeOverlaps(). Valerie > colSums(assays(so)$counts > #185,940 > > which feels silly, allow for double counts and return fewer counts than > prior > > P.S. thank you for your prompt reply, my email "hid" it from me > Best > > Sam > > > > On Mon, Dec 9, 2013 at 6:08 PM, Valerie Obenchain <vobencha at="" fhcrc.org=""> <mailto:vobencha at="" fhcrc.org="">> wrote: > > Reema, > > Thank you for pointing out the error in the summarizeOverlaps > vignette. That sentence was a leftover from a previous version and > should have been removed. I have updated the vignette in both > release (GenomicRanges 1.14.4) and devel (GenomicAlignments 0.99.8). > > When the 'reads' argument is a GAlignmentPairs object the > stand-alone functions of Union(), IntersectionStrict() and > IntersectionNotEmpty() count the data as paired-end. If 'reads' is a > GAlignments object they are counted as single-end. > summarizeOverlaps() reads the data into a GAlignments or > GAlignmentPairs object and calls these functions internally; they > are exactly the same. > > Valerie > > > > On 12/09/2013 02:46 AM, Reema Singh wrote: > > Hi all, > > Indeed summarizeOverlaps() works on paired end data. But different > counting modes [Union, Intersection and IntersectionNotEmpty] > they do > not handle paired end reads. Here the reference > [http://www.bioconductor.org/__packages/release/bioc/__vigne ttes/GenomicRanges/inst/__doc/summarizeOverlaps.pdf]- > <http: www.bioconductor.org="" packages="" release="" bioc="" vignettes="" genomicranges="" inst="" doc="" summarizeoverlaps.pdf%5d-=""> > Page 2 [under Counting Modes]. > > Kind Regards > > > > On Sun, Dec 8, 2013 at 6:31 PM, Valerie Obenchain > <vobencha at="" fhcrc.org="" <mailto:vobencha="" at="" fhcrc.org=""> > <mailto:vobencha at="" fhcrc.org="" <mailto:vobencha="" at="" fhcrc.org="">>> wrote: > > summarizeOverlaps() does count paired-end reads. Use option > singleEnd=FALSE. Details under 'singleEnd' in the 'Arguments' > section on the man page. > > library(GenomicAlignments) ## devel > library(GenomicRanges) ## release > ?summarizeOverlaps > > Valerie > > > On 12/08/13 09:42, Reema Singh wrote: > > Hi Sam, > > It depends on whether you are using single read or > paired end reads > files. As far as I known summarizeOverlaps function > only working > in case > of single read. Anyone please correct me if I am wrong. > > Kind Regards > > > On Sun, Dec 8, 2013 at 8:25 PM, Valerie Obenchain > <vobencha at="" fhcrc.org="" <mailto:vobencha="" at="" fhcrc.org=""> > <mailto:vobencha at="" fhcrc.org="" <mailto:vobencha="" at="" fhcrc.org="">> > <mailto:vobencha at="" fhcrc.org="" <mailto:vobencha="" at="" fhcrc.org=""> > <mailto:vobencha at="" fhcrc.org="" <mailto:vobencha="" at="" fhcrc.org="">>>> wrote: > > Hi Sam, > > Are you getting any error messages related to > seqlevels > when you > count? If you have confirmed that annotation seqlevels > match the bam > files next look at the maximum possible overlap in > a single > bam file. > > bf <- singleBamFile > reads <- readGAlignments(bf) ## assuming > single-end reads > so <- summarizeOverlaps(tx, reads, mode=Union, > inter.feature=FALSE) > > 'inter.feature=FALSE' with 'mode=Union' is > countOverlaps with > 'type=ANY'. Reads that hit multiple features will > be double > counted > in this case. The idea to to see if there are any > common > regions at all. > > If there are still no counts, you could look more > closely at a > smaller chromosome region in 'reads' where you > would expect > counts. > This visual inspection might give you some insight > as to > why there > is no overlap. > > > Valerie > > > On 12/05/13 18:51, Sam McInturf wrote: > > Hello Bioconductors, > I am working on an Arabidopsis RNA seq > set, and after > executing my count > reads script, I get a count table with no > reads counted (ie > sum(colSums(mat)) = 0). > > I mapped my reads using Tophat2 (without > supplying a > gtf/gff > file) and used > samtools to sort and index my > accepted_hits.bam files. > I loaded > these bam > files into IGV (integrated Genome Browser) and the > reads appear > to have > been mapped (additionally the > align_summary.txt says I > have good > mapping). > Script and session info below. > > Essentially I use Rsamtools to load in my > bamfiles, and > then > makeTranscriptsFromGFF to make a txdb object, > transcriptsBy to get > transcripts, and then summarizeOverlaps to > count the > reads. My > gff file is > TAIR10. I am relatively sure that the genome > I mapped > to was > the TAIR10 > release, but I am not 100% on that. > > I have also used biomaRt to do this (commented > out), > but I had > the same > results. > > I am currently upgrading to R 3.0.2 and > re-installing > bioconductor (maybe > it will change something?) > > Thanks for any thoughts! > > Script: > > gffdir <- > "/data/smcintur/Annotation/______TAIR10_GFF3_genes.gff"; > library(ShortRead) > library(GenomicFeatures) > library(Rsamtools) > #library(biomaRt) > > > setwd("/data/smcintur/RNASeq/______NMseq/newTophat/BamFiles/") > > > bf <- list.files(pattern = ".bam$") > bi <- list.files(pattern = "bam.bai$") > bfl <- BamFileList(bf, bi) > > #mart <- > makeTranscriptDbFromBiomart(______biomart = > > > "Public_TAIRV10", dataset= > "tairv10") > tx <- transcriptsBy(mart) > gff <- makeTranscriptDbFromGFF(______gffdir, > format = "gff") > > > gff > > cds <- cdsBy(gff) > tx <- transcriptsBy(gff) > > olapTx <- summarizeOverlaps(tx, bfl) > olapCds <- summarizeOverlaps(cds, bfl) > > txMat <- assays(olapTx)$counts > cdsMat <- assays(olapCds)$counts > > write.table(txMat, file = "countMatTxGff.txt", > sep = "\t") > write.table(cdsMat, file = > "countMatCdsGff.txt", sep = > "\t") > > sessionInfo > R version 3.0.0 (2013-04-03) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 > LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 > LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 > LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils > datasets > methods base > > other attached packages: > [1] BiocInstaller_1.12.0 > > loaded via a namespace (and not attached): > [1] AnnotationDbi_1.24.0 Biobase_2.22.0 > BiocGenerics_0.8.0 > [4] biomaRt_2.18.0 Biostrings_2.30.1 > bitops_1.0-6 > [7] BSgenome_1.30.0 DBI_0.2-7 > GenomicFeatures_1.14.2 > [10] GenomicRanges_1.14.3 IRanges_1.20.6 > parallel_3.0.0 > [13] RCurl_1.95-4.1 Rsamtools_1.14.2 > RSQLite_0.11.4 > [16] rtracklayer_1.22.0 stats4_3.0.0 > tools_3.0.0 > [19] XML_3.98-1.1 XVector_0.2.0 > zlibbioc_1.8.0 > > > _____________________________________________________ > > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > <mailto:bioconductor at="" r-__project.org=""> <mailto:bioconductor at="" r-project.org="">> > <mailto:bioconductor at="" r-____project.org=""> <mailto:bioconductor at="" r-__project.org=""> > <mailto:bioconductor at="" r-__project.org=""> <mailto:bioconductor at="" r-project.org="">>> > https://stat.ethz.ch/mailman/______listinfo/bioconductor > <https: stat.ethz.ch="" mailman="" ____listinfo="" bioconductor=""> > <https: stat.ethz.ch="" mailman="" ____listinfo="" bioconductor=""> <https: stat.ethz.ch="" mailman="" __listinfo="" bioconductor="">> > > > > <https: stat.ethz.ch="" mailman="" ____listinfo="" bioconductor=""> <https: stat.ethz.ch="" mailman="" __listinfo="" bioconductor=""> > <https: stat.ethz.ch="" mailman="" __listinfo="" bioconductor=""> <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor="">>> > Search the archives: > http://news.gmane.org/gmane.______science.biology.informatic s.______conductor > <http: news.gmane.org="" gmane.____science.biology.informatics="" .____conductor=""> > > <http: news.gmane.org="" gmane.____science.biology.informatics="" .____conductor=""> <http: news.gmane.org="" gmane.__science.biology.informatics._="" _conductor="">> > > > <http: news.gmane.org="" gmane.____science.biology.informatics="" .____conductor=""> <http: news.gmane.org="" gmane.__science.biology.informatics._="" _conductor=""> > > <http: news.gmane.org="" gmane.__science.biology.informatics.__conductor=""> <http: news.gmane.org="" gmane.science.biology.informatics.conductor="">>> > > > > > > -- > Reema Singh > PhD Scholar > Computational Biology and Bioinformatics > School of Computational and Integrative Sciences > Jawaharlal Nehru University > New Delhi-110067 > INDIA > > > > > > -- > Reema Singh > Postdoctoral Research Assistant > College of Life Sciences > University of Dundee, > Dundee DD1 4HN, Scotland > United Kingdom > > > > -- > Valerie Obenchain > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B155 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: vobencha at fhcrc.org <mailto:vobencha at="" fhcrc.org=""> > Phone: (206) 667-3158 <tel:%28206%29%20667-3158> > Fax: (206) 667-1319 <tel:%28206%29%20667-1319> > > > > > -- > Sam McInturf

ADD REPLY • link 12.1 years ago Valerie Obenchain ★ 6.8k

0

Entering edit mode

Valerie, Sorry for the lame error, was reading your comments about countOverlaps. I found my problem in that the features data had strand information, while my sequencing was not stranded, so by setting the strand information to "*" in the GRanges object and then reduce(split()) I was able to map succesfully Thanks for the help! On Saturday, December 21, 2013, Valerie Obenchain wrote: > On 12/21/13 05:14, Sam McInturf wrote: > >> Valerie and Reema, >> I am using single end reads. >> >> I did as Valerie suggested >> >> so <- summarizeOverlaps(tx, reads, mode="Union",type = "any") >> > > This is not the example I used below. You did not include > 'inter.feature=FALSE'. 'type' is not an argument to summarizeOverlaps(). In > other words, do not use 'type', use 'inter.feature'. Please see my previous > explanation and the man page for summarizeOverlaps(). > > > Valerie > > > colSums(assays(so)$counts >> #185,940 >> >> which feels silly, allow for double counts and return fewer counts than >> prior >> >> P.S. thank you for your prompt reply, my email "hid" it from me >> Best >> >> Sam >> >> >> >> On Mon, Dec 9, 2013 at 6:08 PM, Valerie Obenchain <vobencha@fhcrc.org>> <mailto:vobencha@fhcrc.org>> wrote: >> >> Reema, >> >> Thank you for pointing out the error in the summarizeOverlaps >> vignette. That sentence was a leftover from a previous version and >> should have been removed. I have updated the vignette in both >> release (GenomicRanges 1.14.4) and devel (GenomicAlignments 0.99.8). >> >> When the 'reads' argument is a GAlignmentPairs object the >> stand-alone functions of Union(), IntersectionStrict() and >> IntersectionNotEmpty() count the data as paired-end. If 'reads' is a >> GAlignments object they are counted as single-end. >> summarizeOverlaps() reads the data into a GAlignments or >> GAlignmentPairs object and calls these functions internally; they >> are exactly the same. >> >> Valerie >> >> >> >> On 12/09/2013 02:46 AM, Reema Singh wrote: >> >> Hi all, >> >> Indeed summarizeOverlaps() works on paired end data. But different >> counting modes [Union, Intersection and IntersectionNotEmpty] >> they do >> not handle paired end reads. Here the reference >> [http://www.bioconductor.org/__packages/release/bioc/__ >> vignettes/GenomicRanges/inst/__doc/summarizeOverlaps.pdf]- >> <http: www.bioconductor.org="" packages="" release="" bioc="">> vignettes/GenomicRanges/inst/doc/summarizeOverlaps.pdf%5D-> >> Page 2 [under Counting Modes]. >> >> Kind Regards >> >> >> >> On Sun, Dec 8, 2013 at 6:31 PM, Valerie Obenchain >> <vobencha@fhcrc.org <mailto:vobencha@fhcrc.org=""> >> <mailto:vobencha@fhcrc.org <mailto:vobencha@fhcrc.org="">>> wrote: >> >> summarizeOverlaps() does count paired-end reads. Use option >> singleEnd=FALSE. Details under 'singleEnd' in the 'Arguments' >> section on the man page. >> >> library(GenomicAlignments) ## devel >> library(GenomicRanges) ## release >> ?summarizeOverlaps >> >> Valerie >> >> >> On 12/08/13 09:42, Reema Singh wrote: >> >> Hi Sam, >> >> It depends on whether you are using single read or >> paired end reads >> files. As far as I known summarizeOverlaps function >> only working >> in case >> of single read. Anyone please correct me if I am wrong. >> >> Kind Regards >> >> >> On Sun, Dec 8, 2013 at 8:25 PM, Valerie Obenchain >> <vobencha@fhcrc.org <mailto:vobencha@fhcrc.org=""> >> <mailto:vobencha@fhcrc.org <mailto:vobencha@fhcrc.org="">> >> <mailto:vobencha@fhcrc.org <mailto:vobencha@fhcrc.org=""> >> <mailto:vobencha@fhcrc.org <mailto:vobencha@fhcrc.org="">>>> wrote: >> >> Hi Sam, >> >> Are you getting any error messages related to >> seqlevels >> when you >> count? If you have confirmed that annotation >> seqlevels >> match the bam >> files next look at the maximum possible overlap in >> a single >> bam file. >> >> bf <- singleBamFile >> reads <- readGAlignments(bf) ## assuming >> single-end reads >> so <- summarizeOverlaps(tx, reads, mode=Union, >> inter.feature=FALSE) >> >> 'inter.feature=FALSE' with 'mode=Union' is >> countOverlaps with >> 'type=ANY'. Reads that hit multiple features will >> be double >> counted >> in this case. The idea to to see if there are any >> common >> regions at all. >> >> If there are still no counts, you could look more >> closely at a >> smaller chromosome region in 'reads' where you >> would expect >> counts. >> This visual inspection might give you some insight >> as to >> why there >> is no overlap. >> >> >> Valerie >> >> >> On 12/05/13 18:51, Sam McInturf wrote: >> >> Hello Bioconductors, >> I am working on an Arabidopsis RNA seq >> set, and after >> executing my count >> reads script, I get a count table with no >> reads counted (ie >> sum(colSums(mat)) = 0). >> >> I mapped my reads using Tophat2 (without >> supplying a >> gtf/gff >> file) and used >> samtools to sort and index my >> accepted_hits.bam files. >> I loaded >> these bam >> files into IGV (integrated Genome Browser) and >> the >> reads appear >> to have >> been mapped (additionally the >> align_summary.txt says I >> have good >> mapping). >> Script and session info below. >> >> Essentially I use Rsamtools to load in my >> bamfiles, and >> then >> makeTranscriptsFromGFF to make a txdb object, >> transcriptsBy to get >> transcripts, and then summarizeOverlaps to >> count the >> reads. My >> gff file is >> TAIR10. I am relatively sure that the genome >> I mapped >> to was >> the TAIR10 >> release, but I am not 100% on that. >> >> I have also used biomaRt to do this (commented >> out), >> but I had >> the same >> results. >> >> I am currently upgrading to R 3.0.2 and >> re-installing >> bioconductor ( >> "/data/smcintur/Annotation/______TAIR10_GFF3_genes.gff"; >> library(ShortRead) >> library(GenomicFeatures) >> library(Rsamtools) >> #library(biomaRt) >> >> >> setwd("/data/smcintur/RNASeq/______NMseq/newTophat/BamFiles/") >> >> >> bf <- list.files(pattern = ".bam$") >> bi <- list.files(pattern = "bam.bai$") >> bfl <- BamFileList(bf, bi) >> >> #mart <- >> makeTranscriptDbFromBiomart(______biomart = >> >> >> "Public_TAIRV10", dataset= >> "tairv10") >> tx <- transcriptsBy(mart) >> gff <- makeTranscriptDbFromGFF(______gffdir, >> format = "gff") >> >> >> gff >> >> cds <- cdsBy(gff) >> tx <- transcriptsBy(gff) >> >> olapTx <- summarizeOverlaps(tx, bfl) >> olapCds <- summarizeOverlaps(cds, bfl) >> >> txMat <- assays(olapTx)$counts >> cdsMat <- assays(olapCds)$counts >> >> write.table(txMat, file = "countMatTxGff.txt", >> sep = "\t") >> write.table(cdsMat, file = >> "countMatCdsGff.txt", sep = >> "\t") >> >> sessionInfo >> R version 3.0.0 (2013-04-03) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 >> LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 >> LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=C LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 >> LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils >> datasets >> methods base >> >> other attached packages: >> [1] BiocInstaller_1.12.0 >> >> loaded via a namespace (and not attached): >> [1] AnnotationDbi_1.24.0 Biobase_2.22.0 >> BiocGenerics_0.8.0 >> [4] biomaRt_2.18.0 Biostrings_2.30.1 >> bitops_1.0-6 >> [7] BSgenome_1.30.0 DBI_0.2-7 >> GenomicFeatures_1.14.2 >> [10] GenomicRanges_1.14.3 IRanges_1.20.6 >> parallel_3.0.0 >> [13] RCurl_1.95-4.1 Rsamtools_1.14.2 >> RSQLite_0.11.4 >> [16] rtracklayer_1.22.0 stats4_3.0.0 >> tools_3.0.0 >> [19] XML_3.98-1.1 XVector_0.2.0 >> zlibbioc_1.8.0 >> >> >> ______________________________ >> _______________________ >> >> Bioconductor mailing list >> Bioconductor@r-project.org <mailto:bioconductor@r-project.org> >> <mailto:bioconductor@r-__project.org>> <mailto:bioconductor@r-project.org>> >> <mailto:bioconductor@r-____project.org>> <mailto:bioconductor@r-__project.org> >> <mailto:bioconductor@r-__project.org>> <mailto:bioconductor@r-project.org>>> >> https://stat.ethz.ch/mailman/______listinfo/bioconductor >> <https: stat.ethz.ch="" mailman="" ____listinfo="" bioconductor=""> >> <https: stat.ethz.ch="" mailman="" ____listinfo="" bioconductor="">> <https: stat.ethz.ch="" mailman="" __listinfo="" bioconductor="">> >> >> >> >> <https: stat.ethz.ch="" mailman="" ____listinfo="" bioconductor="">> <https: stat.ethz.ch="" mailman="" __listinfo="" bioconductor=""> >> <https: stat.ethz.ch="" mailman="" __listinfo="" bioconductor="">> <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor="">>> >> Search the archives: >> http://news.gmane.org/gmane.______science.biology. >> informatics.______conductor >> <http: news.gmane.org="" gmane.____science.biology.="">> informatics.____conductor> >> >> <http: news.gmane.org="" gmane.____science.biology.="">> informatics.____conductor >> <http: news.gmane.org="" gmane.__science.biology.informatics._="">> _conductor>> >> >> >> <http: news.gmane.org="" gmane.____science.biology.="">> informatics.____conductor >> <http: news.gmane.org="" gmane.__science.biology.informatics._="">> _conductor> >> >> <http: news.gmane.org="" gmane.__science.biology.informatics._="">> _conductor >> <http: news.gmane.org="" gmane.science.biology.informatics.="">> conductor>>> >> >> >> >> >> >> -- >> Reema Singh >> PhD Scholar >> Computational Biology and Bioinformatics >> School of Computational and Integrative Sciences >> Jawaharlal Nehru University >> New Delhi-110067 >> INDIA >> >> >> >> >> >> -- >> Reema Singh >> Postdoctoral Research Assistant >> College of Life Sciences >> University of Dundee, >> Dundee DD1 4HN, Scotland >> United Kingdom >> >> >> >> -- >> Valerie Obenchain >> >> Program in Computational Biology >> Division of Public Health Sciences >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N, M1-B155 >> P.O. Box 19024 >> Seattle, WA 98109-1024 >> >> E-mail: vobencha@fhcrc.org <mailto:vobencha@fhcrc.org> >> Phone: (206) 667-3158 <tel:%28206%29%20667-3158> >> Fax: (206) 667-1319 <tel:%28206%29%20667-1319> >> >> >> >> >> -- >> Sam McInturf >> > > -- Sam McInturf [[alternative HTML version deleted]]

ADD REPLY • link 12.1 years ago Sam McInturf ▴ 300

0

Entering edit mode

On 12/24/13 04:24, Sam McInturf wrote: > Valerie, > Sorry for the lame error, was reading your comments about countOverlaps. > > I found my problem in that the features data had strand information, > while my sequencing was not stranded, so by setting the strand > information to "*" in the GRanges object and then reduce(split()) I was > able to map succesfully Great! Glad it worked. There is an 'ignore.strand' argument in many of the count functions. Setting 'ignore.strand=TRUE' is equivalent to setting strand to "*" in the GRanges/GAlignments etc. The benefit of using the argument is that strand is ignored during counting but that information is retained in the original GRanges. Valerie > > > Thanks for the help! > > On Saturday, December 21, 2013, Valerie Obenchain wrote: > > On 12/21/13 05:14, Sam McInturf wrote: > > Valerie and Reema, > I am using single end reads. > > I did as Valerie suggested > > so <- summarizeOverlaps(tx, reads, mode="Union",type = "any") > > > This is not the example I used below. You did not include > 'inter.feature=FALSE'. 'type' is not an argument to > summarizeOverlaps(). In other words, do not use 'type', use > 'inter.feature'. Please see my previous explanation and the man page > for summarizeOverlaps(). > > > Valerie > > > colSums(assays(so)$counts > #185,940 > > which feels silly, allow for double counts and return fewer > counts than > prior > > P.S. thank you for your prompt reply, my email "hid" it from me > Best > > Sam > > > > On Mon, Dec 9, 2013 at 6:08 PM, Valerie Obenchain > <vobencha at="" fhcrc.org=""> <mailto:vobencha at="" fhcrc.org="">> wrote: > > Reema, > > Thank you for pointing out the error in the summarizeOverlaps > vignette. That sentence was a leftover from a previous > version and > should have been removed. I have updated the vignette in both > release (GenomicRanges 1.14.4) and devel (GenomicAlignments > 0.99.8). > > When the 'reads' argument is a GAlignmentPairs object the > stand-alone functions of Union(), IntersectionStrict() and > IntersectionNotEmpty() count the data as paired-end. If > 'reads' is a > GAlignments object they are counted as single-end. > summarizeOverlaps() reads the data into a GAlignments or > GAlignmentPairs object and calls these functions > internally; they > are exactly the same. > > Valerie > > > > On 12/09/2013 02:46 AM, Reema Singh wrote: > > Hi all, > > Indeed summarizeOverlaps() works on paired end data. > But different > counting modes [Union, Intersection and > IntersectionNotEmpty] > they do > not handle paired end reads. Here the reference > > [http://www.bioconductor.org/____packages/release/bioc/____v ignettes/GenomicRanges/inst/____doc/summarizeOverlaps.pdf]- > <http: www.bioconductor.org="" __packages="" release="" bioc="" __vigne="" ttes="" genomicranges="" inst="" __doc="" summarizeoverlaps.pdf%5d-=""> > > <http: www.bioconductor.org="" __packages="" release="" bioc="" __vigne="" ttes="" genomicranges="" inst="" __doc="" summarizeoverlaps.pdf%5d-=""> <http: www.bioconductor.org="" packages="" release="" bioc="" vignettes="" genomicranges="" inst="" doc="" summarizeoverlaps.pdf%5d-="">> > Page 2 [under Counting Modes]. > > Kind Regards > > > > On Sun, Dec 8, 2013 at 6:31 PM, Valerie Obenchain > <vobencha at="" fhcrc.org="" <mailto:vobencha="" at="" fhcrc.org=""> > <mailto:vobencha at="" fhcrc.org=""> <mailto:vobencha at="" fhcrc.org="">>> wrote: > > summarizeOverlaps() does count paired-end reads. > Use option > singleEnd=FALSE. Details under 'singleEnd' in the > 'Arguments' > section on the man page. > > library(GenomicAlignments) ## devel > library(GenomicRanges) ## release > ?summarizeOverlaps > > Valerie > > > On 12/08/13 09:42, Reema Singh wrote: > > Hi Sam, > > It depends on whether you are using single read or > paired end reads > files. As far as I known summarizeOverlaps > function > only working > in case > of single read. Anyone please correct me if I > am wrong. > > Kind Regards > > > On Sun, Dec 8, 2013 at 8:25 PM, Valerie Obenchain > <vobencha at="" fhcrc.org="" <mailto:vobencha="" at="" fhcrc.org=""> > <mailto:vobencha at="" fhcrc.org="" <mailto:vobencha="" at="" fhcrc.org="">> > <mailto:vobencha at="" fhcrc.org=""> <mailto:vobencha at="" fhcrc.org=""> > <mailto:vobencha at="" fhcrc.org=""> <mailto:vobencha at="" fhcrc.org="">>>> wrote: > > Hi Sam, > > Are you getting any error messages related to > seqlevels > when you > count? If you have confirmed that > annotation seqlevels > match the bam > files next look at the maximum possible > overlap in > a single > bam file. > > bf <- singleBamFile > reads <- readGAlignments(bf) ## assuming > single-end reads > so <- summarizeOverlaps(tx, reads, > mode=Union, > inter.feature=FALSE) > > 'inter.feature=FALSE' with 'mode=Union' is > countOverlaps with > 'type=ANY'. Reads that hit multiple > features will > be double > counted > in this case. The idea to to see if there > are any > common > regions at all. > > If there are still no counts, you could > look more > closely at a > smaller chromosome region in 'reads' > where you > would expect > counts. > This visual inspection might give you > some insight > as to > why there > is no overlap. > > > Valerie > > > On 12/05/13 18:51, Sam McInturf wrote: > > Hello Bioconductors, > I am working on an Arabidopsis > RNA seq > set, and after > executing my count > reads script, I get a count table with no > reads counted (ie > sum(colSums(mat)) = 0). > > I mapped my reads using Tophat2 (without > supplying a > gtf/gff > file) and used > samtools to sort and index my > accepted_hits.bam files. > I loaded > these bam > files into IGV (integrated Genome > Browser) and the > reads appear > to have > been mapped (additionally the > align_summary.txt says I > have good > mapping). > Script and session info below. > > Essentially I use Rsamtools to load in my > bamfiles, and > then > makeTranscriptsFromGFF to make a txdb > object, > transcriptsBy to get > transcripts, and then > summarizeOverlaps to > count the > reads. My > gff file is > TAIR10. I am relatively sure that > the genome > I mapped > to was > the TAIR10 > release, but I am not 100% on that. > > I have also used biomaRt to do this > (commented > out), > but I had > the same > results. > > I am currently upgrading to R 3.0.2 and > re-installing > bioconductor ( > "/data/smcintur/Annotation/________TAIR10_GFF3_genes.gff"; > library(ShortRead) > library(GenomicFeatures) > library(Rsamtools) > #library(biomaRt) > > > > setwd("/data/smcintur/RNASeq/________NMseq/newTophat/BamFiles/__") > > > bf <- list.files(pattern = ".bam$") > bi <- list.files(pattern = "bam.bai$") > bfl <- BamFileList(bf, bi) > > #mart <- > makeTranscriptDbFromBiomart(________biomart = > > > "Public_TAIRV10", dataset= > "tairv10") > tx <- transcriptsBy(mart) > gff <- > makeTranscriptDbFromGFF(________gffdir, > format = "gff") > > > gff > > cds <- cdsBy(gff) > tx <- transcriptsBy(gff) > > olapTx <- summarizeOverlaps(tx, bfl) > olapCds <- summarizeOverlaps(cds, bfl) > > txMat <- assays(olapTx)$counts > cdsMat <- assays(olapCds)$counts > > write.table(txMat, file = > "countMatTxGff.txt", > sep = "\t") > write.table(cdsMat, file = > "countMatCdsGff.txt", sep = > "\t") > > sessionInfo > R version 3.0.0 (2013-04-03) > Platform: x86_64-unknown-linux- gnu > (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 > LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 > LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 > LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C > LC_NAME=C > [9] LC_ADDRESS=C > LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 > LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils > datasets > methods base > > other attached packages: > [1] BiocInstaller_1.12.0 > > loaded via a namespace (and not > attached): > [1] AnnotationDbi_1.24.0 > Biobase_2.22.0 > BiocGenerics_0.8.0 > [4] biomaRt_2.18.0 > Biostrings_2.30.1 > bitops_1.0-6 > [7] BSgenome_1.30.0 DBI_0.2-7 > GenomicFeatures_1.14.2 > [10] GenomicRanges_1.14.3 > IRanges_1.20.6 > parallel_3.0.0 > [13] RCurl_1.95-4.1 > Rsamtools_1.14.2 > RSQLite_0.11.4 > [16] rtracklayer_1.22.0 stats4_3.0.0 > tools_3.0.0 > [19] XML_3.98-1.1 XVector_0.2.0 > zlibbioc_1.8.0 > > > > _______________________________________________________ > > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > <mailto:bioconductor at="" r-__project.org=""> <mailto:bioconductor at="" r-project.org="">> > <mailto:bioconductor at="" r-____project.org=""> <mailto:bioconductor at="" r-__project.org=""> > <mailto:bioconductor at="" r-__project.org=""> <mailto:bioconductor at="" r-project.org="">>> > https://stat.ethz.ch/mailman/________listinfo/bioconductor > <https: stat.ethz.ch="" mailman="" ______listinfo="" bioconductor=""> > > <https: stat.ethz.ch="" mailman="" ______listinfo="" bioconductor=""> <https: stat.ethz.ch="" mailman="" ____listinfo="" bioconductor="">> > > <https: stat.ethz.ch="" mailman="" ______listinfo="" bioconductor=""> <https: stat.ethz.ch="" mailman="" ____listinfo="" bioconductor=""> > <https: stat.ethz.ch="" mailman="" ____listinfo="" bioconductor=""> <https: stat.ethz.ch="" mailman="" __listinfo="" bioconductor="">>> > > > > > <https: stat.ethz.ch="" mailman="" ______listinfo="" bioconductor=""> <https: stat.ethz.ch="" mailman="" ____listinfo="" bioconductor=""> > <https: stat.ethz.ch="" mailman="" ____listinfo="" bioconductor=""> <https: stat.ethz.ch="" mailman="" __listinfo="" bioconductor="">> > > <https: stat.ethz.ch="" mailman="" ____listinfo="" bioconductor=""> <https: stat.ethz.ch="" mailman="" __listinfo="" bioconductor=""> > <https: stat.ethz.ch="" mailman="" __listinfo="" bioconductor=""> <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor="">>>> > Search the archives: > http://news.gmane.org/gmane.________science.biology.__inform atics.______conductor > <http: news.gmane.org="" gmane.______science.biology.informati="" cs.______conductor=""> > > <http: news.gmane.org="" gmane.______science.biology.__informa="" tics.____conductor=""> <http: news.gmane.org="" gmane.____science.biology.informatics="" .____conductor="">> > > > <http: news.gmane.org="" gmane.______science.biology.__informa="" tics.____conductor=""> <http: news.gmane.org="" gmane.____science.biology.informatics="" .____conductor=""> > > <http: news.gmane.org="" gmane.____science.biology.informatics="" .____conductor=""> <http: news.gmane.org="" gmane.__science.biology.informatics._="" _conductor="">>> > > > > <http: news.gmane.org="" gmane.______science.biology.__informa="" tics.____conductor=""> <http: news.gmane.org="" gmane.____science.biology.informatics="" .____conductor=""> > > <http: news.gmane.org="" gmane.____science.biology.informatics="" .____conductor=""> <http: news.gmane.org="" gmane.__science.biology.informatics._="" _conductor="">> > > > <http: news.gmane.org="" gmane.____science.biology.informatics="" .____conductor=""> <http: news.gmane.org="" gmane.__science.biology.informatics._="" _conductor=""> > > <http: news.gmane.org="" gmane.__science.biology.informatics.__conductor=""> <http: news.gmane.org="" gmane.science.biology.informatics.conductor="">>>> > > > > > > -- > Reema Singh > PhD Scholar > Computational Biology and Bioinformatics > School of Computational and Integrative Sciences > Jawaharlal Nehru University > New Delhi-110067 > INDIA > > > > > > -- > Reema Singh > Postdoctoral Research Assistant > College of Life Sciences > University of Dundee, > Dundee DD1 4HN, Scotland > United Kingdom > > > > -- > Valerie Obenchain > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B155 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: vobencha at fhcrc.org <mailto:vobencha at="" fhcrc.org=""> > Phone: (206) 667-3158 <tel:%28206%29%20667-3158> > Fax: (206) 667-1319 <tel:%28206%29%20667-1319> > > > > > -- > Sam McInturf > > > > > -- > Sam McInturf

ADD REPLY • link 12.1 years ago Valerie Obenchain ★ 6.8k

Login before adding your answer.