Search
Question: readGappedAlignments not working in Rsamtools
1
gravatar for Guest User
3.4 years ago by
Guest User12k
Guest User12k wrote:
Hello everybody, I'm a PhD student and I am doing my first analysis on RNAseq data. I was trying to use the script to generate a table of counts from bam files (shown here): library(Rsamtools) setwd("C:/R_data") bam_files <- list.files(pattern="*.bam") gr_list <- lapply(bam_files, function(bam_file) as(readGappedAlignments(bam_file), "GRanges")) names(gr_list) <- bam_files library(GenomicFeatures) txdb=makeTranscriptDbFromUCSC(genome='dm3',tablename='ensGene') tx_by_gene=transcriptsBy(txdb,'gene') ex_by_gene=exonsBy(txdb,'gene') toc=data.frame(rep(NA,length(tx_by_gene))) for(i in 1:length(bam_files)) { toc[,i]=countOverlaps(tx_by_gene,bam_files[[i]]) } rownames(toc)=names(tx_by_gene) colnames(toc)=names(gr_list) dim(toc) head(toc) But it doesn't seem to work, failing in two places: 1) Error in .class1(object) : could not find function "readGappedAlignments" 2) Download the ensGene table ... Error in function (type, msg, asError = TRUE) : couldn't connect to host For the point 1) I found that the function is deprecated but I still can't make things work changing it for the new function that is supposed to replace it "readGAlignmentsFromBam", Can anybody help me fixing this? The error in part 2) is completely out of my reach at the moment, I don't know how to fix this issue either, so any help would be appreciated. Thanks in advance! -- output of sessionInfo(): R version 3.1.0 (2014-04-10) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United Kingdom.1252 [2] LC_CTYPE=English_United Kingdom.1252 [3] LC_MONETARY=English_United Kingdom.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United Kingdom.1252 attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] GenomicFeatures_1.16.2 AnnotationDbi_1.26.0 Biobase_2.24.0 [4] Rsamtools_1.16.0 Biostrings_2.32.0 XVector_0.4.0 [7] GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 IRanges_1.22.8 [10] BiocGenerics_0.10.0 loaded via a namespace (and not attached): [1] BatchJobs_1.2 BBmisc_1.6 BiocParallel_0.6.1 [4] biomaRt_2.20.0 bitops_1.0-6 brew_1.0-6 [7] BSgenome_1.32.0 codetools_0.2-8 DBI_0.2-7 [10] digest_0.6.4 fail_1.2 foreach_1.4.2 [13] GenomicAlignments_1.0.1 iterators_1.0.7 plyr_1.8.1 [16] Rcpp_0.11.1 RCurl_1.95-4.1 RSQLite_0.11.4 [19] rtracklayer_1.24.2 sendmailR_1.1-2 stats4_3.1.0 [22] stringr_0.6.2 tools_3.1.0 XML_3.98-1.1 [25] zlibbioc_1.10.0 -- Sent via the guest posting facility at bioconductor.org.
ADD COMMENTlink modified 3.4 years ago by Valerie Obenchain ♦♦ 6.4k • written 3.4 years ago by Guest User12k
1
gravatar for Valerie Obenchain
3.4 years ago by
Valerie Obenchain ♦♦ 6.4k
United States
Valerie Obenchain ♦♦ 6.4k wrote:
Hi, readGappedAlignments() has been replaced with readGAlignments() and is now in the GenomicAlignments package, not Rsamtools. It looks like you want to count reads in a bam file against an annotation. summarizeOverlaps() can do this for you and offers options for counting reads that overlap multiple annotation features. (fyi, summarizeOverlaps() calls readGAlignments() or readGAlignmentPairs() under the hood.) library(GenomicAlignments) ?summarizeOverlaps ## see examples on counting Bam files Create a BamFileList of files: bfl <- BamFileList(bam_files) If the files are too large for available memory use a 'yieldSize': bfl <- BamFileList(bam_files, yieldSize=100000) We have a pre-built package for the dm3 ensembl genes. See this site for a list of available annotation packages: http://www.bioconductor.org/packages/devel/BiocViews.html#___Annotatio nData Install dm3 ensembl: library(TxDb.Dmelanogaster.UCSC.dm3.ensGene) exbygene <- exonsBy(TxDb.Dmelanogaster.UCSC.dm3.ensGene, "gene") Call summarizeOverlaps(): se_single <- summarizeOverlaps(exbygene, bfl) For paired-end reads use 'singleEnd=FALSE': se_paired <- summarizeOverlaps(exbygene, bfl, singleEnd=FALSE) The result is a SummarizedExperiment class with results in the 'assays' slot. assays(se_single) Valerie On 06/13/2014 01:15 AM, Maintainer wrote: > Hello everybody, > > I'm a PhD student and I am doing my first analysis on RNAseq data. > > I was trying to use the script to generate a table of counts from bam files (shown here): > > library(Rsamtools) > > setwd("C:/R_data") > > bam_files <- list.files(pattern="*.bam") > gr_list <- lapply(bam_files, > function(bam_file) > as(readGappedAlignments(bam_file), "GRanges")) > names(gr_list) <- bam_files > > library(GenomicFeatures) > txdb=makeTranscriptDbFromUCSC(genome='dm3',tablename='ensGene') > > tx_by_gene=transcriptsBy(txdb,'gene') > ex_by_gene=exonsBy(txdb,'gene') > toc=data.frame(rep(NA,length(tx_by_gene))) > for(i in 1:length(bam_files)) > { > toc[,i]=countOverlaps(tx_by_gene,bam_files[[i]]) > } > rownames(toc)=names(tx_by_gene) > colnames(toc)=names(gr_list) > dim(toc) > head(toc) > > But it doesn't seem to work, failing in two places: > > 1) Error in .class1(object) : could not find function "readGappedAlignments" > > 2) Download the ensGene table ... Error in function (type, msg, asError = TRUE) : couldn't connect to host > > For the point 1) I found that the function is deprecated but I still can't make things work changing it for the new function that is supposed to replace it "readGAlignmentsFromBam", Can anybody help me fixing this? > > The error in part 2) is completely out of my reach at the moment, I don't know how to fix this issue either, so any help would be appreciated. > > Thanks in advance! > > > > -- output of sessionInfo(): > > > R version 3.1.0 (2014-04-10) > Platform: x86_64-w64-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=English_United Kingdom.1252 > [2] LC_CTYPE=English_United Kingdom.1252 > [3] LC_MONETARY=English_United Kingdom.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United Kingdom.1252 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] GenomicFeatures_1.16.2 AnnotationDbi_1.26.0 Biobase_2.24.0 > [4] Rsamtools_1.16.0 Biostrings_2.32.0 XVector_0.4.0 > [7] GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 IRanges_1.22.8 > [10] BiocGenerics_0.10.0 > > loaded via a namespace (and not attached): > [1] BatchJobs_1.2 BBmisc_1.6 BiocParallel_0.6.1 > [4] biomaRt_2.20.0 bitops_1.0-6 brew_1.0-6 > [7] BSgenome_1.32.0 codetools_0.2-8 DBI_0.2-7 > [10] digest_0.6.4 fail_1.2 foreach_1.4.2 > [13] GenomicAlignments_1.0.1 iterators_1.0.7 plyr_1.8.1 > [16] Rcpp_0.11.1 RCurl_1.95-4.1 RSQLite_0.11.4 > [19] rtracklayer_1.24.2 sendmailR_1.1-2 stats4_3.1.0 > [22] stringr_0.6.2 tools_3.1.0 XML_3.98-1.1 > [25] zlibbioc_1.10.0 > > -- > Sent via the guest posting facility at bioconductor.org. > > ____________________________________________________________________ ____ > devteam-bioc mailing list > To unsubscribe from this mailing list send a blank email to > devteam-bioc-leave at lists.fhcrc.org > You can also unsubscribe or change your personal options at > https://lists.fhcrc.org/mailman/listinfo/devteam-bioc > -- Valerie Obenchain Program in Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, Seattle, WA 98109 Email: vobencha at fhcrc.org Phone: (206) 667-3158
ADD COMMENTlink written 3.4 years ago by Valerie Obenchain ♦♦ 6.4k
Dear Valerie, Thank you very much for your answer. It worked perfectly and solved both issues. Best Alvaro > Date: Fri, 13 Jun 2014 08:42:42 -0700 > From: vobencha@fhcrc.org > To: guest@bioconductor.org; bioconductor@r-project.org; alfumao@hotmail.com > CC: maintainer@bioconductor.org > Subject: Re: [devteam-bioc] readGappedAlignments not working in Rsamtools > > Hi, > > readGappedAlignments() has been replaced with readGAlignments() and is > now in the GenomicAlignments package, not Rsamtools. > > It looks like you want to count reads in a bam file against an > annotation. summarizeOverlaps() can do this for you and offers options > for counting reads that overlap multiple annotation features. (fyi, > summarizeOverlaps() calls readGAlignments() or readGAlignmentPairs() > under the hood.) > > library(GenomicAlignments) > ?summarizeOverlaps ## see examples on counting Bam files > > > Create a BamFileList of files: > bfl <- BamFileList(bam_files) > > If the files are too large for available memory use a 'yieldSize': > bfl <- BamFileList(bam_files, yieldSize=100000) > > We have a pre-built package for the dm3 ensembl genes. See this site for > a list of available annotation packages: > http://www.bioconductor.org/packages/devel/BiocViews.html#___Annotat ionData > > Install dm3 ensembl: > library(TxDb.Dmelanogaster.UCSC.dm3.ensGene) > exbygene <- exonsBy(TxDb.Dmelanogaster.UCSC.dm3.ensGene, "gene") > > Call summarizeOverlaps(): > se_single <- summarizeOverlaps(exbygene, bfl) > > For paired-end reads use 'singleEnd=FALSE': > se_paired <- summarizeOverlaps(exbygene, bfl, singleEnd=FALSE) > > > The result is a SummarizedExperiment class with results in > the 'assays' slot. > assays(se_single) > > > Valerie > > > > > On 06/13/2014 01:15 AM, Maintainer wrote: > > Hello everybody, > > > > I'm a PhD student and I am doing my first analysis on RNAseq data. > > > > I was trying to use the script to generate a table of counts from bam files (shown here): > > > > library(Rsamtools) > > > > setwd("C:/R_data") > > > > bam_files <- list.files(pattern="*.bam") > > gr_list <- lapply(bam_files, > > function(bam_file) > > as(readGappedAlignments(bam_file), "GRanges")) > > names(gr_list) <- bam_files > > > > library(GenomicFeatures) > > txdb=makeTranscriptDbFromUCSC(genome='dm3',tablename='ensGene') > > > > tx_by_gene=transcriptsBy(txdb,'gene') > > ex_by_gene=exonsBy(txdb,'gene') > > toc=data.frame(rep(NA,length(tx_by_gene))) > > for(i in 1:length(bam_files)) > > { > > toc[,i]=countOverlaps(tx_by_gene,bam_files[[i]]) > > } > > rownames(toc)=names(tx_by_gene) > > colnames(toc)=names(gr_list) > > dim(toc) > > head(toc) > > > > But it doesn't seem to work, failing in two places: > > > > 1) Error in .class1(object) : could not find function "readGappedAlignments" > > > > 2) Download the ensGene table ... Error in function (type, msg, asError = TRUE) : couldn't connect to host > > > > For the point 1) I found that the function is deprecated but I still can't make things work changing it for the new function that is supposed to replace it "readGAlignmentsFromBam", Can anybody help me fixing this? > > > > The error in part 2) is completely out of my reach at the moment, I don't know how to fix this issue either, so any help would be appreciated. > > > > Thanks in advance! > > > > > > > > -- output of sessionInfo(): > > > > > > R version 3.1.0 (2014-04-10) > > Platform: x86_64-w64-mingw32/x64 (64-bit) > > > > locale: > > [1] LC_COLLATE=English_United Kingdom.1252 > > [2] LC_CTYPE=English_United Kingdom.1252 > > [3] LC_MONETARY=English_United Kingdom.1252 > > [4] LC_NUMERIC=C > > [5] LC_TIME=English_United Kingdom.1252 > > > > attached base packages: > > [1] parallel stats graphics grDevices utils datasets methods > > [8] base > > > > other attached packages: > > [1] GenomicFeatures_1.16.2 AnnotationDbi_1.26.0 Biobase_2.24.0 > > [4] Rsamtools_1.16.0 Biostrings_2.32.0 XVector_0.4.0 > > [7] GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 IRanges_1.22.8 > > [10] BiocGenerics_0.10.0 > > > > loaded via a namespace (and not attached): > > [1] BatchJobs_1.2 BBmisc_1.6 BiocParallel_0.6.1 > > [4] biomaRt_2.20.0 bitops_1.0-6 brew_1.0-6 > > [7] BSgenome_1.32.0 codetools_0.2-8 DBI_0.2-7 > > [10] digest_0.6.4 fail_1.2 foreach_1.4.2 > > [13] GenomicAlignments_1.0.1 iterators_1.0.7 plyr_1.8.1 > > [16] Rcpp_0.11.1 RCurl_1.95-4.1 RSQLite_0.11.4 > > [19] rtracklayer_1.24.2 sendmailR_1.1-2 stats4_3.1.0 > > [22] stringr_0.6.2 tools_3.1.0 XML_3.98-1.1 > > [25] zlibbioc_1.10.0 > > > > -- > > Sent via the guest posting facility at bioconductor.org. > > > > __________________________________________________________________ ______ > > devteam-bioc mailing list > > To unsubscribe from this mailing list send a blank email to > > devteam-bioc-leave@lists.fhcrc.org > > You can also unsubscribe or change your personal options at > > https://lists.fhcrc.org/mailman/listinfo/devteam-bioc > > > > > -- > Valerie Obenchain > Program in Computational Biology > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, Seattle, WA 98109 > > Email: vobencha@fhcrc.org > Phone: (206) 667-3158 [[alternative HTML version deleted]]
ADD REPLYlink written 3.4 years ago by Alvaro F10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 136 users visited in the last hour