Re(surrecting): [Bioc-sig-seq] Rsamtools countBam labeling
1
0
Entering edit mode
Malcolm Cook ★ 1.6k
@malcolm-cook-6293
Last seen 19 hours ago
United States
Matt et. al., I wonder if a satisfactory resolution to the issue of "the order changes between the GRanges object and the countBam data.frame http://www.mail-archive.com/bioc-sig-sequencing at r-project.org/msg01144.html I am presented with the same issue and poised to tackle it but wondr if a generic solution emerged from you inquiries & efforts. Thanks, Malcolm Cook Stowers Institute for Medical Research - Bioinformatics Kansas City, Missouri USA
• 973 views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 16 days ago
United States
On 03/31/2011 08:32 AM, Cook, Malcolm wrote: > Matt et. al., > > I wonder if a satisfactory resolution to the issue of "the order > changes between the GRanges object and the countBam data.frame > > http://www.mail-archive.com/bioc-sig-sequencing at r-project.org/msg01144.html > > I am presented with the same issue and poised to tackle it but wondr > if a generic solution emerged from you inquiries& efforts. Hi Malcolm -- For a reproducible example, library(Rsamtools) example(countBam) which1 <- as(which, "GRanges") ## which2 might be where your data actually starts which2 <- which1[c(2,1,3)] values(which2)[["OriginalOrder"]] <- 1:3 param <- ScanBamParam(which=which2) cnt <- countBam(fl, param=param) What happens is that ScanBamParam converts its argument to an IRangesList, using split(ranges(which2), seqnames(which2)). So do the same for the values and unlist cntVals <- unlist(split(values(which2), seqnames(which2))) then cbind coerced values cbind(cnt, as.data.frame(cntVals)) with > which2 GRanges with 3 ranges and 1 elementMetadata value seqnames ranges strand | OriginalOrder <rle> <iranges> <rle> | <integer> [1] seq2 [ 100, 1000] * | 1 [2] seq1 [1000, 2000] * | 2 [3] seq2 [1000, 2000] * | 3 seqlengths seq1 seq2 NA NA > cbind(cnt, as.data.frame(cntVals)) space start end width file records nucleotides OriginalOrder 1 seq1 1000 2000 1001 ex1.bam 612 21549 2 2 seq2 100 1000 901 ex1.bam 1169 41235 1 3 seq2 1000 2000 1001 ex1.bam 642 22640 3 Martin > > Thanks, > > Malcolm Cook Stowers Institute for Medical Research - > Bioinformatics Kansas City, Missouri USA > > > _______________________________________________ Bioconductor mailing > list Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor Search the > archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
ADD COMMENT
0
Entering edit mode
Martin, Hmmm, Thanks .... I think I'm getting it.... Following your lead, I can directly re-order cnt in the OriginalOrder as: > cnt[sort(unlist(split(values(which2), seqnames(which2)))$OriginalOrder,index.return=TRUE)$ix,] space start end width file records nucleotides 2 seq2 100 1000 901 ex1.bam 1169 41235 1 seq1 1000 2000 1001 ex1.bam 612 21549 3 seq2 1000 2000 1001 ex1.bam 642 22640 ... which can usefully(?) be abstracted as countBamWhich <- function (file,which,index=file,...) { ### wrapper for countBam that reorders its results to aggree with ### 'which', a required ScanBamParam. ... params are additional ### ScanBamParam options. ### ### ASSUMES: internal implementation detail of ScanBamParam. c.f. ### http://permalink.gmane.org/gmane.science.biology.informatics.condu ctor/34208) param <- ScanBamParam(which=which,...) values(which)[['OriginalOrder']] <- 1:length(which) CBW = countBam(file,index,param=ScanBamParam(which=which)) CBW[sort(unlist(split(values(which), seqnames(which)))$OriginalOrder,index.return=TRUE)$ix,] } allowing me to write > countBamWhich(fl, which2) space start end width file records nucleotides 2 seq2 100 1000 901 ex1.bam 1169 41235 1 seq1 1000 2000 1001 ex1.bam 612 21549 3 seq2 1000 2000 1001 ex1.bam 642 22640 All in favor? ~ Malcolm > -----Original Message----- > From: Martin Morgan [mailto:mtmorgan at fhcrc.org] > Sent: Thursday, March 31, 2011 11:41 AM > To: Cook, Malcolm > Cc: 'myoung at wehi.EDU.AU'; 'bioconductor at r-project.org'; > 'Bioc-sig-sequencing at r-project.org' > Subject: Re: [BioC] Re(surrecting): [Bioc-sig-seq] Rsamtools > countBam labeling > > On 03/31/2011 08:32 AM, Cook, Malcolm wrote: > > Matt et. al., > > > > I wonder if a satisfactory resolution to the issue of "the order > > changes between the GRanges object and the countBam data.frame > > > > > http://www.mail-archive.com/bioc-sig-sequencing at r-project.org/msg01144 > > .html > > > > I am presented with the same issue and poised to tackle it > but wondr > > if a generic solution emerged from you inquiries& efforts. > > Hi Malcolm -- > > For a reproducible example, > > library(Rsamtools) > example(countBam) > which1 <- as(which, "GRanges") > ## which2 might be where your data actually starts > which2 <- which1[c(2,1,3)] > values(which2)[["OriginalOrder"]] <- 1:3 > param <- ScanBamParam(which=which2) > cnt <- countBam(fl, param=param) > > What happens is that ScanBamParam converts its argument to an > IRangesList, using split(ranges(which2), seqnames(which2)). > So do the same for the values and unlist > > cntVals <- unlist(split(values(which2), seqnames(which2))) > > then cbind coerced values > > cbind(cnt, as.data.frame(cntVals)) > > with > > > which2 > GRanges with 3 ranges and 1 elementMetadata value > seqnames ranges strand | OriginalOrder > <rle> <iranges> <rle> | <integer> > [1] seq2 [ 100, 1000] * | 1 > [2] seq1 [1000, 2000] * | 2 > [3] seq2 [1000, 2000] * | 3 > > seqlengths > seq1 seq2 > NA NA > > cbind(cnt, as.data.frame(cntVals)) > space start end width file records nucleotides OriginalOrder > 1 seq1 1000 2000 1001 ex1.bam 612 21549 2 > 2 seq2 100 1000 901 ex1.bam 1169 41235 1 > 3 seq2 1000 2000 1001 ex1.bam 642 22640 3 > > Martin > > > > > Thanks, > > > > Malcolm Cook Stowers Institute for Medical Research - > Bioinformatics > > Kansas City, Missouri USA > > > > > > _______________________________________________ > Bioconductor mailing > > list Bioconductor at r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor Search the > > archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > Computational Biology > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > Location: M1-B861 > Telephone: 206 667-2793 >
ADD REPLY
0
Entering edit mode
!! Please pardon the line-breaks MS Outlook auto-inserted into my code in a previous version of this post. Martin, Hmmm, Thanks .... I think I'm getting it.... Following your lead, I can directly re-order cnt in the OriginalOrder as: > cnt[sort(unlist(split(values(which2), seqnames(which2)))$OriginalOrder,index.return=TRUE)$ix,] space start end width file records nucleotides 2 seq2 100 1000 901 ex1.bam 1169 41235 1 seq1 1000 2000 1001 ex1.bam 612 21549 3 seq2 1000 2000 1001 ex1.bam 642 22640 ... which can usefully(?) be abstracted as countBamWhich <- function (file,which,index=file,...) { ### wrapper for countBam that reorders its results to aggree with ### 'which', a required ScanBamParam. ... params are additional ### ScanBamParam options. ### ### ASSUMES: internal implementation detail of ScanBamParam. c.f. ### http://permalink.gmane.org/gmane.science.biology.informatics.condu ctor/34208) param <- ScanBamParam(which=which,...) values(which)[['OriginalOrder']] <- 1:length(which) CBW = countBam(file,index,param=ScanBamParam(which=which)) CBW[sort(unlist(split(values(which), seqnames(which)))$OriginalOrder,index.return=TRUE)$ix,] } allowing me to write > countBamWhich(fl, which2) space start end width file records nucleotides 2 seq2 100 1000 901 ex1.bam 1169 41235 1 seq1 1000 2000 1001 ex1.bam 612 21549 3 seq2 1000 2000 1001 ex1.bam 642 22640 All in favor? ~ Malcolm Malcolm Cook Stowers Institute for Medical Research - Bioinformatics Kansas City, Missouri USA > -----Original Message----- > From: Martin Morgan [mailto:mtmorgan at fhcrc.org] > Sent: Thursday, March 31, 2011 11:41 AM > To: Cook, Malcolm > Cc: 'myoung at wehi.EDU.AU'; 'bioconductor at r-project.org'; > 'Bioc-sig-sequencing at r-project.org' > Subject: Re: [BioC] Re(surrecting): [Bioc-sig-seq] Rsamtools > countBam labeling > > On 03/31/2011 08:32 AM, Cook, Malcolm wrote: > > Matt et. al., > > > > I wonder if a satisfactory resolution to the issue of "the order > > changes between the GRanges object and the countBam data.frame > > > > > http://www.mail-archive.com/bioc-sig-sequencing at r-project.org/ > msg01144.html > > > > I am presented with the same issue and poised to tackle it > but wondr > > if a generic solution emerged from you inquiries& efforts. > > Hi Malcolm -- > > For a reproducible example, > > library(Rsamtools) > example(countBam) > which1 <- as(which, "GRanges") > ## which2 might be where your data actually starts > which2 <- which1[c(2,1,3)] > values(which2)[["OriginalOrder"]] <- 1:3 > param <- ScanBamParam(which=which2) > cnt <- countBam(fl, param=param) > > What happens is that ScanBamParam converts its argument to an > IRangesList, using split(ranges(which2), seqnames(which2)). So do the > same for the values and unlist > > cntVals <- unlist(split(values(which2), seqnames(which2))) > > then cbind coerced values > > cbind(cnt, as.data.frame(cntVals)) > > with > > > which2 > GRanges with 3 ranges and 1 elementMetadata value > seqnames ranges strand | OriginalOrder > <rle> <iranges> <rle> | <integer> > [1] seq2 [ 100, 1000] * | 1 > [2] seq1 [1000, 2000] * | 2 > [3] seq2 [1000, 2000] * | 3 > > seqlengths > seq1 seq2 > NA NA > > cbind(cnt, as.data.frame(cntVals)) > space start end width file records nucleotides OriginalOrder > 1 seq1 1000 2000 1001 ex1.bam 612 21549 2 > 2 seq2 100 1000 901 ex1.bam 1169 41235 1 > 3 seq2 1000 2000 1001 ex1.bam 642 22640 3 > > Martin > > > > > Thanks, > > > > Malcolm Cook Stowers Institute for Medical Research - > > Bioinformatics Kansas City, Missouri USA > > > > > > _______________________________________________ Bioconductor mailing > > list Bioconductor at r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor Search the > > archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > Computational Biology > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > Location: M1-B861 > Telephone: 206 667-2793 >
ADD REPLY
0
Entering edit mode
Hello Bioconductor community, We were wondering if it would be possible to perform differential expression analysis of exon expression using DESeq or EdgeR. Would the statistical assumptions be the same, and has anyone attempted this type of analysis? Any feedback or insights would be really appreciated! Cheers, Andrew
ADD REPLY
0
Entering edit mode
You are asking about Affy Exon Expression array or after RNA-seq? Vasu --- On Thu, 3/31/11, adeonari@mrc-lmb.cam.ac.uk <adeonari@mrc- lmb.cam.ac.uk=""> wrote: From: adeonari@mrc-lmb.cam.ac.uk <adeonari@mrc-lmb.cam.ac.uk> Subject: [BioC] Using DESeq or EdgeR for Exon Differential Expression Analysis To: "'bioconductor@r-project.org'" <bioconductor@r-project.org> Date: Thursday, March 31, 2011, 1:18 PM Hello Bioconductor community, We were wondering if it would be possible to perform differential expression analysis of exon expression using DESeq or EdgeR. Would the statistical assumptions be the same, and has anyone attempted this type of analysis? Any feedback or insights would be really appreciated! Cheers, Andrew _______________________________________________ Bioconductor mailing list Bioconductor@r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 879 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6