Re(surrecting): [Bioc-sig-seq] Rsamtools countBam labeling

0

Entering edit mode

Malcolm Cook ★ 1.6k

@malcolm-cook-6293

Last seen 19 hours ago

United States

Matt et. al., I wonder if a satisfactory resolution to the issue of "the order changes between the GRanges object and the countBam data.frame http://www.mail-archive.com/bioc-sig-sequencing at r-project.org/msg01144.html I am presented with the same issue and poised to tackle it but wondr if a generic solution emerged from you inquiries & efforts. Thanks, Malcolm Cook Stowers Institute for Medical Research - Bioinformatics Kansas City, Missouri USA

• 973 views

ADD COMMENT • link updated 13.1 years ago by Martin Morgan 25k • written 13.1 years ago by Malcolm Cook ★ 1.6k

0

Entering edit mode

Martin Morgan 25k

@martin-morgan-1513

Last seen 16 days ago

United States

On 03/31/2011 08:32 AM, Cook, Malcolm wrote: > Matt et. al., > > I wonder if a satisfactory resolution to the issue of "the order > changes between the GRanges object and the countBam data.frame > > http://www.mail-archive.com/bioc-sig-sequencing at r-project.org/msg01144.html > > I am presented with the same issue and poised to tackle it but wondr > if a generic solution emerged from you inquiries& efforts. Hi Malcolm -- For a reproducible example, library(Rsamtools) example(countBam) which1 <- as(which, "GRanges") ## which2 might be where your data actually starts which2 <- which1[c(2,1,3)] values(which2)[["OriginalOrder"]] <- 1:3 param <- ScanBamParam(which=which2) cnt <- countBam(fl, param=param) What happens is that ScanBamParam converts its argument to an IRangesList, using split(ranges(which2), seqnames(which2)). So do the same for the values and unlist cntVals <- unlist(split(values(which2), seqnames(which2))) then cbind coerced values cbind(cnt, as.data.frame(cntVals)) with > which2 GRanges with 3 ranges and 1 elementMetadata value seqnames ranges strand | OriginalOrder <rle> <iranges> <rle> | <integer> [1] seq2 [ 100, 1000] * | 1 [2] seq1 [1000, 2000] * | 2 [3] seq2 [1000, 2000] * | 3 seqlengths seq1 seq2 NA NA > cbind(cnt, as.data.frame(cntVals)) space start end width file records nucleotides OriginalOrder 1 seq1 1000 2000 1001 ex1.bam 612 21549 2 2 seq2 100 1000 901 ex1.bam 1169 41235 1 3 seq2 1000 2000 1001 ex1.bam 642 22640 3 Martin > > Thanks, > > Malcolm Cook Stowers Institute for Medical Research - > Bioinformatics Kansas City, Missouri USA > > > _______________________________________________ Bioconductor mailing > list Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor Search the > archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793

ADD COMMENT • link 13.1 years ago Martin Morgan 25k

0

Entering edit mode

Martin, Hmmm, Thanks .... I think I'm getting it.... Following your lead, I can directly re-order cnt in the OriginalOrder as: > cnt[sort(unlist(split(values(which2), seqnames(which2)))$OriginalOrder,index.return=TRUE)$ix,] space start end width file records nucleotides 2 seq2 100 1000 901 ex1.bam 1169 41235 1 seq1 1000 2000 1001 ex1.bam 612 21549 3 seq2 1000 2000 1001 ex1.bam 642 22640 ... which can usefully(?) be abstracted as countBamWhich <- function (file,which,index=file,...) { ### wrapper for countBam that reorders its results to aggree with ### 'which', a required ScanBamParam. ... params are additional ### ScanBamParam options. ### ### ASSUMES: internal implementation detail of ScanBamParam. c.f. ### http://permalink.gmane.org/gmane.science.biology.informatics.condu ctor/34208) param <- ScanBamParam(which=which,...) values(which)[['OriginalOrder']] <- 1:length(which) CBW = countBam(file,index,param=ScanBamParam(which=which)) CBW[sort(unlist(split(values(which), seqnames(which)))$OriginalOrder,index.return=TRUE)$ix,] } allowing me to write > countBamWhich(fl, which2) space start end width file records nucleotides 2 seq2 100 1000 901 ex1.bam 1169 41235 1 seq1 1000 2000 1001 ex1.bam 612 21549 3 seq2 1000 2000 1001 ex1.bam 642 22640 All in favor? ~ Malcolm > -----Original Message----- > From: Martin Morgan [mailto:mtmorgan at fhcrc.org] > Sent: Thursday, March 31, 2011 11:41 AM > To: Cook, Malcolm > Cc: 'myoung at wehi.EDU.AU'; 'bioconductor at r-project.org'; > 'Bioc-sig-sequencing at r-project.org' > Subject: Re: [BioC] Re(surrecting): [Bioc-sig-seq] Rsamtools > countBam labeling > > On 03/31/2011 08:32 AM, Cook, Malcolm wrote: > > Matt et. al., > > > > I wonder if a satisfactory resolution to the issue of "the order > > changes between the GRanges object and the countBam data.frame > > > > > http://www.mail-archive.com/bioc-sig-sequencing at r-project.org/msg01144 > > .html > > > > I am presented with the same issue and poised to tackle it > but wondr > > if a generic solution emerged from you inquiries& efforts. > > Hi Malcolm -- > > For a reproducible example, > > library(Rsamtools) > example(countBam) > which1 <- as(which, "GRanges") > ## which2 might be where your data actually starts > which2 <- which1[c(2,1,3)] > values(which2)[["OriginalOrder"]] <- 1:3 > param <- ScanBamParam(which=which2) > cnt <- countBam(fl, param=param) > > What happens is that ScanBamParam converts its argument to an > IRangesList, using split(ranges(which2), seqnames(which2)). > So do the same for the values and unlist > > cntVals <- unlist(split(values(which2), seqnames(which2))) > > then cbind coerced values > > cbind(cnt, as.data.frame(cntVals)) > > with > > > which2 > GRanges with 3 ranges and 1 elementMetadata value > seqnames ranges strand | OriginalOrder > <rle> <iranges> <rle> | <integer> > [1] seq2 [ 100, 1000] * | 1 > [2] seq1 [1000, 2000] * | 2 > [3] seq2 [1000, 2000] * | 3 > > seqlengths > seq1 seq2 > NA NA > > cbind(cnt, as.data.frame(cntVals)) > space start end width file records nucleotides OriginalOrder > 1 seq1 1000 2000 1001 ex1.bam 612 21549 2 > 2 seq2 100 1000 901 ex1.bam 1169 41235 1 > 3 seq2 1000 2000 1001 ex1.bam 642 22640 3 > > Martin > > > > > Thanks, > > > > Malcolm Cook Stowers Institute for Medical Research - > Bioinformatics > > Kansas City, Missouri USA > > > > > > _______________________________________________ > Bioconductor mailing > > list Bioconductor at r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor Search the > > archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > Computational Biology > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > Location: M1-B861 > Telephone: 206 667-2793 >

ADD REPLY • link 13.1 years ago Malcolm Cook ★ 1.6k

0

Entering edit mode

!! Please pardon the line-breaks MS Outlook auto-inserted into my code in a previous version of this post. Martin, Hmmm, Thanks .... I think I'm getting it.... Following your lead, I can directly re-order cnt in the OriginalOrder as: > cnt[sort(unlist(split(values(which2), seqnames(which2)))$OriginalOrder,index.return=TRUE)$ix,] space start end width file records nucleotides 2 seq2 100 1000 901 ex1.bam 1169 41235 1 seq1 1000 2000 1001 ex1.bam 612 21549 3 seq2 1000 2000 1001 ex1.bam 642 22640 ... which can usefully(?) be abstracted as countBamWhich <- function (file,which,index=file,...) { ### wrapper for countBam that reorders its results to aggree with ### 'which', a required ScanBamParam. ... params are additional ### ScanBamParam options. ### ### ASSUMES: internal implementation detail of ScanBamParam. c.f. ### http://permalink.gmane.org/gmane.science.biology.informatics.condu ctor/34208) param <- ScanBamParam(which=which,...) values(which)[['OriginalOrder']] <- 1:length(which) CBW = countBam(file,index,param=ScanBamParam(which=which)) CBW[sort(unlist(split(values(which), seqnames(which)))$OriginalOrder,index.return=TRUE)$ix,] } allowing me to write > countBamWhich(fl, which2) space start end width file records nucleotides 2 seq2 100 1000 901 ex1.bam 1169 41235 1 seq1 1000 2000 1001 ex1.bam 612 21549 3 seq2 1000 2000 1001 ex1.bam 642 22640 All in favor? ~ Malcolm Malcolm Cook Stowers Institute for Medical Research - Bioinformatics Kansas City, Missouri USA > -----Original Message----- > From: Martin Morgan [mailto:mtmorgan at fhcrc.org] > Sent: Thursday, March 31, 2011 11:41 AM > To: Cook, Malcolm > Cc: 'myoung at wehi.EDU.AU'; 'bioconductor at r-project.org'; > 'Bioc-sig-sequencing at r-project.org' > Subject: Re: [BioC] Re(surrecting): [Bioc-sig-seq] Rsamtools > countBam labeling > > On 03/31/2011 08:32 AM, Cook, Malcolm wrote: > > Matt et. al., > > > > I wonder if a satisfactory resolution to the issue of "the order > > changes between the GRanges object and the countBam data.frame > > > > > http://www.mail-archive.com/bioc-sig-sequencing at r-project.org/ > msg01144.html > > > > I am presented with the same issue and poised to tackle it > but wondr > > if a generic solution emerged from you inquiries& efforts. > > Hi Malcolm -- > > For a reproducible example, > > library(Rsamtools) > example(countBam) > which1 <- as(which, "GRanges") > ## which2 might be where your data actually starts > which2 <- which1[c(2,1,3)] > values(which2)[["OriginalOrder"]] <- 1:3 > param <- ScanBamParam(which=which2) > cnt <- countBam(fl, param=param) > > What happens is that ScanBamParam converts its argument to an > IRangesList, using split(ranges(which2), seqnames(which2)). So do the > same for the values and unlist > > cntVals <- unlist(split(values(which2), seqnames(which2))) > > then cbind coerced values > > cbind(cnt, as.data.frame(cntVals)) > > with > > > which2 > GRanges with 3 ranges and 1 elementMetadata value > seqnames ranges strand | OriginalOrder > <rle> <iranges> <rle> | <integer> > [1] seq2 [ 100, 1000] * | 1 > [2] seq1 [1000, 2000] * | 2 > [3] seq2 [1000, 2000] * | 3 > > seqlengths > seq1 seq2 > NA NA > > cbind(cnt, as.data.frame(cntVals)) > space start end width file records nucleotides OriginalOrder > 1 seq1 1000 2000 1001 ex1.bam 612 21549 2 > 2 seq2 100 1000 901 ex1.bam 1169 41235 1 > 3 seq2 1000 2000 1001 ex1.bam 642 22640 3 > > Martin > > > > > Thanks, > > > > Malcolm Cook Stowers Institute for Medical Research - > > Bioinformatics Kansas City, Missouri USA > > > > > > _______________________________________________ Bioconductor mailing > > list Bioconductor at r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor Search the > > archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > Computational Biology > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > Location: M1-B861 > Telephone: 206 667-2793 >

ADD REPLY • link 13.1 years ago Malcolm Cook ★ 1.6k

0

Entering edit mode

Hello Bioconductor community, We were wondering if it would be possible to perform differential expression analysis of exon expression using DESeq or EdgeR. Would the statistical assumptions be the same, and has anyone attempted this type of analysis? Any feedback or insights would be really appreciated! Cheers, Andrew

ADD REPLY • link 13.1 years ago adeonari@mrc-lmb.cam.ac.uk ▴ 30

0

Entering edit mode

You are asking about Affy Exon Expression array or after RNA-seq? Vasu --- On Thu, 3/31/11, adeonari@mrc-lmb.cam.ac.uk <adeonari@mrc- lmb.cam.ac.uk=""> wrote: From: adeonari@mrc-lmb.cam.ac.uk <adeonari@mrc-lmb.cam.ac.uk> Subject: [BioC] Using DESeq or EdgeR for Exon Differential Expression Analysis To: "'bioconductor@r-project.org'" <bioconductor@r-project.org> Date: Thursday, March 31, 2011, 1:18 PM Hello Bioconductor community, We were wondering if it would be possible to perform differential expression analysis of exon expression using DESeq or EdgeR. Would the statistical assumptions be the same, and has anyone attempted this type of analysis? Any feedback or insights would be really appreciated! Cheers, Andrew _______________________________________________ Bioconductor mailing list Bioconductor@r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]

ADD REPLY • link 13.1 years ago vasu punj ▴ 80

Login before adding your answer.