[Bioc-sig-seq] ChIPpeakAnno get sequences from ranged data
0
0
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 4 months ago
United States
Eric, You need to put the as.character() in space, what happened is that without as.character, the chromosome is treated as factor. Please use the exact code I sent to you. It should work. Regarding strand, reverseComplement in Biostrings package should help. But I will try to add the strand to the function before the new release. Best regards, Julie On 3/25/10 10:54 AM, "Eric Bremer" <ericgbremer@gmail.com> wrote: Hi Julie, Thanks so much for the help!! Below is the result I get when using chromosome names. I still get a numeric value that does not correspond to the chromosome from the original table (2R). Another question is how do I take into account "strand" when retreiving the sequences? Thanks again, Eric > peaks = RangedData(IRanges(start=c(SeqTest$start), end=c(SeqTest$end), names=c(SeqTest$peakID)), space=c(SeqTest$Chrom)) > head(peaks) RangedData with 6 rows and 0 value columns across 1 space space ranges | <character> <iranges> | 6010 1 [16831502, 16831539] | 7922 1 [20702347, 20702374] | 2598 1 [ 8146910, 8146941] | 7048 1 [18952292, 18952318] | 5159 1 [14120268, 14120294] | 7367 1 [19830055, 19830090] | On Wed, Mar 24, 2010 at 11:56 AM, Zhu, Julie <julie.zhu@umassmed.edu> wrote: Hi Eric, Please try the following code. You had the gene names in space which needs to contain chromosome names. peaks = RangedData(IRanges(start=SeqTest$start, end=SeqTest$end, names=SeqTest$peakID), space=as.character(SeqTest$Chrom)) library(BSgenome.Dmelanogaster.UCSC.dm2) peaksWithSequences = getAllPeakSequence(peaks, upstream=60, downstream=40, genome=Dmelanogaster) Best regards, Julie On 3/21/10 8:20 PM, "Julie Zhu" <julie.zhu@umassmed.edu <http:="" julie.zhu@umassmed.edu=""> > wrote: Hi Eric, Is CAGE the cap analysis of gene expression? Thanks! I think the error has to do with the chromosome naming since there are only chromosome X, 2L, 2R, 3L, 3R and 4 are available in drosophila genome. I am revising the ChIPpeakAnno paper which is due on Tuesday. Could I get back to you after that? Thank so much! Meanwhile, if others could help out, that would be very much appreciated! Best regards, Julie On 3/21/10 12:52 PM, "Eric Bremer" <ericgbremer@gmail.com <http:="" ericgbremer@gmail.com=""> > wrote: HI Julie, I am trying to get sequence data from the CAGE peaks identified to be within 50 bases of a TSS. With your previous help I was able to nicely idenitify these using ChIPeakAnno. I seem to be having trouble getting the gene identifier into ranged data following the example in the Vignette. I have pasted my approach below. Thanks in advance for any suggestions. > peaks = RangedData(IRanges(start = c(100, 500), end = c(300, + 600), names = c("peak1", "peak2")), space = c("NC_008253", + "NC_010468")) > peaks RangedData with 2 rows and 0 value columns across 2 spaces space ranges | <character> <iranges> | peak1 NC_008253 [100, 300] | peak2 NC_010468 [500, 600] | > SeqTest = read.table(file="2Rp_50base.txt", header=TRUE) > head(SeqTest) Chrom start end length peakID strand FlyBaseGene 1 2R 16831502 16831539 38 6010 1 FBgn0000044 2 2R 20702347 20702374 28 7922 1 FBgn0000157 3 2R 8146910 8146941 32 2598 1 FBgn0000253 4 2R 18952292 18952318 27 7048 1 FBgn0000487 5 2R 14120268 14120294 27 5159 1 FBgn0000658 6 2R 19830055 19830090 36 7367 1 FBgn0001123 > peaks = RangedData(IRanges(start=c(SeqTest$start), end=c(SeqTest$end), names=c(SeqTest$peakID)), space=c(SeqTest$FlyBaseGene)) > head(peaks) RangedData with 6 rows and 0 value columns across 247 spaces space ranges | <character> <iranges> | 6010 1 [16831502, 16831539] | 7922 2 [20702347, 20702374] | 2598 3 [ 8146910, 8146941] | 7048 4 [18952292, 18952318] | 5159 5 [14120268, 14120294] | 7367 6 [19830055, 19830090] | > peaks$space[1:6] [1] "1" "2" "3" "4" "5" "6" > library(BSgenome.Dmelanogaster.UCSC.dm2) > peaksWithSequences = getAllPeakSequence(peaks, upstream=60, downstream=40, genome=Dmelanogaster) Error in .getOneSeq(x, name) : sequence chr1 not found Error in subseq(.getOneSeq(x, name), start = start, end = end, width = width) : error in evaluating the argument 'x' in selecting a method for function 'subseq' [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list Bioc-sig-sequencing@r-project.org <http: bioc-sig-="" sequencing@r-project.org=""> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing [[alternative HTML version deleted]]
Biostrings ChIPpeakAnno Biostrings ChIPpeakAnno • 919 views
ADD COMMENT

Login before adding your answer.

Traffic: 891 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6