Need assistance with the file preparation for ChIPpeakAnno. (Jennifer Yang from University of California, Santa Barbara)
Jennifer, Please use Bed2RangedData function. Thanks! Please also take a look at the following post http://permalink.gmane.org/gmane.science.biology.informatics.conductor /28497 Best regards, Julie On 1/19/11 12:48 PM, "Chu-Ya (Jennifer) Yang" <chu-ya.yang at="" lifesci.ucsb.edu=""> wrote: > Dear Prof. Zhu, > > From the example on the 3rd page of "The ChIPpeakAnno user's guide", it > seems that the user can pass custom annotation data into the function > annotatePeakInBatch. I tried to pass a custom annotation data file with > the following commands, > >> mm9=read.table("/home/Jennifer/mm9_bed/mm9.bed",header=FALSE) >> Reference=RangedData(IRanges(start=[ ,2], end=[ ,3], names=[ ,4], > space=[ ,1], strand=[ ,6]) > Error: unexpected '[' in "Reference=RangedData(IRanges(start=[" >> > > I am wondering how I should define the vector so that the bed format can > be converted into RangedData with this approach. > > Thank you, > > Jennifer > > Zhu, Lihua (Julie) said the following on 1/18/2011 9:19 AM: >> Dear Jennifer, >> >> You may also want to contact the data provider as well. >> >> Best regards, >> >> Julie >
Jennifer, The name column name=mm9[ ,4] has duplicates (needs to be unique). Best regards, Julie On 1/19/11 4:47 PM, "Chu-Ya (Jennifer) Yang" <chu-ya.yang at="" lifesci.ucsb.edu=""> wrote: > Dear Prof. Zhu, > > Thanks for the prompt reply and assistance. I checked the link and > tried the following commands and here is the error message I obtained. > Would you please help me check it? > > Thank you, > > Jennifer > >> mm9 <- read.table(file="/home/Jennifer/mm9_bed/mm9.bed",header=FALSE) >> test.bed=data.frame(cbind(chrom=mm9[ ,1], chromStart=mm9[ ,2], > chromEnd=mm9[ ,3], name=mm9[ ,4], strand=mm9[ ,6])) >> test.rangedData=BED2RangedData(test.bed) > Error in rownames<-(*tmp*, value = c("82082", "82083", "82084", > "82084", : > duplicate rownames not allowed >> > > > >
Dear Jennifer, I am wondering what is the rational to have multiple coordinates for the same feature name. We could append a serial number to the feature names with multiple coordinates in the future release. Best regards, Julie On 1/20/11 8:35 PM, "Chu-Ya (Jennifer) Yang" <chu-ya.yang at="" lifesci.ucsb.edu=""> wrote: > Dear Prof. Zhu, > > Thank you for the assistance. > > We tried to convert the features with unique names and then convert to > ranged data format and then convert back to the original names of the > features. And then annotate the data with ChIPpeakAnno. It finally > worked, and was not easy for us to do. > > Since ChIPpeakAnno is a very powerful annotation algorithm and it is > very helpful to our data analysis, I am wondering whether it is possible > that you would please consider to modify the BED2RangedData to > accommodate more flexibility of the bed files, such as allow multiple > rows with identical name identifier. > > Thank you, > > Jennifer > > Zhu, Lihua (Julie) said the following on 1/19/2011 2:19 PM: >> Jennifer, >> >> The name column name=mm9[ ,4] has duplicates (needs to be unique). >> >> Best regards, >> >> Julie >