ChIPpeakAnno

0

Entering edit mode

Julie Zhu ★ 4.3k

@julie-zhu-3596

Last seen 6 months ago

United States

Hi Hua, Yes, I remember meeting you in CAMDA! The insideFeature is set to TURE if the peak start is inside the gene (between gene start and gene end). Positive (+) number for distancetoFeature means the peak is inside or downstream of the gene. Negative (-) number for distancetoFeature means the peak is upstream of the gene. Please note that the distancetoFeature takes strand information into consideration already. I suggest use peak summit as peak start so that the distancetoFeature is the peak location to the transcription start site. Thanks! Best regards, Julie On 2/3/10 3:51 PM, "Li, Hua" <hul@stowers.org> wrote: > Julie: > > This is Hua from Stowers Institute, we met in last year's CAMDA. > > I am using you ChIPpeakAnno package, and have question about the > "insideFeature". Could you tell me what "+" and "-" number > (distancetoFeature) means? And how do you decide which peak is TRUE for the > "insideFeature". I assume "insideFeature". Means if it close to a genes, > right? > > Thanks, > > Hua > > [[alternative HTML version deleted]]

Transcription ChIPpeakAnno Transcription ChIPpeakAnno • 1.1k views

ADD COMMENT • link 14.3 years ago • updated 14.2 years ago Julie Zhu ★ 4.3k

0

Entering edit mode

Julie Zhu ★ 4.3k

@julie-zhu-3596

Last seen 6 months ago

United States

Hi Som, Here is the code to create the RangedData from your input dataframe x assuming that the first column of x represents start, the second column of x represents end and the third column represents chromosome as either âchr1â or â1â etc. myexp = RangedData(IRanges(start = as.numeric(x[,1]), end = as.numeric(x[,2])), space=as.character(x[,3])) Best regards, Julie On 2/19/10 6:25 PM, "somnath bandyopadhyay" <genome1976@hotmail.com> wrote: > Hi Julie, > Thanks so much for your prompt reply. I have 6000 probesets with corresponding > start, end and chromosome information along with strand information in an > excel/dataframe format. Could you suggest an easy way to convert it into > RangedData? > > Thanks, > Som. > >> > Date: Fri, 19 Feb 2010 09:43:09 -0500 >> > Subject: Re: [BioC] ChIPpeakAnno >> > From: julie.zhu@umassmed.edu >> > To: genome1976@hotmail.com; bioconductor@stat.math.ethz.ch >> > >> > Hi Som, >> > >> > myPeakList is RangedData, where "start" is the start (or summit) of the >> > binding site, "end" is the end of the binding site, "names" is the name of >> > the binding site and "space" is the chromosome name. >> > >> > Here is how to create RangedData myexp from a list of binding sites. >> > >> > myexp = RangedData(IRanges(start = c(967654, 2010897, 2496704), end = >> > c(967754, 2010997, 2496804), names = c("Site1", "Site2", "Site3")), space = >> > c("1", "2", "3")) >> > >> > Please see ?annotatePeakInBatch for more examples. Thanks! >> > >> > Best regards, >> > >> > Julie >> > >> > >> > ******************************************* >> > Lihua Julie Zhu, Ph.D >> > Research Associate Professor >> > Program Gene Function and Expression >> > University of Massachusetts Medical School >> > 364 Plantation Street, Room 613 >> > Worcester, MA 01605 >> > 508-856-5256 >> > http://www.umassmed.edu/pgfe/faculty/zhu.cfm >> > >> > >> > >> > On 2/19/10 8:52 AM, "somnath bandyopadhyay" <genome1976@hotmail.com> wrote: >> > >>> > > >>> > > Could anybody please tell me what the input list (myPeakList) for the >>> > > ChIPpeakAnno program look like? >>> > > >>> > > I have a list of probeset ids with genomic coordinates coming from a >>> nimblegen >>> > > 385k chip chip experiment. >>> > > >>> > > >>> > > >>> > > Thanks in advance, >>> > > >>> > > Som. >>> > > >>> > > _________________________________________________________________ >>> > > Hotmail: Trusted email with Microsoftâs powerful SPAM protection. >>> > > >>> > > [[alternative HTML version deleted]] >>> > > >>> > > _______________________________________________ >>> > > Bioconductor mailing list >>> > > Bioconductor@stat.math.ethz.ch >>> > > https://stat.ethz.ch/mailman/listinfo/bioconductor >>> > > Search the archives: >>> > > http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >> > > > > Hotmail: Powerful Free email with security by Microsoft. Get it now. > <http: clk.atdmt.com="" gbl="" go="" 201469230="" direct="" 01=""/> [[alternative HTML version deleted]]

ADD COMMENT • link 14.2 years ago Julie Zhu ★ 4.3k

0

Entering edit mode

Julie Zhu ★ 4.3k

@julie-zhu-3596

Last seen 6 months ago

United States

Dear Som, I think you could use pdInfoBuilder (http://www.bioconductor.org/packages/bioc/vignettes/pdInfoBuilder/ins t/doc/ BuildingPDInfoPkgs.pdf) to build an annotation package to store all the probe information including gene symbol, gene id etc from the ndf and pos file. Building annotation dataset is beyond ChIPpeakAnno though. I am ccing the Bioc list to get expertsâ pointers on building annotation packages. Best regards, Julie ******************************************* Lihua Julie Zhu, Ph.D Research Associate Professor Program Gene Function and Expression University of Massachusetts Medical School 364 Plantation Street, Room 613 Worcester, MA 01605 508-856-5256 http://www.umassmed.edu/pgfe/faculty/zhu.cfm On 2/25/10 4:40 PM, "somnath bandyopadhyay" <genome1976@hotmail.com> wrote: > Dear Julie, > > Thanks so much for the pointers and all the help. It worked! I needed your > help with one more thing and was wondering if using ChIPpeakAnno would be the > ideal thing to do. > > I have been analyzing Nimblegen 385k chip-chip data of mm8 refseq genome. I > have the .ndf and .pos files for the chip. What would be best way to build an > annotation file for the entire chip so that at the end of the day I have a > gene symbol, gene id etc. for each single probeset (~385,000) on the chip? > > Thanks a lot in advance! > Best Regards, > Som. > > > Date: Sun, 21 Feb 2010 10:28:07 -0500 > Subject: Re: [BioC] ChIPpeakAnno > From: julie.zhu@umassmed.edu > To: genome1976@hotmail.com > CC: bioconductor@stat.math.ethz.ch > > Hi Som, > > Here is the code to create the RangedData from your input dataframe x assuming > that the first column of x represents start, the second column of x represents > end and the third column represents chromosome as either âchr1â or â1â etc. > > myexp = RangedData(IRanges(start = as.numeric(x[,1]), end = > as.numeric(x[,2])), space=as.character(x[,3])) > > Best regards, > > Julie > > > On 2/19/10 6:25 PM, "somnath bandyopadhyay" <genome1976@hotmail.com> <http: hotmail.com=""/> > wrote: > >> Hi Julie, >> Thanks so much for your prompt reply. I have 6000 probesets with >> corresponding start, end and chromosome information along with strand >> information in an excel/dataframe format. Could you suggest an easy way to >> convert it into RangedData? >> >> Thanks, >> Som. >> >>> > Date: Fri, 19 Feb 2010 09:43:09 -0500 >>> > Subject: Re: [BioC] ChIPpeakAnno >>> > From: julie.zhu@umassmed.edu <http: umassmed.edu=""/> >>> > To: genome1976@hotmail.com <http: hotmail.com=""/> ; >>> bioconductor@stat.math.ethz.ch <http: stat.math.ethz.ch=""/> >>> > >>> > Hi Som, >>> > >>> > myPeakList is RangedData, where "start" is the start (or summit) of the >>> > binding site, "end" is the end of the binding site, "names" is the name of >>> > the binding site and "space" is the chromosome name. >>> > >>> > Here is how to create RangedData myexp from a list of binding sites. >>> > >>> > myexp = RangedData(IRanges(start = c(967654, 2010897, 2496704), end = >>> > c(967754, 2010997, 2496804), names = c("Site1", "Site2", "Site3")), space = >>> > c("1", "2", "3")) >>> > >>> > Please see ?annotatePeakInBatch for more examples. Thanks! >>> > >>> > Best regards, >>> > >>> > Julie >>> > >>> > >>> > ******************************************* >>> > Lihua Julie Zhu, Ph.D >>> > Research Associate Professor >>> > Program Gene Function and Expression >>> > University of Massachusetts Medical School >>> > 364 Plantation Street, Room 613 >>> > Worcester, MA 01605 >>> > 508-856-5256 >>> > http://www.umassmed.edu/pgfe/faculty/zhu.cfm >>> > >>> > >>> > >>> > On 2/19/10 8:52 AM, "somnath bandyopadhyay" <genome1976@hotmail.com>>> <http: hotmail.com=""/> > wrote: >>> > >>>> > > >>>> > > Could anybody please tell me what the input list (myPeakList) for the >>>> > > ChIPpeakAnno program look like? >>>> > > >>>> > > I have a list of probeset ids with genomic coordinates coming from a >>>> nimblegen >>>> > > 385k chip chip experiment. >>>> > > >>>> > > >>>> > > >>>> > > Thanks in advance, >>>> > > >>>> > > Som. >>>> > > >>>> > > _________________________________________________________________ >>>> > > Hotmail: Trusted email with Microsoftâs powerful SPAM protection. >>>> > > >>>> > > [[alternative HTML version deleted]] >>>> > > >>>> > > _______________________________________________ >>>> > > Bioconductor mailing list >>>> > > Bioconductor@stat.math.ethz.ch <http: stat.math.ethz.ch=""/> >>>> > > https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> > > Search the archives: >>>> > > http://news.gmane.org/gmane.science.biology.informatics.conductor >>> > >>> > >> >> >> Hotmail: Powerful Free email with security by Microsoft. Get it now. >> <http: clk.atdmt.com="" gbl="" go="" 201469230="" direct="" 01=""/> > > > Hotmail: Free, trusted and rich email service. Get it now. > <http: clk.atdmt.com="" gbl="" go="" 201469228="" direct="" 01=""/> [[alternative HTML version deleted]]

ADD COMMENT • link 14.2 years ago Julie Zhu ★ 4.3k

0

Entering edit mode

Hi Som, regarding the analysis of ChIP-chip data from Nimblegen arrays, you can also have a look at the package Ringo, the data package ccTutorial and their vignettes. I would recommend to remap the probe sequences from the NDF to the current assembly of the mouse genome (mm9) using alignment tools like Exonerate or functions of Biostrings, and to retrieve the current gene annotation from databases, e.g. using biomaRt. Regards, Joern On Thu, 25 Feb 2010 19:35:13 -0500, Julie Zhu wrote > Dear Som, > > I think you could use pdInfoBuilder > > (http://www.bioconductor.org/packages/bioc/vignettes/pdInfoBuilder/i nst/doc/ > BuildingPDInfoPkgs.pdf) to build an annotation package to store all the > probe information including gene symbol, gene id etc from the ndf > and pos file. Building annotation dataset is beyond ChIPpeakAnno > though. I am ccing the Bioc list to get experts??? pointers on > building annotation packages. > > Best regards, > > Julie > > ******************************************* > Lihua Julie Zhu, Ph.D > Research Associate Professor > Program Gene Function and Expression > University of Massachusetts Medical School > 364 Plantation Street, Room 613 > Worcester, MA 01605 > 508-856-5256 > http://www.umassmed.edu/pgfe/faculty/zhu.cfm > > On 2/25/10 4:40 PM, "somnath bandyopadhyay" <genome1976 at="" hotmail.com=""> > wrote: > > > Dear Julie, > > > > Thanks so much for the pointers and all the help. It worked! I needed your > > help with one more thing and was wondering if using ChIPpeakAnno would be the > > ideal thing to do. > > > > I have been analyzing Nimblegen 385k chip-chip data of mm8 refseq genome. I > > have the .ndf and .pos files for the chip. What would be best way to build an > > annotation file for the entire chip so that at the end of the day I have a > > gene symbol, gene id etc. for each single probeset (~385,000) on the chip? > > > > Thanks a lot in advance! > > Best Regards, > > Som. > > > > --- Joern Toedling Institut Curie -- U900 26 rue d'Ulm, 75005 Paris, FRANCE Tel. +33 (0)156246927

ADD REPLY • link 14.2 years ago Joern Toedling ▴ 450

Login before adding your answer.