Question: How to do Affy ST array analysis
0
gravatar for Guest User
5.5 years ago by
Guest User12k
Guest User12k wrote:
Hi all! I am feeling a little bit stupid, but I have been searching for two days now (maybe I search wrong?!) and could not figure it out. I want to analyze a Human Gene st array. I know that there is the oligo package, I found this annotation package here pd.hugene.2.0.st, but, I do not know how to do the steps. I am used to the affy package and affy pipelines. All I find when searching for solutions are ways on how to make your own annotation package, that is not necessary, I think, because I found the pd.hugene.2.0.st. Or am I wrong? Somehow I can t use it in the same way as I do with the for example hgu133a.db package that provides me the annotations. Im really lost... I want to do: - probe level analysis (similar to affyplm) - RMA normalization (Somehow oligo does this, I think) - Filter probes that are controls (as one does with affy: AFFX, for hgu133a) - annotation of probesets (normally, I would use the IQR filter to get unique entrez ids, but how do I do this with the ST array?) I know that there is something about probe and transcript to be aware of and core? But I cannot connect the workflow. I would be so happy if someone helped me, pointed me to the right docs. (the oligo userguide is not so helpful for me because I still dont understand what to do with what and when...) Sorry! Thanks! Ninni -- output of sessionInfo(): - -- Sent via the guest posting facility at bioconductor.org.
ADD COMMENTlink modified 4.4 years ago by benhrif.oussama0 • written 5.5 years ago by Guest User12k
Answer: How to do Affy ST array analysis
0
gravatar for Bernd Klaus
5.5 years ago by
Bernd Klaus560
Germany
Bernd Klaus560 wrote:
Hi Ninni, I guess a very simple workflow would be: 1.read celfiles library(oligo) rawData = read.celfiles(< character vector of celfiles >) 2. perform RMA and get "transcript cluster" summarized data back using only "core" genes ("safely" annotated genes according to affy) this is the default in oligo. Eset = rma(rawData,target="core") 3. Load annotation package and annotate "transcript clusters" with some stuff contained in that package. ## load Annotation package library("hugene20sttranscriptcluster.db") annotateGene = function ( db , what , missing ) { tab = toTable(db[intersect(featureNames(Eset), mappedkeys(db)) ]) mt = match ( featureNames ( Eset ) , tab$probe_id ) ifelse ( is.na(mt), missing , tab[[ what ]][ mt ]) } fData(Eset)$symbol = annotateGene( hugene20sttranscriptclusterSYMBOL ,"symbol" , missing = NA ) fData(Eset)$genename = annotateGene( hugene20sttranscriptclusterGENENAME , "gene_name" , missing = NA ) fData(Eset)$ensembl = annotateGene( hugene20sttranscriptclusterENSEMBL , "ensembl_id" , missing = NA ) 4. After that keep only the "transcript clusters" that have a ENSEMBL Gene ID. (for example) Hope that helps, Bernd On Wed, 7 May 2014 05:06:00 -0700 (PDT) "Ninni Nahm \[guest\]" <guest at="" bioconductor.org=""> wrote: > > Hi all! > > I am feeling a little bit stupid, but I have been searching for two days now (maybe I search wrong?!) and could not figure it out. > I want to analyze a Human Gene st array. > I know that there is the oligo package, I found this annotation package here pd.hugene.2.0.st, but, I do not know how to do the steps. I am used to the affy package and affy pipelines. > All I find when searching for solutions are ways on how to make your own annotation package, that is not necessary, I think, because I found the pd.hugene.2.0.st. Or am I wrong? Somehow I can t use it in the same way as I do with the for example hgu133a.db package that provides me the annotations. > > Im really lost... > > I want to do: > > - probe level analysis (similar to affyplm) > - RMA normalization (Somehow oligo does this, I think) > - Filter probes that are controls (as one does with affy: AFFX, for hgu133a) > - annotation of probesets (normally, I would use the IQR filter to get unique entrez ids, but how do I do this with the ST array?) > > > I know that there is something about probe and transcript to be aware of and core? But I cannot connect the workflow. > > I would be so happy if someone helped me, pointed me to the right docs. (the oligo userguide is not so helpful for me because I still dont understand what to do with what and when...) Sorry! > > Thanks! > > Ninni > > -- output of sessionInfo(): > > - > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENTlink written 5.5 years ago by Bernd Klaus560
Thank you! That was very very helpful! I wanted to ask if I can use the hugene20sttranscriptcluster.db package for all hugene arrays? I have one more to analyze, which is a st 1 array. Best Ninni On Wed, May 7, 2014 at 2:57 PM, Bernd Klaus <bernd.klaus@embl.de> wrote: > Hi Ninni, > > I guess a very simple workflow would be: > > 1.read celfiles > library(oligo) > rawData = read.celfiles(< character vector of celfiles >) > > 2. perform RMA and get "transcript cluster" summarized data back > using only "core" genes ("safely" annotated genes according to affy) > this is the default in oligo. > > Eset = rma(rawData,target="core") > > 3. Load annotation package and annotate "transcript clusters" with some > stuff contained in that package. > > ## load Annotation package > library("hugene20sttranscriptcluster.db") > > annotateGene = function ( db , what , missing ) { > tab = toTable(db[intersect(featureNames(Eset), mappedkeys(db)) ]) > mt = match ( featureNames ( Eset ) , tab$probe_id ) > ifelse ( is.na(mt), missing , tab[[ what ]][ mt ]) > } > > > fData(Eset)$symbol = annotateGene( hugene20sttranscriptclusterSYMBOL > ,"symbol" , missing = NA ) > fData(Eset)$genename = annotateGene( hugene20sttranscriptclusterGENENAME , > "gene_name" , missing = NA ) > fData(Eset)$ensembl = annotateGene( hugene20sttranscriptclusterENSEMBL , > "ensembl_id" , missing = NA ) > > > 4. After that keep only the "transcript clusters" that have a ENSEMBL > Gene ID. > (for example) > > > Hope that helps, > > Bernd > > On Wed, 7 May 2014 05:06:00 -0700 (PDT) > "Ninni Nahm \[guest\]" <guest@bioconductor.org> wrote: > > > > > Hi all! > > > > I am feeling a little bit stupid, but I have been searching for two days > now (maybe I search wrong?!) and could not figure it out. > > I want to analyze a Human Gene st array. > > I know that there is the oligo package, I found this annotation package > here pd.hugene.2.0.st, but, I do not know how to do the steps. I am used > to the affy package and affy pipelines. > > All I find when searching for solutions are ways on how to make your own > annotation package, that is not necessary, I think, because I found the > pd.hugene.2.0.st. Or am I wrong? Somehow I can t use it in the same way > as I do with the for example hgu133a.db package that provides me the > annotations. > > > > Im really lost... > > > > I want to do: > > > > - probe level analysis (similar to affyplm) > > - RMA normalization (Somehow oligo does this, I think) > > - Filter probes that are controls (as one does with affy: AFFX, for > hgu133a) > > - annotation of probesets (normally, I would use the IQR filter to get > unique entrez ids, but how do I do this with the ST array?) > > > > > > I know that there is something about probe and transcript to be aware of > and core? But I cannot connect the workflow. > > > > I would be so happy if someone helped me, pointed me to the right docs. > (the oligo userguide is not so helpful for me because I still dont > understand what to do with what and when...) Sorry! > > > > Thanks! > > > > Ninni > > > > -- output of sessionInfo(): > > > > - > > > > -- > > Sent via the guest posting facility at bioconductor.org. > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLYlink written 5.5 years ago by Ninni Nahm50
Hi Ninni, no, you need to switch, there is an annotation data base for every single platform (thanks to Jim MacDonald!), search here: http://bioconductor.org/packages/release/BiocViews.html#___AnnotationD ata The 1.0 one is here: http://bioconductor.org/packages/release/data/annotation/html hugene10sttranscriptcluster.db.html Best wishes, Bernd On Wed, 7 May 2014 18:17:19 +0200 Ninni Nahm <ninninahm at="" gmail.com=""> wrote: > Thank you! That was very very helpful! > I wanted to ask if I can use the hugene20sttranscriptcluster.db package for > all hugene arrays? I have one more to analyze, which is a st 1 array. > Best > Ninni > > > > On Wed, May 7, 2014 at 2:57 PM, Bernd Klaus <bernd.klaus at="" embl.de=""> wrote: > > > Hi Ninni, > > > > I guess a very simple workflow would be: > > > > 1.read celfiles > > library(oligo) > > rawData = read.celfiles(< character vector of celfiles >) > > > > 2. perform RMA and get "transcript cluster" summarized data back > > using only "core" genes ("safely" annotated genes according to affy) > > this is the default in oligo. > > > > Eset = rma(rawData,target="core") > > > > 3. Load annotation package and annotate "transcript clusters" with some > > stuff contained in that package. > > > > ## load Annotation package > > library("hugene20sttranscriptcluster.db") > > > > annotateGene = function ( db , what , missing ) { > > tab = toTable(db[intersect(featureNames(Eset), mappedkeys(db)) ]) > > mt = match ( featureNames ( Eset ) , tab$probe_id ) > > ifelse ( is.na(mt), missing , tab[[ what ]][ mt ]) > > } > > > > > > fData(Eset)$symbol = annotateGene( hugene20sttranscriptclusterSYMBOL > > ,"symbol" , missing = NA ) > > fData(Eset)$genename = annotateGene( hugene20sttranscriptclusterGENENAME , > > "gene_name" , missing = NA ) > > fData(Eset)$ensembl = annotateGene( hugene20sttranscriptclusterENSEMBL , > > "ensembl_id" , missing = NA ) > > > > > > 4. After that keep only the "transcript clusters" that have a ENSEMBL > > Gene ID. > > (for example) > > > > > > Hope that helps, > > > > Bernd > > > > On Wed, 7 May 2014 05:06:00 -0700 (PDT) > > "Ninni Nahm \[guest\]" <guest at="" bioconductor.org=""> wrote: > > > > > > > > Hi all! > > > > > > I am feeling a little bit stupid, but I have been searching for two days > > now (maybe I search wrong?!) and could not figure it out. > > > I want to analyze a Human Gene st array. > > > I know that there is the oligo package, I found this annotation package > > here pd.hugene.2.0.st, but, I do not know how to do the steps. I am used > > to the affy package and affy pipelines. > > > All I find when searching for solutions are ways on how to make your own > > annotation package, that is not necessary, I think, because I found the > > pd.hugene.2.0.st. Or am I wrong? Somehow I can t use it in the same way > > as I do with the for example hgu133a.db package that provides me the > > annotations. > > > > > > Im really lost... > > > > > > I want to do: > > > > > > - probe level analysis (similar to affyplm) > > > - RMA normalization (Somehow oligo does this, I think) > > > - Filter probes that are controls (as one does with affy: AFFX, for > > hgu133a) > > > - annotation of probesets (normally, I would use the IQR filter to get > > unique entrez ids, but how do I do this with the ST array?) > > > > > > > > > I know that there is something about probe and transcript to be aware of > > and core? But I cannot connect the workflow. > > > > > > I would be so happy if someone helped me, pointed me to the right docs. > > (the oligo userguide is not so helpful for me because I still dont > > understand what to do with what and when...) Sorry! > > > > > > Thanks! > > > > > > Ninni > > > > > > -- output of sessionInfo(): > > > > > > - > > > > > > -- > > > Sent via the guest posting facility at bioconductor.org. > > > > > > _______________________________________________ > > > Bioconductor mailing list > > > Bioconductor at r-project.org > > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD REPLYlink written 5.5 years ago by Bernd Klaus560
Thank you, Bernd! and of course Jim!! :) On Wed, May 7, 2014 at 6:22 PM, Bernd Klaus <bernd.klaus@embl.de> wrote: > Hi Ninni, > > no, you need to switch, there is an annotation data base for every single > platform (thanks to Jim MacDonald!), search here: > > http://bioconductor.org/packages/release/BiocViews.html#___Annotatio nData > > The 1.0 one is here: > > http://bioconductor.org/packages/release/data/annotation/htmlhugene1 0sttranscriptcluster.db.html > > Best wishes, > > Bernd > > On Wed, 7 May 2014 18:17:19 +0200 > Ninni Nahm <ninninahm@gmail.com> wrote: > > > Thank you! That was very very helpful! > > I wanted to ask if I can use the hugene20sttranscriptcluster.db package > for > > all hugene arrays? I have one more to analyze, which is a st 1 array. > > Best > > Ninni > > > > > > > > On Wed, May 7, 2014 at 2:57 PM, Bernd Klaus <bernd.klaus@embl.de> wrote: > > > > > Hi Ninni, > > > > > > I guess a very simple workflow would be: > > > > > > 1.read celfiles > > > library(oligo) > > > rawData = read.celfiles(< character vector of celfiles >) > > > > > > 2. perform RMA and get "transcript cluster" summarized data back > > > using only "core" genes ("safely" annotated genes according to affy) > > > this is the default in oligo. > > > > > > Eset = rma(rawData,target="core") > > > > > > 3. Load annotation package and annotate "transcript clusters" with some > > > stuff contained in that package. > > > > > > ## load Annotation package > > > library("hugene20sttranscriptcluster.db") > > > > > > annotateGene = function ( db , what , missing ) { > > > tab = toTable(db[intersect(featureNames(Eset), > mappedkeys(db)) ]) > > > mt = match ( featureNames ( Eset ) , tab$probe_id ) > > > ifelse ( is.na(mt), missing , tab[[ what ]][ mt ]) > > > } > > > > > > > > > fData(Eset)$symbol = annotateGene( hugene20sttranscriptclusterSYMBOL > > > ,"symbol" , missing = NA ) > > > fData(Eset)$genename = annotateGene( > hugene20sttranscriptclusterGENENAME , > > > "gene_name" , missing = NA ) > > > fData(Eset)$ensembl = annotateGene( hugene20sttranscriptclusterENSEMBL > , > > > "ensembl_id" , missing = NA ) > > > > > > > > > 4. After that keep only the "transcript clusters" that have a ENSEMBL > > > Gene ID. > > > (for example) > > > > > > > > > Hope that helps, > > > > > > Bernd > > > > > > On Wed, 7 May 2014 05:06:00 -0700 (PDT) > > > "Ninni Nahm \[guest\]" <guest@bioconductor.org> wrote: > > > > > > > > > > > Hi all! > > > > > > > > I am feeling a little bit stupid, but I have been searching for two > days > > > now (maybe I search wrong?!) and could not figure it out. > > > > I want to analyze a Human Gene st array. > > > > I know that there is the oligo package, I found this annotation > package > > > here pd.hugene.2.0.st, but, I do not know how to do the steps. I am > used > > > to the affy package and affy pipelines. > > > > All I find when searching for solutions are ways on how to make your > own > > > annotation package, that is not necessary, I think, because I found the > > > pd.hugene.2.0.st. Or am I wrong? Somehow I can t use it in the same > way > > > as I do with the for example hgu133a.db package that provides me the > > > annotations. > > > > > > > > Im really lost... > > > > > > > > I want to do: > > > > > > > > - probe level analysis (similar to affyplm) > > > > - RMA normalization (Somehow oligo does this, I think) > > > > - Filter probes that are controls (as one does with affy: AFFX, for > > > hgu133a) > > > > - annotation of probesets (normally, I would use the IQR filter to > get > > > unique entrez ids, but how do I do this with the ST array?) > > > > > > > > > > > > I know that there is something about probe and transcript to be > aware of > > > and core? But I cannot connect the workflow. > > > > > > > > I would be so happy if someone helped me, pointed me to the right > docs. > > > (the oligo userguide is not so helpful for me because I still dont > > > understand what to do with what and when...) Sorry! > > > > > > > > Thanks! > > > > > > > > Ninni > > > > > > > > -- output of sessionInfo(): > > > > > > > > - > > > > > > > > -- > > > > Sent via the guest posting facility at bioconductor.org. > > > > > > > > _______________________________________________ > > > > Bioconductor mailing list > > > > Bioconductor@r-project.org > > > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > > Search the archives: > > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > [[alternative HTML version deleted]]
ADD REPLYlink written 5.5 years ago by Ninni Nahm50
Answer: How to do Affy ST array analysis
0
gravatar for benhrif.oussama
4.4 years ago by
benhrif.oussama0 wrote:

hello ,

can you give me an example of pipeline where i use rma(rawData, target='probeset')

 

for the exon summarization level

 

thanks

ADD COMMENTlink written 4.4 years ago by benhrif.oussama0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 181 users visited in the last hour