two questions regarding Human Gene 1.0 ST arrays

1

Entering edit mode

Javier Pérez Florido ▴ 840

@javier-perez-florido-3121

Last seen 6.1 years ago

Dear list, I have two questions regarding Human Gene 1.0 ST arrays: * Both NUSE and RLE plots need a fitted object using fitPLM function. Now, this function accepts raw data from a set of Hu Gene 1.0 arrays, but, internally, this function performs a RMA normalization. What level is used for this normalization? I cannot choose the level (i.e. core, full, extended) for the "internal" normalization. * Are a splicing analysis using Hu Gene 1.0 arrays (core analysis) and a splicing analysis using Hu Exon 1.0 arrays (core analysis) equivalent in terms of results? Thanks, Javier [[alternative HTML version deleted]]

Normalization Normalization • 1.8k views

ADD COMMENT • link updated 13.0 years ago by cstrato ★ 3.9k • written 13.0 years ago by Javier Pérez Florido ▴ 840

1

Entering edit mode

cstrato ★ 3.9k

@cstrato-908

Last seen 5.6 years ago

Austria

Dear Javier, Since you do not supply your sessionInfo() it is not possible to answer your question. However, please note that levels core, extended, full do only exist for Exon ST arrays but not for Gene ST arrays. Best regards Christian _._._._._._._._._._._._._._._._._._ C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a e.m.a.i.l: cstrato at aon.at _._._._._._._._._._._._._._._._._._ On 4/25/11 7:24 PM, Javier P?rez Florido wrote: > Dear list, > I have two questions regarding Human Gene 1.0 ST arrays: > > * Both NUSE and RLE plots need a fitted object using fitPLM > function. Now, this function accepts raw data from a set of Hu > Gene 1.0 arrays, but, internally, this function performs a RMA > normalization. What level is used for this normalization? I cannot > choose the level (i.e. core, full, extended) for the "internal" > normalization. > * Are a splicing analysis using Hu Gene 1.0 arrays (core analysis) > and a splicing analysis using Hu Exon 1.0 arrays (core analysis) > equivalent in terms of results? > > > Thanks, > Javier > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD COMMENT • link 13.0 years ago cstrato ★ 3.9k

0

Entering edit mode

Sorry, I always forget sessionInfo(), see below You are right, for Human Gene ST arrays and at transcript level, only "core" mode exists. However, when: fit<-fitPLM(OligoRaw) where OligoRaw is the set of Raw data, the size of "fit" object is 257,430 and when the following command is executed OligoEset<-rma(OligoRaw,target="probeset") OligoEset has 257,430 features. So, the RMA procedure "inside" fitPLM function performs a normalization at the probeset level. On the other hand, summarization using RMA can be performed at the transcript level in the following way: OligoEset<-rma(OligoRaw,target="core") which yields around 33000 transcripts. I'm still confused about the concepts of "probeset" and "transcript" on Human Gene Arrays. For Exon arrays, probesets consists of four individual probes and usually target a particular exon of a particular gene. Thus exon-level intensity estimates correspond to the probeset-level estimates. Probesets are further grouped into transcript clusters enabling gene-level estimate to be computed by summarizing data from all probes within the transcript cluster. However, I don't know if I can assert that, for Gene arrays, probesets target a particular exon of a particular gene and transcript cluster enables gene-level estimates as Exon arrays. The only difference is that, for Exon arrays, we have two more "annotation levels" with less confidence score (extended and full). Otherwise, what is the utility of summarizing at the probeset level on Hu Gene arrays? This is related to my second question: can HuGene could detect alternative splice events reliably? Can HuGene be used as an economical exon array for just the well-annotated content (core)? Thanks again, Javier Thanks, Javier R version 2.13.0 (2011-04-13) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C [5] LC_TIME=Spanish_Spain.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] pd.hugene.1.0.st.v1_3.0.2 hugene10sttranscriptcluster.db_7.0.1 org.Hs.eg.db_2.5.0 RSQLite_0.9-4 [5] DBI_0.2-5 AnnotationDbi_1.14.1 oligo_1.16.0 oligoClasses_1.14.0 [9] affyPLM_1.28.5 preprocessCore_1.14.0 gcrma_2.24.1 affy_1.30.0 [13] Biobase_2.12.1 loaded via a namespace (and not attached): [1] affxparser_1.24.0 affyio_1.20.0 Biostrings_2.20.0 bit_1.1-6 ff_2.2-1 IRanges_1.10.0 splines_2.13.0 tools_2.13.0 On 25/04/2011 19:36, cstrato wrote: > Dear Javier, > > Since you do not supply your sessionInfo() it is not possible to > answer your question. > > However, please note that levels core, extended, full do only exist > for Exon ST arrays but not for Gene ST arrays. > > Best regards > Christian > _._._._._._._._._._._._._._._._._._ > C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a > V.i.e.n.n.a A.u.s.t.r.i.a > e.m.a.i.l: cstrato at aon.at > _._._._._._._._._._._._._._._._._._ > > > On 4/25/11 7:24 PM, Javier P?rez Florido wrote: >> Dear list, >> I have two questions regarding Human Gene 1.0 ST arrays: >> >> * Both NUSE and RLE plots need a fitted object using fitPLM >> function. Now, this function accepts raw data from a set of Hu >> Gene 1.0 arrays, but, internally, this function performs a RMA >> normalization. What level is used for this normalization? I >> cannot >> choose the level (i.e. core, full, extended) for the "internal" >> normalization. >> * Are a splicing analysis using Hu Gene 1.0 arrays (core analysis) >> and a splicing analysis using Hu Exon 1.0 arrays (core analysis) >> equivalent in terms of results? >> >> >> Thanks, >> Javier >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >

ADD REPLY • link 13.0 years ago Javier Pérez Florido ▴ 840

0

Entering edit mode

While Exon ST arrays have usually 4 probes per probeset, Gene ST arrays have only 1-2 probes per probeset. Thus my personal opinion is not to use Gene ST arrays to detect alternative splicing events. However, there exists e.g. FIRMAGene for this purpose, see: http://bioinf.wehi.edu.au/folders/firmagene/ Best regards Christian On 4/25/11 8:54 PM, Javier P?rez Florido wrote: > Sorry, I always forget sessionInfo(), see below > > You are right, for Human Gene ST arrays and at transcript level, only > "core" mode exists. However, when: > fit<-fitPLM(OligoRaw) > where OligoRaw is the set of Raw data, the size of "fit" object is > 257,430 and when the following command is executed > > OligoEset<-rma(OligoRaw,target="probeset") > > OligoEset has 257,430 features. So, the RMA procedure "inside" fitPLM > function performs a normalization at the probeset level. > > On the other hand, summarization using RMA can be performed at the > transcript level in the following way: > OligoEset<-rma(OligoRaw,target="core") > > which yields around 33000 transcripts. > > I'm still confused about the concepts of "probeset" and "transcript" on > Human Gene Arrays. > > For Exon arrays, probesets consists of four individual probes and > usually target a particular exon of a particular gene. Thus exon- level > intensity estimates correspond to the probeset-level estimates. > Probesets are further grouped into transcript clusters enabling > gene-level estimate to be computed by summarizing data from all probes > within the transcript cluster. > > However, I don't know if I can assert that, for Gene arrays, probesets > target a particular exon of a particular gene and transcript cluster > enables gene-level estimates as Exon arrays. The only difference is > that, for Exon arrays, we have two more "annotation levels" with less > confidence score (extended and full). Otherwise, what is the utility of > summarizing at the probeset level on Hu Gene arrays? > > This is related to my second question: can HuGene could detect > alternative splice events reliably? Can HuGene be used as an economical > exon array for just the well-annotated content (core)? > > Thanks again, > Javier > > > Thanks, > Javier > > > R version 2.13.0 (2011-04-13) > Platform: x86_64-pc-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 > LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C > [5] LC_TIME=Spanish_Spain.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] pd.hugene.1.0.st.v1_3.0.2 hugene10sttranscriptcluster.db_7.0.1 > org.Hs.eg.db_2.5.0 RSQLite_0.9-4 > [5] DBI_0.2-5 AnnotationDbi_1.14.1 oligo_1.16.0 oligoClasses_1.14.0 > [9] affyPLM_1.28.5 preprocessCore_1.14.0 gcrma_2.24.1 affy_1.30.0 > [13] Biobase_2.12.1 > > loaded via a namespace (and not attached): > [1] affxparser_1.24.0 affyio_1.20.0 Biostrings_2.20.0 bit_1.1-6 ff_2.2-1 > IRanges_1.10.0 splines_2.13.0 tools_2.13.0 > > > > > On 25/04/2011 19:36, cstrato wrote: >> Dear Javier, >> >> Since you do not supply your sessionInfo() it is not possible to >> answer your question. >> >> However, please note that levels core, extended, full do only exist >> for Exon ST arrays but not for Gene ST arrays. >> >> Best regards >> Christian >> _._._._._._._._._._._._._._._._._._ >> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a >> V.i.e.n.n.a A.u.s.t.r.i.a >> e.m.a.i.l: cstrato at aon.at >> _._._._._._._._._._._._._._._._._._ >> >> >> On 4/25/11 7:24 PM, Javier P?rez Florido wrote: >>> Dear list, >>> I have two questions regarding Human Gene 1.0 ST arrays: >>> >>> * Both NUSE and RLE plots need a fitted object using fitPLM >>> function. Now, this function accepts raw data from a set of Hu >>> Gene 1.0 arrays, but, internally, this function performs a RMA >>> normalization. What level is used for this normalization? I cannot >>> choose the level (i.e. core, full, extended) for the "internal" >>> normalization. >>> * Are a splicing analysis using Hu Gene 1.0 arrays (core analysis) >>> and a splicing analysis using Hu Exon 1.0 arrays (core analysis) >>> equivalent in terms of results? >>> >>> >>> Thanks, >>> Javier >>> >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> > >

ADD REPLY • link 13.0 years ago cstrato ★ 3.9k

0

Entering edit mode

Dear Christian, Thanks for your reply, but, is it right to assert that, for Gene arrays, probesets target a particular exon of a particular gene and transcript cluster enables gene-level estimates as Exon arrays, but using less probes per exon? Thanks, Javier On 25/04/2011 21:13, cstrato wrote: > While Exon ST arrays have usually 4 probes per probeset, Gene ST > arrays have only 1-2 probes per probeset. Thus my personal opinion is > not to use Gene ST arrays to detect alternative splicing events. > > However, there exists e.g. FIRMAGene for this purpose, see: > http://bioinf.wehi.edu.au/folders/firmagene/ > > Best regards > Christian > > > On 4/25/11 8:54 PM, Javier P?rez Florido wrote: >> Sorry, I always forget sessionInfo(), see below >> >> You are right, for Human Gene ST arrays and at transcript level, only >> "core" mode exists. However, when: >> fit<-fitPLM(OligoRaw) >> where OligoRaw is the set of Raw data, the size of "fit" object is >> 257,430 and when the following command is executed >> >> OligoEset<-rma(OligoRaw,target="probeset") >> >> OligoEset has 257,430 features. So, the RMA procedure "inside" fitPLM >> function performs a normalization at the probeset level. >> >> On the other hand, summarization using RMA can be performed at the >> transcript level in the following way: >> OligoEset<-rma(OligoRaw,target="core") >> >> which yields around 33000 transcripts. >> >> I'm still confused about the concepts of "probeset" and "transcript" on >> Human Gene Arrays. >> >> For Exon arrays, probesets consists of four individual probes and >> usually target a particular exon of a particular gene. Thus exon- level >> intensity estimates correspond to the probeset-level estimates. >> Probesets are further grouped into transcript clusters enabling >> gene-level estimate to be computed by summarizing data from all probes >> within the transcript cluster. >> >> However, I don't know if I can assert that, for Gene arrays, probesets >> target a particular exon of a particular gene and transcript cluster >> enables gene-level estimates as Exon arrays. The only difference is >> that, for Exon arrays, we have two more "annotation levels" with less >> confidence score (extended and full). Otherwise, what is the utility of >> summarizing at the probeset level on Hu Gene arrays? >> >> This is related to my second question: can HuGene could detect >> alternative splice events reliably? Can HuGene be used as an economical >> exon array for just the well-annotated content (core)? >> >> Thanks again, >> Javier >> >> >> Thanks, >> Javier >> >> >> R version 2.13.0 (2011-04-13) >> Platform: x86_64-pc-mingw32/x64 (64-bit) >> >> locale: >> [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 >> LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C >> [5] LC_TIME=Spanish_Spain.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] pd.hugene.1.0.st.v1_3.0.2 hugene10sttranscriptcluster.db_7.0.1 >> org.Hs.eg.db_2.5.0 RSQLite_0.9-4 >> [5] DBI_0.2-5 AnnotationDbi_1.14.1 oligo_1.16.0 oligoClasses_1.14.0 >> [9] affyPLM_1.28.5 preprocessCore_1.14.0 gcrma_2.24.1 affy_1.30.0 >> [13] Biobase_2.12.1 >> >> loaded via a namespace (and not attached): >> [1] affxparser_1.24.0 affyio_1.20.0 Biostrings_2.20.0 bit_1.1-6 ff_2.2-1 >> IRanges_1.10.0 splines_2.13.0 tools_2.13.0 >> >> >> >> >> On 25/04/2011 19:36, cstrato wrote: >>> Dear Javier, >>> >>> Since you do not supply your sessionInfo() it is not possible to >>> answer your question. >>> >>> However, please note that levels core, extended, full do only exist >>> for Exon ST arrays but not for Gene ST arrays. >>> >>> Best regards >>> Christian >>> _._._._._._._._._._._._._._._._._._ >>> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a >>> V.i.e.n.n.a A.u.s.t.r.i.a >>> e.m.a.i.l: cstrato at aon.at >>> _._._._._._._._._._._._._._._._._._ >>> >>> >>> On 4/25/11 7:24 PM, Javier P?rez Florido wrote: >>>> Dear list, >>>> I have two questions regarding Human Gene 1.0 ST arrays: >>>> >>>> * Both NUSE and RLE plots need a fitted object using fitPLM >>>> function. Now, this function accepts raw data from a set of Hu >>>> Gene 1.0 arrays, but, internally, this function performs a RMA >>>> normalization. What level is used for this normalization? I cannot >>>> choose the level (i.e. core, full, extended) for the "internal" >>>> normalization. >>>> * Are a splicing analysis using Hu Gene 1.0 arrays (core analysis) >>>> and a splicing analysis using Hu Exon 1.0 arrays (core analysis) >>>> equivalent in terms of results? >>>> >>>> >>>> Thanks, >>>> Javier >>>> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>> >> >> >

ADD REPLY • link 13.0 years ago Javier Pérez Florido ▴ 840

0

Entering edit mode

Dear Javier, I would suggest to look at the Affymetrix annotation file "HuGene-1_0-st-v1.na31.hg19.probeset.csv". There you can see the following for e.g. SAMD11: transcript_cluster_id = 7896761 number of probesets for 7896761 is 17 number of probes per probeset is 2 (with 2 exceptions) Then you can open "HuEx-1_0-st-v2.na31.hg19.probeset.csv" and compare the data for yourself. Best regards Christian On 4/26/11 11:13 AM, Javier P?rez Florido wrote: > Dear Christian, > Thanks for your reply, but, is it right to assert that, for Gene arrays, > probesets target a particular exon of a particular gene and transcript > cluster enables gene-level estimates as Exon arrays, but using less > probes per exon? > > Thanks, > Javier > > > On 25/04/2011 21:13, cstrato wrote: >> While Exon ST arrays have usually 4 probes per probeset, Gene ST >> arrays have only 1-2 probes per probeset. Thus my personal opinion is >> not to use Gene ST arrays to detect alternative splicing events. >> >> However, there exists e.g. FIRMAGene for this purpose, see: >> http://bioinf.wehi.edu.au/folders/firmagene/ >> >> Best regards >> Christian >> >> >> On 4/25/11 8:54 PM, Javier P?rez Florido wrote: >>> Sorry, I always forget sessionInfo(), see below >>> >>> You are right, for Human Gene ST arrays and at transcript level, only >>> "core" mode exists. However, when: >>> fit<-fitPLM(OligoRaw) >>> where OligoRaw is the set of Raw data, the size of "fit" object is >>> 257,430 and when the following command is executed >>> >>> OligoEset<-rma(OligoRaw,target="probeset") >>> >>> OligoEset has 257,430 features. So, the RMA procedure "inside" fitPLM >>> function performs a normalization at the probeset level. >>> >>> On the other hand, summarization using RMA can be performed at the >>> transcript level in the following way: >>> OligoEset<-rma(OligoRaw,target="core") >>> >>> which yields around 33000 transcripts. >>> >>> I'm still confused about the concepts of "probeset" and "transcript" on >>> Human Gene Arrays. >>> >>> For Exon arrays, probesets consists of four individual probes and >>> usually target a particular exon of a particular gene. Thus exon- level >>> intensity estimates correspond to the probeset-level estimates. >>> Probesets are further grouped into transcript clusters enabling >>> gene-level estimate to be computed by summarizing data from all probes >>> within the transcript cluster. >>> >>> However, I don't know if I can assert that, for Gene arrays, probesets >>> target a particular exon of a particular gene and transcript cluster >>> enables gene-level estimates as Exon arrays. The only difference is >>> that, for Exon arrays, we have two more "annotation levels" with less >>> confidence score (extended and full). Otherwise, what is the utility of >>> summarizing at the probeset level on Hu Gene arrays? >>> >>> This is related to my second question: can HuGene could detect >>> alternative splice events reliably? Can HuGene be used as an economical >>> exon array for just the well-annotated content (core)? >>> >>> Thanks again, >>> Javier >>> >>> >>> Thanks, >>> Javier >>> >>> >>> R version 2.13.0 (2011-04-13) >>> Platform: x86_64-pc-mingw32/x64 (64-bit) >>> >>> locale: >>> [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 >>> LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C >>> [5] LC_TIME=Spanish_Spain.1252 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] pd.hugene.1.0.st.v1_3.0.2 hugene10sttranscriptcluster.db_7.0.1 >>> org.Hs.eg.db_2.5.0 RSQLite_0.9-4 >>> [5] DBI_0.2-5 AnnotationDbi_1.14.1 oligo_1.16.0 oligoClasses_1.14.0 >>> [9] affyPLM_1.28.5 preprocessCore_1.14.0 gcrma_2.24.1 affy_1.30.0 >>> [13] Biobase_2.12.1 >>> >>> loaded via a namespace (and not attached): >>> [1] affxparser_1.24.0 affyio_1.20.0 Biostrings_2.20.0 bit_1.1-6 ff_2.2-1 >>> IRanges_1.10.0 splines_2.13.0 tools_2.13.0 >>> >>> >>> >>> >>> On 25/04/2011 19:36, cstrato wrote: >>>> Dear Javier, >>>> >>>> Since you do not supply your sessionInfo() it is not possible to >>>> answer your question. >>>> >>>> However, please note that levels core, extended, full do only exist >>>> for Exon ST arrays but not for Gene ST arrays. >>>> >>>> Best regards >>>> Christian >>>> _._._._._._._._._._._._._._._._._._ >>>> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a >>>> V.i.e.n.n.a A.u.s.t.r.i.a >>>> e.m.a.i.l: cstrato at aon.at >>>> _._._._._._._._._._._._._._._._._._ >>>> >>>> >>>> On 4/25/11 7:24 PM, Javier P?rez Florido wrote: >>>>> Dear list, >>>>> I have two questions regarding Human Gene 1.0 ST arrays: >>>>> >>>>> * Both NUSE and RLE plots need a fitted object using fitPLM >>>>> function. Now, this function accepts raw data from a set of Hu >>>>> Gene 1.0 arrays, but, internally, this function performs a RMA >>>>> normalization. What level is used for this normalization? I cannot >>>>> choose the level (i.e. core, full, extended) for the "internal" >>>>> normalization. >>>>> * Are a splicing analysis using Hu Gene 1.0 arrays (core analysis) >>>>> and a splicing analysis using Hu Exon 1.0 arrays (core analysis) >>>>> equivalent in terms of results? >>>>> >>>>> >>>>> Thanks, >>>>> Javier >>>>> >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at r-project.org >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>> >>>> >>> >>> >> > >

ADD REPLY • link 13.0 years ago cstrato ★ 3.9k

Login before adding your answer.