Manufacturer ids in lumiHumanV1

0

Entering edit mode

Benjamin Otto ▴ 830

@benjamin-otto-1519

Last seen 9.6 years ago

Dear bioconductors, I just installed the lumiHumanV1 package via the biocLite() procedure yesterday. The identifiers of the single probes look very strange to me. Beside that they don't appear in the annotation file provided by Illumina the name composition itself made me, let me say somehow suspicious! Here comes a quick extract. Did someone observe that before? > xx <- as.list(lumiHumanV1ACCNUM) > xx[1:5] $x00WAIBIVaMvS3EVQ0 [1] NA $`9iFKOnl5SKK4SnkAS8` [1] "NM_032360" $Ke_INOf6KSdNz6Qeek [1] "NM_030945" $rKBxx.U0SoWkRRVxac [1] "NM_025258" $ilenrd0XnE.4v_3Zec [1] NA > sessionInfo() R version 2.5.0 (2007-04-23) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] "tools" "stats" "graphics" "grDevices" "utils" "datasets" [7] "methods" "base" other attached packages: lumiHumanV1 lumi annotate mgcv affy affyio "1.2.0" "1.2.0" "1.14.1" "1.3-23" "1.14.0" "1.4.0" Biobase "1.14.0" Cheers, Benjamin ====================================== Benjamin Otto University Hospital Hamburg-Eppendorf Institute For Clinical Chemistry Martinistr. 52 D-20246 Hamburg Tel.: +49 40 42803 1908 Fax.: +49 40 42803 4971 ====================================== -- Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universit?tsklinikum Hamburg-Eppendorf K?rperschaft des ?ffentlichen Rechts Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. J?rg F. Debatin (Vorsitzender) Dr. Alexander Kirstein Ricarda Klein Prof. Dr. Dr. Uwe Koch-Gromus

Annotation lumiHumanV1 annotate lumi Annotation lumiHumanV1 annotate lumi • 1.4k views

ADD COMMENT • link updated 16.8 years ago by Pan Du ★ 1.2k • written 16.8 years ago by Benjamin Otto ▴ 830

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 3 months ago

United States

Benjamin Otto wrote: > Dear bioconductors, > > I just installed the lumiHumanV1 package via the biocLite() procedure > yesterday. The identifiers of the single probes look very strange to me. > Beside that they don't appear in the annotation file provided by Illumina > the name composition itself made me, let me say somehow suspicious! Here > comes a quick extract. Did someone observe that before? > > >> xx <- as.list(lumiHumanV1ACCNUM) >> xx[1:5] >> > $x00WAIBIVaMvS3EVQ0 > [1] NA > > $`9iFKOnl5SKK4SnkAS8` > [1] "NM_032360" > > $Ke_INOf6KSdNz6Qeek > [1] "NM_030945" > > $rKBxx.U0SoWkRRVxac > [1] "NM_025258" > > $ilenrd0XnE.4v_3Zec > [1] NA > > >> sessionInfo() >> > R version 2.5.0 (2007-04-23) > i386-pc-mingw32 > > locale: > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United > States.1252;LC_MONETARY=English_United > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 > > attached base packages: > [1] "tools" "stats" "graphics" "grDevices" "utils" "datasets" > [7] "methods" "base" > > other attached packages: > lumiHumanV1 lumi annotate mgcv affy affyio > "1.2.0" "1.2.0" "1.14.1" "1.3-23" "1.14.0" "1.4.0" > Biobase > "1.14.0" > These IDs are nuIDs. They are used by the lumi package and are a base-64 encoded version of the actual sequence of the probe. You may find that the "illumina...." data packages, rather than the "lumi...." data packages have the standard keys. However, if you are using the lumi package for dealing with Illumina data, the "lumi...." data packages will work just fine. Perhaps Pan Du or Simon Lin will fill in some more details about why they use nuID. Sean

ADD COMMENT • link 16.8 years ago Sean Davis 21k

0

Entering edit mode

Pan Du ★ 1.2k

@pan-du-2010

Last seen 9.6 years ago

Hi Benjamin There are potential problems of directly using Illumina Target identifier or probe identifier because of its imperfect design. For example, there are duplicated Target IDs in the same chip. The same probe can have different Target IDs. The nuID is designed to solve these problems. It can be directly convert to the probe sequence and get the latest annotation easily. Also, the annotation packages like lumiHumanV1 provides the mapping between TargetID to nuID and probe Id to nuID. Please check the vignette of lumi package and the paper published in Biology Direct 2007, 2:16: nuID: a universal naming scheme of oligonucleotides for Illumina, Affymetrix, and other microarrays Tell me if you have any questions. Pan On 7/6/07 5:00 AM, "bioconductor-request at stat.math.ethz.ch" <bioconductor-request at="" stat.math.ethz.ch=""> wrote: > ------------------------------ > > Message: 2 > Date: Thu, 5 Jul 2007 13:00:14 +0200 > From: "Benjamin Otto" <b.otto at="" uke.uni-hamburg.de=""> > Subject: [BioC] Manufacturer ids in lumiHumanV1 > To: "BioClist" <bioconductor at="" stat.math.ethz.ch=""> > Message-ID: <000901c7bef3$ac604870$9f05a20a at matrix.com> > Content-Type: text/plain; charset="us-ascii" > > Dear bioconductors, > > I just installed the lumiHumanV1 package via the biocLite() procedure > yesterday. The identifiers of the single probes look very strange to me. > Beside that they don't appear in the annotation file provided by Illumina > the name composition itself made me, let me say somehow suspicious! Here > comes a quick extract. Did someone observe that before?

ADD COMMENT • link 16.8 years ago Pan Du ★ 1.2k

0

Entering edit mode

Hi everyone, My question is of statistical nature. I am trying to finalize the analysis of real-time PCR experiments. I have 4 technical replicates (pipeting errors) and three biological replicates (repeat treatment). I use gapdh as an external reference gene. 1) Can I average all replication regardless of technical/biological? 2) Similarily can I use the standard error from the combined standard deviations and what would the population size be in that case? (would it be 4 technical x 3 biological?) If I can not do the above how does someone proceed with these two different types of replication? I can imagine that the technical replication reflects only on the threshold cycles and the biological replication at the actual fold difference[2^-(threshold cycle)] but how can I combine the two differently calculated standard errors? I apologize that the e-mail is not directly linking to some BioC package but there are no packages for real-time and I had nowhere else to turn! Thanks Niki Swann Building The King's Buildings University of Edinburgh Edinburgh, EH9 3JR Scotland UK tel:(0044)0131-6507072 fax:(0044)0131-6505379 R.Athanasiadou at sms.ed.ac.uk -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Pan Du Sent: 06 July 2007 15:46 To: bioconductor at stat.math.ethz.ch Cc: Simon Lin Subject: Re: [BioC] Manufacturer ids in lumiHumanV1 Hi Benjamin There are potential problems of directly using Illumina Target identifier or probe identifier because of its imperfect design. For example, there are duplicated Target IDs in the same chip. The same probe can have different Target IDs. The nuID is designed to solve these problems. It can be directly convert to the probe sequence and get the latest annotation easily. Also, the annotation packages like lumiHumanV1 provides the mapping between TargetID to nuID and probe Id to nuID. Please check the vignette of lumi package and the paper published in Biology Direct 2007, 2:16: nuID: a universal naming scheme of oligonucleotides for Illumina, Affymetrix, and other microarrays Tell me if you have any questions. Pan On 7/6/07 5:00 AM, "bioconductor-request at stat.math.ethz.ch" <bioconductor-request at="" stat.math.ethz.ch=""> wrote: > ------------------------------ > > Message: 2 > Date: Thu, 5 Jul 2007 13:00:14 +0200 > From: "Benjamin Otto" <b.otto at="" uke.uni-hamburg.de=""> > Subject: [BioC] Manufacturer ids in lumiHumanV1 > To: "BioClist" <bioconductor at="" stat.math.ethz.ch=""> > Message-ID: <000901c7bef3$ac604870$9f05a20a at matrix.com> > Content-Type: text/plain; charset="us-ascii" > > Dear bioconductors, > > I just installed the lumiHumanV1 package via the biocLite() procedure > yesterday. The identifiers of the single probes look very strange to me. > Beside that they don't appear in the annotation file provided by Illumina > the name composition itself made me, let me say somehow suspicious! Here > comes a quick extract. Did someone observe that before? _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 16.8 years ago r.athanasiadou ▴ 100

0

Entering edit mode

Hello Niki, You should not treat biological and technical replicates in the same way. The standard deviations provided by the termocycler software only pertain to the within sample (purely technical) variability. Moreover, to infer upon treatment effects in a broad inference context you should refer to the sample-to-sample variability (so-called biological variability). For this last purpose you have 4 reps (not 12). You can use a hierarchical model to account for different levels of replication. This can be done using a linear mixed model including both the test and control gene and computing the (corrected) contrasts of interest. Please follow the link to a short paper with a linear model (model [3]) that we have proposed for this type of analysis. We did not use R for the computations, but perhaps the lme4 package could be used. http://www.wcgalp8.org.br/wcgalp8/articles/paper/23_528-1915.pdf Best, Juan Pedro r.athanasiadou wrote: > Hi everyone, > My question is of statistical nature. > I am trying to finalize the analysis of real-time PCR experiments. > I have 4 technical replicates (pipeting errors) and three biological > replicates (repeat treatment). > > I use gapdh as an external reference gene. > > 1) Can I average all replication regardless of technical/biological? > 2) Similarily can I use the standard error from the combined standard > deviations and what would the population size be in that case? (would it be > 4 technical x 3 biological?) > > If I can not do the above how does someone proceed with these two different > types of replication? > > > I can imagine that the technical replication reflects only on the threshold > cycles and the biological replication at the actual fold > difference[2^-(threshold cycle)] but how can I combine the two differently > calculated standard errors? > > > > I apologize that the e-mail is not directly linking to some BioC package but > there are no packages for real-time and I had nowhere else to turn! > > Thanks > Niki > > > > > Swann Building > The King's Buildings > University of Edinburgh > Edinburgh, EH9 3JR > Scotland > UK > > tel:(0044)0131-6507072 > fax:(0044)0131-6505379 > R.Athanasiadou at sms.ed.ac.uk > > > -- ============================= Juan Pedro Steibel Postdoctoral researcher Statistical Genetics Department of Animal Science Michigan State University 1205-I Anthony Hall East Lansing, MI 48823 USA Phone: 1-517-353-5102 E-mail: steibelj at msu.edu

ADD REPLY • link 16.8 years ago Juan Pedro Steibel ▴ 130

0

Entering edit mode

Erratum: you have 3 true reps. Sorry about that... JP Juan Pedro Steibel wrote: > Hello Niki, > You should not treat biological and technical replicates in the same > way. The standard deviations provided by the termocycler software only > pertain to the within sample (purely technical) variability. Moreover, > to infer upon treatment effects in a broad inference context you should > refer to the sample-to-sample variability (so-called biological > variability). For this last purpose you have 4 reps (not 12). > > You can use a hierarchical model to account for different levels of > replication. This can be done using a linear mixed model including both > the test and control gene and computing the (corrected) contrasts of > interest. > > Please follow the link to a short paper with a linear model (model [3]) > that we have proposed for this type of analysis. We did not use R for > the computations, but perhaps the lme4 package could be used. > > http://www.wcgalp8.org.br/wcgalp8/articles/paper/23_528-1915.pdf > > Best, > Juan Pedro > > > > > > r.athanasiadou wrote: > >> Hi everyone, >> My question is of statistical nature. >> I am trying to finalize the analysis of real-time PCR experiments. >> I have 4 technical replicates (pipeting errors) and three biological >> replicates (repeat treatment). >> >> I use gapdh as an external reference gene. >> >> 1) Can I average all replication regardless of technical/biological? >> 2) Similarily can I use the standard error from the combined standard >> deviations and what would the population size be in that case? (would it be >> 4 technical x 3 biological?) >> >> If I can not do the above how does someone proceed with these two different >> types of replication? >> >> >> I can imagine that the technical replication reflects only on the threshold >> cycles and the biological replication at the actual fold >> difference[2^-(threshold cycle)] but how can I combine the two differently >> calculated standard errors? >> >> >> >> I apologize that the e-mail is not directly linking to some BioC package but >> there are no packages for real-time and I had nowhere else to turn! >> >> Thanks >> Niki >> >> >> >> >> Swann Building >> The King's Buildings >> University of Edinburgh >> Edinburgh, EH9 3JR >> Scotland >> UK >> >> tel:(0044)0131-6507072 >> fax:(0044)0131-6505379 >> R.Athanasiadou at sms.ed.ac.uk >> >> >> >> > > -- ============================= Juan Pedro Steibel Postdoctoral researcher Statistical Genetics Department of Animal Science Michigan State University 1205-I Anthony Hall East Lansing, MI 48823 USA Phone: 1-517-353-5102 E-mail: steibelj at msu.edu

ADD REPLY • link 16.8 years ago Juan Pedro Steibel ▴ 130

Login before adding your answer.