Search
Question: problem illumina annotation with lumi
0
8.9 years ago by
Md.Mamunur Rashid260 wrote:
Dear List, I am trying to annotate some illumina microarray probes (humanHT12v3) from an experiment of 96 samples. Apparently there are some difference between annotating with illuminaHumanv3BeadID.db and lumiHumanAll.db. Here is what I have done in brief 1. I have read and processed(also includes detection p.value filtering) the raw data file with lumi package 2. Found some differentially expressed genes using linear model Now in my topTable I have some thing like that > top<- topTable(aneu348_fit2,coef=2,adjust="BH") > top ID logFC AveExpr t P.Value adj.P.Val B 19287 730612 0.1968519 6.506182 5.446788 1.729911e-06 0.03507526 4.750244 19897 3520463 0.3286017 7.057259 5.390423 2.103278e-06 0.03507526 4.580566 3028 2650605 0.4613558 7.115252 5.309757 2.780147e-06 0.03507526 4.338214 3956 3310538 0.5527499 8.000359 5.113185 5.466881e-06 0.05172900 3.750403 1626 3390605 -0.2277937 6.930935 -4.890353 1.168046e-05 0.07558592 3.089894 25875 6280470 0.5706626 7.235376 4.841339 1.378711e-05 0.07558592 2.945587 34978 6760546 0.3195073 7.659098 4.783197 1.677400e-05 0.07558592 2.774918 32380 3940692 -0.2995773 8.258397 -4.756620 1.834288e-05 0.07558592 2.697098 35264 1740020 -0.3454641 7.384281 -4.734429 1.976252e-05 0.07558592 2.632216 33126 6040398 0.5112817 7.517186 4.731312 1.997039e-05 0.07558592 2.623109 Then, I try to annotate top IDs with geneName, geneSymbol , EntrezId and others. ** As you can see from the result of the topTable my probeIDs are the array_Address_ID (according to manifest file buy illumina HumanHT- 12_v3_0_R2_11283641_A) > geneSymbol<- getSYMBOL(, 'illuminaHumanv3BeadID.db') > geneName<- sapply(lookUp(aneu348_probeList, 'illuminaHumanv3BeadID.db', 'GENENAME'), function(x) x[1]) gives me the correct geneName and Symbol. (according to the manifest file) But when I try to convert these probeIDs using IlluminaID2nuID() or probeID2nuID() method it transforms to a complete different set of geneNames and symbol. I then added "000" before all of my probes and passed it to IllumimnaID2nuID() function > top<- paste("000",top,sep="") > illu<- IlluminaID2nuID(top) Warning messages: 1: In getChipInfo(IlluminaID, lib.mapping = lib.mapping, species = species, : Some input IDs can not be matched! 2: In if (!is.na(chipInfo$IDType)) { : the condition has length> 1 and only the first element will be used > illu[1,] # Here illu[1,] holds the mapping for "000730612" Search_Key ILMN_Gene Accession Symbol NA NA NA NA Probe_Id Array_Address_Id nuID NA NA NA now for some reason it is always showing "NA" for few of the probes even though when I passed them individually to the function it returns the correct mapping > IlluminaID2nuID(top[1]) # here top[1] = "000730612" Search_Key ILMN_Gene Accession Symbol Probe_Id 000730612 "ILMN_10981" "HTRA1" "NM_002775.3" "HTRA1" "ILMN_1676563" Array_Address_Id nuID 000730612 "000730612" "ZEObIyCCVRqJSjqHrY" So my questions are : 1. Why the above functions can not find any entry for few probeIDs even though it's present ? 2. The way around I found out (adding "000" in the beginning) , is it correct or there are some other better options ? 3. Even though it's a Human-HT12 chip , the getChipInfo() gives getChipInfo(aneu348_N)$chipVersion [1] "HumanWG6_V2_11223189_B" 4. I am trying to develop a workflow which will handle data with both type of probeID pattern(ie. "ILMN_1805" or "730612"). What would be the standard path way to annotate both type of data? Please accept my apology if the mail seems long. I tried to provide as mush details as I could thanks in advance, regards, Mamun
modified 8.9 years ago • written 8.9 years ago by Md.Mamunur Rashid260
0
8.9 years ago by
Md.Mamunur Rashid260 wrote:
Dear Gilbert, Thanks for a prompt reply. I am extremely sorry because I forgot to remove the following line "Apparently there are some difference between annotating with illuminaHumanv3BeadID.db and lumiHumanAll.db." It will be great if you kindly have a look at the questions at the bottom of the mail. This is the session details : R version 2.9.0 (2009-04-17) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US .UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_N AME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTI FICATION=C attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] limma_3.2.1 illuminaHumanv3BeadID.db_1.2.0 [3] lumiHumanAll.db_1.6.1 lumiHumanIDMapping_1.2.1 [5] lumi_1.10.2 RSQLite_0.7-3 [7] DBI_0.2-4 preprocessCore_1.6.0 [9] mgcv_1.5-6 affy_1.22.1 [11] annotate_1.22.0 AnnotationDbi_1.6.1 [13] Biobase_2.4.1 session_1.0.2 loaded via a namespace (and not attached): [1] affyio_1.12.0 grid_2.9.0 lattice_0.17-26 nlme_3.1-96 [5] xtable_1.5-5 On 12/10/2009 06:12 PM, Gilbert Feng wrote: > Hi, Mamun > > Thanks for your information. > > probeID2nuID calls "lumiHumanIDMapping", not "lumiHumanAll.db". The latter > one only provides the mapping between nuIDs (based on probe sequences) and > annotations in org.Hs.eg.db The mapping between Illumina IDs and nuIDs > requires "lumiHumanIDMapping" package. > > Could you show us your sessionInfp()? > > Best > > Gilbert > > On 12/10/09 9:04 AM, "Md.Mamunur Rashid"<mamunur.rashid@kcl.ac.uk> wrote: > > >> Dear List, >> >> I am trying to annotate some illumina microarray probes (humanHT12v3) from an >> experiment of 96 samples. >> Here is what I have done in brief >> >> 1. I have read and processed(also includes detection p.value filtering) the >> raw data file with lumi package >> 2. Found some differentially expressed genes using linear model >> >> Now in my topTable I have some thing like that >> >> >>> top<- topTable(aneu348_fit2,coef=2,adjust="BH") >>> top >>> >> ID logFC AveExpr t P.Value adj.P.Val B >> 19287 730612 0.1968519 6.506182 5.446788 1.729911e-06 0.03507526 4.750244 >> 19897 3520463 0.3286017 7.057259 5.390423 2.103278e-06 0.03507526 4.580566 >> 3028 2650605 0.4613558 7.115252 5.309757 2.780147e-06 0.03507526 4.338214 >> 3956 3310538 0.5527499 8.000359 5.113185 5.466881e-06 0.05172900 3.750403 >> 1626 3390605 -0.2277937 6.930935 -4.890353 1.168046e-05 0.07558592 3.089894 >> 25875 6280470 0.5706626 7.235376 4.841339 1.378711e-05 0.07558592 2.945587 >> 34978 6760546 0.3195073 7.659098 4.783197 1.677400e-05 0.07558592 2.774918 >> 32380 3940692 -0.2995773 8.258397 -4.756620 1.834288e-05 0.07558592 2.697098 >> 35264 1740020 -0.3454641 7.384281 -4.734429 1.976252e-05 0.07558592 2.632216 >> 33126 6040398 0.5112817 7.517186 4.731312 1.997039e-05 0.07558592 2.623109 >> >> >> Then, I try to annotate top IDs with geneName, geneSymbol , EntrezId and >> others. >> >> >> ** As you can see from the result of the topTable my probeIDs are the >> array_Address_ID (according to manifest file buy illumina >> HumanHT-12_v3_0_R2_11283641_A) >> >> >> >>> geneSymbol<- getSYMBOL(, 'illuminaHumanv3BeadID.db') >>> geneName<- sapply(lookUp(aneu348_probeList, 'illuminaHumanv3BeadID.db', >>> 'GENENAME'), function(x) x[1]) >>> >> gives me the correct geneName and Symbol. (according to the manifest file) >> >> But when I try to convert these probeIDs using IlluminaID2nuID() or >> probeID2nuID() method >> it transforms to a complete different set of geneNames and symbol. >> >> I then added "000" before all of my probes and passed it to IllumimnaID2nuID() >> function >> >> >>> top<- paste("000",top,sep="") >>> illu<- IlluminaID2nuID(top) >>> >> Warning messages: >> 1: In getChipInfo(IlluminaID, lib.mapping = lib.mapping, species = species, : >> Some input IDs can not be matched! >> >> 2: In if (!is.na(chipInfo$IDType)) { : >> the condition has length> 1 and only the first element will be used >> >> >> >>> illu[1,] # Here illu[1,] holds the mapping for "000730612" >>> >> Search_Key ILMN_Gene Accession Symbol >> NA NA NA NA >> Probe_Id Array_Address_Id nuID >> NA NA NA >> >> now for some reason it is always showing "NA" for few of the probes even >> though when I passed >> them individually to the function it returns the correct mapping >> >> >>> IlluminaID2nuID(top[1]) # here top[1] = "000730612" >>> >> Search_Key ILMN_Gene Accession Symbol Probe_Id >> 000730612 "ILMN_10981" "HTRA1" "NM_002775.3" "HTRA1" "ILMN_1676563" >> Array_Address_Id nuID >> 000730612 "000730612" "ZEObIyCCVRqJSjqHrY" >> >> So my questions are : >> >> 1. Why the above functions can not find any entry for some of the probeIDs even though >> it's present ? (e.g. 000730612 ) >> >> 2. The way around I found out (adding "000" in the beginning) , is it correct >> or there are >> some other better options ? >> >> 3. Even though it's a Human-HT12 chip , the getChipInfo() gives >> >> getChipInfo(aneu348_N) >>$chipVersion >> [1] "HumanWG6_V2_11223189_B" >> >> >> 4. I am trying to develop a workflow which will handle data with both type of >> probeID >> pattern(ie. "ILMN_1805" or "730612"). What would be the standard path way >> to annotate >> both type of data? >> >> >> Please accept my apology if the mail seems long. I tried to provide as mush >> details as I could >> >> thanks in advance, >> >> regards, >> Mamun >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> > > ----------------------------------------------- > Gang (Gilbert) Feng, PhD > Biomedical Informatics Center > Robert H. Lurie Comprehensive Cancer Center > Northwestern University > 750 N. Lake Shore Drive, 11th Floor(11-175e) > Chicago, IL 60611 > Phone:312-503-2358 > Email g-feng (at) northwestern.edu > ----------------------------------------------- > > > > [[alternative HTML version deleted]]