Entering edit mode
Md.Mamunur Rashid
▴
260
@mdmamunur-rashid-3595
Last seen 10.2 years ago
Dear List,
I am trying to annotate some illumina microarray probes (humanHT12v3)
from an experiment
of 96 samples. Apparently there are some difference between annotating
with
illuminaHumanv3BeadID.db and lumiHumanAll.db.
Here is what I have done in brief
1. I have read and processed(also includes detection p.value
filtering) the raw data file with lumi package
2. Found some differentially expressed genes using linear model
Now in my topTable I have some thing like that
> top<- topTable(aneu348_fit2,coef=2,adjust="BH")
> top
ID logFC AveExpr t P.Value adj.P.Val
B
19287 730612 0.1968519 6.506182 5.446788 1.729911e-06 0.03507526
4.750244
19897 3520463 0.3286017 7.057259 5.390423 2.103278e-06 0.03507526
4.580566
3028 2650605 0.4613558 7.115252 5.309757 2.780147e-06 0.03507526
4.338214
3956 3310538 0.5527499 8.000359 5.113185 5.466881e-06 0.05172900
3.750403
1626 3390605 -0.2277937 6.930935 -4.890353 1.168046e-05 0.07558592
3.089894
25875 6280470 0.5706626 7.235376 4.841339 1.378711e-05 0.07558592
2.945587
34978 6760546 0.3195073 7.659098 4.783197 1.677400e-05 0.07558592
2.774918
32380 3940692 -0.2995773 8.258397 -4.756620 1.834288e-05 0.07558592
2.697098
35264 1740020 -0.3454641 7.384281 -4.734429 1.976252e-05 0.07558592
2.632216
33126 6040398 0.5112817 7.517186 4.731312 1.997039e-05 0.07558592
2.623109
Then, I try to annotate top IDs with geneName, geneSymbol , EntrezId
and others.
** As you can see from the result of the topTable my probeIDs are the
array_Address_ID (according to manifest file buy illumina HumanHT-
12_v3_0_R2_11283641_A)
> geneSymbol<- getSYMBOL(, 'illuminaHumanv3BeadID.db')
> geneName<- sapply(lookUp(aneu348_probeList,
'illuminaHumanv3BeadID.db', 'GENENAME'), function(x) x[1])
gives me the correct geneName and Symbol. (according to the manifest
file)
But when I try to convert these probeIDs using IlluminaID2nuID() or
probeID2nuID() method
it transforms to a complete different set of geneNames and symbol.
I then added "000" before all of my probes and passed it to
IllumimnaID2nuID() function
> top<- paste("000",top,sep="")
> illu<- IlluminaID2nuID(top)
Warning messages:
1: In getChipInfo(IlluminaID, lib.mapping = lib.mapping, species =
species, :
Some input IDs can not be matched!
2: In if (!is.na(chipInfo$IDType)) { :
the condition has length> 1 and only the first element will be
used
> illu[1,] # Here illu[1,] holds the mapping for "000730612"
Search_Key ILMN_Gene Accession Symbol
NA NA NA NA
Probe_Id Array_Address_Id nuID
NA NA NA
now for some reason it is always showing "NA" for few of the probes
even though when I passed
them individually to the function it returns the correct mapping
> IlluminaID2nuID(top[1]) # here top[1] = "000730612"
Search_Key ILMN_Gene Accession Symbol Probe_Id
000730612 "ILMN_10981" "HTRA1" "NM_002775.3" "HTRA1" "ILMN_1676563"
Array_Address_Id nuID
000730612 "000730612" "ZEObIyCCVRqJSjqHrY"
So my questions are :
1. Why the above functions can not find any entry for few probeIDs
even though it's present ?
2. The way around I found out (adding "000" in the beginning) , is it
correct or there are
some other better options ?
3. Even though it's a Human-HT12 chip , the getChipInfo() gives
getChipInfo(aneu348_N)
$chipVersion
[1] "HumanWG6_V2_11223189_B"
4. I am trying to develop a workflow which will handle data with both
type of probeID
pattern(ie. "ILMN_1805" or "730612"). What would be the standard
path way to annotate
both type of data?
Please accept my apology if the mail seems long. I tried to provide as
mush details as I could
thanks in advance,
regards,
Mamun