Entering edit mode
John Coulthard
▴
170
@john-coulthard-3077
Last seen 10.2 years ago
Dear List
I'm analyzing some (HumanHT12_V3_0_R1_11283641_A) Illumina data using
the lumi package. The raw data has 48803 probes, 36157 of which have
a gene symbol annotation and 12646 don't. When I use lumi, and
convert probe ids to nuIDs, then annotate the nuIDs I only get 25935
probes annotated with a gene symbol.
There can't be that many probes which have had annotated gene symbol
deleted, so what am I doing wrong?
Is there a way to get the probe_ids and gene symbols that came with
the raw data onto my TopTable post analysis?
My working below (not the full analysis just an example of how I did
the annotation bit).
Thanks for you time.
John
> lumidata<-lumiR("Sample Probe Profile_rawdata.txt",
lib.mapping='lumiHumanIDMapping')
Perform Quality Control assessment of the LumiBatch object ...
Duplicated IDs found and were merged!
> f <- exprs(lumidata)
> g<-as.matrix(rownames(f))
> f<-as.data.frame(cbind(f,g) )
> head(f)
1 2 3 4 V25
Ku8QhfS0n_hIOABXuE 92 84 75 79 Ku8QhfS0n_hIOABXuE
fqPEquJRRlSVSfL.8A 113 120 111 109 fqPEquJRRlSVSfL.8A
ckiehnugOno9d7vf1Q 107 104 94 94 ckiehnugOno9d7vf1Q
x57Vw5B5Fbt5JUnQkI 93 83 94 94 x57Vw5B5Fbt5JUnQkI
ritxUH.kuHlYqjozpE 93 97 77 89 ritxUH.kuHlYqjozpE
QpE5UiUgmJOJEkPXpc 102 95 97 92 QpE5UiUgmJOJEkPXpc
> f$Symbol<-if (require(lumiHumanAll.db)) getSYMBOL(f$V25,
'lumiHumanAll.db')
> sumis.na(f$Symbol))
[1] 22868
> data<-read.csv("Sample Probe Profile_rawdata.txt", header = TRUE,
sep="\t")
> names(data)
[1] "PROBE_ID" "SYMBOL" "X1.AVG_Signal"
"X1.Detection.Pval" "X1.NARRAYS" "X1.ARRAY_STDEV"
"X1.BEAD_STDERR"
...
> sumis.na(data$SYMBOL))
[1] 0
> sum(data$SYMBOL=="")
[1] 12646
> sum(data$SYMBOL!="")
[1] 36157
> sessionInfo()
R version 2.10.1 (2009-12-14)
i386-redhat-linux-gnu
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=C
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] beadarray_1.14.0 lumiHumanIDMapping_1.4.0 limma_3.2.3
lumi_1.12.4 MASS_7.3-4 preprocessCore_1.8.0
[7] mgcv_1.6-1 affy_1.24.2
lumiHumanAll.db_1.8.1 org.Hs.eg.db_2.3.6 RSQLite_0.8-4
DBI_0.2-5
[13] annotate_1.24.1 AnnotationDbi_1.8.2 Biobase_2.6.1
loaded via a namespace (and not attached):
[1] affyio_1.14.0 grid_2.10.1 hwriter_1.2
KernSmooth_2.23-3 lattice_0.17-26 Matrix_0.999375-33 nlme_3.1-96
[8] tcltk_2.10.1 tools_2.10.1 xtable_1.5-6
>
_________________________________________________________________
Hotmail: Free, trusted and rich email service.
[[alternative HTML version deleted]]