Entering edit mode
Jing Huang
▴
380
@jing-huang-4737
Last seen 10.2 years ago
Dear Sean and all members,
I am trying to extract GSE data from GEO and do analysis. I am
wondering if the GSE data has been normalized and log 2 transformed. R
scripts and output are copied below. Can somebody help me on this?
>Table(GSMList(gse)[[1]])[1:5, ]
ID_REF VALUE
1 1007_s_at 7.693888187
2 1053_at 8.571408272
3 117_at 5.179812431
4 121_at 7.468027592
5 1255_g_at 3.118550777
> Columns(GSMList(gse)[[1]])[1:5, ]
Column Description
1 ID_REF
2 VALUE log2 signal intensity, RMA <<<<< Does this means
that the value is log2 transformed and the data was normalized
by RMA
NA <na> <na>
NA.1 <na> <na>
NA.2 <na> <na>
According to GEOquery package I should do following steps in order to
get the eset:
> probesets <- Table(GPLList(gse)[[1]])$ID
> data.matrix <- do.call("cbind", lapply(GSMList(gse), function(x) {
+ tab <- Table(x)
+ mymatch <- match(probesets, tab$ID_REF)
+ return(tab$VALUE[mymatch])
+ }))
> data.matrix <- apply(data.matrix, 2, function(x) {
+ as.numeric(as.character(x))
+ })
> data.matrix <- log2(data.matrix)
> data.matrix[1:5, ]
GSM424759 GSM424760 GSM424761 GSM424762 GSM424763 GSM424764
GSM424765
[1,] 2.943713 2.917086 2.926155 2.983485 2.973219 2.962445
2.926030
[2,] 3.099532 3.136898 3.152696 3.217172 3.206948 3.198448
3.135146
[3,] 2.372900 2.309177 2.354380 2.373350 2.368464 2.381139
2.314555
[4,] 2.900727 2.873853 2.863911 2.879232 2.927384 2.913594
2.852870
[5,] 1.640876 1.645330 1.494274 1.792643 1.719597 1.648126
1.605055
Is the log2 transformation necessary for this dataset?
Many thanks
Jing
[[alternative HTML version deleted]]