Question: Opening Illumina HT12 V3.0 Data from GEO
6 weeks ago
FL5120
FL5120 wrote:

I was essentially doing the same things posted previously.

Following the note posted on 1, I have successfully downloaded the data of my interest.

library(GEOquery)
data <- getGEO("GSE32894")[[1]]


Unfortunately, I got stuck when I was trying to read "GSE32894" with limma.

idata <- read.ilmn("GSE32894_non-normalized_308UCsamples.txt",probeid = "PROBE_ID",expr="SKBR")


The error shows as follows;

Error in readGenericHeader(fname, columns = expr, sep = sep) :


I looked into the documentation (https://www.rdocumentation.org/packages/limma/versions/3.28.14/topics/read.ilmn), none of them worked out.

It should be great if you can give me any kind of suggestion to fix this problem.

Thank you.

written 6 weeks ago by FL5120

In the above the quotation marks " are not correct; what is the actual command that you used?

Yes, you right. I am sorry for any confusion caused. I will change from reading to opening Illumina HT12 V3.0 Data from GEO.

OP has copied Mark Dunning's code (which was for a specific dataset) from https://support.bioconductor.org/p/70064. I reformated OP's question before, now I've removed the extra quote mark as well.

5 weeks ago
United States
James W. MacDonald51k wrote:

In your first code chunk you are reading in the wrong GSE (GSE32849), which is a CHiP-Chip experiment, rather than the one you want. In the second case, you have a text file containing something that isn't what you think it is:

sed -n '5p' GSE32894_non-normalized_308UCsamples.txt | sed 's/\t/\n/g' | head
ID_REF
UC_0001_1
UC_0001_1.detection.p.value
UC_0002_1
UC_0002_1.detection.p.value
UC_0003_1
UC_0003_1.detection.p.value
UC_0006_2
UC_0006_2.detection.p.value
UC_0007_1


You are probably better off just using the data you get from getGEO:

 z <- getGEO("GSE32894")[[1]]
> z
ExpressionSet (storageMode: lockedEnvironment)
assayData: 24402 features, 308 samples
element names: exprs
protocolData: none
phenoData
sampleNames: GSM814052 GSM814053 ... GSM814359 (308 total)
varLabels: title geo_accession ... tumor_stage:ch1 (55 total)
featureData
featureNames: ILMN_1343291 ILMN_1343295 ... ILMN_2415979 (24402
total)
fvarLabels: ID nuID ... GB_ACC (30 total)
experimentData: use 'experimentData(object)'
pubMedIds: 22553347
Annotation: GPL6947
> pData(z)[1:5,54:55]
GSM814052              G3              T2
GSM814053              G2              T2
GSM814054              G2              T1
GSM814055              G2              T1
GSM814056              G3             T3b

> table(pData(z)[,54:55])
tumor_stage:ch1
tumor_grade:ch1 T1 T2 T2a T2b T3 T3b T4a Ta Tx
G1  0  0   0   0  0   0   0 48  0
G2 35  9   1   0  1   0   0 56  1
G3 61 73   0   2  0   5   1 11  1
G4  1  0   0   0  0   0   0  0  0
Gx  0  0   0   0  0   1   0  1  0


You can just use limma directly on that ExpressionSet, based on whatever phenotypic groups you care to compare.

I am sorry James, it was a type, should be GSE32894 as you mentioned. So what if I just want to have normalized data, what am I supposed to do? Because I would like to see a big picture rather than the comparison of specific genes for now.