Question

ExpressionSet annotation with custom annotation file

0

Entering edit mode

genesignature • 0

@genesignature-14900

Last seen 6.2 years ago

I made a ExpressionSet from a Affy U133a spreadsheet data.

> testExp

ExpressionSet (storageMode: lockedEnvironment)

assayData: 22277 features, 43 samples

element names: exprs

protocolData: none

phenoData

sampleNames: GSM637758 GSM637759 ... GSM637800 (43 total)

varLabels: Group

varMetadata: labelDescription

featureData: none

experimentData: use 'experimentData(object)'

Annotation:

I did not use the u133a.db from annotationDbi. Instead, I would like to annotate using my own annotation file ("anno")

> head(anno)

PROBEID ENSEMBL

1 209108_at ENSG00000000003

2 209109_s_at ENSG00000000003

3 220065_at ENSG00000000005

4 202673_at ENSG00000000419

5 205607_s_at ENSG00000000457

6 220840_s_at ENSG00000000457

> dim(anno)

[1] 37627 2

Gene numbers in ExpressionSet and annotation file are different. What I need is a final ExpressionSet with the only the valid ENSEMBL IDs (delete all affy Ids with no matching ENSEMBL ids from ExpressionSet)

Is it possible? I would appreciate the detailed protocol.

annotation microarray R • 633 views

ADD COMMENT • link 6.2 years ago genesignature • 0

0

Entering edit mode

To remove rows from anno that do not map to an "ENSG", you can do something like:

anno_no_ensg = anno[!grepl('ENSG';, anno$ENSEMBL),]

I suspect that is only half of the problem. You will likely still have the problem of a single probe_id mapping to multiple ENSGs. Unfortunately, that is not a simple problem to solve (you will need to make some arbitrary decisions).

ADD REPLY • link 6.2 years ago Sean Davis 21k