I've recently started trying to process data from the GEO datasets that are publically available.
Unfortunately my R skills are still quite not up to par.
I've been working with a specific set in particular GSE3933
the code is simply to extract the expression for each sample then match it with the appropriate gene symbol/annotation.
gse3933 <- getGEO('GSE3933',GSEMatrix=TRUE) gpl <- annotation(gse3933[]) platf <- getGEO(gpl, AnnotGPL=TRUE) anot <- data.frame(attr(dataTable(platf), "table")) Gmicro <- exprs(gse3933[])
My problem arises in two parts:
I've accessed the user submitted annotation which is GPL3289 - this has ten variables. However there is no variable which is "gene symbol". My guess is that the GB_List variable column can be converted to Gene symbols/ Gene ID's - however I have not been able to figure out how to do this.
I then want to match the Gene Symbols/Probes with the appropriate expression values from the matrix 'Gmicro' as per the code. Is this simply matching the ID's from the matrix generated by using the exprs function to those generated using dataTable?