Hi, I have some gene expression data of the NCI-60 cell-lines from GEO that generated with the Affymetrix Human Genome U133 Plus 2.0 chip. I want to build a protein interaction network from 3 samples of a specific cell-line and so far I have done this in two steps: Mapped probe-IDs to gene symbol and then ignoring all mappings that map from one probe-id to multiple gene symbols and collected the genes with the highest values.
In the second step I have used iRefR to collect interactions between these genes and then building a graph from that data.
Because I did not know R I programmed the first step in Java, so I want to code this in a R script and maybe rectify mistakes I've done, but I'm still not sure how to properly interpret Affymatrix probe sets and was hoping someone could point me to the right R library and maybe tell me how it should be done. Is there a way to gather replicates, normalize and average 3 samples so I can get a list of unique gene symbol and their value? And is there some sort of threshold value where probe-id/gene symbol is not present in the sample/cell-line?