Help needed to identify common genes among datasets
Entering edit mode
vavecilla • 0
Last seen 8.1 years ago
United States

Good Morning,

I need some advice on some gene expression research. I have datasets which are downloaded from GEO and customized into MS excel. I need to identify the common genes across all the datasets. I've been reading that there is a way I can use R/Bioconductor in order to simply this process but still unsure where to begin. Can anyone shed some light and guide me in the right direction? Thanks so much!

datasets genes gene expression • 1.0k views
Entering edit mode
Last seen 6 months ago
United States

It's not clear what you mean by "customized into MS excel," but I'm imagining this means that you have gene identifiers (symbols or entrezIDs) in the first column, and data in the rest?

In any case, you'll need to load your data into R as a data.frame (you can use the readxl package (among others)) to do so.

Once you have data.frame(s) for your data, you can combine them using the merge function to combine these datasets.

If you just want iterate over the files and manually take the intersection of identifiers and such, you can use R's set operations.

It sounds like you're new to R (and programming, in general?) I'd recommend going through some R tutorials to get a feel for the language and an overview of some of its basic capabilities.

Entering edit mode

Thank you for your response. Yes I have the gene symbols and description in my first two columns followed by the downloaded raw data. 


Login before adding your answer.

Traffic: 433 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6