Identifying presence of certain genes
Entering edit mode
Sam ▴ 10
Last seen 12 months ago
United States


I'm very new in the field and would appreciate any help. I want to identify if a certain protein is differentially expressed between two sets of population from the TCGA database. Before conducting any differential expression analysis, 1) How can I see if the gene that codes for a certain protein is even present in a sample? What package should I use? 2) When extracting data from TCGA, there are many different types of data I can extract, can someone point me towards a direction that lists what type of data is for which purpose? 3) How will I be able to conduct differential expression analysis for a certain gene from data extracted from TCGA?

Thank you,

Code should be placed in three backticks as shown below

# include your problematic code here with any corresponding output 
# please also include the results of running the following in an R session 

sessionInfo( )
DifferentialExpression TCGAWorkflow • 374 views
Entering edit mode
Last seen 12 weeks ago
United States

Hi Sam,

It looks like you've got a long road ahead, but congratulations on taking the first step on your scientific journey. This likely isn't the best place to answer these types of broad questions, as this support forum is focussed on helping people when they have particular/specific question on the use of bioconductor packages.

I'm actually not sure where is the best place to get answers to these types of questions, as the level of detail in the answer that is best for you is unclear.

So some pointers:

  1. You talk about protein and gene expression. I don't recall the TCGA had any real proteomics to speak of (maybe some RPPA data? It's been a long time now, so don't recall). My guess is that you want to focus on gene expression (RNA-seq data).

  2. To see if your gene(s) of interest are expressed, you might consider using some of the online data browsers where you can query for genes of interest. Some of these include:

  3. Even though you may be only interested in one (or some) genes, once you find the indication(s) you care about, it is recommended to perform a differential expression analysis on the full set of expressed genes among the samples of interest, then bring your focus to the individual genes you may care about.

  4. You will find tons of great tutorials and workflows for doing differential gene expression analysis using R/Bioconductor under the workflows section. Also be sure to read the user guides that come with limma, edgeR, and DESeq2, which offer a ton of detail on the mechanics of performing differential expression analysis.

Once you get going and have specific questions on analyses you have underway, feel free to come back here with more questions so we can help you with those.

Entering edit mode

Thank you so much for your detailed response! I'll be sure to come back with more specific questions once I have them figured out.


Login before adding your answer.

Traffic: 503 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6