Absence of significant gene in DEG list
1
0
Entering edit mode
nia ▴ 30
@nia-12707
Last seen 4.1 years ago

Dear all,

I am a biologist and currently working on the identification of DEGs from a raw data which has about 50 thousand genes and after applying statistics on that I have come up with about 250 genes but the problem is that, the DEGs which have been identified by using packages of bioconductor miss some significant and known genes which cause that particular disease. Now the question in my mind is that how to justify the significant gene absence in DEGs list, while it is present in raw data and secondly if the gene is present in the DEGs list but what if it is downregulated.

Is it fine to proceed with the result just by saying its  because of the statistical parameters we applied or is there any other reason.

For Your Information:

(FDR) < 0.1 was set for statistical analysis and the positive values of foldchange was considered as up-reg and -ve as down-regulated genes. After that its different enrichment analysis were also performed.

For Example:

In a breast cancer study we extract list of DEGs from raw data computationally but BRCA gene is absent while it is present in raw data.

I hope I make my question clear.

Thankyou in advance.

limma microarray • 896 views
ADD COMMENT
0
Entering edit mode

Not enough information. Read the posting guide; describe your experimental design and show the code you used, for starters.

ADD REPLY
0
Entering edit mode

I have updated the post

ADD REPLY
2
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 8 hours ago
The city by the bay

You don't show the command that you used to create your design matrix, but I assume you used ~ 0 + f in the formula. Otherwise, your code looks fine.

Now, just because BRCA "is present in the raw data", it doesn't mean that it is truly differentially expressed. Have you actually looked at the expression values of BRCA with respect to f? Does it look like there is any difference in BRCA expression between the groups? This is a standard diagnostic procedure when you have positive control genes that you expect to be DE between groups.

Other potential causes include:

  • Known batch effects that inflate the variance. These should be included as blocking factors in design. More generally, I would check for these by examining a MDS plot and seeing whether the samples separate by f (ideal) or by something else (e.g., sex, age; not ideal).
  • Unknown batch effects that inflate the variance, see sva and RUVnormalize for more details.
  • Suboptimal variance modelling, try turning on robust=TRUE and trend=TRUE in eBayes.
  • Insufficient power in your experimental design, at least for the positive control genes. This is quite likely for human patient data that are often highly variable. The only solution here is to collect more arrays.
  • Your understanding of the biology or experimental system is incorrect, and the "known" genes are not actually DE. This should not be ruled out, especially for technical reasons, e.g., if someone purified the wrong cell type for use in RNA sequencing.
ADD COMMENT

Login before adding your answer.

Traffic: 460 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6