edgeR - Design matrix for comparisons of all and subsets of samples
2
0
Entering edit mode
Geo • 0
@0192d047
Last seen 9 weeks ago
Greece

Hello everyone!

I have a large dataset of patients with Systemic Lupus Erythematosus (SLE) and Healthy Controls and I would like to perform various DE comparisons using edgeR, but I am having some doubt about my design matrices.

A little more info to better clarify the situation. SLE Patients in my dataset are split into 2 categories, patients with Lupus Nephritis (LN) and non-LN patients. And patients with LN are further split into Active LN and Inactive LN patients. So the Structure looks like this:

Condition Active
LN Yes
LN No
non_LN NA
Healthy NA

The comparisons I want to perform are:

LN vs Healthy

LN vs non-LN

Active LN vs Inactive LN

--So, my question is, what should my matrix formula and contrasts be?

Proposed design: Formula: ~ Condition + Active + 0


Model Matrix would look like:

ConditionLN Conditionnon_LN ConditionHealthy ActiveYes ActiveNo
1 0 0 1 0
1 0 0 1 0
1 0 0 0 1
0 1 0 NA NA
0 1 0 NA NA
0 0 1 NA NA
0 0 1 NA NA

Contrasts:

LN vs Healthy: c(1,0,-1,1,1)

LN vs non_LN: c(1,-1,0,1,1)

Active vs Inactive: c(0,0,0,1,-1)


Apologies for the bad post quality. It is my first post here and I could not figure out how to better lay it out.

Thank you very much in advance for all your help.

edgeR • 423 views
ADD COMMENT
2
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

The easy way is to follow section 3.3.1 in the edgeR User's Guide, and make a combination of your two factors and then use explicit contrasts.

0
Entering edit mode

Thank you very much for your answer! I hadn't noticed the more complex examples mentioned in the guide. I will look more into them.

ADD REPLY
0
Entering edit mode
Geo • 0
@0192d047
Last seen 9 weeks ago
Greece

Hi again! So, update on the analysis.

I ended up combining the columns into a single one and using the levels: LN_active, LN_inactive, non_LN and HC.

Wanting to to compare LN vs HC, I used the contrast (1,1,0,-1). I ended up getting only negative logFC values, spanning from -0.03 to -24 (which is outrageous for a logFC value). Did I not understand correctly?

Thank you very much for all your help!

ADD COMMENT
2
Entering edit mode

Please don't ask another question using the ADD ANSWER button. It's not an answer if it's a question.

A contrast is a set of indicator values that sum to zero. Yours sums to 1, so is not a contrast. Instead you are asking if the sum of two coefficients is larger than another coefficient, which doesn't make sense. Ideally you would use makeContrasts which automates that sort of thing, and you would also indicate that you want the mean of the LN groups instead of the sum.

That said, you are getting results that don't make sense, particularly for the contrast you say is resulting in those results. Without any code I can't say for sure, but I would bet that the contrast you really did was something with one 1 and two -1, which would result in all negative logFC values.

ADD REPLY

Login before adding your answer.

Traffic: 510 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6