Question

Limma for paired and unpaired samples. Strange number of DEGs in unpaired samples

0

Entering edit mode

Dimitris • 0

@9d129eeb

Last seen 17 months ago

Greece

Hello,

I have RNA-seq data from two different treatment groups (F and NF ) at 2 different time points (T1 and T2). The mapping was done with STAR aligner and the quantification was done with FeatureCounts. When I perform the paired analysis in F: T2 vs T1 and NF: T2 vs T1. I create a design matrix including the variables Individ+medianIsize+GC+RIN+Plate from my metadata file which looks like:

       Individ age sex fasting timePoint medianIsize    GC   BMI BMI_category extraction_batch_ID extraction_date plate RIN
001_T2   ID001  54   F       F        T2         442 59.10 32.67            4                 101        31/07/19     9 8.4
002_T2   ID002  65   F       F        T2         434 57.20 31.80            4                 101        31/07/19     9 8.9
003_T2   ID003  64   F       F        T2         423 58.17 29.47            3                 100        31/07/19     8 7.5
004_T2   ID004  65   F       F        T2         400 54.90 28.00            3                 100        31/07/19     8 8.5
005_T2   ID005  56   F       F        T2         381 54.77 21.71            2                  99        30/07/19     8 9.1
006_T2   ID006  53   F       F        T2         450 53.18 32.85            4                  63        22/04/19     5 9.5

My count matrix looks like

                001_T2 002_T2 003_T2 004_T2 005_T2 006_T2 007_T2 008_T2 009_T2 010_T2 011_T2 013_T2 014_T2 015_T2 016_T2 017_T2 018_T2
ENSG00000186827     68    248    250    269    143    268    253    293    144    117    183    145    263    334    180    211    197
ENSG00000186891     61    130    185    111    164    120    139    341    111    156    147     95    253    349    121    281    153

In paired analysis limma-voom returns 3302 DEGS (padjust < 0,05) out of 17231 genes for F: T2 vs T1 and 3844 for NF: T2 vs T1.

When I am trying to perform unpaired analysis for comparisons of treatment groups within time points ie T1 : F vs NF or T2: F vs NF, I create a design matrix with different combinations of the covariates: medianIsize + GC + age + sex + BMI+ RIN. In each combination of covariates limma returns very few DEGs (7-32). From a biological standpoint, we expect a large number of DEGs for these comparisons. Below is the table of different versions of covariates and results of limma :

Input Covariates    T1: F vs NF T2: F vs NF
medianIsize + GC    14096   14041 
medianIsize + GC + age + sex    5   9 
medianIsize + GC + age + sex + BMI  7   10 
GC + age + sex + BMI+ RIN   30  13
GC + age + sex + BMI    7   12
medianIsize + GC + age + sex + BMI+ RIN 32  10 
medianIsize + GC + age + sex + BMI+plate    7   9
medianIsize + GC + age + sex + BMI+RIN+plate    13  10
age + sex + BMI+RIN+plate   13  6
medianIsize + GC +sex+ RIN+plate    311 156

Since I was expecting more DEGs in unpaired analysis (T1: F vs NF, T2: F vs NF) do you think that this kind of result makes sense based on the combination of covariates that we use?

limma lmfit R DifferentialExpression • 881 views

ADD COMMENT • link updated 23 months ago by ATpoint ★ 4.0k • written 23 months ago by Dimitris • 0

score 1 · Answer 1 · 2022-05-24

1

Entering edit mode

ATpoint ★ 4.0k

@atpoint-13662

Last seen 3 hours ago

Germany

Cross-posted and answered by Gordon Smyth https://www.biostars.org/p/9524346/

ADD COMMENT • link 23 months ago ATpoint ★ 4.0k