Search
Question: limma voom for single factor
0
gravatar for bumbum
12 weeks ago by
bumbum0
bumbum0 wrote:

Dear all,

I'm trying to run limma voom to find DE genes between females and males, but my results don't seem to make sense. I'm sure it's just some lame mistake. How should I make the contrast design to only compare males and females? There should definitely be DE genes. Here's my code:

> library(limma)
> library(edgeR)
>
> counts <- read.csv("...", header=FALSE, sep=",")
>
> genes <- read.csv("...", header=TRUE)
> genes <- data.frame(genes)
>
> pheno <- read.csv("...", header=TRUE, sep=",")
> samplenames <- pheno$sampleid
>
> pheno
                           sampleid    sex
1   GTEX-1192X-1026-SM-5H12P_RNA_XY   male
2   GTEX-11DXY-0526-SM-5EGGQ_RNA_XY   male
...
135  GTEX-ZYY3-0626-SM-5NQ6W_RNA_XX female
136  GTEX-ZZPU-0426-SM-5GZYH_RNA_XX female
>
> sex <- factor(pheno$sex, levels=c("female", "male"))
>
> design <- model.matrix(~sex-1)
> colnames(design) <- levels(sex)
> design
    female male
1        0    1
2        0    1
3        0    1
...
134      0    1
135      1    0
136      1    0
attr(,"assign")
[1] 1 1
attr(,"contrasts")
attr(,"contrasts")$sex
[1] "contr.treatment"

> dge <- DGEList(counts=counts, genes=genes)
> colnames(dge) <- samplenames
> dge$samples$sex <- sex
> dge$samples$samplename <- samplenames
> keep <- rowSums(cpm(dge)>2) >= 4
> dge <- dge[keep,]
> v <- voom(dge, design, normalize="quantile")
> fit <- lmFit(v, design)
> contrast_design <- makeContrasts(female-male, levels=design)
> contrast_design
        Contrasts
Levels   female - male
  female             1
  male              -1
>
> vfit <- contrasts.fit(fit, contrasts=contrast_design)
> efit <- eBayes(vfit)
> topTable(efit)
                    gene logFC AveExpr    t P.Value adj.P.Val     B
47425 ENSG00000141556.20 -0.19     5.8 -3.7 0.00028      0.93 -0.61
12256 ENSG00000123933.15 -0.31     5.3 -3.6 0.00042      0.93 -0.82
34809 ENSG00000175215.10 -0.30     6.7 -3.6 0.00037      0.93 -0.97
34421  ENSG00000123268.8  0.20     3.7  3.6 0.00047      0.93 -1.11
4440  ENSG00000143493.12  0.28     4.3  3.5 0.00073      1.00 -1.23
1815  ENSG00000198160.14  0.21     5.0  3.4 0.00085      1.00 -1.27
37434  ENSG00000139826.5  0.21     3.9  3.5 0.00067      1.00 -1.27
27189 ENSG00000157654.17  0.43     4.0  3.3 0.00132      1.00 -1.65
2441  ENSG00000162775.14  0.27     3.9  3.3 0.00134      1.00 -1.69
50848 ENSG00000126464.13 -0.26     3.7 -3.3 0.00127      1.00 -1.76

> summary(decideTests(efit))
   female - male
-1             0
0          15868
1              1

R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] gplots_3.0.1   pheatmap_1.0.8 edgeR_3.18.1   limma_3.32.5  

loaded via a namespace (and not attached):
 [1] locfit_1.5-9.1     Rcpp_0.12.12       lattice_0.20-35    gtools_3.5.0       bitops_1.0-6       grid_3.4.1         plyr_1.8.4        
 [8] gtable_0.2.0       scales_0.4.1       KernSmooth_2.23-15 gdata_2.18.0       RColorBrewer_1.1-2 tools_3.4.1        munsell_0.4.3     
[15] compiler_3.4.1     colorspace_1.3-2   caTools_1.17.1    
 
 
ADD COMMENTlink modified 12 weeks ago by Gordon Smyth32k • written 12 weeks ago by bumbum0

Can you please tidy up a few inconsistencies in your question? It will be easier to give you an answer then.

  1. You give a topTable for 'fit_contrast', but 'fit_contrast' hasn't been mentioned anywhere in your code. So we don't know what this table is. You need to give code and output that is consistent from start to finish. Try running it again from the start.
  2. You say that "there definitely should be DE" but the topTable shows heaps of DE genes with FDR = 1e-179. So what's the problem?
  3. There's no sense in running decideTests() or run plotMD() before you form a contrast, so the output from decideTests(fit) and plotMD(fit) aren't meaningful. Better to delete these.

Thanks

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by Gordon Smyth32k

Thanks for tidying up your question (45 min ago), but the results you show are still not consistent. The topTable now shows no DE genes but decideTests() still shows 1 DE gene. It is still not possible that the output you show could have arisen from a single unbroken R session.

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by Gordon Smyth32k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 107 users visited in the last hour