Entering edit mode
                    Ryan M Ghan
        
    
        ▴
    
    20
        @ryan-m-ghan-6448
        Last seen 11.1 years ago
        
    Hello limma community-
I am attempting to construct a contrast matrix that I can run in R,
using the limma bioconductor package, but I am not sure that I have
coded the contrast matrix correctly. A previous
post<https: stats.stackexchange.com="" questions="" 64249="" creating-="" contrast-matrix-for-linear-regression-="" in-r?newreg="add2674ca9d04b7eb85fad255b45b7f5"> on
stats.stackexchange.com, the bioconductor mailing list, and the limma
guide were helpful, but my two factorial design is more complicated
than what is illustrated there.
The first factor is the treatment, with two levels (control=c and
stress=s), and the second factor is the genotype, with five levels
(g1, g2, g3, g4, g5). Each genotype/treatment consists of 3-biological
replicates (30xsamples total). My dataset has already been normalized
and log2 transformed (Should I even start with log transformed data?).
It consists of 1208 proteins (based upon spectral counting for those
that care) that measures protein abundance differences in the five
genotypes and two treatments. The dataset is complete, meaning each
sample/condition has a datapoint, and appears to be normally
distributed (histogram and qqplots).
## Session information
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
other attached packages:
[1] limma_3.18.13
loaded via a namespace (and not attached):
 [1] colorspace_1.2-4   dichromat_2.0-0    digest_0.6.4
ggplot2_0.9.3.1    grid_3.0.2         gtable_0.1.2
 [7] labeling_0.2       MASS_7.3-30        munsell_0.4.2
plyr_1.8.1         proto_0.3-10       RColorBrewer_1.0-5
[13] Rcpp_0.11.0        reshape2_1.2.2     scales_0.2.3
stringr_0.6.2      tools_3.0.2
## Subset of the data:
    proteinID g1.s1 g1.s2 g1.s3 g1.c1 g1.c2 g1.c3 g2.s1 g2.s2 g2.s3
g2.c1 g2.c2 g2.c3 g3.s1 g3.s2 g3.s3 g3.c1 g3.c2 g3.c3 g4.s1 g4.s2
g4.s3 g4.c1 g4.c2 g4.c3 g5.s1 g5.s2 g5.s3 g5.c1 g5.c2 g5.c3
    prot1 -9.70583694 -9.940059478 -9.764489183 -9.691937821
-9.547306096 -9.668928704 -9.821333234 -10.00376839 -9.843380585
-10.0789111 -9.958506961 -9.791583706 -10.04996359 -10.10279896
-10.0689715 -9.989303332 -10.05414639 -10.00619809 -9.907032795
-10.09700113 -10.00902876 -10.05603575 -10.26218387 -10.15527373
-9.88009858 -9.748974338 -9.730010667 -9.899956956 -9.773955101
-9.957684691
    prot2 -9.810354967 -9.844319231 -9.896748977 -9.777040294
-9.821308434 -9.906798728 -9.832236541 -9.876359355 -9.935535795
-10.05991278 -9.831098077 -9.789738587 -10.08470861 -10.18515166
-10.10371621 -10.01971224 -9.977142493 -10.09055782 -9.739831978
-9.586647999 -9.949407778 -9.800183583 -9.83900565 -9.943521592
-9.99229056 -9.744850134 -9.794814509 -9.98542989 -9.766324886
-9.95430439
    prot3 -11.70842601 -11.72521838 -11.90389475 -11.98273998
-11.915401 -11.88620205 -11.91603643 -11.96029519 -12.14926486
-12.23846499 -12.26650985 -11.84300821 -12.64562082 -12.41471031
-12.66462278 -12.577619 -12.90001898 -12.31577711 -11.66323243
-11.50283992 -11.4844068 -11.60402491 -11.95270942 -11.68245512
-12.32380181 -12.24294758 -12.23990879 -12.21563403 -12.33730369
-12.437377
    prot4 -10.88942769 -11.16906693 -11.13942576 -11.31332257
-11.04718433 -11.11811122 -11.17687812 -11.12503828 -10.9724186
-11.16837945 -11.19642214 -10.96468249 -11.3975887 -11.28808753
-11.32778647 -11.34124725 -11.30972182 -11.29564372 -10.74370929
-10.92223539 -10.97733154 -11.40528844 -11.1238659 -11.15938598
-11.24937805 -10.8691392 -11.12478375 -10.75566728 -10.99485703
-11.09493115
    prot5 -10.0102959 -9.936796529 -9.964629149 -9.842835973
-9.791578592 -9.773380518 -9.72290866 -9.715837804 -9.79028651
-9.951486129 -9.636225505 -9.820715987 -10.41899204 -10.25269382
-10.26949484 -10.02644184 -10.13120897 -10.20756299 -9.752087376
-9.687001368 -10.07111473 -9.815279198 -9.995624174 -9.993526894
-9.722360141 -9.551502595 -9.551929198 -9.724500546 -9.502769792
-9.65324573
    prot6 -10.34051005 -10.27571947 -10.14968761 -10.17419023
-10.47812301 -10.11019796 -10.40447672 -10.15885481 -10.22900798
-10.26612428 -10.21920493 -10.17186677 -10.66125689 -10.95438025
-10.63751536 -10.65825783 -10.60857688 -10.78516027 -10.33890785
-10.49726978 -10.47100414 -10.64742463 -10.78932619 -10.5318634
-10.26494688 -9.975182247 -10.24870036 -10.2356165 -10.26689552
-10.13061368
    prot7 -10.24930429 -10.37307132 -10.03573128 -10.29985129
-9.991216794 -10.05854902 -10.1958704 -10.30549818 -10.2078462
-10.28795766 -10.23314344 -10.23897922 -9.997472306 -10.27461285
-10.20805608 -10.06261332 -10.24876706 -10.12643737 -9.906088449
-10.07316322 -10.23545822 -10.30970717 -10.40745591 -10.36432166
-10.22423532 -10.25703553 -10.44925268 -9.902554721 -9.891163766
-10.0695915
    prot8 -10.98782595 -10.84184533 -10.76496107 -10.68290092
-10.55763113 -10.91736394 -10.87505278 -10.76474268 -10.58319007
-10.87547281 -10.71948079 -10.95011831 -10.99753277 -11.061728
-10.8852958 -10.86371208 -10.96638746 -11.24112703 -10.46809937
-10.78446288 -10.71240489 -10.80931259 -10.6598091 -10.54801115
-10.70612733 -10.7339808 -10.8184854 -10.53370359 -10.47323989
-10.62675183
    prot9 -8.83857166 -8.736344638 -8.743339515 -8.8152675
-8.743086044 -8.719612156 -8.898093257 -8.902781886 -9.071574958
-8.945970659 -8.862394746 -8.825061244 -8.82313363 -9.161452294
-8.905846232 -8.940119002 -9.024995852 -8.943721201 -8.768488159
-8.802155458 -8.721187011 -8.84850416 -8.931513624 -8.86743278
-8.856904592 -8.675257846 -8.900833162 -8.676117406 -8.758661701
-8.925717389
    prot10 -10.65297508 -10.74532307 -10.65940071 -10.36671791
-10.50431649 -10.54915637 -11.07154003 -10.79884265 -10.97164196
-11.1201714 -11.14821342 -10.9254445 -10.92875918 -10.90806369
-10.77581175 -11.2324716 -11.31360896 -11.01070959 -11.04450945
-10.89694291 -10.76865867 -10.92983387 -11.07365287 -11.43888216
-11.14948441 -10.69611194 -10.85827316 -10.64470128 -10.79046792
-10.86048168
## Code that I am attempting to utilize:
    proteins.mat <- as.matrix(proteins.df)
    treat = c("g1.s","g1.c","g2.s","g2.c","g3.s","g3.c","g4.s","g4.c",
"g5.s","g5.c")
    factors = gl(10,3,labels=treat)
    design <- model.matrix(~0+factors)
    colnames(design) <- treat
## Here is the design for my model:
    > design
       g1.s g1.c g2.s g2.c g3.s g3.c g4.s g4.c g5.s g5.c
    1     1    0    0    0    0    0    0    0    0    0
    2     1    0    0    0    0    0    0    0    0    0
    3     1    0    0    0    0    0    0    0    0    0
    4     0    1    0    0    0    0    0    0    0    0
    5     0    1    0    0    0    0    0    0    0    0
    6     0    1    0    0    0    0    0    0    0    0
    7     0    0    1    0    0    0    0    0    0    0
    8     0    0    1    0    0    0    0    0    0    0
    9     0    0    1    0    0    0    0    0    0    0
    10    0    0    0    1    0    0    0    0    0    0
    11    0    0    0    1    0    0    0    0    0    0
    12    0    0    0    1    0    0    0    0    0    0
    13    0    0    0    0    1    0    0    0    0    0
    14    0    0    0    0    1    0    0    0    0    0
    15    0    0    0    0    1    0    0    0    0    0
    16    0    0    0    0    0    1    0    0    0    0
    17    0    0    0    0    0    1    0    0    0    0
    18    0    0    0    0    0    1    0    0    0    0
    19    0    0    0    0    0    0    1    0    0    0
    20    0    0    0    0    0    0    1    0    0    0
    21    0    0    0    0    0    0    1    0    0    0
    22    0    0    0    0    0    0    0    1    0    0
    23    0    0    0    0    0    0    0    1    0    0
    24    0    0    0    0    0    0    0    1    0    0
    25    0    0    0    0    0    0    0    0    1    0
    26    0    0    0    0    0    0    0    0    1    0
    27    0    0    0    0    0    0    0    0    1    0
    28    0    0    0    0    0    0    0    0    0    1
    29    0    0    0    0    0    0    0    0    0    1
    30    0    0    0    0    0    0    0    0    0    1
    attr(,"assign")
    [1] 1 1 1 1 1 1 1 1 1 1
    attr(,"contrasts")
    attr(,"contrasts")$factors
    [1] "contr.treatment"
## My contrast model. I want to test for interaction, differences
between genotypes, and to see if specific genotypes respond
differently to the treatment from one another:
    cmtx <- makeContrasts(
      GenotypevsTreatment=(g1.s-g1.c)-(g2.s-g2.c)-(g3.s-g3.c)-(g4.s-g4
.c)-(g5.s-g5.c),
genotype=(g1.s+g1.c)-(g2.s+g2.c)-(g3.s+g3.c)-(g4.s+g4.c)-(g5.s+g5.c),
      Treatment=(g1.s+g2.s+g3.s+g4.s+g5.s)-(g1.c+g2.c+g3.c+g4.c+g5.c),
      levels=design)
## What my contrast model looks like, but I don't think this is
correct:
    > cmtx
          Contrasts
    Levels GenotypevsTreatment Genotype Treatment
      g1.s                   1        1         1
      g1.c                  -1        1        -1
      g2.s                  -1       -1         1
      g2.c                   1       -1        -1
      g3.s                  -1       -1         1
      g3.c                   1       -1        -1
      g4.s                  -1       -1         1
      g4.c                   1       -1        -1
      g5.s                  -1       -1         1
      g5.c                   1       -1        -1
## Fitting the linear model by empirical bayes statistics for
differential expression:
    fit <- eBayes(contrasts.fit(lmFit(proteins.mat, design), cmtx))
    topTable(fit, adjust.method="BH")
## The below topTable proteins are the same as the subset of data from
above:
    > topTable(fit, adjust.method="BH")
           GenotypevsTreatment Genotype    Treatment    AveExpr
F      P.Value    adj.P.Val
    prot1        -0.40786338 60.30918  0.073054723  -9.918822 17308.55
1.124646e-39 1.232079e-36
    prot2        -0.09255219 59.60864  0.061701713  -9.897968 15801.43
3.304533e-39 1.232079e-36
    prot3        -0.23880357 73.48557  0.536672827 -12.090016 15650.65
3.701463e-39 1.232079e-36
    prot4        -0.11834000 66.76931  0.305471823 -11.122034 15522.46
4.079731e-39 1.232079e-36
    prot5        -0.15210172 59.21509 -0.183849274  -9.876144 14734.51
7.556112e-39 1.423908e-36
    prot6        -0.15761118 62.87467  0.155340561 -10.389362 14565.87
8.658504e-39 1.423908e-36
    prot7        -0.03886438 61.15652 -0.166795475 -10.182834 14551.88
8.757515e-39 1.423908e-36
    prot8        -0.10425341 64.63523 -0.186904167 -10.780359 14461.18
9.429854e-39 1.423908e-36
    prot9        -0.03426380 53.48057  0.007403722  -8.854471 13713.49
1.767090e-38 2.021378e-36
    prot10       -0.75250251 66.62646  0.327497120 -10.894506 13480.51
2.164184e-38 2.021378e-36
I think I am naively constructing the contrast matrix. The result for
Genotype (above) looks incorrect to me. Ideally, I would like to
figure this out because I wish to apply similar modeling techniques to
an RNAseq dataset from the same experimental samples that I used for
my protein work.
Also, in order to make the correct comparisons, do I also need to
employ the duplicateCorrelation function (pg. 50, lima guide 9.7
Multi-level Experiments)? Something like this:
corfit <- duplicateCorrelation(proteins.mat, design)
> corfit$consensus
[1] -0.7888584
fit <- eBayes(contrasts.fit(lmFit(proteins.mat, design,
corrleation=corfit$consensus), cmtx))
Any insight/corrections would be greatly appreciated, and if I have
forgotten some information that would aid in helping me with this
problem I am happy to cooperate.
Cheers,
Ryan
Ryan M. Ghan
Graduate Research Assistant
Department of Biochemistry and Molecular Biology, MS 330
University of Nevada, Reno
Reno, NV 89557
Howard 205
Lab: (775) 784-4225
rghan@unr.edu
        [[alternative HTML version deleted]]
                    
                
                