Search
Question: Another Matrix is not full rank question, multi-factor design issues
0
gravatar for pw349
21 days ago by
pw3490
pw3490 wrote:

Hello,

Thank you in advance for your help with this. I have been trying very hard to work this out and have tried to follow the vignette and looked through Q&As on this site, but to no avail.

My experimental outline: I pretreated cells to either "control" or "RecLG" for 3 days and then on the 4th day treated them to control or low glucose conditions. This gave me four groups "C", "LG", "aRLG", and "RLG".

I am interested in differential gene expression between control and RecLG pretreatments, and then the differences between each of the treatment groups. my coldata looks like this:

as.data.frame(coldata2)
       pretreatment treatment treatment.n
AH_1          RecLG      aRLG           1
AH_2          RecLG      aRLG           1
AH_4          RecLG      aRLG           1
AH_5          RecLG      aRLG           1
AH_6          RecLG      aRLG           1
CONT_1      control         C           2
CONT_3      control         C           2
CONT_4      control         C           2
CONT_5      control         C           2
CONT_6      control         C           2
HYPO_1      control        LG           1
HYPO_2      control        LG           1
HYPO_3      control        LG           1
HYPO_5      control        LG           1
HYPO_6      control        LG           1
RH_1          RecLG       RLG           2
RH_2          RecLG       RLG           2
RH_3          RecLG       RLG           2
RH_5          RecLG       RLG           2
RH_6          RecLG       RLG           2

In each group there are 5 replicates. As you can see I made a treatment.n column to account for the nesting of "aRLG" and "RLG" within the "RecLG" pretreatment group and "C" and "LG" within the "control pretreatment group.

When I run:

dds <- DESeqDataSetFromMatrix(countData = countdata, colData = coldata2, design = ~pretreatment + pretreatment:treatment.n + treatment)

I get the error "Error in checkFullRank(modelMatrix)"

For the life of me, I can not figure this out.

The model matrix looks like this:

 model.matrix(~pretreatment + pretreatment:treatment.n + treatment, coldata2)
       (Intercept) pretreatmentRecLG treatmentC treatmentLG treatmentRLG pretreatmentcontrol:treatment.n2
AH_1             1                 1          0           0            0                                0
AH_2             1                 1          0           0            0                                0
AH_4             1                 1          0           0            0                                0
AH_5             1                 1          0           0            0                                0
AH_6             1                 1          0           0            0                                0
CONT_1           1                 0          1           0            0                                1
CONT_3           1                 0          1           0            0                                1
CONT_4           1                 0          1           0            0                                1
CONT_5           1                 0          1           0            0                                1
CONT_6           1                 0          1           0            0                                1
HYPO_1           1                 0          0           1            0                                0
HYPO_2           1                 0          0           1            0                                0
HYPO_3           1                 0          0           1            0                                0
HYPO_5           1                 0          0           1            0                                0
HYPO_6           1                 0          0           1            0                                0
RH_1             1                 1          0           0            1                                0
RH_2             1                 1          0           0            1                                0
RH_3             1                 1          0           0            1                                0
RH_5             1                 1          0           0            1                                0
RH_6             1                 1          0           0            1                                0
       pretreatmentRecLG:treatment.n2
AH_1                                0
AH_2                                0
AH_4                                0
AH_5                                0
AH_6                                0
CONT_1                              0
CONT_3                              0
CONT_4                              0
CONT_5                              0
CONT_6                              0
HYPO_1                              0
HYPO_2                              0
HYPO_3                              0
HYPO_5                              0
HYPO_6                              0
RH_1                                1
RH_2                                1
RH_3                                1
RH_5                                1
RH_6                                1
attr(,"assign")
[1] 0 1 2 2 2 3 3
attr(,"contrasts")
attr(,"contrasts")$pretreatment
[1] "contr.treatment"

attr(,"contrasts")$treatment.n
[1] "contr.treatment"

attr(,"contrasts")$treatment
[1] "contr.treatment"

I'm just really stuck with this, I don't know where to start to sort this problem out so I've come here for help. Anything is really appreciated.

Many thanks,

Paul

ADD COMMENTlink modified 21 days ago by James W. MacDonald48k • written 21 days ago by pw3490

FYI, session info is below:

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8       
 [4] LC_COLLATE=en_GB.UTF-8     LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] RColorBrewer_1.1-2         pheatmap_1.0.10            ggplot2_3.0.0             
 [4] dplyr_0.7.6                genefilter_1.58.1          RUVSeq_1.10.0             
 [7] edgeR_3.18.1               limma_3.32.10              EDASeq_2.10.0             
[10] ShortRead_1.34.2           GenomicAlignments_1.12.2   Rsamtools_1.28.0          
[13] Biostrings_2.44.2          XVector_0.16.0             BiocParallel_1.10.1       
[16] DESeq2_1.16.1              SummarizedExperiment_1.6.5 DelayedArray_0.2.7        
[19] matrixStats_0.54.0         Biobase_2.38.0             GenomicRanges_1.28.6      
[22] GenomeInfoDb_1.12.3        IRanges_2.10.5             S4Vectors_0.14.7          
[25] BiocGenerics_0.24.0       

loaded via a namespace (and not attached):
 [1] bitops_1.0-6            bit64_0.9-7             tools_3.4.3             backports_1.1.2        
 [5] R6_2.2.2                rpart_4.1-13            Hmisc_4.1-1             DBI_1.0.0              
 [9] lazyeval_0.2.1          colorspace_1.3-2        nnet_7.3-12             withr_2.1.2            
[13] tidyselect_0.2.4        gridExtra_2.3           bit_1.1-14              compiler_3.4.3         
[17] htmlTable_1.12          rtracklayer_1.36.6      scales_1.0.0            checkmate_1.8.5        
[21] DESeq_1.28.0            stringr_1.3.1           digest_0.6.17           foreign_0.8-71         
[25] R.utils_2.7.0           base64enc_0.1-3         pkgconfig_2.0.2         htmltools_0.3.6        
[29] htmlwidgets_1.3         rlang_0.2.2             rstudioapi_0.7          RSQLite_2.1.1          
[33] bindr_0.1.1             hwriter_1.3.2           acepack_1.4.1           R.oo_1.22.0            
[37] RCurl_1.95-4.11         magrittr_1.5            GenomeInfoDbData_0.99.0 Formula_1.2-3          
[41] Matrix_1.2-14           Rcpp_0.12.18            munsell_0.5.0           R.methodsS3_1.7.1      
[45] stringi_1.2.4           yaml_2.2.0              MASS_7.3-50             zlibbioc_1.22.0        
[49] plyr_1.8.4              grid_3.4.3              blob_1.1.1              crayon_1.3.4           
[53] lattice_0.20-35         splines_3.4.3           GenomicFeatures_1.28.5  annotate_1.54.0        
[57] locfit_1.5-9.1          knitr_1.20              pillar_1.3.0            geneplotter_1.56.0     
[61] biomaRt_2.32.1          XML_3.98-1.16           glue_1.3.0              latticeExtra_0.6-28    
[65] data.table_1.11.4       gtable_0.2.0            purrr_0.2.5             assertthat_0.2.0       
[69] aroma.light_3.6.0       xtable_1.8-3            survival_2.42-6         tibble_1.4.2           
[73] AnnotationDbi_1.40.0    memoise_1.1.0           bindrcpp_0.2.2          cluster_2.0.7-1 
ADD REPLYlink written 21 days ago by pw3490
0
gravatar for James W. MacDonald
21 days ago by
United States
James W. MacDonald48k wrote:

There is no nesting here. Nesting means there is something inherently different between the two groups. Like if you had two different cell types. But if these are all the same cell types, you have four groups. You can paste the first two columns of your coldata, separated by a period and use that as your factor, so you have four groups, RecLG.aRLG, control.c, control.LG, RecLG.RLG, and then make whatever comparisons you care about.

 

ADD COMMENTlink written 21 days ago by James W. MacDonald48k

Thanks James for the feedback. Could I just run everything using just the treatment column then? i.e

dds2 <- DESeqDataSetFromMatrix(countData = countdata, colData = coldata, design = ~treatment)
dds2 <- DESeq(dds2)
rld_dds2 <- rlog(dds2, blind = FALSE)
results <-DESeq(dds2)
DESeq2_results=results(results)

And then contrast between groups? i.e

results_CvLG <- results(dds2, contrast=c("treatment","C","LG"))
head(results_CvLG)
log2 fold change (MLE): treatment C vs LG 
Wald test p-value: treatment C vs LG 
DataFrame with 6 rows and 6 columns
                      baseMean log2FoldChange      lfcSE        stat       pvalue       padj
                     <numeric>      <numeric>  <numeric>   <numeric>    <numeric>  <numeric>
ENSG00000227232.5     4.078592     -0.0209465 0.89695220 -0.02335297 0.9813687199         NA
ENSG00000279457.4     3.638559     -0.6458330 0.85428877 -0.75598916 0.4496556937         NA
ENSG00000228463.10    1.594748      2.0704358 1.41400168  1.46423856 0.1431287784         NA
ENSG00000225972.1     1.773500      0.3730648 1.18547461  0.31469661 0.7529920184         NA
ENSG00000225630.1  1314.483271     -0.3273191 0.09114708 -3.59110859 0.0003292744 0.07313782
ENSG00000237973.1   693.793225     -0.3296993 0.11795461 -2.79513748 0.0051877649 0.22739225
ADD REPLYlink modified 20 days ago • written 20 days ago by pw3490

Hi, Can I just bump this again to see if I understood your answer correctly, please?

Thank you James

ADD REPLYlink written 6 days ago by pw3490

Yes that is how you would make comparisons between levels of a single factor.

ADD REPLYlink written 6 days ago by Michael Love20k

Thanks James and Michael!

ADD REPLYlink written 1 day ago by pw3490
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 364 users visited in the last hour