EdgeR Design matrix not of full rank. The following coefficients not estimable errorR
1
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia
Dear Eugene, According to your design, Sample 31 is a unique treatment unto itself, and also a unique batch unto itself. Obviously it is impossible to estimate both the batch effect and the treatment effect from one sample. Hence the error message. Best wishes Gordon > Date: Fri, 20 Dec 2013 16:49:43 -0800 (PST) > From: "Eugene Bolotin [guest]" <guest at="" bioconductor.org=""> > To: bioconductor at r-project.org, elbolotin at gmail.com > Subject: [BioC] EdgeR Design matrix not of full rank. The following > coefficients not estimable erroR > > > Hi I have the following samples: > batch > [1] 1802 1802 1802 1802 1802 1802 1802 1802 1802 1802 1802 2055 1802 1802 2055 > [16] 2055 2055 2055 2055 2055 2055 2055 2055 2055 2055 2055 2055 2055 1802 1802 > [31] 1157 1802 1802 1802 1802 1802 1802 1802 1802 1802 1802 1802 1802 2055 2055 > [46] 2055 2055 2055 2055 2055 2055 2055 2055 2055 2055 > Levels: 1157 1802 2055 > treatment > [1] TCGA-BR-6452 TCGA-BR-6453 tumor TCGA-BR-6454 tumor > [6] TCGA-BR-6455 TCGA-BR-6456 TCGA-BR-6457 tumor TCGA-BR-6458 > [11] tumor TCGA-BR-6563 TCGA-BR-6565 TCGA-BR-6566 TCGA- BR-7196 > [16] TCGA-BR-7703 tumor TCGA-BR-7704 tumor TCGA- BR-7707 > [21] TCGA-BR-7715 tumor TCGA-BR-7716 tumor TCGA- BR-7717 > [26] tumor TCGA-BR-7723 TCGA-CD-5804 TCGA-CG-4437 TCGA- CG-4441 > [31] TCGA-CG-4476 TCGA-CG-5716 TCGA-D7-6518 TCGA-D7-6519 TCGA-D7-6520 > [36] TCGA-D7-6521 TCGA-D7-6522 TCGA-D7-6524 TCGA-D7-6525 TCGA-D7-6526 > [41] TCGA-D7-6527 TCGA-D7-6528 TCGA-F1-6177 TCGA-F1-6875 TCGA- FP-7735 > [46] tumor TCGA-FP-7829 tumor TCGA-HF-7131 TCGA- HF-7132 > [51] TCGA-HF-7133 TCGA-HF-7134 TCGA-HF-7136 TCGA-IN-7806 tumor > 44 Levels: TCGA-BR-6452 TCGA-BR-6453 TCGA-BR-6454 TCGA-BR-6455 ... tumor > > > > > > I want to compare each sample from TCGA_X, to average mutant background, I know it is possible, because I was able to do it using standard commands. > However, when I try to adjust for batch effects as follows: > design=model.matrix(~batch+treatment) > names(data.frame(design)) > group=treatment > y=readDGE(files, path=wd, columns=c(1,2), group=group) > #names(data.frame(design)) > design=model.matrix(~0+batch+treatment) > > names(data.frame(design)) > #rownames(design)=colnames(y) > design > >> y = estimateGLMCommonDisp(y, design, verbose=TRUE) > Error in glmFit.default(y, design = design, dispersion = dispersion, offset = offset, : > Design matrix not of full rank. The following coefficients not estimable: > treatmentTCGA-CG-4476 > as far as i can tell it is because the batch 1157 contains a normal sample but does not contain any tumor samples. > Is there a way around that? > Thanks, > Eugene > > > -- output of sessionInfo(): > >> sessionInfo() > R version 3.0.2 (2013-09-25) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] edgeR_3.4.2 limma_3.18.6 > > loaded via a namespace (and not attached): > [1] tools_3.0.2 > > > -- > Sent via the guest posting facility at bioconductor.org. ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
edgeR edgeR • 3.1k views
ADD COMMENT
0
Entering edit mode
@eugene-bolotin-6300
Last seen 10.3 years ago
Dear Gordon, I apologize if I was a bit unclear, I actually simplified my problem a little bit for the post so it would fit into this bio-conductor post. I actually have 10+ samples in batch 1157, but that batch does not contain any "tumor" samples. I have additional similar batches some with tumor "some" without "tumor" samples. I want to remove batch specific differences between all samples. edgeR however gives me the same error, no matter how many samples I have in the batch, but does not give me this error if I remove all batches which do not contain any "tumor" samples. Can I just take residuals of logged count data after performing the linear regression on the batch factor? Can I then then feed the residuals into edgeR linear modeling? I want to compare how much each sample/patient/vector differs from average "tumor" sample. The batches are quite large with >10 samples each, and I have ~300 total samples. Thanks a ton, Eugene On Sat, Dec 21, 2013 at 3:31 AM, Gordon K Smyth <smyth@wehi.edu.au> wrote: > Dear Eugene, > > According to your design, Sample 31 is a unique treatment unto itself, and > also a unique batch unto itself. Obviously it is impossible to estimate > both the batch effect and the treatment effect from one sample. Hence the > error message. > > Best wishes > Gordon > > Date: Fri, 20 Dec 2013 16:49:43 -0800 (PST) >> From: "Eugene Bolotin [guest]" <guest@bioconductor.org> >> To: bioconductor@r-project.org, elbolotin@gmail.com >> Subject: [BioC] EdgeR Design matrix not of full rank. The following >> coefficients not estimable erroR >> >> >> Hi I have the following samples: >> batch >> [1] 1802 1802 1802 1802 1802 1802 1802 1802 1802 1802 1802 2055 1802 1802 >> 2055 >> [16] 2055 2055 2055 2055 2055 2055 2055 2055 2055 2055 2055 2055 2055 >> 1802 1802 >> [31] 1157 1802 1802 1802 1802 1802 1802 1802 1802 1802 1802 1802 1802 >> 2055 2055 >> [46] 2055 2055 2055 2055 2055 2055 2055 2055 2055 2055 >> Levels: 1157 1802 2055 >> treatment >> [1] TCGA-BR-6452 TCGA-BR-6453 tumor TCGA-BR-6454 tumor >> [6] TCGA-BR-6455 TCGA-BR-6456 TCGA-BR-6457 tumor TCGA- BR-6458 >> [11] tumor TCGA-BR-6563 TCGA-BR-6565 TCGA-BR-6566 TCGA- BR-7196 >> [16] TCGA-BR-7703 tumor TCGA-BR-7704 tumor TCGA- BR-7707 >> [21] TCGA-BR-7715 tumor TCGA-BR-7716 tumor TCGA- BR-7717 >> [26] tumor TCGA-BR-7723 TCGA-CD-5804 TCGA-CG-4437 TCGA- CG-4441 >> [31] TCGA-CG-4476 TCGA-CG-5716 TCGA-D7-6518 TCGA-D7-6519 TCGA-D7-6520 >> [36] TCGA-D7-6521 TCGA-D7-6522 TCGA-D7-6524 TCGA-D7-6525 TCGA-D7-6526 >> [41] TCGA-D7-6527 TCGA-D7-6528 TCGA-F1-6177 TCGA-F1-6875 TCGA- FP-7735 >> [46] tumor TCGA-FP-7829 tumor TCGA-HF-7131 TCGA- HF-7132 >> [51] TCGA-HF-7133 TCGA-HF-7134 TCGA-HF-7136 TCGA-IN-7806 tumor >> 44 Levels: TCGA-BR-6452 TCGA-BR-6453 TCGA-BR-6454 TCGA-BR-6455 ... tumor >> >> >> >> >> >> I want to compare each sample from TCGA_X, to average mutant background, >> I know it is possible, because I was able to do it using standard commands. >> However, when I try to adjust for batch effects as follows: >> design=model.matrix(~batch+treatment) >> names(data.frame(design)) >> group=treatment >> y=readDGE(files, path=wd, columns=c(1,2), group=group) >> #names(data.frame(design)) >> design=model.matrix(~0+batch+treatment) >> >> names(data.frame(design)) >> #rownames(design)=colnames(y) >> design >> >> y = estimateGLMCommonDisp(y, design, verbose=TRUE) >>> >> Error in glmFit.default(y, design = design, dispersion = dispersion, >> offset = offset, : >> Design matrix not of full rank. The following coefficients not >> estimable: >> treatmentTCGA-CG-4476 >> as far as i can tell it is because the batch 1157 contains a normal >> sample but does not contain any tumor samples. >> Is there a way around that? >> Thanks, >> Eugene >> >> >> -- output of sessionInfo(): >> >> sessionInfo() >>> >> R version 3.0.2 (2013-09-25) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] edgeR_3.4.2 limma_3.18.6 >> >> loaded via a namespace (and not attached): >> [1] tools_3.0.2 >> >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:10}}
ADD COMMENT

Login before adding your answer.

Traffic: 549 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6