Error in glmFit.default() : y is not a numeric matrix
1
2
Entering edit mode
borpano ▴ 20
@borpano-17039
Last seen 4.6 years ago

I am trying to do differential gene expression analysis on a transcriptome I generated. I generated a normalized count matrix called Trans_Atlas first and now I am running into an error using glmFit command.

Here is my script so far:

> Trans_Atlas <- read.table("normalized_count_matrix", header=T, row.name=1, com='', check.names=F)
> group <- factor(c(1,1,2,2,3,3,4,4,5,5,6,6,7,7))
> design <- model.matrix(~0+group, data=Trans_Atlas$samples)
> test <- estimateDisp(Trans_Atlas,design)
> fit <- glmFit(test,design)
Error in glmFit.default(test, design) : y is not a numeric matrix
 

I'm confused on this error as test is a list with 6 values:

common.dispersion : num 0.482

trended.dispersion : num [1:30558]

tagwise.dispersion : num [1:30558]

span : num 0.281

prior.df : num 6.21

prior.n : num 0.888

I'm new to R where did I mess up?

Thanks

edger • 2.6k views
ADD COMMENT
0
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 6 hours ago
The city by the bay

A list is not a matrix, so that's why it doesn't work.

There are a number of issues with what you are doing. For starters, you should supply the raw counts to edgeR, not normalized values. You should be using normalization factors; you should be filtering; and you should be using the DGEList data structure to coordinate this across the analysis.

I strongly suggest reading the edgeR user's guide, which will save time for both me and you.

ADD COMMENT
0
Entering edit mode

Hi even I had this problem, honestly, I read the manual and I am still having difficulty. It is not just about reading manual it is about knowing how to use these functions. I like the user borpano do not know what to do when I have this issue.

ADD REPLY
0
Entering edit mode

As for normalization factors I would like to know how you suggest doing TMM normalization Aaron Lun as you are supposed to supply the original counts?

ADD REPLY
0
Entering edit mode

Yes, see ?calcNormFactors.

ADD REPLY
0
Entering edit mode

I know this one however, I don't think we can just supply it to  a   in the following command

y <- estimateDisp( a, design) .

I wanted to check as I am not used to using this package. Any help you can give would be much appreciated.

ADD REPLY
0
Entering edit mode

If you read ?estimateDisp, you will see that the output of estimateDisp changes depending on whether a is a DGEList or a matrix. All use cases in the user's guide will supply a DGEList to estimateDisp; this is what you should be doing as well.

ADD REPLY
0
Entering edit mode

I get that we should be supplying a DGEList but can we supply TMM normalized values as it does not seem okay to in the documentation to supply it to estimateDisp? However, TMM normalization in general is encouraged, I believe?

ADD REPLY
0
Entering edit mode

I get that we should be supplying a DGEList

Then do it.

can we supply TMM normalized values

edgeR has no concept of "TMM normalized values", see Section 2.7.6 of the user's guide. There are TMM normalization factors, but the counts should not be modified during the course of the analysis. You should not supply any kind of scaled counts to estimateDisp or the other edgeR functions.

it does not seem okay to in the documentation to supply it to estimateDisp?

I don't understand what you're complaining about. Just create a DGEList object and use it for your analysis steps including estimateDisp - it's that easy. Why are you trying to fight this?

However, TMM normalization in general is encouraged, I believe?

Use calcNormFactors on a DGEList, and you will get a DGEList in return. This can be used for input into the remaining edgeR functions, including estimateDisp.

ADD REPLY
0
Entering edit mode

I was not trying to fight  I just didn't understand so from that I get that I should just use a dgelist of counts fed in estimateDisp. Thank you for all your help! :)

ADD REPLY
0
Entering edit mode

If you have been following the instructions in the user's guide, you cannot possibly have the problems described by the OP, because we never tell users to run the commands in that manner! Specifically, all calls to estimateDisp are performed on DGEList objects, not on the count matrices themselves. In addition, we do not accept "normalized counts" as input - see the explanation in Section 2.7.6. I would suggest the following:

  1. Read the workflow here.
  2. Read the documentation for individual functions, see ?estimateDisp.
  3. Read the user's guide again.
  4. Read the posting guide and describe your specific problem in detail, with code.
ADD REPLY
0
Entering edit mode

Oh in which case, while I see your point, I have used a DGEList (I did not see that the person did not use a DGEList) and yet I am still getting the error that this person got. As for the user guide please do not refer to that I do not understand what I am doing wrong, I have read the manual.

ADD REPLY
0
Entering edit mode

I see you must have ignored point 4 in my comment above. I'm not psychic, I can't figure out what you did wrong if you don't show any code! In the absence of any further information, I can only refer you to the user's guide, or the workflow that I mentioned above, or ?estimateDisp.
 

ADD REPLY
0
Entering edit mode

I did not ignore it intentionally. I did not realize that I needed to in this case. Sorry. Thank you for the help though.

ADD REPLY
0
Entering edit mode

I'm confused as to why the estimateDisp is creating this list object instead of the correct matrix. To generate this count matrix I used some of the perl scripts include in the Trinity de novo transcriptome Assember which output this R script to create both a DGEList and normalized via normalization factors all using the edgeR package. 

> library(edgeR)
Loading required package: limma
>
> rnaseqMatrix = read.table("RSEM.isoform.TPM.not_cross_norm", header=T, row.names=1, com='', check.names=F)
> rnaseqMatrix = as.matrix(rnaseqMatrix)
> rnaseqMatrix = round(rnaseqMatrix)
> exp_study = DGEList(counts=rnaseqMatrix, group=factor(colnames(rnaseqMatrix)))
> exp_study = calcNormFactors(exp_study)
> exp_study$samples$eff.lib.size = exp_study$samples$lib.size * exp_study$samples$norm.factors
> write.table(exp_study$samples, file="RSEM.isoform.TPM.not_cross_norm.TMM_info.txt", quote=F, sep="\t", row.names=F)
>

Do I have to reformat the "Trans_Atlas into a DGEList data structure?

ADD REPLY
1
Entering edit mode

Yes. Read the "Value" section in ?estimateDisp, especially pertaining to estimateDisp.default.

ADD REPLY
0
Entering edit mode

This solved my issue thank you very much.

ADD REPLY

Login before adding your answer.

Traffic: 758 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6