Question

Error in glmFit.default() : y is not a numeric matrix

2

Entering edit mode

borpano ▴ 20

@borpano-17039

Last seen 5.5 years ago

I am trying to do differential gene expression analysis on a transcriptome I generated. I generated a normalized count matrix called Trans_Atlas first and now I am running into an error using glmFit command.

Here is my script so far:

> Trans_Atlas <- read.table("normalized_count_matrix", header=T, row.name=1, com='', check.names=F)
> group <- factor(c(1,1,2,2,3,3,4,4,5,5,6,6,7,7))
> design <- model.matrix(~0+group, data=Trans_Atlas$samples)
> test <- estimateDisp(Trans_Atlas,design)
> fit <- glmFit(test,design)
Error in glmFit.default(test, design) : y is not a numeric matrix

I'm confused on this error as test is a list with 6 values:

common.dispersion : num 0.482

trended.dispersion : num [1:30558]

tagwise.dispersion : num [1:30558]

span : num 0.281

prior.df : num 6.21

prior.n : num 0.888

I'm new to R where did I mess up?

Thanks

edger • 3.5k views

ADD COMMENT • link updated 6.6 years ago by Aaron Lun ★ 28k • written 6.6 years ago by borpano ▴ 20

score 0 · Answer 1 · 2018-08-23

0

Entering edit mode

Aaron Lun ★ 28k

@alun

Last seen 4 hours ago

The city by the bay

A list is not a matrix, so that's why it doesn't work.

There are a number of issues with what you are doing. For starters, you should supply the raw counts to edgeR, not normalized values. You should be using normalization factors; you should be filtering; and you should be using the DGEList data structure to coordinate this across the analysis.

I strongly suggest reading the edgeR user's guide, which will save time for both me and you.

ADD COMMENT • link 6.6 years ago Aaron Lun ★ 28k

0

Entering edit mode

Hi even I had this problem, honestly, I read the manual and I am still having difficulty. It is not just about reading manual it is about knowing how to use these functions. I like the user borpano do not know what to do when I have this issue.

ADD REPLY • link 6.6 years ago yr542 • 0

0

Entering edit mode

As for normalization factors I would like to know how you suggest doing TMM normalization Aaron Lun as you are supposed to supply the original counts?

ADD REPLY • link 6.6 years ago yr542 • 0

0

Entering edit mode

Yes, see ?calcNormFactors.

ADD REPLY • link 6.6 years ago Aaron Lun ★ 28k

0

Entering edit mode

I know this one however, I don't think we can just supply it to a in the following command

y <- estimateDisp( a, design) .

I wanted to check as I am not used to using this package. Any help you can give would be much appreciated.

ADD REPLY • link 6.6 years ago yr542 • 0

0

Entering edit mode

If you read ?estimateDisp, you will see that the output of estimateDisp changes depending on whether a is a DGEList or a matrix. All use cases in the user's guide will supply a DGEList to estimateDisp; this is what you should be doing as well.

ADD REPLY • link 6.6 years ago Aaron Lun ★ 28k

0

Entering edit mode

I get that we should be supplying a DGEList but can we supply TMM normalized values as it does not seem okay to in the documentation to supply it to estimateDisp? However, TMM normalization in general is encouraged, I believe?

ADD REPLY • link 6.6 years ago yr542 • 0

0

Entering edit mode

I get that we should be supplying a DGEList

Then do it.

can we supply TMM normalized values

edgeR has no concept of "TMM normalized values", see Section 2.7.6 of the user's guide. There are TMM normalization factors, but the counts should not be modified during the course of the analysis. You should not supply any kind of scaled counts to estimateDisp or the other edgeR functions.

it does not seem okay to in the documentation to supply it to estimateDisp?

I don't understand what you're complaining about. Just create a DGEList object and use it for your analysis steps including estimateDisp - it's that easy. Why are you trying to fight this?

However, TMM normalization in general is encouraged, I believe?

Use calcNormFactors on a DGEList, and you will get a DGEList in return. This can be used for input into the remaining edgeR functions, including estimateDisp.

ADD REPLY • link 6.6 years ago Aaron Lun ★ 28k

0

Entering edit mode

I was not trying to fight I just didn't understand so from that I get that I should just use a dgelist of counts fed in estimateDisp. Thank you for all your help! :)

ADD REPLY • link 6.6 years ago yr542 • 0

0

Entering edit mode

If you have been following the instructions in the user's guide, you cannot possibly have the problems described by the OP, because we never tell users to run the commands in that manner! Specifically, all calls to estimateDisp are performed on DGEList objects, not on the count matrices themselves. In addition, we do not accept "normalized counts" as input - see the explanation in Section 2.7.6. I would suggest the following:

Read the workflow here.
Read the documentation for individual functions, see ?estimateDisp.
Read the user's guide again.
Read the posting guide and describe your specific problem in detail, with code.

ADD REPLY • link 6.6 years ago Aaron Lun ★ 28k

0

Entering edit mode

Oh in which case, while I see your point, I have used a DGEList (I did not see that the person did not use a DGEList) and yet I am still getting the error that this person got. As for the user guide please do not refer to that I do not understand what I am doing wrong, I have read the manual.

ADD REPLY • link 6.6 years ago yr542 • 0

0

Entering edit mode

I see you must have ignored point 4 in my comment above. I'm not psychic, I can't figure out what you did wrong if you don't show any code! In the absence of any further information, I can only refer you to the user's guide, or the workflow that I mentioned above, or ?estimateDisp.

ADD REPLY • link 6.6 years ago Aaron Lun ★ 28k

0

Entering edit mode

I did not ignore it intentionally. I did not realize that I needed to in this case. Sorry. Thank you for the help though.

ADD REPLY • link 6.6 years ago yr542 • 0

0

Entering edit mode

I'm confused as to why the estimateDisp is creating this list object instead of the correct matrix. To generate this count matrix I used some of the perl scripts include in the Trinity de novo transcriptome Assember which output this R script to create both a DGEList and normalized via normalization factors all using the edgeR package.

> library(edgeR)
Loading required package: limma
>
> rnaseqMatrix = read.table("RSEM.isoform.TPM.not_cross_norm", header=T, row.names=1, com='', check.names=F)
> rnaseqMatrix = as.matrix(rnaseqMatrix)
> rnaseqMatrix = round(rnaseqMatrix)
> exp_study = DGEList(counts=rnaseqMatrix, group=factor(colnames(rnaseqMatrix)))
> exp_study = calcNormFactors(exp_study)
> exp_study$samples$eff.lib.size = exp_study$samples$lib.size * exp_study$samples$norm.factors
> write.table(exp_study$samples, file="RSEM.isoform.TPM.not_cross_norm.TMM_info.txt", quote=F, sep="\t", row.names=F)
>

Do I have to reformat the "Trans_Atlas into a DGEList data structure?

ADD REPLY • link 6.6 years ago borpano ▴ 20

1

Entering edit mode

Yes. Read the "Value" section in ?estimateDisp, especially pertaining to estimateDisp.default.