**0**wrote:

This is a head of my dataset, with expected counts from RSEM, with 3 biological replicates of 2 samples (control and treatment), as can be seen in the group variable below. > head(rsem, 10) X4_7 X7_9 X8_13 X7_10 X4_27 X5_25 comp0_c0_seq1 0 0 0 0 0 0 comp100001_c0_seq1 0 1 3 2 1 1 comp100006_c0_seq1 2 0 0 0 0 2 comp100009_c0_seq1 0 1 0 0 0 0 comp100010_c0_seq1 0 0 0 0 0 0 comp100012_c0_seq1 0 3 0 3 1 0 comp100013_c0_seq1 0 0 0 3 3 6 comp100017_c0_seq1 0 0 0 0 0 0 comp100018_c0_seq1 2 0 2 3 5 4 comp100019_c0_seq1 0 0 0 3 0 0

> group [1] CS CS CS TS TS TS Levels: CS TS

Contrasts:

Contrasts Levels treatment CS -1 TS 1

After filtering and running edgeR I obtain the results table:

> head(glm$glm_lrt$table, 10) logFC logCPM LR PValue comp0_c0_seq1 0.000000 -4.010505 0.000000 1.0000000000 comp100010_c0_seq1 0.000000 -4.010505 0.000000 1.0000000000 comp100013_c0_seq1 4.891693 -3.025308 12.954572 0.0003191411 comp100017_c0_seq1 0.000000 -4.010505 0.000000 1.0000000000 comp100025_c0_seq1 0.000000 -4.010505 0.000000 1.0000000000 comp100035_c0_seq1 0.000000 -4.010505 0.000000 1.0000000000 comp100037_c0_seq1 0.000000 -4.010505 0.000000 1.0000000000 comp100038_c0_seq1 0.000000 -4.010505 0.000000 1.0000000000 comp100039_c0_seq1 0.000000 -4.010505 0.000000 1.0000000000 comp10003_c0_seq1 1.186745 -1.847143 2.686438 0.1012058293

These are the respective coefficients (I guess):

> head(glm$glm_lrt$coefficients, 10) CS TS comp0_c0_seq1 -19.36797 -19.36797 comp100010_c0_seq1 -19.36797 -19.36797 comp100013_c0_seq1 -19.36797 -15.97731 comp100017_c0_seq1 -19.36797 -19.36797 comp100025_c0_seq1 -19.36797 -19.36797 comp100035_c0_seq1 -19.36797 -19.36797 comp100037_c0_seq1 -19.36797 -19.36797 comp100038_c0_seq1 -19.36797 -19.36797 comp100039_c0_seq1 -19.36797 -19.36797 comp10003_c0_seq1 -15.84720 -15.02462

> hist(glm$glm_lrt$coefficients)

**Is it normal to have only negative values?**

I guess these are the b(g) coefficients from:

"Here **x***i* is a vector of covariates that specifies the treatment conditions applied to RNA sample *i*, and β*g* is a vector of regression coefficients by which the covariate effects are mediated for gene *g*." from http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3378882/

In the edgeR code (glmfit.R):

logFC <- drop((glmfit$coefficients %*% contrast)/log(2))

Where I guess the glmfit$coefficients are the beta(g) above and the "contrast" variable is the x(i) vector.

**Is this the same formula? If not, what is the difference? If so, why is log(mu(gi)) = logFC?**

From http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3378882/ again:

"We denote the total number of mapped reads in library *i* by *N**i* and the number that map to the *g*-th gene by *y**gi*."

Therefore, I would understand that

logFC = log(mu(gi) - mu(gj)) ~ x(i)*beta(i)+x(j)*beta(j)

for samples i, j. **Is that correct?**

Thank you in advance

**33k**• written 3.2 years ago by cpcantalapiedra •

**0**

I can probably guess, but how did you parametrize your design matrix? Also, what's your call to

`glmLRT`

?19kdesign <- model.matrix(~0+group)

colnames(design) <- levels(glm$samples$group)

my.constrasts <- makeContrasts(treatment = TS-CS, levels=design)

glm$glm_fit <- function(glm, design){

glm <- estimateGLMCommonDisp(glm, design)

glm <- estimateGLMTrendedDisp(glm, design)

glm <- estimateGLMTagwiseDisp(glm, design)

return(glm)

}

glm$glm_lrt <- glmLRT(glm$glm_fit, contrast=my.constrasts[,"treatment"])

0