Deseq2 weird values for Log2FC and pvalue.
1
0
Entering edit mode
@biomandressa-23774
Last seen 3.8 years ago
Brazil

Hello

I'm analyzing some data from tcga (tumor x normal) with deseq2, but some genes are returning with really big log2 fc and p values. Like this:

                            baseMean | log2FoldChange   |   lfcSE |stat |  pvalue |   padj<br>

ENSG00000121691.4| 278.024.793.462.181 | -147192720471788 |0.170612654907734| -862.730.379.240.558 |6,28E-04| 2,62E-02

ENSG00000250722.4 |438.211.180.060.727| -103006615299621| 0.160160306303726 |-643.146.967.415.759 |1,26E+04| 2,05E+05

Is this ok or did I do something wrong? I used HTSEQ-counts data.

Example of how I performed the analysis:

Data was constructed like this:

Data counts= row (gene ids), columns (sample names)

Metadata= row (same sample names, same order), column (condition - target and control; subject - 1/1, 2/2 - paired by patient).

dds <-DESeqDataSetFromMatrix(countData = rawCountTable, colData = sampleInfo, design = ~ subject + condition)
dds <- dds[ rowSums(counts(dds)) >1,]
dds$condition <- relevel(dds$condition, ref="Control")
dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds)
dds <- DESeq(dds)
res <- results(dds, contrast=c("condition","Target","Control"), alpha=0.05)
res

deseq2 rnaseq r • 762 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 4 hours ago
United States

Your counts look wrong (I'm looking at baseMean).

What do you get with:

summary(counts(dds))
ADD COMMENT
0
Entering edit mode

Hi, I didn't generated the counts. I have downloaded the file from TCGA. They have available HTSEQ counts and HTSEQ FPKM for each patient. I have downloaded HTSEQ counts. I contacted support from GDC portal and they said that "There is no further transformation after the HTSeq-Counts data are acquired.".

summary(counts(dds)) [I'm posting part of the data because of words limit].

TCGA-BC-A10Q-T TCGA-BC-A10Q-N TCGA-BC-A10R-T
Min. : 0 Min. : 0 Min. : 0
1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
Median : 2 Median : 1 Median : 2
Mean : 1062 Mean : 1074 Mean : 1423
3rd Qu.: 143 3rd Qu.: 71 3rd Qu.: 140
Max. :1755800 Max. :5453481 Max. :5899751

TCGA-BC-A10R-N TCGA-BC-A10T-T TCGA-BC-A10T-N
Min. : 0 Min. : 0 Min. : 0
1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
Median : 1 Median : 2 Median : 1
Mean : 1370 Mean : 1501 Mean : 973
3rd Qu.: 93 3rd Qu.: 184 3rd Qu.: 61
Max. :7372618 Max. :6450987 Max. :6128977

TCGA-BC-A10U-T TCGA-BC-A10U-N TCGA-BC-A10W-T
Min. : 0 Min. : 0 Min. : 0
1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
Median : 3 Median : 1 Median : 1
Mean : 1355 Mean : 1213 Mean : 1168
3rd Qu.: 183 3rd Qu.: 89 3rd Qu.: 90
Max. :1722281 Max. :6355384 Max. :5701625

TCGA-BC-A10W-N TCGA-BC-A10X-T TCGA-BC-A10X-N
Min. : 0 Min. : 0 Min. : 0
1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
Median : 2 Median : 2 Median : 2
Mean : 1355 Mean : 1481 Mean : 1655
3rd Qu.: 200 3rd Qu.: 151 3rd Qu.: 127
Max. :1833691 Max. :6701929 Max. :9555183

TCGA-BC-A10Z-T TCGA-BC-A10Z-N TCGA-BC-A110-T
Min. : 0 Min. : 0 Min. : 0
1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
Median : 2 Median : 2 Median : 1
Mean : 1418 Mean : 1175 Mean : 637
3rd Qu.: 149 3rd Qu.: 106 3rd Qu.: 68
Max. :1437660 Max. :5699334 Max. :2522799

TCGA-BC-A110-N TCGA-BC-A216-T TCGA-BC-A216-N
Min. : 0 Min. : 0 Min. : 0
1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
Median : 1 Median : 3 Median : 1
Mean : 1131 Mean : 1204 Mean : 1265
3rd Qu.: 73 3rd Qu.: 211 3rd Qu.: 90
Max. :5161790 Max. :3153306 Max. :4076229

TCGA-BD-A2L6-T TCGA-BD-A2L6-N TCGA-BD-A3EP-T
Min. : 0 Min. : 0 Min. : 0.0
1st Qu.: 0 1st Qu.: 0 1st Qu.: 0.0
Median : 2 Median : 2 Median : 2.0
Mean : 1382 Mean : 1292 Mean : 907.5
3rd Qu.: 156 3rd Qu.: 175 3rd Qu.: 155.0
Max. :2671074 Max. :6829693 Max. :1994074.0

TCGA-BD-A3EP-N TCGA-DD-A113-T TCGA-DD-A113-N
Min. : 0 Min. : 0 Min. : 0
1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
Median : 1 Median : 2 Median : 1
Mean : 988 Mean : 1461 Mean : 1008
3rd Qu.: 77 3rd Qu.: 181 3rd Qu.: 67
Max. :7399284 Max. :8729599 Max. :2716549

TCGA-DD-A114-T TCGA-DD-A114-N TCGA-DD-A116-T
Min. : 0.0 Min. : 0 Min. : 0
1st Qu.: 0.0 1st Qu.: 0 1st Qu.: 0
Median : 1.0 Median : 2 Median : 1
Mean : 624.3 Mean : 1331 Mean : 1070
3rd Qu.: 150.0 3rd Qu.: 151 3rd Qu.: 121
Max. :1030351.0 Max. :5764313 Max. :1815554

TCGA-DD-A116-N TCGA-DD-A118-T TCGA-DD-A118-N
Min. : 0 Min. : 0 Min. : 0
1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
Median : 1 Median : 2 Median : 0
Mean : 1373 Mean : 1527 Mean : 774
3rd Qu.: 104 3rd Qu.: 145 3rd Qu.: 52
Max. :10541905 Max. :1887429 Max. :4897499

TCGA-DD-A119-T TCGA-DD-A119-N TCGA-DD-A11A-T
Min. : 0 Min. : 0 Min. : 0
1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
Median : 1 Median : 1 Median : 2
Mean : 1138 Mean : 1182 Mean : 1300
3rd Qu.: 89 3rd Qu.: 75 3rd Qu.: 132
Max. :2470780 Max. :2651164 Max. :4898883

ADD REPLY
0
Entering edit mode

Maybe you can figure out what's going on with a bit of exploration, is it the case that the gene you pasted above has a mean count of 278 trillion reads? Take a look at the counts for that gene with counts(dds, normalized=TRUE) across samples.

ADD REPLY
0
Entering edit mode

(again showing just part of data, but mean was calculated with all of data).

counts(dds, normalized=TRUE)[5176,]

TCGA-BC-A10Q-T TCGA-BC-A10Q-N TCGA-BC-A10R-T TCGA-BC-A10R-N 

      3584.647      50295.225      11458.843      53618.712 

TCGA-BC-A10T-T TCGA-BC-A10T-N TCGA-BC-A10U-T TCGA-BC-A10U-N 

     52236.447      35884.749       6243.326      39861.797 

TCGA-BC-A10W-T TCGA-BC-A10W-N TCGA-BC-A10X-T TCGA-BC-A10X-N 

     36771.806       6421.304      26461.536      51103.550 

counts <-counts(dds, normalized=TRUE)[5176,] mean(counts) [1] 27802.48

ADD REPLY
0
Entering edit mode

I exported results as .csv, maybe the problems are with Excel that is reading the numbers wrong, and not my counts!!!

res [5176,] log2 fold change (MLE): condition Target vs Control Wald test p-value: condition Target vs Control DataFrame with 1 row and 6 columns
baseMean log2FoldChange lfcSE stat <numeric> <numeric> <numeric> <numeric> ENSG00000121691.4 27802.5 -1.47193 0.170613 -8.6273
pvalue padj <numeric> <numeric> ENSG00000121691.4 6.28152e-18 2.6215e-16

How about that?

ADD REPLY
0
Entering edit mode

Ok so everything is now solved?

ADD REPLY

Login before adding your answer.

Traffic: 729 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6