Question

zero counts and deseq2

0

Entering edit mode

scamiolo ▴ 10

@scamiolo-14389

Last seen 6.4 years ago

Dear Community,

I am writing to kindly ask support on the software deseq2. I am currently working on differential expression in transcripts of vitis vinifera. I aligned several RNAseq experiments on the transcripts sequences of the vitis reference genome using bwa and computed fpkm and effective reads counts with the software express. I ran the deseq pipeline but there is some part that is not clear to me. I am working on two experiments with 2 replicates each. My coldata file (named coldata.txt) is the following:

condition type

CAN3FI2_exp.txt untreated paired-end

CAN3FI3_exp.txt untreated paired-end

CAN3NI2_exp.txt treated paired-end

CAN3NI3_exp.txt treated paired-end

The first rows of my expression (reads count) input table (named expressionTable.txt) are the following:

CAN3FI2_exp.txt CAN3FI3_exp.txt CAN3NI2_exp.txt CAN3NI3_exp.txt

678 1003 598 488

40 87 25 15

9 12 7 7

914 1028 1063 856

0 0 0 0

48 74 31 25

221 281 181 140

776 1225 664 539

0 0 0 0

finally the first rows of the rownames file (named rownames.txt) are the following:

chr5.gff3_MRNA_VIT_05s0020g02570.t01

chrUn.gff3_MRNA_VIT_00s0144g00290.t01

chr15.gff3_MRNA_VIT_15s0048g00090.t01

chr11.gff3_MRNA_VIT_11s0118g00830.t01

chr5.gff3_MRNA_VIT_05s0165g00290.t01

chr9.gff3_MRNA_VIT_09s0070g00570.t01

chr18.gff3_MRNA_VIT_18s0001g07490.t01

chr12.gff3_MRNA_VIT_12s0035g01940.t01

chr16.gff3_MRNA_VIT_16s0022g02170.t01

chr3.gff3_MRNA_VIT_03s0038g02130.t01

I run the following pipeline to compute the genes that are differentially expressed:

library("DESeq2")

expSet <- read.table("expressionTable.txt",header=T)

rownames <- unlist(read.table("rownames.txt"))

row.names(expSet) <- rownames

coldata <- read.table("coldata.txt",header=T)

cts <- as.matrix(expSet)

dds <- DESeqDataSetFromMatrix(countData=cts,colData=coldata, design=~ condition)

dds <- DESeq(dds)

res <- results(dds)

write.table(res,"expDiff.txt")

ntd <- normTransform(dds)

write.table(assay(ntd),"normalizedExpressions.txt")

If I consider the transcript named chr16.gff3_MRNA_VIT_16s0100g00990.t01 and I look for it in the produced expDiff.txt file I find:

"chr16.gff3_MRNA_VIT_16s0100g00990.t01" 32.3186500941916 1.09546846712526 0.29492459931321 3.71440181550224 0.000203684932436758 0.00480523551483428

As you can see the log2Fold change value is 1.095; this means that the treated sample is more highly expressed than the untreated. If I look for that gene in the rowname.txt file in order to get the line number I obtain the following:

grep -nr "VIT_16s0100g00990" rownames.txt

28170:chr16.gff3_MRNA_VIT_16s0100g00990.t01

Since the expressionTable.txt file has also the field names line, I used the following command to retrieve the reads counts from it corresponding to the transcript chr16.gff3_MRNA_VIT_16s0100g00990.t01

sed -n 28171p expressionTable.txt

19 162 0 0

My question is the following: if the treated sample have zero count in both the replicates, how can it be more highly expressed than the untreated as reported in the expDiff file? I thought it was a matter of normalization and for this reason I used the function normTransform to have the normalized data output in the file normalizedData.txt. However if I look for this transcript in this file I get the following:

grep "chr16.gff3_MRNA_VIT_16s0100g00990.t01" normalizedExpressions.txt

"chr16.gff3_MRNA_VIT_16s0100g00990.t01" 4.29263896969935 6.80318233956259 0 0

So even here the treated have zero count. Am I doing something wrong?

Sorry to bother you with a so long e-mail but I preferred to be as detailed as possible in order to give you the possibility to have a precise picture of what I am doing

Thanks a lot for your support

deseq2 • 1.1k views

ADD COMMENT • link updated 6.4 years ago by Michael Love 41k • written 6.4 years ago by scamiolo ▴ 10

score 0 · Answer 1 · 2017-11-14

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 2 hours ago

United States

Check the top of the results table when you print it to console, it will tell you the order of the factor levels In the fold change. This can be controlled with input from the user of course, see the vignette, “Note on factor levels”.

ADD COMMENT • link 6.4 years ago Michael Love 41k