Question: zero counts and deseq2
gravatar for scamiolo
13 months ago by
scamiolo10 wrote:

Dear Community,


I am writing to kindly ask support on the software deseq2. I am currently working on differential expression in transcripts of vitis vinifera. I aligned several RNAseq experiments on the transcripts sequences of the vitis reference genome using bwa and computed  fpkm and effective reads counts with the software express. I ran the deseq pipeline but there is some part that is not clear to me. I am working on two experiments with 2 replicates each. My coldata file (named coldata.txt) is the following:


condition type

CAN3FI2_exp.txt untreated paired-end

CAN3FI3_exp.txt untreated paired-end

CAN3NI2_exp.txt treated paired-end

CAN3NI3_exp.txt treated paired-end


The first rows of my expression (reads count) input table (named expressionTable.txt) are the following:


CAN3FI2_exp.txt CAN3FI3_exp.txt CAN3NI2_exp.txt CAN3NI3_exp.txt

678 1003 598 488

40 87 25 15

9 12 7 7

914 1028 1063 856

0 0 0 0

48 74 31 25

221 281 181 140

776 1225 664 539

0 0 0 0


finally the first rows of the rownames file (named rownames.txt) are the following:













I run the following pipeline to compute the genes that are differentially expressed:



expSet <- read.table("expressionTable.txt",header=T)

rownames <- unlist(read.table("rownames.txt"))

row.names(expSet) <- rownames

coldata <- read.table("coldata.txt",header=T)

cts <- as.matrix(expSet)

dds <- DESeqDataSetFromMatrix(countData=cts,colData=coldata, design=~ condition)

dds <- DESeq(dds)

res <- results(dds)


ntd <- normTransform(dds)



If I consider the transcript named chr16.gff3_MRNA_VIT_16s0100g00990.t01 and I look for it in the produced expDiff.txt file I find:


"chr16.gff3_MRNA_VIT_16s0100g00990.t01" 32.3186500941916 1.09546846712526 0.29492459931321 3.71440181550224 0.000203684932436758 0.00480523551483428


As you can see the log2Fold change value is 1.095; this means that the treated sample is more highly expressed than the untreated. If I look for that gene in the rowname.txt file in order to get the line number I obtain the following:


grep -nr "VIT_16s0100g00990" rownames.txt



Since the expressionTable.txt file has also the field names line, I used the following command to retrieve the reads counts from it corresponding to the transcript chr16.gff3_MRNA_VIT_16s0100g00990.t01


sed -n 28171p expressionTable.txt

19 162 0 0


My question is the following: if the treated sample have zero count in both the replicates, how can it be more highly expressed than the untreated as reported in the expDiff file? I thought it was a matter of normalization and for this reason I used the function normTransform to have the normalized data output in the file normalizedData.txt. However if I look for this transcript in this file I get the following:


grep "chr16.gff3_MRNA_VIT_16s0100g00990.t01" normalizedExpressions.txt

"chr16.gff3_MRNA_VIT_16s0100g00990.t01" 4.29263896969935 6.80318233956259 0 0


So even here the treated have zero count. Am I doing something wrong?

Sorry to bother you with a so long e-mail but I preferred to be as detailed as possible in order to give you the possibility to have a precise picture of what I am doing


Thanks a lot for your support

ADD COMMENTlink modified 13 months ago by Michael Love20k • written 13 months ago by scamiolo10
gravatar for Michael Love
13 months ago by
Michael Love20k
United States
Michael Love20k wrote:

Check the top of the results table when you print it to  console, it will tell you the order of the factor levels In the fold change. This can be controlled with input from the user of course, see the vignette, “Note on factor levels”.

ADD COMMENTlink written 13 months ago by Michael Love20k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 302 users visited in the last hour