Entering edit mode
Andres Eduardo Rodriguez Cubillos
▴
20
@andres-eduardo-rodriguez-cubillos-5486
Last seen 10.3 years ago
Good night Dr. Simon Anders,
I'm a master degree student at Universidad de los Andes (Bogot?,
Colombia) currently doing some research on differential gene
expression between three conditions (2 treated and 1 untreated). I'm
using DESeq on R to compare the FPKMs of the genes under these three
conditions with one replicate per condition.
The results were obtained by single-end RNA-seq data and the expressed
genes were aligned to a reference genome using Bowtie. Cufflinks and,
subsequently, CuffCompare were used to obtain the reads mapped in a
given sample to a given gene.
Overall, I organized my tables quite similar to the
pasilla_gene_counts.tsv file you used in your guide "Analyzing RNA-seq
Data with the DESeq Package" from 2012. At first I had two tables: one
per replicate containing FPKMs for the three conditions analyzed. I
then merged the FPKMs of both files to obtain three tables; each one
now with the FPKMs for both replicates compared between two of the
three conditions (TreatedA vs Untreated; TreatedB vs Untreated;
TreatedA vs TreatedB). The final format for each of these tables was
as follows:
Gene Treated1(Replicate1) Treated2(Replicate 2)
Untreated1(Replicate1) Untreated2(Replicate2)
tag_id FPKM FPKM
FPKM FPKM
tag_id FPKM FPKM
FPKM FPKM
...
I understand that DESeq must first normalize the expression values of
each treatment by dividing each column with it's own size factor...
however, when I want to estimate the size factors for any of my three
final tables I get a "NA" or "Not Applicable" value for each
treatment. It only happens with these tables that include both
replicates but not with the two previous tables that only contain
information for one replicate (results are attached).
We don't know what might be causing this problem because the tables
that contain information for one replicate have the same format than
the tables that have both replicates (and in the example you use a
table that contains replicates).
I was planning to make two separate analysis using DESeq without any
replicates (Section 3.3 of your guide), in spite of this problem, but
I read that one must assume gene expression levels are quite similar
between treatments (this is not our case). The idea I had in mind was
to perform one analysis for each replicate, separately, and then
compare the results to pick only genes that show differential
expression on both analysis. Is this right, even though we know the
expression levels vary between conditions?
What could be the cause of the "NA" output when trying to estimate the
size factors for the tables that contain both replicates?
Can I analyze three treatments at once (all in one table) using DESeq,
or only two conditions per table (analysis)?
We would appreciate your help on this matter a lot because we do want
to continue using DESeq for our differential gene expression analysis.
Sincerely,
Andr?s E. Rodr?guez C.
Graduate Assistant
LAMFU - Universidad de los Andes
(Bogot? D.C., Colombia)