Question: Differential expression of RNA seq data at gene level (for Beginner)
gravatar for islamshaikhul2014
21 months ago by
islamshaikhul20140 wrote:


At first, I must mention that I am a beginner in bioinformatics. I am really sorry since my questions may seem really basic.

I am trying to conduct the differential gene expression analysis of my RNAseq data. I am conducting my experiment on Transcriptome analysis of early phase of chlorosis in transgenic tobacco (N. tabaccum) plants. We want to analyze the transcriptional changes early after Dex-treatment (dexamethasone - an inducible promoter) in tobaccos and compare with other transcriptome analyses in virus-infected plants. In this case, I am planning to use DESeq2.

DESeq2 requires count data in the form of a rectangular table of integer values. Now, how can I prepare this table from my RNAseq data? I have read your article "Differential expression of RNA-Seq data at the gene level {the DESeq package)". There you have suggested to use the summarizeOverlaps function of Bioconductor software. I have tried to read the summarizeOverlaps.pdf file; but could not understand it properly. 

I would really appreciate any suggestions regarding how to start using DESeq2  for analyzing my RNAseq data.

Thank You.


ADD COMMENTlink modified 21 months ago by arfranco130 • written 21 months ago by islamshaikhul20140
Answer: Differential expression of RNA seq data at gene level (for Beginner)
gravatar for Ram
21 months ago by
Ram20 wrote:

After obtaining the mapped files you can calculate the counts either by using featureCounts or htseq-count by help of DESeqDataSetFromHTSeq. Later you can prepare count files on basis of condition (treated and untreated).


directory <- "/directory-path/"
sampleFiles <- grep("treated",list.files(directory),value=TRUE)
sampleTable<-data.frame(sampleName=sampleFiles, fileName=sampleFiles, condition=sampleCondition)
ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable=sampleTable, directory=directory, design= ~ condition)
colData(ddsHTSeq)$condition <- factor(colData(ddsHTSeq)$condition, levels=c("untreated","treated"))
rld = rlogTransformation(dds)
sum( abs(res$log2FoldChange) >= 0.5 & res$pvalue < 0.05, na.rm=TRUE )
ADD COMMENTlink written 21 months ago by Ram20
Answer: Differential expression of RNA seq data at gene level (for Beginner)
gravatar for arfranco
21 months ago by
European Union
arfranco130 wrote:

You can do two similar approaches

1. Map your RNA reads to a reference genome or transcriptome to eventually get a BAM file. This can be accomplished with TopHat, STAR, HISAT ant the like, and need its time and computer power. With this BAM file and with the help of a gtf or gff annotation file, along with programs like htseq-count or featurecounts (under R), you can get easily the tabular data you need

2. Do the mapping using the a reference cds fasta file using a pseudoaligner such as KALLISTO, SALMON, and the like. This has same adventages, like that the mapping is accomplished in minutes and using a regular computer without big resources. You obtain three files after each of the mapping that can be introduced into the DESeq2 pipeline with the help of the tximport R program. You have enough information into the DESeq2 vignette or help file to accomplish this

Some published papers see advantageous the using of the second approach 

ADD COMMENTlink modified 21 months ago • written 21 months ago by arfranco130
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 216 users visited in the last hour