Question: How to find Top 10 expressed genes in DESeq2
0
gravatar for melkile26
11 weeks ago by
melkile260
melkile260 wrote:

I am a newbie to RNAseq analysis. Briefly, I mapped the RNA-seq reads to the reference genome using STAR and I used featureCounts to quantify the reads mapped to the reference genes and found the gene counts. I removed rows that have no counts, or only a single count across all samples using the code below as it is mentioned in RNA-seq workflow

dds <- dds[ rowSums(counts(dds)) > 1, ]

Then I run the function DESeq on the raw counts.

dds <- DESeq(dds)

I would like to extract top 10 expressed genes and then I sorted the result based on baseMean and found the following genes. Is it the right approach to extrct the top expressed genes?

baseMean    log2FoldChange  lfcSE   stat    pvalue  padj LUKE00000033088    24608.95    2.492   0.481   5.180   2.22038382101822E-07    5.88213544449403E-05 LUKE00000005348    23004.02    -1.559  0.366   -4.265  1.9950729451741E-05 0.001090627809135 LUKE00000038686   17687.10    -2.207  0.368   -5.993  2.06555823272055E-09    2.01779219858888E-06 LUKE010978756  9213.70 -2.064  0.354   -5.826  5.66036709013022E-09    4.21293036279692E-06 LUKE00000004836    8685.09 -1.576  0.378   -4.171  3.03873604272807E-05    0.001294153796944 LUKE00000002600   8268.26 -1.522  0.427   -3.565  0.000363547763205   0.005474230769653 LUKE010993945 7758.28 -1.873  0.495   -3.786  0.000153046441589   0.003240726698392 LUKE00000032342   7452.12 1.966   0.480   4.098   4.16481516975401E-05    0.001557322035963 LUKE00000003460   6717.35 -1.686  0.392   -4.304  1.68102625403878E-05    0.000991488315118 LUKE00000018131   6508.93 -1.722  0.402   -4.280  1.86954821598158E-05    0.00106646126335

Is it the right approach to extract the top expressed genes?

ADD COMMENTlink modified 11 weeks ago by swbarnes2150 • written 11 weeks ago by melkile260
Answer: How to find Top 10 expressed genes in DESeq2
2
gravatar for Michael Love
11 weeks ago by
Michael Love23k
United States
Michael Love23k wrote:

We have some code in the workflow:

https://bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html

Basically:

head(res[order(res$padj),], 10)
ADD COMMENTlink written 11 weeks ago by Michael Love23k

Dear Michael,

Thanks for your replay,

Are you recommending to extract the top expressed genes based on adjusted p-value rather than sorting baseMean?

Best regards,

Mel

ADD REPLYlink written 11 weeks ago by melkile260

Oh, I missed that you want to rank by expression. You should really use TPM for this, rather than counts. Counts are not proportional to expression across genes.

ADD REPLYlink written 11 weeks ago by Michael Love23k

Is this because baseMean doesn't take into account the differences in library sizes between samples

ADD REPLYlink written 10 weeks ago by melkile260
Answer: How to find Top 10 expressed genes in DESeq2
1
gravatar for swbarnes2
11 weeks ago by
swbarnes2150
swbarnes2150 wrote:

This will give you the genes with the highest number of counts, but for some library preps, count numbers for a gene can be related to the length of the transcript, so you would want to correct for that if it was a factor with your library prep.

ADD COMMENTlink written 11 weeks ago by swbarnes2150
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 269 users visited in the last hour