Question: How to find Top 10 expressed genes in DESeq2
0
8 months ago by
melkile260 wrote:

I am a newbie to RNAseq analysis. Briefly, I mapped the RNA-seq reads to the reference genome using STAR and I used featureCounts to quantify the reads mapped to the reference genes and found the gene counts. I removed rows that have no counts, or only a single count across all samples using the code below as it is mentioned in RNA-seq workflow

dds <- dds[ rowSums(counts(dds)) > 1, ]


Then I run the function DESeq on the raw counts.

dds <- DESeq(dds)


I would like to extract top 10 expressed genes and then I sorted the result based on baseMean and found the following genes. Is it the right approach to extrct the top expressed genes?

baseMean    log2FoldChange  lfcSE   stat    pvalue  padj LUKE00000033088    24608.95    2.492   0.481   5.180   2.22038382101822E-07    5.88213544449403E-05 LUKE00000005348    23004.02    -1.559  0.366   -4.265  1.9950729451741E-05 0.001090627809135 LUKE00000038686   17687.10    -2.207  0.368   -5.993  2.06555823272055E-09    2.01779219858888E-06 LUKE010978756  9213.70 -2.064  0.354   -5.826  5.66036709013022E-09    4.21293036279692E-06 LUKE00000004836    8685.09 -1.576  0.378   -4.171  3.03873604272807E-05    0.001294153796944 LUKE00000002600   8268.26 -1.522  0.427   -3.565  0.000363547763205   0.005474230769653 LUKE010993945 7758.28 -1.873  0.495   -3.786  0.000153046441589   0.003240726698392 LUKE00000032342   7452.12 1.966   0.480   4.098   4.16481516975401E-05    0.001557322035963 LUKE00000003460   6717.35 -1.686  0.392   -4.304  1.68102625403878E-05    0.000991488315118 LUKE00000018131   6508.93 -1.722  0.402   -4.280  1.86954821598158E-05    0.00106646126335


Is it the right approach to extract the top expressed genes?

modified 8 months ago by swbarnes2330 • written 8 months ago by melkile260
Answer: How to find Top 10 expressed genes in DESeq2
2
8 months ago by
Michael Love25k
United States
Michael Love25k wrote:

We have some code in the workflow:

https://bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html

Basically:

head(res[order(res\$padj),], 10)


Dear Michael,

Are you recommending to extract the top expressed genes based on adjusted p-value rather than sorting baseMean?

Best regards,

Mel

Oh, I missed that you want to rank by expression. You should really use TPM for this, rather than counts. Counts are not proportional to expression across genes.

Is this because baseMean doesn't take into account the differences in library sizes between samples

Answer: How to find Top 10 expressed genes in DESeq2
1
8 months ago by
swbarnes2330
swbarnes2330 wrote:

This will give you the genes with the highest number of counts, but for some library preps, count numbers for a gene can be related to the length of the transcript, so you would want to correct for that if it was a factor with your library prep.