Some discrepancy between ED genes in cBioPortal vs DESeq2
1
0
Entering edit mode
Alexandre • 0
@04bd68e3
Last seen 17 months ago
Brazil

Hi,

I did an analysis through the cBioPortal website and another one using TCGAbiolinks to get the TCGA files and then DESeq2 locally. Although there is a good overlap (58%) of differentially expressed (DE) genes, I'm curious to understand why there isn't a more satisfactory overlap when comparing the two pipelines.

This is the analysis using cBioPortal: https://www.cbioportal.org/comparison/mrna?comparisonId=63b2d2551cec6922c422d9a2

I noticed the DE genes in my analysis that are not statistically DE in cBioPortal are mostly genes with low counts. I have already tried to filter these genes using:

keep <- rowSums( counts(dds) >= 5 ) >= 50 #since I am working with >950 samples
dds <- dds[keep,]

But I keep seeing very low expressed genes as the top DE genes (lower padj) in my analysis.

Some examples that could be checked in the link above; these genes are DE in my analysis, but not in cBioPortal: "ALDH3A1" "STEAP1B" "GHSR" "KRT12"

One gene, "OR3A3", for example, is up-regulated in the “High” group in cBioPortal, but downregulated in the “High” group in my analysis.

Is there a way to get a better overlaping or that is what it is?

I will be glad to provide more details.

Thank you,

Alex

DESeq2 TCGAbiolinks • 1.2k views
ADD COMMENT
0
Entering edit mode

I didn't see an indication of using the cBioPortalData R package. I've added the TCGAbiolinks package tag instead.

FWIW, you should be able to get the same data from cBioPortalData with studyId = "brca_tcga_pan_can_atlas_2018" .

Best,

Marcel

ADD REPLY
0
Entering edit mode

Hi Marcel,

Thank you for your response.

Your suggestion works for me. I can use the data got from cBioPortalData. What do you recommend to perform a differential expression analysis on the data “data_mrna_seq_v2_rsem”? I would prefer to use DESeq2 if it is possible.

Thank you for your time.

Alex

ADD REPLY
1
Entering edit mode
@mikelove
Last seen 3 hours ago
United States

Without knowing what code is used in the other pipeline, can’t offer much guidance.

Note that Bioconductor RNA packages often have a lot of agreement on their DE gene sets.

ADD COMMENT

Login before adding your answer.

Traffic: 847 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6