Performing differential gene analysis on RT PCR experimental data
0
Entering edit mode
Nithisha • 10
@nithisha-14272
Last seen 3.0 years ago

Hi everyone, 

I had a question with regards to a GEO dataset: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE92776. The link says that this data contains expression profiling from RT PCR and so if I downloaded the raw non-normalized data at the bottom of the page, would the values there be gene counts/Ct values or dCt values? Could anyone advise on how I could change the numbers to the right format to be used by Limma/DESeq or how else I can perform differential gene analysis on them to get log2FoldChange values?

Thanks!

differential gene expression • 549 views
ADD COMMENTlink
1
Entering edit mode

The values are reported by the submitter to be "batch-corrected deltaCT" values.

ADD REPLYlink
0
Entering edit mode

Thank you Davis.

ADD REPLYlink
0
Entering edit mode

Hello Nithisha!

We believe that this post does not fit the main topic of this site.

This question doesn't have anything to do with any Bioconductor package, and is instead a general question about how to analyze data. You might try asking on stackoverflow.com or biostars.com.

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.

Cheers!

ADD REPLYlink
4
Entering edit mode
@gordon-smyth
Last seen 7 hours ago
WEHI, Melbourne, Australia

RT-PCR data are almost always delta CT values, and the GEO entry tells you that this is so. You can analyse delta CT values using the limma package.

Just read the values into a matrix:

> x <- read.delim("GSE92776_Non-normalized_data.txt",row.names="ID_REF")
> dim(x)
[1] 347 337

Now CT values are inversely correlated with expression, so you need to reverse them to represent log2-expression values:

> y <- max(x) - x
> y[1:4,1:3]
                    X100.B0100V1.BASELINE X101.B0101V1.BASELINE X102.B0102V1.BASELINE
ABCA1-Hs01059118_m1              18.23124              17.36424              16.88424
ACP1-Hs00962877_m1               16.58378              17.52378              18.24978
ADAR-Hs00241666_m1               19.92366              19.17866              19.89866
ADM-Hs00181605_m1                16.34361              15.83161              15.63761

Then you can enter the expression matrix into a limma analysis. You will need to create a design matrix, which will require quite a bit of care as this is a complex dataset.

I would suggest that you also filter genes with very low expression, for example you might do this:

> keep <- rowMeans(y) > 7
> y <- y[keep,]

Then you do a standard limma analysis, starting with:

> fit <- lmFit(y, design)

and so on.

 

 

PS. PCR never produces "gene counts", so I don't know where you might have got that idea from. You certainly can't analyse this data using a RNA-seq package like DESeq or edgeR.

ADD COMMENTlink
0
Entering edit mode

Hi Gordon,

Sorry for the late reply, this was very informative, thank you so much. 

 

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Traffic: 218 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.4