Question

Performing differential gene analysis on RT PCR experimental data

0

Entering edit mode

Nithisha ▴ 10

@nithisha-14272

Last seen 7.7 years ago

Hi everyone,

I had a question with regards to a GEO dataset: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE92776. The link says that this data contains expression profiling from RT PCR and so if I downloaded the raw non-normalized data at the bottom of the page, would the values there be gene counts/Ct values or dCt values? Could anyone advise on how I could change the numbers to the right format to be used by Limma/DESeq or how else I can perform differential gene analysis on them to get log2FoldChange values?

Thanks!

differential gene expression • 3.1k views

ADD COMMENT • link updated 7.7 years ago by Gordon Smyth 53k • written 7.7 years ago by Nithisha ▴ 10

1

Entering edit mode

The values are reported by the submitter to be "batch-corrected deltaCT" values.

ADD REPLY • link 7.7 years ago Sean Davis 21k

0

Entering edit mode

Thank you Davis.

ADD REPLY • link 7.7 years ago Nithisha ▴ 10

0

Entering edit mode

Hello Nithisha!

We believe that this post does not fit the main topic of this site.

This question doesn't have anything to do with any Bioconductor package, and is instead a general question about how to analyze data. You might try asking on stackoverflow.com or biostars.com.

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.

Cheers!

ADD REPLY • link 7.7 years ago James W. MacDonald 68k

score 4 · Accepted Answer · 2018-02-08

RT-PCR data are almost always delta CT values, and the GEO entry tells you that this is so. You can analyse delta CT values using the limma package.

Just read the values into a matrix:

> x <- read.delim("GSE92776_Non-normalized_data.txt",row.names="ID_REF")
> dim(x)
[1] 347 337

Now CT values are inversely correlated with expression, so you need to reverse them to represent log2-expression values:

> y <- max(x) - x
> y[1:4,1:3]
                    X100.B0100V1.BASELINE X101.B0101V1.BASELINE X102.B0102V1.BASELINE
ABCA1-Hs01059118_m1              18.23124              17.36424              16.88424
ACP1-Hs00962877_m1               16.58378              17.52378              18.24978
ADAR-Hs00241666_m1               19.92366              19.17866              19.89866
ADM-Hs00181605_m1                16.34361              15.83161              15.63761

Then you can enter the expression matrix into a limma analysis. You will need to create a design matrix, which will require quite a bit of care as this is a complex dataset.

I would suggest that you also filter genes with very low expression, for example you might do this:

> keep <- rowMeans(y) > 7
> y <- y[keep,]

Then you do a standard limma analysis, starting with:

> fit <- lmFit(y, design)

and so on.

PS. PCR never produces "gene counts", so I don't know where you might have got that idea from. You certainly can't analyse this data using a RNA-seq package like DESeq or edgeR.