DESEQ2 and continuous data
4
0
Entering edit mode
@carandres74-9625
Last seen 6.6 years ago

Hello

My name is Carlos. I am PhD student working on proteomic analysis from CSF. I am currently using the DESeq2 package in R to analyze the data, and wanted to ask if it is ok to use the negative binomial distribution on continuos data?
Is the correction factor in the DESeq2 package based on a ratio or on a multiplicative factor?
Kind regards,
CA

proteomics deseq2 • 1.4k views
0
Entering edit mode
@mikelove
Last seen 1 day ago
United States

It's really based on the idea that the input is on the count scale. Can you explain exactly what kind of data you have?

0
Entering edit mode
@carandres74-9625
Last seen 6.6 years ago

Hello Micheal

I have SWATH proteomic data from cerebrospinal fluid. This is continuous data.

Best,

CA

0
Entering edit mode

SWATH is MS1 quantitation in data independent (DIA) mode, i.e. not the kind of data one would typically view as count data.

1
Entering edit mode

If it's positive, but not counts or estimated counts of observations, I don't think count-based models are the right approach.

You might try limma-voom for finding differences (so scaling columns, log transforming, and estimating variance mean dependence from the data to inform the linear model)? Maybe Laurent can weigh in on whether that sounds reasonable.

0
Entering edit mode

limma is definitely a great option. I am not sure about the voom transformation.

0
Entering edit mode
@carandres74-9625
Last seen 6.6 years ago

Hello

This is what i was trying to do in order to normalize the data

df = as.matrix(x)
df[df==0]=NA

kdf = knnImputation(df)

cds=estimateSizeFactorsForMatrix(kdf)
ndf=kdf/rbind(rep(list(cds)))

boxplot(log(ndf))

Witch actually gives a pretty normalized box plot, but when trying to plot svd using

svd=svd(ndf)

plot(svd$v[,1],svd[,2],col=as.numeric(pdata$Group)

i get a strange PCA plot with no separation of the data!!

Best,

CA

0
Entering edit mode

I don't have any comment on normalizing this data, as I have no idea about the distribution, what is measured, etc.

You might consider making a new post, with a more descriptive title, e.g. "how to normalize SWATH proteomic data", as we've determined that it doesn't make much sense to use count-based tools like DESeq2.

0
Entering edit mode
@carandres74-9625
Last seen 6.6 years ago

Hello

This is what i was trying to do in order to normalize the data

df = as.matrix(x)
df[df==0]=NA

kdf = knnImputation(df)

cds=estimateSizeFactorsForMatrix(kdf)
ndf=kdf/rbind(rep(list(cds)))

boxplot(log(ndf))

Witch actually gives a pretty normalized box plot, but when trying to plot svd using

svd=svd(ndf)

plot(svd$v[,1],svd[,2],col=as.numeric(pdata$Group)

i get a strange PCA plot with no separation of the data!!

Best,

CA