Search
Question: DESEQ2 and continuous data
0
2.7 years ago by
carandres740 wrote:

Hello

My name is Carlos. I am PhD student working on proteomic analysis from CSF. I am currently using the DESeq2 package in R to analyze the data, and wanted to ask if it is ok to use the negative binomial distribution on continuos data?
Is the correction factor in the DESeq2 package based on a ratio or on a multiplicative factor?
Kind regards,
CA

modified 2.7 years ago • written 2.7 years ago by carandres740
0
2.7 years ago by
Michael Love19k
United States
Michael Love19k wrote:

It's really based on the idea that the input is on the count scale. Can you explain exactly what kind of data you have?

0
2.7 years ago by
carandres740 wrote:

Hello Micheal

I have SWATH proteomic data from cerebrospinal fluid. This is continuous data.

Best,

CA

SWATH is MS1 quantitation in data independent (DIA) mode, i.e. not the kind of data one would typically view as count data.

1

If it's positive, but not counts or estimated counts of observations, I don't think count-based models are the right approach.

You might try limma-voom for finding differences (so scaling columns, log transforming, and estimating variance mean dependence from the data to inform the linear model)? Maybe Laurent can weigh in on whether that sounds reasonable.

limma is definitely a great option. I am not sure about the voom transformation.

0
2.7 years ago by
carandres740 wrote:

Hello

This is what i was trying to do in order to normalize the data

df = as.matrix(x)
df[df==0]=NA

kdf = knnImputation(df)

cds=estimateSizeFactorsForMatrix(kdf)
ndf=kdf/rbind(rep(list(cds)))

boxplot(log(ndf))

Witch actually gives a pretty normalized box plot, but when trying to plot svd using

svd=svd(ndf)

plot(svd$v[,1],svd[,2],col=as.numeric(pdata$Group)

i get a strange PCA plot with no separation of the data!!

Best,

CA

I don't have any comment on normalizing this data, as I have no idea about the distribution, what is measured, etc.

You might consider making a new post, with a more descriptive title, e.g. "how to normalize SWATH proteomic data", as we've determined that it doesn't make much sense to use count-based tools like DESeq2.

0
2.7 years ago by
carandres740 wrote:

Hello

This is what i was trying to do in order to normalize the data

df = as.matrix(x)
df[df==0]=NA

kdf = knnImputation(df)

cds=estimateSizeFactorsForMatrix(kdf)
ndf=kdf/rbind(rep(list(cds)))

boxplot(log(ndf))

Witch actually gives a pretty normalized box plot, but when trying to plot svd using

svd=svd(ndf)

plot(svd$v[,1],svd[,2],col=as.numeric(pdata$Group)

i get a strange PCA plot with no separation of the data!!

Best,

CA