Search
Question: DESEQ2 and continuous data
0
gravatar for carandres74
21 months ago by
carandres740 wrote:

Hello

My name is Carlos. I am PhD student working on proteomic analysis from CSF. I am currently using the DESeq2 package in R to analyze the data, and wanted to ask if it is ok to use the negative binomial distribution on continuos data?
Is the correction factor in the DESeq2 package based on a ratio or on a multiplicative factor?
Kind regards,
CA

ADD COMMENTlink modified 21 months ago • written 21 months ago by carandres740
0
gravatar for Michael Love
21 months ago by
Michael Love14k
United States
Michael Love14k wrote:

It's really based on the idea that the input is on the count scale. Can you explain exactly what kind of data you have? 

ADD COMMENTlink written 21 months ago by Michael Love14k
0
gravatar for carandres74
21 months ago by
carandres740 wrote:

Hello Micheal

I have SWATH proteomic data from cerebrospinal fluid. This is continuous data.

Hope this answers your questions

Best,

CA

 

ADD COMMENTlink written 21 months ago by carandres740

SWATH is MS1 quantitation in data independent (DIA) mode, i.e. not the kind of data one would typically view as count data.

ADD REPLYlink written 21 months ago by Laurent Gatto840
1

If it's positive, but not counts or estimated counts of observations, I don't think count-based models are the right approach.

You might try limma-voom for finding differences (so scaling columns, log transforming, and estimating variance mean dependence from the data to inform the linear model)? Maybe Laurent can weigh in on whether that sounds reasonable.

ADD REPLYlink written 21 months ago by Michael Love14k

limma is definitely a great option. I am not sure about the voom transformation.

ADD REPLYlink written 21 months ago by Laurent Gatto840
0
gravatar for carandres74
21 months ago by
carandres740 wrote:

Hello

This is what i was trying to do in order to normalize the data

df = as.matrix(x)
df[df==0]=NA

kdf = knnImputation(df)

cds=estimateSizeFactorsForMatrix(kdf)
ndf=kdf/rbind(rep(list(cds)))

boxplot(log(ndf))

Witch actually gives a pretty normalized box plot, but when trying to plot svd using

svd=svd(ndf)

plot(svd$v[,1],svd[,2],col=as.numeric(pdata$Group)

i get a strange PCA plot with no separation of the data!!

Could someone please advice?

Best,

CA

ADD COMMENTlink written 21 months ago by carandres740

I don't have any comment on normalizing this data, as I have no idea about the distribution, what is measured, etc.

You might consider making a new post, with a more descriptive title, e.g. "how to normalize SWATH proteomic data", as we've determined that it doesn't make much sense to use count-based tools like DESeq2.

ADD REPLYlink modified 21 months ago • written 21 months ago by Michael Love14k
0
gravatar for carandres74
21 months ago by
carandres740 wrote:

Hello

This is what i was trying to do in order to normalize the data

df = as.matrix(x)
df[df==0]=NA

kdf = knnImputation(df)

cds=estimateSizeFactorsForMatrix(kdf)
ndf=kdf/rbind(rep(list(cds)))

boxplot(log(ndf))

Witch actually gives a pretty normalized box plot, but when trying to plot svd using

svd=svd(ndf)

plot(svd$v[,1],svd[,2],col=as.numeric(pdata$Group)

i get a strange PCA plot with no separation of the data!!

Could someone please advice?

Best,

CA

ADD COMMENTlink written 21 months ago by carandres740
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 256 users visited in the last hour