Question

DESEQ2 and continuous data

0

Entering edit mode

carandres74 • 0

@carandres74-9625

Last seen 9.2 years ago

Hello

My name is Carlos. I am PhD student working on proteomic analysis from CSF. I am currently using the DESeq2 package in R to analyze the data, and wanted to ask if it is ok to use the negative binomial distribution on continuos data?
Is the correction factor in the DESeq2 package based on a ratio or on a multiplicative factor?
Kind regards,
CA

proteomics deseq2 • 2.4k views

ADD COMMENT • link 9.2 years ago carandres74 • 0

score 0 · Answer 1 · 2016-01-29

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 9 days ago

United States

It's really based on the idea that the input is on the count scale. Can you explain exactly what kind of data you have?

ADD COMMENT • link 9.2 years ago Michael Love 43k

score 0 · Answer 2 · 2016-02-01

0

Entering edit mode

carandres74 • 0

@carandres74-9625

Last seen 9.2 years ago

Hello Micheal

I have SWATH proteomic data from cerebrospinal fluid. This is continuous data.

Hope this answers your questions

Best,

CA

ADD COMMENT • link 9.2 years ago carandres74 • 0

0

Entering edit mode

SWATH is MS1 quantitation in data independent (DIA) mode, i.e. not the kind of data one would typically view as count data.

ADD REPLY • link 9.2 years ago Laurent Gatto 1.6k

1

Entering edit mode

If it's positive, but not counts or estimated counts of observations, I don't think count-based models are the right approach.

You might try limma-voom for finding differences (so scaling columns, log transforming, and estimating variance mean dependence from the data to inform the linear model)? Maybe Laurent can weigh in on whether that sounds reasonable.

ADD REPLY • link 9.2 years ago Michael Love 43k

0

Entering edit mode

limma is definitely a great option. I am not sure about the voom transformation.

ADD REPLY • link 9.2 years ago Laurent Gatto 1.6k

score 0 · Answer 3 · 2016-02-01

0

Entering edit mode

carandres74 • 0

@carandres74-9625

Last seen 9.2 years ago

Hello

This is what i was trying to do in order to normalize the data

df = as.matrix(x)
df[df==0]=NA

kdf = knnImputation(df)

cds=estimateSizeFactorsForMatrix(kdf)
ndf=kdf/rbind(rep(list(cds)))

boxplot(log(ndf))

Witch actually gives a pretty normalized box plot, but when trying to plot svd using

svd=svd(ndf)

plot(svd$v[,1],svd[,2],col=as.numeric(pdata$Group)

i get a strange PCA plot with no separation of the data!!

Could someone please advice?

Best,

CA

ADD COMMENT • link 9.2 years ago carandres74 • 0

0

Entering edit mode

I don't have any comment on normalizing this data, as I have no idea about the distribution, what is measured, etc.

You might consider making a new post, with a more descriptive title, e.g. "how to normalize SWATH proteomic data", as we've determined that it doesn't make much sense to use count-based tools like DESeq2.

ADD REPLY • link 9.2 years ago Michael Love 43k

score 0 · Answer 4 · 2016-02-01

Hello

This is what i was trying to do in order to normalize the data

df = as.matrix(x)
df[df==0]=NA

kdf = knnImputation(df)

cds=estimateSizeFactorsForMatrix(kdf)
ndf=kdf/rbind(rep(list(cds)))

boxplot(log(ndf))

Witch actually gives a pretty normalized box plot, but when trying to plot svd using

svd=svd(ndf)

plot(svd$v[,1],svd[,2],col=as.numeric(pdata$Group)

i get a strange PCA plot with no separation of the data!!

Could someone please advice?

Best,

CA