Question

Find DEPs among TMT labeling proteom time course data by limma

0

Entering edit mode

maleknias • 0

@maleknias-14080

Last seen 3.3 years ago

Dear all Hi

I worked on RNA-seq and microarray data before but now I am currently working on TMT labeled proteome data for the first time. My data achieved based on 5-time points (in days 0,1,3,6,12) and at any time point, I have data from two TMT 10-plex (two batches). So in any time point, for any TMT we have 2 replicates. As you know the range of TMT data is large and based on the last similar studies, I calculated log2 of data firstly. Then I normalized data by "scale" function and then removed the batch effect between two TMT runs by Combat (or removeBatchEffect from limma). After these steps, my data became ready to analyze. My questions are:

1- Am I right to use the limma package and "ebayes" function to find DEPs (differencial expressed proteins) between any specific time point to other time points?

2- If the answer of question 1 is yes, how can I write the "makeContrasts" function?

3- For Day3, when I put colnames=D3 for columns related to Day3 and put colnames=other for columns related to other time points and:

cont.matrix <- makeContrasts("D3-other",levels=design)

==> No DEPs can find.

when I put colnames=D3 for columns related to Day3, colnames=before for columns related to Day0 and Day1 and colnames=after for columns related to Day6 and Day12 :

cont.matrix <- makeContrasts("D3-befor","D3-after",levels=design) 
fit2 <- contrasts.fit(fit, cont.matrix)
fit2 <- eBayes(fit2, 0.01)
tT=topTable(fit2, adjust="BH", number=Inf)
if (! is.null(ann)) tT <- cbind(tT, ann[as.numeric(rownames(tT)),,drop=F])
colnames(tT)[1:2] <- c("before","after")
tT$logFC <- rowMeans(cbind(tT$before, tT$after), na.rm=TRUE)
DEPs <- tT[tT$adj.P.Val<0.05,]
DEPs$ABS_lofFC <- abs(DEPs$logFC)
DEPs<- DEPs[DEPs$ABS_lofFC >1,]

==> I could find 646 DEPs.

4- Please note that at least 400 DEPs is obtained when the indipendent t-test is performed between the given time point data and the other time points data.

Gordon Smyth and other colleagues whats your idea? how can I design the contrast to find true DEPs?

Best regards

Samaneh

limma proteomics TMTlabeling timecoursedata • 1.4k views

ADD COMMENT • link updated 3.5 years ago by SamGG ▴ 350 • written 3.5 years ago by maleknias • 0

score 0 · Answer 1 · 2020-11-07

I normalized data by "scale" function

That's seems a simplistic way to normalize, but I don't know anything about TMT data.

removed the batch effect between two TMT runs by Combat (or removeBatchEffect from limma)

It is incorrect to batch correct and then analyse as if there was no batch correct. You must instead include the batch effect in the limma linear model.

1- Am I right to use the limma package and "ebayes" function to find DEPs (differencial expressed proteins) between any specific time point to other time points?

Sure, why not?

2- If the answer of question 1 is yes, how can I write the "makeContrasts" function?

The use of the makeContrast function is well documented. The purpose of this forum isn't to write your code for you. We can help if you have difficulty with usage, but you have to mention a specific problem to get help with.

3- For Day3, when I put colnames=D3 for columns related to Day3 and put colnames=other for columns related to other time points and: ... or Day3, when I put colnames=D3 for columns related to Day3 and put colnames=other for columns related to other time points and:

Well, first, this isn't a question.

Second, I can't follow what you mean by other or before or after. It is unclear what you are inputing to makeContrasts. Indeed it is unclear to me what null hypothesis you are hoping to test or what trend you are trying to find. Are you possibly trying to find genes that reach an expression peak at Day 3? You don't explain. Anyway, why don't you use limma in the way it is documented to be used? There are many examples of using makeContrasts in the User's Guide and you can easily compare one time with averages of other times.

Alternatively, you could use F-tests to test for time trends in the usual way. Or you could fit time course trends as in Section 9.6.2 of the limma User's Guide or Section 4.8 of the edgeR User's Guide.

Third, I can't read your complicated code and you don't explain what you are trying to do. If you want to know how many DE genes there are, just use

summary(decideTests(fit2))

Fourth, I strongly recommend against the use of fold-change cutoffs when assessing DE genes. And you seem to be using your own ad hoc measure of logFC anyway rather than using what is output by limma, so it is quite unclear to me what your DE list represents.

4- Please note that at least 400 DEPs is obtained when the indipendent t-test is performed between the given time point data and the other time points data.

Again, this is not a question.

score 0 · Answer 2 · 2020-11-07

Hi,

I agree with Gordon’s response, and I will simply add a few details.

I agree with the log2 transform but disagree with the scale usage. What is the rational in your opinion?

You probably don't need a complicated batch correction. I observed typically a small offset between samples when looking at a pairs scatter plot.

The batch should be taken into in the design.

I think it would be valuable to read sections 9.6 and 9.7 of limma user guide.

I think your design looks like the following. This could help Gordon to suggest the right fit in order to compare the average D3 versus the average of the other time points taking into the batch effect.

| Batch|Day | Repli|
|-----:|:---|-----:|
|     1|D00 |     1|
|     1|D00 |     2|
|     1|D01 |     1|
|     1|D01 |     2|
|     1|D03 |     1|
|     1|D03 |     2|
|     1|D06 |     1|
|     1|D06 |     2|
|     1|D12 |     1|
|     1|D12 |     2|
|     2|D00 |     1|
|     2|D00 |     2|
|     2|D01 |     1|
|     2|D01 |     2|
|     2|D03 |     1|
|     2|D03 |     2|
|     2|D06 |     1|
|     2|D06 |     2|
|     2|D12 |     1|
|     2|D12 |     2|