Find DEPs among TMT labeling proteom time course data by limma
2
0
Entering edit mode
maleknias • 0
@maleknias-14080
Last seen 4.0 years ago

Dear all Hi

I worked on RNA-seq and microarray data before but now I am currently working on TMT labeled proteome data for the first time. My data achieved based on 5-time points (in days 0,1,3,6,12) and at any time point, I have data from two TMT 10-plex (two batches). So in any time point, for any TMT we have 2 replicates. As you know the range of TMT data is large and based on the last similar studies, I calculated log2 of data firstly. Then I normalized data by "scale" function and then removed the batch effect between two TMT runs by Combat (or removeBatchEffect from limma). After these steps, my data became ready to analyze. My questions are:

1- Am I right to use the limma package and "ebayes" function to find DEPs (differencial expressed proteins) between any specific time point to other time points?

2- If the answer of question 1 is yes, how can I write the "makeContrasts" function?

3- For Day3, when I put colnames=D3 for columns related to Day3 and put colnames=other for columns related to other time points and:

cont.matrix <- makeContrasts("D3-other",levels=design)

==> No DEPs can find.

when I put colnames=D3 for columns related to Day3, colnames=before for columns related to Day0 and Day1 and colnames=after for columns related to Day6 and Day12 :

cont.matrix <- makeContrasts("D3-befor","D3-after",levels=design) 
fit2 <- contrasts.fit(fit, cont.matrix)
fit2 <- eBayes(fit2, 0.01)
tT=topTable(fit2, adjust="BH", number=Inf)
if (! is.null(ann)) tT <- cbind(tT, ann[as.numeric(rownames(tT)),,drop=F])
colnames(tT)[1:2] <- c("before","after")
tT$logFC <- rowMeans(cbind(tT$before, tT$after), na.rm=TRUE)
DEPs <- tT[tT$adj.P.Val<0.05,]
DEPs$ABS_lofFC <- abs(DEPs$logFC)
DEPs<- DEPs[DEPs$ABS_lofFC >1,]

==> I could find 646 DEPs.

4- Please note that at least 400 DEPs is obtained when the indipendent t-test is performed between the given time point data and the other time points data.

Gordon Smyth and other colleagues whats your idea? how can I design the contrast to find true DEPs?

Best regards

Samaneh

limma proteomics TMTlabeling timecoursedata • 1.6k views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 6 hours ago
WEHI, Melbourne, Australia

I normalized data by "scale" function

That's seems a simplistic way to normalize, but I don't know anything about TMT data.

removed the batch effect between two TMT runs by Combat (or removeBatchEffect from limma)

It is incorrect to batch correct and then analyse as if there was no batch correct. You must instead include the batch effect in the limma linear model.

1- Am I right to use the limma package and "ebayes" function to find DEPs (differencial expressed proteins) between any specific time point to other time points?

Sure, why not?

2- If the answer of question 1 is yes, how can I write the "makeContrasts" function?

The use of the makeContrast function is well documented. The purpose of this forum isn't to write your code for you. We can help if you have difficulty with usage, but you have to mention a specific problem to get help with.

3- For Day3, when I put colnames=D3 for columns related to Day3 and put colnames=other for columns related to other time points and: ... or Day3, when I put colnames=D3 for columns related to Day3 and put colnames=other for columns related to other time points and:

Well, first, this isn't a question.

Second, I can't follow what you mean by other or before or after. It is unclear what you are inputing to makeContrasts. Indeed it is unclear to me what null hypothesis you are hoping to test or what trend you are trying to find. Are you possibly trying to find genes that reach an expression peak at Day 3? You don't explain. Anyway, why don't you use limma in the way it is documented to be used? There are many examples of using makeContrasts in the User's Guide and you can easily compare one time with averages of other times.

Alternatively, you could use F-tests to test for time trends in the usual way. Or you could fit time course trends as in Section 9.6.2 of the limma User's Guide or Section 4.8 of the edgeR User's Guide.

Third, I can't read your complicated code and you don't explain what you are trying to do. If you want to know how many DE genes there are, just use

summary(decideTests(fit2))

Fourth, I strongly recommend against the use of fold-change cutoffs when assessing DE genes. And you seem to be using your own ad hoc measure of logFC anyway rather than using what is output by limma, so it is quite unclear to me what your DE list represents.

4- Please note that at least 400 DEPs is obtained when the indipendent t-test is performed between the given time point data and the other time points data.

Again, this is not a question.

ADD COMMENT
0
Entering edit mode
SamGG ▴ 360
@samgg-6428
Last seen 1 day ago
France/Marseille/Inserm

Hi,

I agree with Gordon’s response, and I will simply add a few details.

I agree with the log2 transform but disagree with the scale usage. What is the rational in your opinion?

You probably don't need a complicated batch correction. I observed typically a small offset between samples when looking at a pairs scatter plot.

The batch should be taken into in the design.

I think it would be valuable to read sections 9.6 and 9.7 of limma user guide.

I think your design looks like the following. This could help Gordon to suggest the right fit in order to compare the average D3 versus the average of the other time points taking into the batch effect.

| Batch|Day | Repli|
|-----:|:---|-----:|
|     1|D00 |     1|
|     1|D00 |     2|
|     1|D01 |     1|
|     1|D01 |     2|
|     1|D03 |     1|
|     1|D03 |     2|
|     1|D06 |     1|
|     1|D06 |     2|
|     1|D12 |     1|
|     1|D12 |     2|
|     2|D00 |     1|
|     2|D00 |     2|
|     2|D01 |     1|
|     2|D01 |     2|
|     2|D03 |     1|
|     2|D03 |     2|
|     2|D06 |     1|
|     2|D06 |     2|
|     2|D12 |     1|
|     2|D12 |     2|
ADD COMMENT
0
Entering edit mode

This article proposes a web interface for TMT normalization and gives references to many methods.

proteiNorm − A User-Friendly Tool for Normalization and Analysis of TMT and Label-Free Protein Quantification

The authors cite limma and other packages. I am not convinced by the DAtest package as it relies on permutation of samples which are not numerous in your design.

ADD REPLY

Login before adding your answer.

Traffic: 994 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6