Question

How to find differentially expressed proteins from multiple conditions (>2) using proDA?

0

Entering edit mode

t.b.n.nguyen-9 • 0

@a45b6d4d

Last seen 18 months ago

Netherlands

I am using proDA to analyze my proteomics data, the input as a matrix. I use here the example data from the package (system.file("extdata/proteinGroups.txt", package = "proDA", mustWork = TRUE)). I removed the reference condition as we do not have it in our dataset, and we have 4 conditions to compare

fit <- proDA(normalized_abundance_matrix, design = ~ condition, col_data = sample_info_df)

my purpose is to find differentially expressed proteins something like anova test across the conditions. How could i do it using test_diff?

I tried this: test_res <- test_diff(fit, "condition") as suggested in previous post but it does not work. I got error: object 'condition' not found

I have another question about creating heatmap, i want to show all significant proteins in a heatmap, should i use the non-imputed matrix here?

Any help would be much appreciated.

Proteomics proDA MultipleComparison • 1.2k views

ADD COMMENT • link 19 months ago t.b.n.nguyen-9 • 0

score 1 · Answer 1 · 2023-08-30

1

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 1 day ago

United States

If I am not mistaken, this is all covered in the vignette for proDA. Read through that a couple of times and then let us know if you have any questions that aren't already answered there.

ADD COMMENT • link 19 months ago James W. MacDonald 68k

0

Entering edit mode

Thank you for your answer, i went over the vignette again and i think I should do all of the pair-wise comparisons and then combine all of the significant proteins. The reduced_model = ~1 which creates a intercept condition to compare all the conditions maybe also a way? but i am not sure if my assumption is correct.

ADD REPLY • link 19 months ago t.b.n.nguyen-9 • 0

0

Entering edit mode

If all you care about is knowing if any gene is different in any of the conditions, then the LRT will tell you that. I don't see the use case for that though (don't you want to know which conditions differ?).

As you note, an alternative is to make all possible comparisons. Combining all the significant proteins will get you pretty close to what you get for the LRT (different models produce slightly different results).

ADD REPLY • link 19 months ago James W. MacDonald 68k

0

Entering edit mode

In my research, I aim to distinguish five different enriched cell populations based on their unique expression profiles. My strategy is first identifying differentially expressed proteins across these cell populations, then do the clustering. While I've explored both pairwise comparisons and ANOVA for this task with different dataset (no missing values), I've found that ANOVA seems to be more effective (just my gut feeling). I am still new and learning on this type of analysis.

ADD REPLY • link 19 months ago t.b.n.nguyen-9 • 0