Missing value imputation
2
0
Entering edit mode
@kamalfartiyal84-7976
Last seen 6.0 years ago
Cancer Research UK Cambridge Institute

Hi,

I want to perform missing value imputation on TMT-tags based quantitative proteomics data. I would be performing mixed imputation by applying two different methods (MCAR/MNAR) on two different groups within same dataset. Should I perform the imputation on raw or log transformed peptide intensity data?

Kamal

 

 

msnbase • 2.3k views
ADD COMMENT
0
Entering edit mode
@laurent-gatto-5645
Last seen 3 days ago
Belgium

I don't think it matters, as long as you wouldn't use a zero imputation for MNAR.

However, as you use TMT tags, one would expect your missing values to be the results of absent peptides, rather than the MS missing features, because samples were combined. If it is a typical shotgun experiment, one wouldn't expect many missing values; some features can have many missing values, and these should probably be filtered out completely.

ADD COMMENT
0
Entering edit mode
@kamalfartiyal84-7976
Last seen 6.0 years ago
Cancer Research UK Cambridge Institute

Thanks Laurent for your reply. So in my dataset one condition is supposed to be have more missing peptide than the other due to biological reasons than the other. So filtering strategy I am employing is as below:

Condition A (5 Replicates) supposed to have more peptide than Condition B (5 Replicates)

-> Filter all the peptide completely missing from Condition A & B.
-> Keep peptide that are present in atleast 3 replicates of Condition A.
-> No such restriction on Condition B.
-> Apply MAR on Condition A and MNAR in Condition B (as here they are supposed to be biologically missing).
-> Another way is making average of peptide intensity of all replicate (for each peptide) in Condition A and assigning it to missing peptides (in other replicates) in Condition A. On the other hand giving minimum peptide intensity of all replicates (for each peptide) in Condition B and assigning it to missing peptides (in other replicates) in Condition B. 

I usually remove the missing values in all my analysis but in this specific dataset due to the nature of biology I have to keep them for analysis. Hence, I would highly appreciate your feedback on the above outline as this is the first time I am using imputation in analysis.

Thanks.

Kamal

ADD COMMENT
0
Entering edit mode

Yes, that seems reasonable. I am unsure about using the peptide average rather than another suitable MAR method (as this will artificially minimise the variability for that peptide and the statistical tests might then be too optimistic), but I guess by trying and inspecting results, you will see.

ADD REPLY
0
Entering edit mode

Thanks very much for your comments.

ADD REPLY

Login before adding your answer.

Traffic: 441 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6