limma technical repeat correction
1
0
Entering edit mode
john • 0
@cb255338
Last seen 33 minutes ago
United Kingdom

Hello, Would someone please help me verify if i'm on the right track in correcting for technical replicates using limma and also if there is a better approach to use (duplicateCorrelation() vs lmer()) for my protein expression data.

I've got 8 disease patients per group (DiseaseA, DiseaseB). There are 2 biological samples(_1,_2) with 4 technical replicates each (rep1-4).

Quite confused on what to include in the block when handling technical replicates - is it the column - patientID or replicates_1 or replicates_2 below. Please note I have manually created replicate_2 column


> sample.sheet
                   group   patientID replicates_1 replicates_2
258736519273_1 Disease_A Disease_A_1            1            1
258736519272_4 Disease_A Disease_A_1            2            2
258736519276_1 Disease_A Disease_A_1            3            3
258736519273_6 Disease_A Disease_A_1            4            4
258736519273_2 Disease_A Disease_A_2            1            1
258736519272_1 Disease_A Disease_A_2            2            2
258736519275_3 Disease_A Disease_A_2            3            3
258736519272_8 Disease_A Disease_A_2            4            4
258736519273_5 Disease_B Disease_B_1            1            5
258736519273_3 Disease_B Disease_B_1            2            6
258736519274_2 Disease_B Disease_B_1            3            7
258736519275_7 Disease_B Disease_B_1            4            8
258736519274_3 Disease_B Disease_B_2            1            5
258736519274_7 Disease_B Disease_B_2            2            6
258736519275_1 Disease_B Disease_B_2            3            7
258736519272_2 Disease_B Disease_B_2            4            8
258736519273_8   Healthy   Healthy_1            1            9
258736519273_4   Healthy   Healthy_1            2           10
258736519272_5   Healthy   Healthy_1            3           11
258736519274_4   Healthy   Healthy_1            4           12
258736519274_8   Healthy   Healthy_2            1            9
258736519276_3   Healthy   Healthy_2            2           10
258736519272_7   Healthy   Healthy_2            3           11
258736519274_5   Healthy   Healthy_2            4           12

group <- factor(sample.sheet$group);
patient <- factor(sample.sheet$patientID); **# unsure whether to use patientID or replicates_1 or replicates_2**

#> table(sample.sheet$group)
#Disease_A Disease_B   Healthy
#        8         8         8

#> table(sample.sheet$patientID)
#Disease_A_1 Disease_A_2 Disease_B_1 Disease_B_2   Healthy_1   Healthy_2
#          4           4           4           4           4           4

# Create a design matrix for your comparison (e.g., control vs treatment)
design <- model.matrix(~0 + group);
colnames(design) <-  gsub('group', '', colnames(design));

# Use duplicateCorrelation to estimate within-patient correlation
corfit <- duplicateCorrelation(exp.data.subset, design, block = patient);

fit <- lmFit(
    exp.data.subset,
    design,
    block = patient,
    correlation = corfit$consensus.correlation
    );

# create contrast matrix
contrast.matrix <- makeContrasts(
    'Disease_AvsHealthy' = Disease_A-Healthy,
    'Disease_BvsHealthy' = Disease_B-Healthy,
    'Disease_A-HealthyvsDisease_B-Healthy' = (Disease_A-Healthy) - (Disease_B-Healthy),
    levels = design
);

fit2 <- contrasts.fit(fit, contrast.matrix);
fit2 <- eBayes(fit2, trend = TRUE); **# robust = TRUE, trend = TRUE? what is more appropriate for proteomics **
limma lmer duplicateCorrelation technicalrepeats • 81 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 2 hours ago
WEHI, Melbourne, Australia

The technical samples should be averaged. Then use duplicateCorrelation() with block=PatientID.

ADD COMMENT
0
Entering edit mode

Hi Gordon, Thank you very much for the prompt response.

If i understand correctly, wouldnt this remove the need to use duplicateCorrelation(). Since averaging condenses these replicates into a single value per biological replicate, would that mean we no longer need to account for correlations between them? Thanks for clarifying!

         sampleID     group   patientID
1  258736519273_1 Disease_A Disease_A_1
5  258736519273_2 Disease_A Disease_A_2
9  258736519273_5 Disease_B Disease_B_1
13 258736519274_3 Disease_B Disease_B_2
17 258736519273_8   Healthy   Healthy_1
21 258736519274_8   Healthy   Healthy_2
ADD REPLY

Login before adding your answer.

Traffic: 766 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6