I have two RNA-seq experiments performed in different batches several months apart. Both experiments include technical replicates of the same 10 samples from condition A (with identical sample names: A1, A2, ..., A10). Exactly same control samples were used.
Experiment 1: 10 samples from condition A and 10 samples from condition B.
Experiment 2: 10 samples from condition A and 10 samples from condition C.
How can I use the technical replicates (condition A) to normalize for batch effects and perform differential gene expression analysis between conditions B and C?
I have processed my reads with kallisto so far and tximport.
I appreciate all your help!
Thank you Gordon! Apologies if this is a too basic question-I'm a biologist and want to ensure I understand correctly. Here is my metadata and a script- is it what you suggested?
Here is my metadata:
Here is the script I am using for analysis:
Please come back to the support site and view your post - I've fixed it for you. If you want to show people your targets file, just paste that in rather than the code someone could use to generate it. Also, when you post the targets file or any code, please either put a triple backtick (the upper left key on a QWERTY keyboard) before and after the code, or highlight the block of code and click the CODE button that is immediately above the dialog box you type in.
While you are composing your post you can always see what it will look like by scrolling down and looking at the presentation box right below the dialog box. If it looks like a bunch of gibberish, that's what your post will look like, so it gives you a chance to edit for clarity.
Thank you James, that really helped! I tried CODE
enter code here
but I must have dome it incorrectly, as it showed all in red.Oh, right. There's two ways to enter code. If you are in the middle of a sentence and you want to include a function name, you wrap the name in backticks. And if you are typing a sentence and click CODE, it just types 'enter code here' with backticks. That's why it was red, because code in a sentence should be red like that.
But if you want a block of code, you can add a line of three backticks, then put in the code, and then another three backticks. An alternative (which is what the CODE button does if you are on a new line) is to indent by four spaces. Everything that is indented will have the code typeface.
That looks basically correct, although the terms that you are inputing to
makeContrasts
do not match the condition names in the targets file, so I assume the targets info you have shown here is not your real data.You would find it easier to use
voomLmFit
, which automates the process of usingvoom
withduplicateCorrelation
, and has some additional advantages.Thank you! This is actually quite close to my real dataset. I simplified the names to make it easier as an example, but I have now corrected it so that it is more useful and clearer for others to follow.
I am not sure how to interpret this, as I expected to see a rather strong positive correlation. I have double-checked the metadata, and everything seems correct. However, the data were generated several months apart, potentially on different machines. Any comments or suggestions would be greatly appreciated!
Since the intra-block correlation is negative, it would be best to remove
block
from the model.My experience with genetically identical lab mice at the WEHI is that repeat observations on the same mouse are virtually the same as observations from different mice, so don't lead to strong intra-mouse correlations.