Batch Effect Correction, Gene Filtering
1
0
Entering edit mode
Amit • 0
@b648b3f5
Last seen 6 months ago
India

Hi, I am using 26 .CEL files having 3 disease conditions from GEO for DGE analysis. The PCA plot after RMA normalization shows mixing of samples together.

So when I use

BATCH.cor <- readData(NORM.data, factor = PHENO)

BATCH.data <- ARSyNseq(BATCH.cor, factor="Group", batch = FALSE, norm = "n", logtransf = TRUE)

to remove batch effect it separates the samples on PCA .

But when is use

varFilter(BATCH.data, var.func=sd, var.cutoff=0.25, filterByQuantile=TRUE)

it reduces the total number of genes from 54000 to 10000.

Also i get more number of DOWN DEG than UP DEG after performing eBayes.

Is there any other simple batch correction and gene filter package available that didn't interfere with results ?

DGE limma • 829 views
ADD COMMENT
1
Entering edit mode

I answered almost the same question from you 3 days ago: DEG Filtering. It is a bit disappointing to find that you haven't taken any notice of my advice. I told you that the varFilter and ARSyNSeq steps were wrong yet you have continued them without any change, Should I conclude that you don't want any advice from me in the future?

ADD REPLY
2
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

You should never filter data based on variance if you are using limma.

0
Entering edit mode

Hi James, what about Batch correction ? and simple gene filtering ? is there any package available as i am new in this ...

ADD REPLY
2
Entering edit mode

Generally speaking you don't gain much by filtering genes using microarray data. You could do some filtering, but I never felt it was necessary or helpful to do so. If the batches are orthogonal to the treatment (condition, whatever), then you can simply add a batch effect to your model. If the batches aren't orthogonal (e.g., one or more treatments/conditions is found only in a subset of the batches), then there will be confounding between batch and treatment and that's mostly insurmountable.

As a simple example, let's say you want to do a study to compare weights of people in two different states. In one state, the person getting the weights is using an old scale that uses springs and is in poor condition. In the other state they are using one of those new electronic scales. The weights in this situation are nested in scale (e.g., all the weights from state A are on the old scale, and all the weights from state B are on the more accurate electronic scale), and it is impossible to know if any differences are due to some technical differences between the scales, or are actual differences in weight. You could fix that by getting the scales together and testing with a set of reference weights.

But in the case of microarrays that were processed in different batches in different labs using different reagents, it's not possible to adjust for the lab/batch specfic differences, unless each batch has all the same treatments and controls.

ADD REPLY
2
Entering edit mode

And I gave you advice on simple gene filtering 3 days ago.

ADD REPLY
0
Entering edit mode

Unfortunately OP keeps getting essentially the same answer though. ;-D

ADD REPLY

Login before adding your answer.

Traffic: 306 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6