Question

TR: HTqPCR package using the Biomark Fluidigm 96.96 array

0

Entering edit mode

Heidi Dvinge ★ 2.0k

@heidi-dvinge-2195

Last seen 9.6 years ago

Hello Balbine, (I'm forwarding this reply to the Bioconductor list, since it might be of use to people there. Fluidigm seems to be increasing in popularity. A recent discussion, http://article.gmane.org/ gmane.science.biology.informatics.conductor/29566/match=htqpcr +fluidigm , might also be of interest to you.) On 24 Jun 2010, at 18:03, ROUSSEL BALBINE wrote: > Hello, > I need advice on normalizing my data. > > With the Fluidigm chips, we can measure expression of 96 genes in > 96 samples on one plate. > > We have 15 plates concerning samples so 1536 samples and 3 sets > concerning genes so we have finally 288 genes. > > Totally, we have 15*3=45 plates > > The problem is that we have not housekeeping genes and no > calibrator samples for each set or for each plate. But we have a > test sample on all plates and a gene on all sets so we can verify > if the normalization takes away much of the impact plates and if we > keep the same information for the same sample or the same gene. > Just to make sure I understand you correctly; you have 3 different plates (lets call them A, B and C) with genes geneA1->geneA96, geneB1- >geneB96 and geneC1>geneC96. Sample X is present on all 45 plates. A single gene, geneY is present on plate types A, B and C. So you should have a single value out of the 96x96 that is identical on all 45 plates. So do you have 15x96 = 1440 or 1536 samples in total? For normalising within each plate type you have several options. You can use e.g. rank-invariant normalisation as you suggest, but given that you have an entire sample, i.e. 96 Ct values, that should be the same for all 15 plates, you can also select these 96 values and do a deltaCt normalisation. That corresponds to using these 96 values as "housekeeping" genes, since they should be identical across all plates of the same type A, B or C). Normalising across plates A, B and C is a bit trickier. In principle you can designate the single sample x gene value common across all 45 plates as a pseudo-housekeeping gene and normalise against that using delta Ct. But because there are no replicates within each plate, if that single reaction didn't work well for whatever reason, it all affect the entire plate after normalisation. Risky! What's the correlation you see for this one Ct values across all 45 plates? Within each of the 3 groups of 45 plates? Alternatively you can also use quantile normalisation as you suggest. Note though that this is a quite "harsh" procedure. No-matter what you data looks like to begin with, it will force them into having the same Ct-value distribution. That might be okay if all your genes and samples are completely randomised across all 45 plates. But what if for example 10 samples on one plate (e.g. a particular treatment) all give very high Ct values, whereas another 10 samples (a different treatment) on another plate all give very low Ct values? Then you can't assume that the Ct value distribution on each plate should be identical. In that case a rank-invariant normalisation is probably the safest bet. If you're not going to compare Ct values directly across plate types, such as sampleAA-gene1 versus sample BA-gene4, then technically you wouldn't even have to normalise between plates types A, B and C. Presumably you want to find differential expression of samples across each individual gene, right? Since the same type of gene will always be present on the same type of plate, regardless of sample, you should be okay with just normalising within each A/B/C set. I can't give you any solid advice on what normalisation to do, since it will depend on the distribution of your data, how the samples have been group together on plates and other factors. I would probably first spend a lot of time on initial data QC and comparison, and depending on how the data looks do something along these lines: - Load the three different plate types, A, B and C into separate qPCRset objects. Each object would then consist of 9216 rows (96 genes x 96 samples) and 15 columns (individual plates). - Normalise each of these objects separately, using either quantile normalisation (strictest choice), rank-invariant normalisation or deltaCt normalisation based on the 96 rows corresponding to the sample that have been loaded on all plates. - Combine the three objects together (cbind/rbind), and potentially change the layout (changeCtLayout) so that you have 1 gene per row and 1 sample per column, such that the object can be used for statistical testing. or perhaps: - Load all the plates into a single object with 9206 rows (one 96x96 plate) and one row per individual plate = 45. - Do e.g. rank-invariant normalisation across all these. I would probably use some of the diagnostics functions, like clusterCt and plotCtCor both before and after normalisation, to see if the samples group together as expected based on the biology. HTH \Heidi > So I thought realize normalizations by quantile, or by rank- invariant. > > But I do not know what strategy used because : > > - I can have a plate effect on the 3 set of genes > - I can also have a plate effect concerning the different samples > > Is it necessary that I start combining all "qPCRset objects" or not? > > For plate 1 and 3 gene sets: > > An object of class "qPCRset" > Size: 288 features, 96 samples > Feature types: P1 > Feature names: AACS AADACL1 ABHD5 ... > Feature classes: > Feature categories: OK > Sample names: Sample1 Sample2 Sample3 ... > > For plate 2 and 3 gene sets: > > An object of class "qPCRset" > Size: 288 features, 96 samples > Feature types: P1 > Feature names: AACS AADACL1 ABHD5 ... > Feature classes: > Feature categories: OK > Sample names: Sample1 Sample2 Sample3 ... > > > .... up to the plate 15 and 3 gene sets > > > q.features=cbind(qPCRset1,qPCRset2,.....,qPCRset15) > > > q.features > An object of class "qPCRset" > Size: 288 features, 1536 samples > Feature types: > Feature names: AACS AADACL1 ABHD5 ... > Feature classes: > Feature categories: OK > Sample names: Sample1 Sample2 Sample3 ... > > > >group=read.table(file="group 1536 samples.csv",h=T,sep=";",dec=".") > >attach(group) > >groupCID=c(as.character(group$CID)) > >sample=c(as.character(group$Subject)) > >sampleNames(q.features)=sample > >q.features2=setCategory > (q.features,groups=groupCID,flag=TRUE,flag.out="Failed") > >q.features3=filterCategory(q.features2,na.categories="Undetermined") > > > > do the following strategy may be good? : > > - to do the quantile normalisation on the 96 samples and 96*3 > genes (g=288) > - then to do the global quantile normalisation on all samples > and all genes > (n=1536, g=288) > > what is with the function "normalizeCtData()" two steps are > performed simultaneously? > if not how can I do? Do I have to do a normalization for each plate > with 3 gene sets? or Is what I specify in my script that it is > these 15 different plates? > > Do you see another strategy more suitable for my data to realize > normalization? > > How would you do? > > Perhaps if we combine the two methods (quantile and rank-invariant): > > - to do the quantile normalisation on the 96 samples and 96*3 > genes (g=288) > - then to do rank-invariant normalization on all samples and > all genes > (n=1536, g=288) > > How would you do? > > Thanks to your response, > > Balbine > > > > > > > > > > [[alternative HTML version deleted]]

Normalization Normalization • 1.4k views

ADD COMMENT • link 13.8 years ago Heidi Dvinge ★ 2.0k