Entering edit mode
maziz@tgen.org
▴
70
@maziztgenorg-5342
Last seen 10.3 years ago
Hi,
Will CellHTS2 work if I donot have positive controls on my plate. I
only have negative controls.
Thanks,
Meraj
-----Original Message-----
From: bioconductor-bounces@r-project.org [mailto:bioconductor-
bounces@r-project.org] On Behalf Of maziz@tgen.org
Sent: Saturday, July 28, 2012 12:11 PM
To: joseph.barry at embl.de
Cc: bioconductor at r-project.org
Subject: Re: [BioC] Question regarding cellhts2 output
Hi Joseph,
We are getting ready to writeup our findings.
I am still wondering about one question. I apologize if you have
answered that before But in order to clarify please help me understand
the process by which CellHTS2 processes out data.
So as I mentioned before we have 900 siRNA (x4 siRNA per gene), no
replicates in our experiments.
The input to cellhts2 is already a simple ratio between a
phosphorylated vs baseline protein.
I am using the online version of cellhts2 (http://web-cellhts2.dkfz.de
/cellHTS-java/CellHTS2) and not the R version.
Following are my parameters that are part of the output generated by
the online web cellhts2.
//////////////////////////////////////////////////////////////////////
//
orgDir=getwd()
setwd("/temp/cellHTS2/JOB8905381608534530143")
Indir="/temp/cellHTS2/JOB8905381608534530143"
zz <- file("/temp/cellHTS2/JOB8905381608534530143_RUN44161073050464844
87/R_OUTPUT.TXT", open="w") sink(file=zz,type="message" ) Name="test"
Outdir_report="/temp/cellHTS2/JOB8905381608534530143_RUN44161073050464
84487"
LogTransform=FALSE
PlateList="Platelist.txt"
Plateconf="PlateConfig.txt"
Description="Description.txt"
NormalizationMethod="Bscore"
NormalizationScaling="additive"
VarianceAdjust="byPlate"
SummaryMethod="mean"
Screenlog="Screenlog.txt"
Score="zscore"
Annotation="GeneIDs.txt"
library(cellHTS2)
x=readPlateList(PlateList, name = Name, path = Indir) x=configure(x,
descripFile=Description, confFile=Plateconf,
logFile=Screenlog,path=Indir) xn=normalizePlates(x, scale
=NormalizationScaling , log =LogTransform,method=NormalizationMethod,
varianceAdjust=VarianceAdjust) comp=compare2cellHTS(x, xn)
xsc=scoreReplicates(xn, sign = "-", method = Score)
xsc=summarizeReplicates(xsc, summary = SummaryMethod)
scores=Data(xsc)
ylim=quantile(scores, c(0.001, 0.999), na.rm = TRUE) xsc=annotate(xsc,
geneIDFile = Annotation) out=writeReport(raw = x, normalized = xn,
scored = xsc, outdir = Outdir_report, force = TRUE, settings =
list(xrange = c(0.5,3),zrange = c(-4, 8), ar = 1))
setwd(orgDir)
sink()
//////////////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////
I choose the normalization method as Bscore and VarianceAdjust
"byPlate".
One of the questions is After cellhts2 does the Bscore
normalization/smoothing of the plate, Does it then take those values
and calculate Zscores.
You have been very helpful, and I really appreciate it.
Thanks,
Meraj
From: Joseph Barry [mailto:joseph.barry@embl.de]
Sent: Monday, July 16, 2012 12:22 AM
To: Meraj Aziz
Subject: Re: Question regarding cellhts2 output
Dear Meraj,
Wolfgang has already replied to your questions on the mailing list.
Please make sure that you are properly subscribed: bioconductor at
r-project.org<mailto:bioconductor at="" r-project.org="">
Best wishes,
Joseph Barry
On Jul 14, 2012, at 1:07 AM, <maziz at="" tgen.org<mailto:maziz="" at="" tgen.org="">> <maziz at="" tgen.org<mailto:maziz="" at="" tgen.org="">> wrote:
Thank you for responding. Somehow I did not receive your reply email
and I got to your response to my question when I was searching for a
solution online.
So given the variances are accounted for:
According to wikipedia:
FPR = FP/(FP+TN)
Suppose I have 50 wells of "-ve" controls in total across all plates
and 20 (TP) show up in the "hit list".
This will give me True Positive Rate (TPR) sensitivity:
TPR = TP/(TP+FN)
TPR= 20/(20+30) = 0.4
I am not sure how to translate that to FPR since I donot know the FPs
and TNs.
If we have had done a confirmation screen then we could have found out
the false positives and true negatives.
Am I on the right track?
meraj
From: Meraj Aziz
Sent: Saturday, June 16, 2012 11:49 PM
To: 'Joseph Barry'
Cc: 'bioconductor at r-project.org<mailto:bioconductor at="" r-project.org="">'
Subject: RE: Question regarding cellhts2 output
I am using CellHTS2 to calculate Bscores. My experiment has only one
replicate.
There are approx 900 genes (x4 siRNA).
From: Meraj Aziz
Sent: Saturday, June 16, 2012 5:05 PM
To: 'Joseph Barry'
Cc: 'bioconductor at r-project.org<mailto:bioconductor at="" r-project.org="">'
Subject: RE: Question regarding cellhts2 output
Hi,
Is there a way to calculate the False Discovery Rate (FDR) for an RNAi
Experiment.
Thanks,
Meraj
From: Joseph Barry
[mailto:joseph.barry@embl.de]<mailto:[mailto:joseph.barry@embl.de]>
Sent: Wednesday, June 13, 2012 2:45 PM
To: Meraj Aziz
Subject: Re: Question regarding cellhts2 output
Hi Meraj,
Yes, that would be great. Thanks for being understanding.
Best wishes,
Joseph
On Jun 13, 2012, at 7:49 PM, <maziz at="" tgen.org<mailto:maziz="" at="" tgen.org="">> <maziz at="" tgen.org<mailto:maziz="" at="" tgen.org="">> wrote:
So next time I ask a question I will include bioconductor at
r-project.org<mailto:bioconductor at="" r-project.org="">
In my CC.
I apologize for this.
Thanks,
Meraj
From: Joseph Barry
[mailto:joseph.barry@embl.de]<mailto:[mailto:joseph.barry@embl.de]>
Sent: Wednesday, June 13, 2012 2:55 AM
To: Meraj Aziz
Subject: Re: Question regarding cellhts2 output
Hi Meraj,
The negative/positive controls are defined by the user, and their
"significance" varies greatly from experiment to experiment. Some have
no negative controls, others do. It depends on experimental design.
Most of the time they are for quality control, as you say. However,
the normalization method "negatives" does make use of this
information. See the package documentation for further details.
As regards the assignment of probabilities, I would not interpret the
Z or Bscores in this way. Each well can be viewed as being independent
from the others (again depending on exp design) so you are not really
sampling in the way you are suggesting. The setting of a threshold is
usually an arbitrary choice based on the data. It is fine to just
state your threshold and present the results directly.
I am happy to answer any further questions, should you have any.
However, it would be great if you could send any such questions out
through the bioconductor mailing list so that other users may
contribute to the discussion and benefit from the commentary.
Many thanks,
Joseph
On Jun 13, 2012, at 1:46 AM, <maziz at="" tgen.org<mailto:maziz="" at="" tgen.org="">> <maziz at="" tgen.org<mailto:maziz="" at="" tgen.org="">> wrote:
Hi Joseph
One more question regarding CellHTS2 is the use of negative controls.
When I run CellHTS2 with Bscore normalization, what is the
significance of the negative Controls on the plates. Are negative and
positive controls only for quality control and visualization purpose
or are they actually used somehow in the Bscore calculations.
Thanks,
Meraj
From: Meraj Aziz
Sent: Tuesday, June 12, 2012 1:12 PM
To: 'Joseph Barry'
Subject: RE: Question regarding cellhts2 output
Hi Joseph,
Thanks for your reply.
The reason i was interested in knowing if my screen was normally
distributed is using the Bscores (assuming the scores are standard
deviations from the median) to assign probability to each siRNA (using
something like Zscore to Probability tables/calculators).
The outcome from Z/Bscore to probability should give the probability
that the given siRNA effect observed by chance is x%.
For example:
So at threshold "2" (Bscore) the probability is 0.023 or 2.23%.
This 2.23% means that the probability of a siRNA giving you the
observed effect by chance Is less than 2.23%.
For threshold "3" the probability is 0.00135 or 0.135%.
For that I need to be sure we are assumption of normality is true or
not.
I hope I am interpreting the results from CellHTS2 Bscore
normalization the right way.
Our aim is to justify why we are using a particular Bscore cutoff.
Thanks,
Meraj
From: Joseph Barry
[mailto:joseph.barry@embl.de]<mailto:[mailto:joseph.barry@embl.de]>
Sent: Tuesday, June 12, 2012 12:07 PM
To: Meraj Aziz
Subject: Re: Question regarding cellhts2 output
Hi Meraj,
The density plot just shows the distribution of scores for your screen
and conveniently marks the positions of positive/negative controls.
Your screen is not fully normal as it does not have the classical bell
shape. However I would not read too much into whether a screen is
normally distributed or not. Scores which seem to break the trend
(such as your SMG1) tend to lie further from the line on the Q-Q plot
but I would not waste too much time looking at this. They are
primarily for quality control, to check that the distribution does not
look "funny".
Best wishes,
Joseph
On Jun 12, 2012, at 8:18 PM, <maziz at="" tgen.org<mailto:maziz="" at="" tgen.org="">> wrote:
Hi Joseph
The Q-Q plots gives a measure of testing for normality of our RNAi
distribution.
Attached is my screens Q-Q plot.
What does the density plot imply and is my screen normally
distributed?
You have been really helpful.
Thanks,
Meraj
From: Joseph Barry
[mailto:joseph.barry@embl.de]<mailto:[mailto:joseph.barry@embl.de]>
Sent: Monday, June 11, 2012 12:55 PM
To: Meraj Aziz
Subject: Re: Question regarding cellhts2 output
Hi Meraj,
My apologies, I had not spotted the line:
xsc=scoreReplicates(xn, sign = "-", method = Score)
, which is calculating the zscore at this stage and multiplying by -1.
This is absolutely fine.
Therefore I don't think there is anything wrong with your analysis. I
would not be concerned that you get a score of -76 s.d.. This is
perfectly reasonable, given that the standard deviation is ~0.07, i.e.
the scores seem high simply because you divide by a small number.
Hope this helps,
Joseph
On Jun 11, 2012, at 9:26 PM, Joseph Barry wrote:
Hi Meraj,
I noticed in your output that
VarianceAdjust="none"
so I guess that you have not divided by the MAD (or standard
deviation) using cellHTS2, but have rather done this as a post-
processing step?
Can you check that you have not made a mistake in calculating the
zscore? In R, I quickly manually divided by MAD and obtained a more
conservative range:
range(x$normalized_r1_ch1/mad(x$normalized_r1_ch1, na.rm=TRUE),
na.rm=TRUE) [1] -12.46033 63.93462
The median is zero, as it should be, so the subtraction of the median
is working fine.
As a solution, I recommend you reanalyze your data with the
VarianceAdjust="byPlate" option turned on.
Best wishes,
Joseph
On Jun 11, 2012, at 9:04 PM, <maziz at="" tgen.org<mailto:maziz="" at="" tgen.org="">> <maziz at="" tgen.org<mailto:maziz="" at="" tgen.org="">> wrote:
Attached is my output from CellHTS2.
So I was interested in gene "SMG1" and at a cutoff of "-2 BScore"
I get all 4 siRNA, which is good. But the score in the negative goes
upto
-76.26 Standard deviations which seems a lot.
My parameters are as follows:
orgDir=getwd()
setwd("/temp/cellHTS2/JOB5676000587616137010")
Indir="/temp/cellHTS2/JOB5676000587616137010"
zz <- file("/temp/cellHTS2/JOB5676000587616137010_RUN13703788433091823
39/R_OUTPUT.TXT", open="w") sink(file=zz,type="message" )
Name="SCNA_with_pos_ctrl"
Outdir_report="/temp/cellHTS2/JOB5676000587616137010_RUN13703788433091
82339"
LogTransform=FALSE
PlateList="Platelist.txt"
Plateconf="PlateConfig.txt"
Description="Description.txt"
NormalizationMethod="Bscore"
NormalizationScaling="additive"
VarianceAdjust="none"
SummaryMethod="mean"
Screenlog="Screenlog.txt"
Score="zscore"
Annotation="GeneIDs.txt"
library(cellHTS2)
x=readPlateList(PlateList, name = Name, path = Indir) x=configure(x,
descripFile=Description, confFile=Plateconf,
logFile=Screenlog,path=Indir) xn=normalizePlates(x, scale
=NormalizationScaling , log =LogTransform,method=NormalizationMethod,
varianceAdjust=VarianceAdjust) comp=compare2cellHTS(x, xn)
xsc=scoreReplicates(xn, sign = "-", method = Score)
xsc=summarizeReplicates(xsc, summary = SummaryMethod)
scores=Data(xsc)
ylim=quantile(scores, c(0.001, 0.999), na.rm = TRUE) xsc=annotate(xsc,
geneIDFile = Annotation) out=writeReport(raw = x, normalized = xn,
scored = xsc, outdir = Outdir_report, force = TRUE, settings =
list(xrange = c(0.5,3),zrange = c(-4, 8), ar = 1))
setwd(orgDir)
sink()
Any comments from you will really help guiding me towards the right
direction.
meraj
From: Joseph Barry
[mailto:joseph.barry@embl.de]<mailto:[mailto:joseph.barry@embl.de]>
Sent: Monday, June 11, 2012 11:53 AM
To: Meraj Aziz
Cc: bioconductor at r-project.org<mailto:bioconductor at="" r-project.org="">
Subject: Re: Question regarding cellhts2 output
Hi Meraj,
One clarification: the Bscore method in cellHTS2 does not
automatically divide by the MAD. One must explicitly specify
varianceAdjust="byPlate" to enforce this.
Best wishes,
Joseph
On Jun 11, 2012, at 8:37 PM, Joseph Barry wrote:
Hi Meraj,
I would recommend that you use the method="median" and
varianceAdjust="byPlate" (or alternatively "byExperiment" or
"byBatch", depending on the context) options to normalizePlates. This
will subtract the median and divide by the median absolute deviation
(MAD), which is slightly more robust than the classical zscore, where
one subtracts the mean and divides by the standard deviation.
The Bscore normalization method subtracts the plate median and divides
by the plate MAD, but also applies a two-way median polish to correct
for row and column effects. Thus it is essentially a zscore with a few
more bells and whistles attached, if you will. The references at the
bottom of the ?Bscore documentation explain this in more detail and
will help you to decide whether or not this is appropriate for your
data.
(cc'd to the bioconductor mailing list for future googlers :) )
Best wishes,
Joseph
On Jun 11, 2012, at 8:11 PM, <maziz at="" tgen.org<mailto:maziz="" at="" tgen.org="">> <maziz at="" tgen.org<mailto:maziz="" at="" tgen.org="">> wrote:
Hi Joseph
I have a question regarding the scores generated by cellhts2.
I would really appreciate if you can answer them.
In your paper
http://genomebiology.com/content/pdf/gb-2006-7-7-r66.pdf
you mention zscore as the basis of your score. Online
cellhts2 does not have a zscore normalization mechanism/option.
Question is:
1) How can I only choose zscore normalization.
2) And if I choose Bscore normalization. Is the score really
standard
deviation from the mean/median.
In the R_OUTPUT file I see:
NormalizationMethod="Bscore"
Score="zscore"
(what exactly does this imply)
Thank you for your help
Meraj
<cellhts2_output_scna_project.xlsx>
<density.pdf><qqplot.pdf>
[[alternative HTML version deleted]]
_______________________________________________
Bioconductor mailing list
Bioconductor at r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor