I have an Affymetrix library which I compute the fold change
and PA call for every probe with Bioconductor.
I found cases where probe may have a high fold change (>= 5)
but the Pa call is 0.
My question is, how can we interpret such cases?
Should we assign the probe fold change to be 0 or 1?
We need to assign values because later we will
perform hierarchical clustering with R hclust().
If we assign "NA" we will have problem with clustering later.
G.V.
Hi,
On Fri, Dec 6, 2013 at 8:53 PM, Gundala Viswanath <gundalav at="" gmail.com=""> wrote:
> I have an Affymetrix library which I compute the fold change
> and PA call for every probe with Bioconductor.
Without you providing any code to show us what you mean, it's hard to
guess what you have done. Bionconductor is a large ecosystem of
packages, and there are many different ones that could have been used
to process an "affymetrix library" and compute fold changes.
So, please provide the code (or more specific detail) to show us how
your data was processed.
Also, by "PA call" I will assume that you mean "present / absent"
call, but for future reference it is very common to first use the
whole word/phrase before you use an acronym for it, even if you think
the acronym is extremely obvious.
> I found cases where probe may have a high fold change (>= 5)
> but the Pa call is 0.
>
> My question is, how can we interpret such cases?
Confused -- Are you computing the fold change between between a probe
(do you mean "probe set") between two arrays, where only one of the
arrays has an absent call? To both arrays have an "absent" call?
Where is this fold change coming from? This is where better detail on
how you processed the data (as asked for above) would be helpful.
> Should we assign the probe fold change to be 0 or 1?
>
> We need to assign values because later we will
> perform hierarchical clustering with R hclust().
> If we assign "NA" we will have problem with clustering later.
If you *really* have missing (NA) data, one option you can consider is
to to impute the missing values so that you can perform such
downstream analysis that require complete data .
--
Steve Lianoglou
Computational Biologist
Genentech
Yes. PA call means present and absent calls.
My main questions is actually
If a probe have a pa call zero.
What is the reasonable expression value we can assign to it?
NA or zero.
> > I found cases where probe may have a high fold change (>= 5)
> > but the Pa call is 0.
> >
> > My question is, how can we interpret such cases?
>
> Confused -- Are you computing the fold change between between a
probe
> (do you mean "probe set") between two arrays, where only one of the
> arrays has an absent call? To both arrays have an "absent" call?
Correct. Two arrays have absent calls. When both PA=0,
I assign PA=0 for that foldchange.
>
> Where is this fold change coming from? This is where better detail
on
> how you processed the data (as asked for above) would be helpful.
>From two arrays, control and test sample.
G.V.
>
> > Should we assign the probe fold change to be 0 or 1?
> >
> > We need to assign values because later we will
> > perform hierarchical clustering with R hclust().
> > If we assign "NA" we will have problem with clustering later.
>
> If you *really* have missing (NA) data, one option you can consider
is
> to to impute the missing values so that you can perform such
> downstream analysis that require complete data .
>
> --
> Steve Lianoglou
> Computational Biologist
> Genentech
>
[[alternative HTML version deleted]]
Hi,
On Sat, Dec 7, 2013 at 3:28 PM, Gundala Viswanath <gundalav at="" gmail.com=""> wrote:
> Yes. PA call means present and absent calls.
>
> My main questions is actually
> If a probe have a pa call zero.
> What is the reasonable expression value we can assign to it?
> NA or zero.
Since you didn't directly answer my question(s), I'll return the favor
;-)
At this stage of the "affy array game," I think you'd need a very good
reason to choose to preprocess your data with something other than
RMA.
So, since RMA doesn't use P/A calls, and pre-processing your data w/
RMA would give you something else besides either NA or 0, I wouldn't
pick either option you propose.
How about simply RMA normalizing your data and use that output for
downstream analysis (clustering, or whatever else you're trying to
do).
If you still *really* want an answer to dealing with the absence
calls, you will find many hits (some publications) if you run a google
search for:
"rma present absent calls"
HTH,
-steve
--
Steve Lianoglou
Computational Biologist
Genentech