Entering edit mode
Benjamin Otto
▴
830
@benjamin-otto-1519
Last seen 10.2 years ago
Hi guys,
in principle the problem is how to compute a statistic for ultra-tiny
group sizes with paired samples.
Here is the Model:
-------------------------
Assumption 1:
A data set of microarrays consists of four classes describing the
disease phenotype: type 1, type 2, type 3 and control group. Now as
the type 1 and type 2 phenotype of the disease is extremely rare there
are only two sample in these two groups. The data set now consists of
control: 8 samples
type 1: 7 samples
type 2: 2 samples
type 3: 2 samples
Assumption 2:
We assume, that gender and age might have an influence on the
phenotype. Therefore samples in the control groups were selected so
that age and gender match the samples in the other three groups.
Unfortunately, as the disease is so rare, the age and gender of the
patients in the groups are not all the same. So we end up with some
kind of semi-paired comparisons, "paired" because for each type1/2/3
sample we pick a control sample defined by age and gender and "semi"
because it is not really the same patient the control sample come
from.
We suppose (but that IS an assumption) that differences between
type1/2/3 samples and controls with non-matching age and gender might
naturally exhibit bigger (disease-unrelated) variance, so the
selection of the control-disease pairs is targeted.
At the end type 1/2/3 groups shall be compared with control group. As
group 1 has 7 samples a paired analysis is possible. The problem lies
within groups 2 and 3.
Here is a suggested analysis approach:
------------------------------------------------------
As there is no real statistical test that can be applied for samples
with groups of size 2 it would be a thought introducing a
bootstrapping approach where for each gene no statistic but only the
fold change is computed. From the set of computed fold changes the
location of the native fold change(s) (e.g. the mean fold change for
the correct pairs) within the distribution is used as significance
statistic.
Now here are the questions:
--------------------------------------
1) As the samples are "paired", is it at all convincing to resolve the
pairings to be able to perform a bootstrapping? Is such a
bootstrapping the correct approach for "paired" samples anyway in such
a case?
2) Should the samples of group 2/3 "only" randomly be remapped to
other control samples than the initial ones. Or does it make more
sense to randomly assign the control and type 2/3 samples to the
groups?
3) If the samples were randomly assigned to the groups, does always at
least one "disease"-sample have to remain in the type 2/3 group? Or
would it be legit in this case to use a permutation where two control
samples are compared to two other permutations?
4) Any preferable idea how to calculate a statistic here?
Thanks and best regards,
Benjamin
___________________________________________
Benjamin Otto, PhD
University Medical Center Hamburg-Eppendorf
Institute For Clinical Chemistry / Central Laboratories
Campus Forschung N27
Martinistr. 52,
D-20246 Hamburg
Tel.: +49 40 7410 51908
Fax.: +49 40 7410 54971
___________________________________________
--
Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und
Genossenschaftsregister sowie das Unternehmensregister (EHUG):
Universit?tsklinikum Hamburg-Eppendorf
K?rperschaft des ?ffentlichen Rechts
Gerichtsstand: Hamburg
Vorstandsmitglieder:
Prof. Dr. J?rg F. Debatin (Vorsitzender)
Dr. Alexander Kirstein
Joachim Pr?l?
Prof. Dr. Dr. Uwe Koch-Gromus