Entering edit mode
Daniel Brewer
★
1.9k
@daniel-brewer-1791
Last seen 10.3 years ago
This is not strictly a bioconductor question, but it is in the
processing I use bioconductor and someone might have a similar
experience.
I use "apt-probeset-summarize" to produce Exon-level and gene-level
signals. Different probesets are assigned to a gene or Exon based on
the evidence to support this association. I use the "core" grouping.
This grouping is defined by two files, one a probeset file (PS) which
is
simply a list of identifiers and the meta-probeset file which is a
file
with four columns:
1) probeset_id
2) transcript_cluster_id (Always same as 2)
3) probeset_list (list of probesets associated with the transcription
cluster)
4) probe_count (the total number of probes)
I might be confused about the true meaning of the meta probeset file
but
from what I can see, the probesets in a particular grouping should be
in
both the mps and the ps files if associated with a gene. This does not
appear to be the case. For example if we look at the PTEN gene
(3256689).
The mps file (HuEx-1_0-st-v2.r2.dt1.hg18.core.mps) has the following
line:
3256689 3256689 3256702 3256703 3256704 3256705 3256740 3256780 24
i.e. there are 6 probesets associated (3256702, 3256703,3256704,
3256705,3256740 & 3256780).
Using NETAFFX or
HuEx-1_0-st-v2.r2.dt1.hg18.core.ps+HuEx-
1_0-st-v2.na21.hg18.probeset.csv
suggest that there are 23 core probesets associated with this gene
("3256702"
"3256703","3256704","3256705","3256706",
"3256707","3256708","3256709","3256710",
"3256711","3256725","3256736","3256738",
"3256739","3256740","3256764","3256767",
"3256772","3256773","3256777","3256778", "3256779" & "3256780").
This difference could significantly effect the gene summary results.
Does anyone know whether this discrepancy is on purpose? and if so
way?
Am I using the correct mps file?
Thanks
--
**************************************************************
Daniel Brewer, Ph.D.
Institute of Cancer Research
Email: daniel.brewer at icr.ac.uk
**************************************************************
The Institute of Cancer Research: Royal Cancer Hospital, a charitable
Company Limited by Guarantee, Registered in England under Company No.
534147 with its Registered Office at 123 Old Brompton Road, London SW7
3RP.
This e-mail message is confidential and for use by the
addre...{{dropped}}