Hello,
I've read the help page and the vignette and I want to have the strand
information of the probes used, but I can't see what the required
format of the names of the vectors is. Can anyone provide a small
example of how it's done ? It would be nice if there was a simple way
to do this based on a data.frame or GRanges object of probe
information, rather than having having to create lots of specially
named vectors and mess around with environments. Also, the chr
argument to segChrom has to be an integer vector, which means X and Y
for human have to be artificially renamed as 23 and 24. The design
could be more streamlined.
- Dario.
--------------------------------------
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia
Dear Dario
please can you next time be clearer which package you are refering to.
'probeAnno' objects are defined in at least two packages:
'tilingArray'
and 'Ringo'. (In R there is no assumption that the same class name can
be used only by a single package in the world, this is why R has name
spaces.)
Assuming that you mean the tilingArray package, the 1st sentence of
Section 2 in the vignette says:
The function plotAlongChrom accepts an environment as its first
argument, which is expected to contain objects of class segmentation
with names given by paste(chr, c("+", "-"), sep="."), where chr is the
chromosome identifier.
So the strand is implied by the name of the segmentation object, e.g.
"1.+" and "1.-" correspond to the Watson and Crick stands of
chromosome
1 respectively.
Regarding the design: the tilingArray package was written in 2005, 06.
It predates GRanges, the oligo package, and almost anything else of
that
sort. The Ringo package was written later and corrected many of the
less
streamlined aspects of the initial attempt. The IRanges-infrastructure
was developed more recently, and is by far much more elegant and
performant.
Personally, I do not see much value in refactoring the old tiling
array
code from the middle of last decade, when most people in the meanwhile
have moved on to high-throughput sequencing, and the those who still
use
tiling arrays normally have lots of legacy code. However, you should
feel free to do so, and you could contribute a tiling array package
that
you actually like :)
Best wishes
Wolfgang
May/23/12 11:00 AM, Dario Strbenac scripsit::
> Hello,
>
> I've read the help page and the vignette and I want to have the
> strand information of the probes used, but I can't see what the
> required format of the names of the vectors is. Can anyone provide a
> small example of how it's done ? It would be nice if there was a
> simple way to do this based on a data.frame or GRanges object of
> probe information, rather than having having to create lots of
> specially named vectors and mess around with environments. Also, the
> chr argument to segChrom has to be an integer vector, which means X
> and Y for human have to be artificially renamed as 23 and 24. The
> design could be more streamlined.
>
> - Dario.
>
> -------------------------------------- Dario Strbenac Research
> Assistant Cancer Epigenetics Garvan Institute of Medical Research
> Darlinghurst NSW 2010 Australia
>
--
Best wishes
Wolfgang
Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber
> Assuming that you mean the tilingArray package, the 1st sentence of
> Section 2 in the vignette says:
> The function plotAlongChrom accepts an environment as its first
> argument, which is expected to contain objects of class segmentation
> with names given by paste(chr, c("+", "-"), sep="."), where chr is
the
> chromosome identifier.
But there is only one probeAnno class, defined in Ringo, right ?
Thanks for the
reference to the vignette, that's what I needed to know. I assumed
?probeAnno
was the best place to find out about creating an object. My
understanding is
that vignettes are to show how the package works on real data for an
analysis,
rather than defining how to use class constructors not documented in
basic help
files.
Another aspect I found unclear was that the first two parameters to
segChrom
can be a matrix and a probeAnno object and how they were linked.
Without
reading the source code of segChrom, the user wouldn't know that the
chrNumber.index items of a probeAnno object correspond to the row
numbers/names
of the intensity matrix. Which seems logical, but could be stated
somewhere.
I've made assumptions before about how public software works in the
past, and
it turned out it didn't work how I assumed, so I'm wary about
ambiguities
nowadays.
> However, you should feel free to do so, and you could contribute a
tiling
> array package that you actually like :)
Done it once before for my package - it was an adventure ! I think I
might pass
you up on your offer.
- Dario.
Also worth describing in ?probeAnno
> probes
Error in validObject(object) :
invalid class ?probeAnno? object: 1: Probe matches are not sorted
(in
increasing order) by their middle position on chromosome chr1.- and
possibly
others.
invalid class ?probeAnno? object: 2: Probe matches are not sorted (in
increasing order) by their middle position on chromosome chr1.+ and
possibly
others.
invalid class ?probeAnno? object: 3: Probe matches are not sorted (in
increasing order) by their middle position on chromosome chr10.- and
possibly
others.
... ...
Also, I need to write a workaround for segChrom for any chromosomes
that don't
have probes on both strands (boutique design array).
... ...
Running 'segment' on chromosome chr12.+ ... complete
Running 'segment' on chromosome chr12.- ... complete
Running 'segment' on chromosome chr13.+Error in probeAnno[w] :
No mapping 'chr13.+.start' in this 'probeAnno' object.
> probes
A 'probeAnno' object holding the mapping between
reporters and genomic positions.
Chromosomes: chr1.- chr1.+ chr10.- chr10.+ chr11.- chr11.+ chr12.-
chr12.+
chr14.- chr14.+ chr15.- chr15.+ chr16.- chr16.+ chr17.- chr17.+
chr18.- chr18.+
chr19.- chr19.+ chr2.- chr2.+ chr20.- chr20.+ chr21.- chr21.+ chr22.-
chr22.+
chr3.- chr3.+ chr4.- chr4.+ chr5.- chr5.+ chr6.- chr6.+ chr7.- chr7.+
chr8.-
chr8.+ chr9.- chr9.+ chrX.- chrX.+ chrY.- chrY.+
Microarray platform:
Genome:
Gives a valid object, so segChrom should work with it.
- Dario
Sorry, bad example for the last part. Here's what I was meant to show.
> segments <- segChrom(intensRBNorm, probes, chr = c("chr9", "chr10"),
strands
= c('+', '-'), nrBasesPerSegment = 500000, maxk = 500, step = 7)
Running 'segment' on chromosome chr9.+ ... complete
Running 'segment' on chromosome chr9.- ... complete
Running 'segment' on chromosome chr10.+Error in segment(ychr, maxseg =
nsegs,
maxk = maxk) :
maxseg must be an integer of length 1 between 1 and nrow(y)=0
> head(probes["chr10.+.start"])
integer(0)
> head(probes["chr10.-.start"])
[1] 44872512 44872538 44872560 44872588 44872612 44872634
Only a few genes on chr10 - strand are tiled, none on + strand. I'll
get around
it by creating a probeAnno object for chromosomes that have both
strands tiled,
one for only +, and one for only -, and likewise for intensity
matrices.
I could be wrong, but as far as i remember from when I used
tilingArray, the probeAnno object was just a simple list or
environment. It was not wrapped in a classes structure at all. I
suggest you look at the davidTiling package, which contains a
probeAnno object. It was pretty easy to figure out how the object
should be structured.
Kasper
On Fri, May 25, 2012 at 2:15 AM, Dario Strbenac
<d.strbenac at="" garvan.org.au=""> wrote:
> Sorry, bad example for the last part. Here's what I was meant to
show.
>
>> segments <- segChrom(intensRBNorm, probes, chr = c("chr9",
"chr10"), strands
> = c('+', '-'), nrBasesPerSegment = 500000, maxk = 500, step = 7)
> Running 'segment' on chromosome chr9.+ ... complete
> Running 'segment' on chromosome chr9.- ... complete
> Running 'segment' on chromosome chr10.+Error in segment(ychr, maxseg
= nsegs,
> maxk = maxk) :
> ?maxseg must be an integer of length 1 between 1 and nrow(y)=0
>> head(probes["chr10.+.start"])
> integer(0)
>> head(probes["chr10.-.start"])
> [1] 44872512 44872538 44872560 44872588 44872612 44872634
>
> Only a few genes on chr10 - strand are tiled, none on + strand. I'll
get around
> it by creating a probeAnno object for chromosomes that have both
strands tiled,
> one for only +, and one for only -, and likewise for intensity
matrices.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor