Entering edit mode
Ramil Nurtdinov
▴
10
@ramil-nurtdinov-4381
Last seen 10.3 years ago
Dear colleagues
My experience with R BioConductor and Affymetrix Human Exon 1.0 ST
array started from oligo package. Unfortunately for my 19 HuExon1.0
arrays R asks for approx 6-7 Gigabytes of memory. While RMA algorithm
in Affymetrix Expression Console takes 40 minutes of my Sony Vailo
notebook. Second there was no good annotation for this chip in R,
except X:Map, my competitor for the paper :))
So first problem I had solved by Expression Console and for second
problem we had developed PLANdbAffy
http://nar.oxfordjournals.org/content/38/suppl_1/D726.long
Now I am finishing EnsEmbl plus hg19 version of database. I understand
that BioConductor is
widely used in scientific word but my load is rather big because of
many new projects.
If somebody gives me the format for annotation I can make
corresponding database summary file.
Yours sincerely,
Ramil Nurtdinov, PhD
.On 12/7/10, B.Misovic at lumc.nl <b.misovic at="" lumc.nl=""> wrote:
> Dear Ramil,
>
>
>
> I see I forgot to add you in the email bellow which I've sent
to
> bioConductor mailing list and our collaborators in Poland... just
in
> case you have some comments.
>
>
>
> Best,
>
> Branko
>
>
>
> ________________________________
>
> From: Misovic, B. (TOXGEN)
> Sent: 07 December 2010 15:09
> To: 'roman.jaksik at polsl.pl'; 'bioconductor at r-project.org'
> Cc: 'cstrato'
> Subject: PLANdbAffy + Alternative Exon Annotation
> +XPS,aroma,oligo,RMAExpress
>
>
>
> Dear Roman, all
>
>
>
> Recently we tried your version of Annotation files for Gene 1.0 ST
> array that your team built from PLANdbAffy DB . I encountered some
> problems so I hope you can help.
>
>
>
> You provide nice CDF and Affy PGF/CLF files , but, the PGF/CLF were
not
> useful in bioConductor packages for affy Exon/Gene type arrays
,namely:
> oligo & XPS as they require annotation file in csv format. I tried
the
> annotation csv file from Affymetrix and after that from PLANdbAffy
DB.
> The PLANdbAffy csv file is very different from Affymetrix one so
import
> is not possible (actually csv file on the website is TAB delimited
> instead of comma so problem already starts there , and it requires
> reformatting).
>
> Christian from XPS was kind to inform me that :
>
>
>>... PLANdbAffy annotation columns have nothing to do with the
> Affymetrix
>>annotation columns. Thus xps will not read these annotation files.
>
>>Alternative annotation files must contain exactly the same columns
as
>
>>the Affymetrix annotation files.
>
>
>
>>For whole genome and exon arrays it is not possible to use only the
> PGF->files w/o the annotation files, since I extract most of the
> important >information from the probeset-annotation file first, so
this
> file is >absolutely essential. For example, column "level" contains
the
> information >Core/Extended/Full, see the corresponding annotation
README
> files for an >explanation of all columns.
>
>
>
>>xps error you get simply says that their PGF-file does not contain
the
>>AFFX controls, so maybe adding the AFFX controls to their PGF-file
> might >help. However, as you mention, they use their own
Probesetids,
> which will >not match the Probesetids of the Affymetrix annotation
> files, thus it may >not work anyhow.
>
>
>
>>It is not quite clear to me why they created their own PGF-file. The
>>Affymetrix PGF-file contains only 1-4 probes for each probeset,
where
> each >exon consists of one or more probesets, thus the probability
that
> a probe >within a probeset is not correct should be pretty small.
> However, a >probeset could be mapped to a wrong exon/gene or no gene
at
> all, so it >should be sufficient to correct the Affymetrix
annotation
> files.
>
>
>
> The tools like RMAExpress, EC., and Aroma.affymetrix, can work
with
> CDF only. So after using RMAExpress (in command line mode) I did
get
> Expression matrix out but I could not link 19532 Probeset ids to
> PLANdbAffy annotation csv file to collect gene basic information.
What i
> did was , 1st load the full annotation file (not filtered) from
> PLANdbAffy:
> http://affymetrix2.bioinf.fbb.msu.ru/files.html
>
> and search the 2nd colum (Probe_Sets) with ids after RMA and I find
0...
> then i tried the 1st column (the Probes ) and found 8664... but I
would
> expect vice versa situation ?
>
>
>
> So Roman can you please:
> 1) advise how to get real ids after RMAExpress run?
> 2) do you plan to build Annotation csv file as Affymetrix dose so
that
> other software from Bioconductor oligo & XPS can use it?
> 3) comment on Christian feedback.
>
>
>
> Btw. Christian, how come RMAExpress, EC., and Aroma.affymetrix can
work
> with CDFs only and oligo & XPS require extra annotation? From what
I
> gather (after peaking into CDF and PGF files ) they show what probes
are
> belonging to probe_set. So for probe_set level analysis (or more
> exon_like analysis) the PGF/CLF files alone seem to be enough?
>
>
>
> For bioc list, just to bring attention to this article & DB :
>
>
>
> PLANdbAffy: probe-level annotation database for Affymetrix
expression
> microarrays , Ramil N. Nurtdinov1 et al.
>
> http://nar.oxfordjournals.org/content/38/suppl_1/D726.full
>
>
>
> http://affymetrix2.bioinf.fbb.msu.ru/
>
>
>
> Maybe some of bioC experts have comments about it?
>
>
>
> Best,
>
> Branko
>
>
>
> --------------------------
>
> Branislav Misovic,
>
> Department of Toxicogenetics
>
> Leiden University Medical Center
>
> Einthovenweg 20, 2333 ZC Leiden
>
> PO.box 9600, Building2,Room:T3-11
>
> 2300 RC Leiden
>
> The Netherlands
>
> Phone: +31 71 526 9636
>
> Mob: 0653135855
>
> E-mail:
>
> b.misovic at lumc.nl
>
> braniti at gmail.com
>
>
>
>