Entering edit mode
On Mon, Jun 17, 2013 at 7:09 AM, Renaud Gaujoux
<renaud at="" mancala.cbio.uct.ac.za=""> wrote:
> Hi,
>
> I am getting incorrect feature annotation data when loading a
dataset from
> GPL4133.
> The feature data looks like this:
>
> head(fData(eset)[, 1:2])
> ID COL
> 12 12 266
> NA <na> <na>
> NA.1 <na> <na>
> 15 15 266
> 16 16 266
> NA.2 <na> <na>
>
> This possibly also results in having less features in the final
expression
> matrix, if it is at some point restricted to feature names matching
the
> ones in the loaded annotation data.
>
> The real issue here seems to be with the soft file being badly
formatted,
> with lines having double quotes where there should not be:
>
> 12 266 148 A_24_P66027 A_24_P66027 FALSE
> NM_004900 NM_004900 9582 APOBEC3B
apolipoprotein B
> mRNA editing enzyme, catalytic polypeptide-like 3B" Hs.226307 ...
>
> Looking at the way GEOquery loads the annotation soft files, we see
that
> they are read using `quote="\""`, which clearly returns a messed up
> data.frame.
Thanks, Renaud for the report. I finally got around to making this
adjustment, so this should work for you now.
Sean