GEOquery: incomplete feature data from GPL soft file
0
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
On Mon, Jun 17, 2013 at 7:09 AM, Renaud Gaujoux <renaud at="" mancala.cbio.uct.ac.za=""> wrote: > Hi, > > I am getting incorrect feature annotation data when loading a dataset from > GPL4133. > The feature data looks like this: > > head(fData(eset)[, 1:2]) > ID COL > 12 12 266 > NA <na> <na> > NA.1 <na> <na> > 15 15 266 > 16 16 266 > NA.2 <na> <na> > > This possibly also results in having less features in the final expression > matrix, if it is at some point restricted to feature names matching the > ones in the loaded annotation data. > > The real issue here seems to be with the soft file being badly formatted, > with lines having double quotes where there should not be: > > 12 266 148 A_24_P66027 A_24_P66027 FALSE > NM_004900 NM_004900 9582 APOBEC3B apolipoprotein B > mRNA editing enzyme, catalytic polypeptide-like 3B" Hs.226307 ... > > Looking at the way GEOquery loads the annotation soft files, we see that > they are read using `quote="\""`, which clearly returns a messed up > data.frame. Thanks, Renaud for the report. I finally got around to making this adjustment, so this should work for you now. Sean
Annotation GEOquery Annotation GEOquery • 743 views
ADD COMMENT

Login before adding your answer.

Traffic: 1034 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6