Why procset(files, procol) requires exactly one column name?
Entering edit mode
klv2706w ▴ 10
Last seen 3.1 years ago


I started using ArrayExpress today to read a processed data from a recent study - https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6153/

I used the example provided in the documentation of getAE() to retrieve and load the data of my interest.

mexp1422 = getAE("E-MEXP-1422", type = "full")
## Build a an ExpressionSet from the processed data
cnames = getcolproc(mexp1422)
MEXP1422proc = procset(mexp1422, cnames[2])

I am very puzzled why procset() requires exactly one column name. This totally doesn't make sense. In addition, it is very slow for my dataset (19000 single cells). I would like to improve speed by rewriting the code that reads the processed data. I want to use data.table in a way that I suggest here: https://github.com/vitkl/myArrayExpress/commit/53858134241a326dc73f4f70f7fd2622e1702ee5 (line 78). For that, I need to know why procset() requires exactly one column name and whether I can remove this condition.

Thank you,



