Search
Question: Question on importing large dataset (1.4GB) into R-Bioconductor
0
gravatar for Anqi
9.3 years ago by
Anqi40
Anqi40 wrote:
To whom it may concern, I am a student from Peking University, China. I am currently doing some microarray data analysis research with Bioconductor package of R. Problem arises when I try to import into R my dataset which contains 109 samples (total size more than 1.4 GB). The memory limit of R makes importing all the samples into one AffyBatch object a "mission impossible" for me. Though it will be possible to import data into several AffyBatch objects, and do the preprocessing respectively. Yet in this case, the results of background correction or normalization are not desirable, because not all the information known (namely 109 samples) is used to obtain a baseline or something like that. An alternative approach would be to pre-process the data in dChip, and then export it into R. Yet I am thinking about an approach that relies solely on R. Would you please give some suggestions on this issue, though it might be more a technical problem than a scientific (statistical) one? Much thanks for your help! look forward to your reply! All the best to your work! Best regards, Anqi [[alternative HTML version deleted]]
ADD COMMENTlink modified 9.3 years ago by Sean Davis21k • written 9.3 years ago by Anqi40
0
gravatar for Sean Davis
9.3 years ago by
Sean Davis21k
United States
Sean Davis21k wrote:
On Thu, Jul 16, 2009 at 12:14 AM, Anqi <dotzaq@126.com> wrote: > To whom it may concern, > I am a student from Peking University, China. I am currently doing some > microarray data analysis research with Bioconductor package of R. > > Problem arises when I try to import into R my dataset which contains 109 > samples (total size more than 1.4 GB). The memory limit of R makes importing > all the samples into one AffyBatch object a "mission impossible" for me. > > Though it will be possible to import data into several AffyBatch objects, > and do the preprocessing respectively. Yet in this case, the results of > background correction or normalization are not desirable, because not all > the information known (namely 109 samples) is used to obtain a baseline or > something like that. > > An alternative approach would be to pre-process the data in dChip, and then > export it into R. Yet I am thinking about an approach that relies solely on > R. > > Would you please give some suggestions on this issue, though it might be > more a technical problem than a scientific (statistical) one? Much thanks > for your help! look forward to your reply! All the best to your work! > > You could try using the xps or aroma.affymetrix packages. I think both are designed to deal with large datasets. Sean [[alternative HTML version deleted]]
ADD COMMENTlink written 9.3 years ago by Sean Davis21k
0
gravatar for Sean Davis
9.3 years ago by
Sean Davis21k
United States
Sean Davis21k wrote:
2009/7/16 Anqi <dotzaq@126.com> > Hi Sean, > I have tried my dataset out in the aroma.affymetrix package and it DOES > work. Thanks so much for your help! > > Glad to hear that did the trick. Sean > Best, > Anqi > > > 在2009-07-16 12:19:29,"Sean Davis" <seandavi@gmail.com> 写道: > > > > On Thu, Jul 16, 2009 at 12:14 AM, Anqi <dotzaq@126.com> wrote: > >> To whom it may concern, >> I am a student from Peking University, China. I am currently doing some >> microarray data analysis research with Bioconductor package of R. >> >> Problem arises when I try to import into R my dataset which contains 109 >> samples (total size more than 1.4 GB). The memory limit of R makes importing >> all the samples into one AffyBatch object a "mission impossible" for me. >> >> Though it will be possible to import data into several AffyBatch objects, >> and do the preprocessing respectively. Yet in this case, the results of >> background correction or normalization are not desirable, because not all >> the information known (namely 109 samples) is used to obtain a baseline or >> something like that. >> >> An alternative approach would be to pre-process the data in dChip, and >> then export it into R. Yet I am thinking about an approach that relies >> solely on R. >> >> Would you please give some suggestions on this issue, though it might be >> more a technical problem than a scientific (statistical) one? Much thanks >> for your help! look forward to your reply! All the best to your work! >> >> > You could try using the xps or aroma.affymetrix packages. I think both are > designed to deal with large datasets. > > Sean > > > > > ------------------------------ > 200万种商品,最低价格,疯狂诱惑你<http: count.mail.163.c="" om="" redirect="" footer.htm?f="&lt;a href=" http:="" gouwu.youdao.com"="" rel="nofollow">http://gouwu.youdao.com"> [[alternative HTML version deleted]]
ADD COMMENTlink written 9.3 years ago by Sean Davis21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 432 users visited in the last hour