Help on PLGEM R Package Data Import
1
0
Entering edit mode
@norman-pavelka-4875
Last seen 9.6 years ago
Dear Qi, Thank you for your interest in PLGEM! I am CC'ing my reply to the Bioconductor mailing list, as this is the best forum to address your question. I strongly recommend you to subscribe and send further queries there. (You may always CC me to get a more rapid response.) Your question is about how to load your data into R/Bioconductor. Since the object that PLGEM needs as an input is of type 'ExpressionSet', you'll have to learn how to build such an object in R. Doing it from scratch is a bit cumbersome, but you could use function 'readExpressionSet' from package Biobase to make your life easier. Type the following in your R prompt to get the help page: library(Biobase) ?readExpressionSet For PLGEM, you will only need a single 'exprsFile' and a single 'phenoDataFile': * The 'exprsFile' is going to be a tab-delimited text file in which the first column contains your protein identifiers and the subsequent columns contain NSAF values from the various MS runs you performed. Be sure to put a meaningful header on top of each column (except for the first column). Do not use any spaces or special characters in your column headers, though, because it will cause some problems. For those proteins that were not identified in all your runs, replace the missing values with a zero. * The 'phenoDataFile' instead is going to be a description of your columns in your 'exprsFile', i.e. a description of your experimental design. Note that the row names of the 'phenoDataFile' need to exactly match the column names of the 'exprsFile'. To make it easier, I'm attaching an example with some random numbers. Copy these two files into your working directory and run the following code: library(plgem) eset <- readExpressionSet("example-exprsFile.txt", "example- phenoDataFile.txt") plgemResult <- run.plgem(eset) (Of course the results are going to look aweful, because I just put in some random numbers...) Please direct further queries directly to the Bioconductor mailing list. Good luck and let me know how it worked! Cheers, Norman On Wed, Sep 21, 2011 at 10:37 AM, Wu Qi <qwu at="" dicp.ac.cn=""> wrote: > Dear Norman, > > > > My name is Qi Wu, I?m a Chinese student working on quantitative proteomics, > recently your PLGEM algorithm interested me. It seems a better choice than > conventional t test. > > I?m a beginner in statistics, after installing PLGEM R package, I followed > the instruction on ?An introduction to PLGEM, Mattia Pelizzola and Norman > Pavelka, April 13, 2011? running the wrapper mode and got the sample > figures. But I don?t know how to import my own data. I couldn?t open the > sample data named ?LPSeset? using Excel or UltraEdit, so I had no idea how > the data was organized. Now I could generate replicate Excel or plain text > files containing proteins abundance values of different status, could you > tell me how can I import such data in PLGEM R package and get a list > containing those significantly changed proteins? I searched the internet for > quite a long time and got nothing. > > Thanks very much for contributing your wonderful algorithm, your reply is > high appreciated. > > > > Best Regards, > > Qi Wu > > -------------- next part -------------- run1 run2 run3 run4 run5 run6 protein1 0.272105158 0.578182635 0.754035242 0.145940092 0.769239904 0.701316978 protein2 0.764643801 0.894118526 0.548281721 0.711020065 0.464191732 0.465665834 protein3 0.812542045 0.725245955 0.355091457 0.733231007 0.3719215 0.573931768 protein4 0.532098925 0.576436094 0.79932147 0.923533755 0.240087811 0.298329571 protein5 0.069322058 0.342420665 0.595869364 0.871297892 0.290066368 0.262239274 protein6 0.651908779 0.08587925 0.41497138 0.593448418 0.135282783 0.237435055 protein7 0.00676442 0.868917734 0.123948979 0.309431888 0.39153302 0.974454413 protein8 0.252936075 0.82627297 0.753852382 0.637724441 0.818835489 0.536654514 protein9 0.939906975 0.411477747 0.831390849 0.081767998 0.666160992 0.893382316 protein10 0.168850397 0.485616703 0.205180179 0.156619923 0.508252278 0.471131498 protein11 0.196314728 0.32468134 0.761186946 0.558866463 0.67901396 0.489973861 protein12 0.861746035 0.102611596 0.855005987 0.73792888 0.32983249 0.21108694 protein13 0.951873673 0.30003462 0.006940712 0.629648448 0.629938638 0.832315979 protein14 0.640402383 0.294533828 0.529832604 0.371573318 0.506894077 0.143067205 protein15 0.73583378 0.14851005 0.595233972 0.885030862 0.325172829 0.026206465 protein16 0.908470945 0.718603101 0.708392513 0.50143069 0.193894918 0.665979581 protein17 0.496500532 0.535995474 0.094783653 0.401989162 0.94661853 0.601725604 protein18 0.194425978 0.340924645 0.712342405 0.080625936 0.819505964 0.616081334 protein19 0.430001631 0.441905248 0.402944474 0.843559756 0.31844547 0.901831521 protein20 0.639123746 0.624013128 0.359589383 0.342791867 0.564777258 0.244177715 protein21 0.924487693 0.757481476 0.891131304 0.609658293 0.010763033 0.127580434 protein22 0.218318705 0.452645588 0.911919944 0.078377704 0.567760287 0.602822591 protein23 0.736702867 0.985518833 0.369754987 0.422809818 0.928399206 0.608567008 protein24 0.320451014 0.071640958 0.839071677 0.426467979 0.442474916 0.222783007 protein25 0.616781335 0.935056527 0.721135255 0.155921482 0.674421545 0.945110022 protein26 0.141025926 0.21758066 0.56221364 0.172119983 0.464236048 0.481820792 protein27 0.924781234 0.994016109 0.901588658 0.131451283 0.749907303 0.546611201 protein28 0.82062542 0.731001115 0.261704493 0.015184267 0.340863612 0.311089976 protein29 0.808924799 0.06599185 0.711024246 0.303225398 0.934253553 0.441136691 protein30 0.829397695 0.962024924 0.701677997 0.269678982 0.905625469 0.696369106 protein31 0.688649426 0.429196772 0.850694014 0.530922811 0.489442554 0.340990442 protein32 0.271072588 0.310993208 0.066286859 0.667186432 0.216603295 0.98393354 protein33 0.177641473 0.947536386 0.314142345 0.335356201 0.03103037 0.308629266 protein34 0.083279722 0.505583206 0.149014369 0.93389466 0.689173154 0.393801431 protein35 0.307865094 0.893705322 0.66594377 0.346686825 0.545788992 0.935939284 protein36 0.658938081 0.117871285 0.664053321 0.076622596 0.963525499 0.52271653 protein37 0.627671082 0.017791532 0.643219901 0.768227117 0.492449358 0.363140614 protein38 0.355743028 0.385378738 0.160308018 0.471810777 0.086925445 0.174162284 protein39 0.337959389 0.385260383 0.177478255 0.736542854 0.566532246 0.265694763 protein40 0.96810809 0.918746911 0.162265268 0.255419097 0.908330676 0.274357173 protein41 0.015773201 0.355726353 0.480739577 0.169213899 0.342018637 0.535936092 protein42 0.577332614 0.29894483 0.75724934 0.395812026 0.566983833 0.707412429 protein43 0.515129951 0.357157564 0.526637623 0.854330064 0.049939994 0.695080804 protein44 0.871384099 0.616546877 0.697200572 0.127922319 0.219454732 0.495248987 protein45 0.374477308 0.594947101 0.417492497 0.686996851 0.640584105 0.088030237 protein46 0.071587003 0.989157339 0.413504575 0.37602752 0.747968579 0.759316849 protein47 0.51427987 0.860039769 0.769387922 0.490167111 0.490964504 0.592018772 protein48 0.484559854 0.356943493 0.339023292 0.995063496 0.30532164 0.608724785 protein49 0.912141251 0.558137217 0.680421621 0.99945879 0.537633599 0.188452237 protein50 0.494583561 0.611372237 0.854799472 0.541990958 0.03840154 0.350093271 protein51 0.011361507 0.012479676 0.918867369 0.912596163 0.682556294 0.975385196 protein52 0.778622728 0.912687805 0.944790793 0.883330044 0.973987955 0.206676026 protein53 0.474411517 0.182070146 0.232991395 0.485088865 0.508959881 0.206590012 protein54 0.911200361 0.354936593 0.056552516 0.944233546 0.650099845 0.145641756 protein55 0.775455866 0.127316086 0.44492017 0.291546211 0.500757509 0.552814652 protein56 0.550983571 0.496775038 0.089711212 0.609431834 0.950519686 0.564260335 protein57 0.588955589 0.399663477 0.268178686 0.195061129 0.677796033 0.271139318 protein58 0.528449282 0.637165612 0.388329992 0.40286496 0.498295389 0.30062392 protein59 0.601779252 0.813951251 0.972581676 0.606104355 0.99651101 0.50984962 protein60 0.358177875 0.971566658 0.939295489 0.258080793 0.046821332 0.432076408 protein61 0.215898664 0.16481797 0.267180855 0.296904479 0.536706898 0.499094133 protein62 0.721319369 0.977751638 0.399863891 0.114349684 0.191042264 0.13360937 protein63 0.526770676 0.13305538 0.528330076 0.42729097 0.899472657 0.20495134 protein64 0.74041248 0.358460646 0.013588585 0.53335124 0.196911565 0.270794911 protein65 0.423650942 0.037028606 0.029929215 0.91885699 0.944606949 0.250795396 protein66 0.013598182 0.124505757 0.098784472 0.110043485 0.336810969 0.468085919 protein67 0.875384392 0.666533999 0.998699727 0.921972974 0.269602531 0.999853779 protein68 0.389763473 0.543996193 0.453664108 0.357524658 0.256674508 0.138297241 protein69 0.989995993 0.933590185 0.728864597 0.642563731 0.63626545 0.986435677 protein70 0.344739693 0.455033467 0.277329572 0.236647829 0.99447734 0.962611081 protein71 0.344029962 0.258233081 0.27946336 0.504306631 0.059642877 0.042321403 protein72 0.990149755 0.43033956 0.727804593 0.315031256 0.428771405 0.606973588 protein73 0.980010936 0.536351496 0.792459902 0.21475268 0.497742157 0.973566377 protein74 0.029378461 0.169899454 0.490532397 0.540477243 0.17628776 0.699109742 protein75 0.024287626 0.291580497 0.153676365 0.874232996 0.735014392 0.058511081 protein76 0.23614577 0.176915025 0.614894395 0.956701471 0.822677541 0.398301527 protein77 0.872337677 0.680133671 0.276904392 0.469358016 0.020074098 0.438384384 protein78 0.702711553 0.745110836 0.481211218 0.556333357 0.861230363 0.227543124 protein79 0.496590638 0.537850551 0.24012069 0.971338596 0.666206152 0.485584098 protein80 0.805569513 0.264843593 0.728664985 0.535440314 0.459766895 0.441764153 protein81 0.72210255 0.599214674 0.693628097 0.370340448 0.012107564 0.11413817 protein82 0.058792544 0.837276592 0.535703814 0.54429201 0.351159191 0.504381629 protein83 0.945161518 0.238986507 0.699804522 0.4247302 0.754882812 0.076056326 protein84 0.058176154 0.696600805 0.112801656 0.611122357 0.630338211 0.472786578 protein85 0.4964451 0.31601527 0.255416708 0.300014553 0.284052374 0.857228019 protein86 0.661603154 0.527571481 0.898159514 0.953075851 0.370684365 0.486787466 protein87 0.675233086 0.854044781 0.633968255 0.868082896 0.006404833 0.20081543 protein88 0.176833932 0.12980006 0.059860796 0.037318478 0.923219281 0.255300512 protein89 0.336967914 0.962323181 0.585833243 0.494059335 0.659501162 0.520424374 protein90 0.114827679 0.953159749 0.641332223 0.390621256 0.977909639 0.310468422 protein91 0.36771061 0.345454185 0.447244909 0.783510116 0.510032541 0.745687023 protein92 0.990137239 0.618745078 0.221793003 0.336632617 0.207446643 0.21701026 protein93 0.672596151 0.557426961 0.23452454 0.68006612 0.328876782 0.216499769 protein94 0.43904726 0.84772286 0.728493344 0.390138814 0.959153602 0.28976034 protein95 0.744142277 0.186217811 0.029561071 0.671092008 0.097159255 0.764376307 protein96 0.841812656 0.440726883 0.90302471 0.497747151 0.13072324 0.126721984 protein97 0.20488616 0.566345413 0.316886685 0.286125691 0.042550611 0.005176815 protein98 0.990144408 0.629420237 0.162497339 0.235526658 0.133771507 0.403680368 protein99 0.54433037 0.655007858 0.473160857 0.295717332 0.347074454 0.588217328 protein100 0.558155605 0.98697914 0.950869889 0.894547726 0.914887557 0.292393484 -------------- next part -------------- gender treatment run1 male drug run2 male no drug run3 female drug run4 female no drug run5 male drug run6 female no drug
Biobase plgem Biobase plgem • 995 views
ADD COMMENT
0
Entering edit mode
@norman-pavelka-4875
Last seen 9.6 years ago
Dear Qi, As I suggested you already, please direct your questions to the mailing list and I will be happy to reply to you there. You are welcome to put me in CC for a more rapid response. The parameter 'delta' is related to the false positive rate. You can get a more comprehensive answer in the original paper about PLGEM: http://www.biomedcentral.com/1471-2105/5/203 The short answer is: Yes, for proteomics dataset in which you only survey a few hundred proteins at most, you can increase the false positive rate ('delta') up to the level that you feel comfortable. For example, if you only have 500 proteins in a MudPIT dataset, then choosing a 'delta'=0.01 will result in roughly 5 false positive identifications. If you run such an analysis and you find 50 proteins differentially expressed, than your FDR will be roughly 10%. Please note that proteomics dataset also have a few other special features compared to microarrays, the most important of which is the presence of many missing observations. I highly recommend using the parameter trimAllZeroRows=TRUE and zeroMeanOrSD="trim" for proteomics data, in order to improve the fitting of the model in case of many missing values. Out of curiosity what values do you get for the fitting (slope, r.squared, etc.)? Hope this helps! Norman On Fri, Sep 23, 2011 at 3:16 PM, Wu Qi <qwu at="" dicp.ac.cn=""> wrote: > Dear Norman, > > Thanks for your kind help, it worked just fine, and I have subscribed to the > Bioconductor mailing list. > I have two other questions: > I found that if I use the default delta=0.001 I would get no significantly > changed proteins, Does delta=0.001 means p=0.001? So perhaps for proteomics > data, shall I choose a bigger delta like 0.01 or 0.05? > And how can I evaluate the FDR of the data? > > Best Regards, > Qi Wu > > -----Original Message----- > From: Norman Pavelka [mailto:normanpavelka at gmail.com] > Sent: Thursday, September 22, 2011 10:29 AM > To: Wu Qi > Cc: bioconductor at stat.math.ethz.ch; mattia.pelizzola at gmail.com > Subject: Re: Help on PLGEM R Package Data Import > > Dear Qi, > > Thank you for your interest in PLGEM! > > I am CC'ing my reply to the Bioconductor mailing list, as this is the best > forum to address your question. I strongly recommend you to subscribe and > send further queries there. (You may always CC me to get a more rapid > response.) > > Your question is about how to load your data into R/Bioconductor. > Since the object that PLGEM needs as an input is of type 'ExpressionSet', > you'll have to learn how to build such an object in R. Doing it from scratch > is a bit cumbersome, but you could use function 'readExpressionSet' from > package Biobase to make your life easier. Type the following in your R > prompt to get the help page: > > library(Biobase) > ?readExpressionSet > > For PLGEM, you will only need a single 'exprsFile' and a single > 'phenoDataFile': > > * The 'exprsFile' is going to be a tab-delimited text file in which the > first column contains your protein identifiers and the subsequent columns > contain NSAF values from the various MS runs you performed. Be sure to put a > meaningful header on top of each column (except for the first column). Do > not use any spaces or special characters in your column headers, though, > because it will cause some problems. For those proteins that were not > identified in all your runs, replace the missing values with a zero. > > * The 'phenoDataFile' instead is going to be a description of your columns > in your 'exprsFile', i.e. a description of your experimental design. Note > that the row names of the 'phenoDataFile' need to exactly match the column > names of the 'exprsFile'. > > To make it easier, I'm attaching an example with some random numbers. > Copy these two files into your working directory and run the following > code: > > library(plgem) > eset <- readExpressionSet("example-exprsFile.txt", > "example-phenoDataFile.txt") plgemResult <- run.plgem(eset) > > (Of course the results are going to look aweful, because I just put in some > random numbers...) Please direct further queries directly to the > Bioconductor mailing list. Good luck and let me know how it worked! > > Cheers, > Norman > > On Wed, Sep 21, 2011 at 10:37 AM, Wu Qi <qwu at="" dicp.ac.cn=""> wrote: >> Dear Norman, >> >> >> >> My name is Qi Wu, I'm a Chinese student working on quantitative >> proteomics, recently your PLGEM algorithm interested me. It seems a >> better choice than conventional t test. >> >> I'm a beginner in statistics, after installing PLGEM R package, I >> followed the instruction on "An introduction to PLGEM, Mattia >> Pelizzola and Norman Pavelka, April 13, 2011" running the wrapper mode >> and got the sample figures. But I don't know how to import my own >> data. I couldn't open the sample data named "LPSeset" using Excel or >> UltraEdit, so I had no idea how the data was organized. Now I could >> generate replicate Excel or plain text files containing proteins >> abundance values of different status, could you tell me how can I >> import such data in PLGEM R package and get a list containing those >> significantly changed proteins? I searched the internet for quite a long > time and got nothing. >> >> Thanks very much for contributing your wonderful algorithm, your reply >> is high appreciated. >> >> >> >> Best Regards, >> >> Qi Wu >> >> > >
ADD COMMENT

Login before adding your answer.

Traffic: 865 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6