Question

limma and marray data import problem

0

Entering edit mode

Piotr Stępniak ▴ 90

@piotr-stepniak-2827

Last seen 9.6 years ago

Dear Gordon, Thank you very much for your answer. When I try the arguments nrows=2 or 2640 I get the error: Read 15sz_szk10_Results1.gpr Error in read.table(file = file, header = TRUE, col.names = allcnames, : formal argument "nrows" matched by multiple actual arguments Csv files give the same error with nrows attribute but the do read fine with the following instruction: bialkoRaw<- read.maimages( targets$FileName, columns=list(G="Ch1\ Median", Gb="Ch1\ B\ Median", R="Ch2\ Median", Rb="Ch2\ B\ Median"), sep=",") And the object is almost fine: An object of class "RGList" $G 15sz_szk10_Results1 16kr_szk11_Results1 19sz_szk19_Results1 [1,] 3016 1632 2413 [2,] 2587 1385 2145 [3,] 3088 1526 2015 [4,] 137 109 596 [5,] 137 114 649 20sz_szk9_Results1 22sz_szk18_Results1 26kr_szk12_Results1 [1,] 2529 4421 12481 [2,] 2459 4576 11637 [3,] 2387 4557 10468 [4,] 186 975 2032 [5,] 174 961 906 26sz_szk13_Results1 27sz_szk14_Results1 30kr_szk16_Results1 [1,] 5946 4747 3374 [2,] 5368 4022 2839 [3,] 4368 4251 3526 [4,] 2164 280 210 [5,] 2087 282 221 30sz_szk17_Results1 34kr_szk20_Results1 35kr_szk6_Results1 [1,] 1814 2108 4703 [2,] 2092 1949 4396 [3,] 1830 2094 4634 [4,] 307 910 171 [5,] 255 805 187 35sz_szk7_Results1 38kr_szk6_Results1 ML_szk15_Results1_BRZYDKIE [1,] 3053 1852 8598 [2,] 2813 1785 7524 [3,] 2755 1683 8625 [4,] 197 631 548 [5,] 227 656 629 ML_szk15_Results1_BRZYDKIEpopluk [1,] 10964 [2,] 12696 [3,] 13298 [4,] 7176 [5,] 5768 2636 more rows ... $Gb 15sz_szk10_Results1 16kr_szk11_Results1 19sz_szk19_Results1 [1,] 172 125 1158 [2,] 168 121 1110 [3,] 168 119 1148 [4,] 178 122 1190 [5,] 162 114 1217 20sz_szk9_Results1 22sz_szk18_Results1 26kr_szk12_Results1 [1,] 285 1513 5559 [2,] 345 1533 5839 [3,] 331 1570 5667 [4,] 328 1565 5544 [5,] 299 1644 5369 26sz_szk13_Results1 27sz_szk14_Results1 30kr_szk16_Results1 [1,] 2490 1025 1086 [2,] 2330 962 1078 [3,] 2206 884 1043 [4,] 2307 836 982 [5,] 2349 775 956 30sz_szk17_Results1 34kr_szk20_Results1 35kr_szk6_Results1 [1,] 645 1258 311 [2,] 618 1267 321 [3,] 647 1300 289 [4,] 679 1276 295 [5,] 708 1267 302 35sz_szk7_Results1 38kr_szk6_Results1 ML_szk15_Results1_BRZYDKIE [1,] 305 294 3992 [2,] 307 274 3759 [3,] 282 273 3575 [4,] 318 292 3419 [5,] 303 302 3123 ML_szk15_Results1_BRZYDKIEpopluk [1,] 8085 [2,] 7296 [3,] 7552 [4,] 7125 [5,] 6407 2636 more rows ... $R 15sz_szk10_Results1 16kr_szk11_Results1 19sz_szk19_Results1 [1,] 1487 1054 1149 [2,] 1387 913 1106 [3,] 1559 924 1006 [4,] 147 109 268 [5,] 161 125 273 20sz_szk9_Results1 22sz_szk18_Results1 26kr_szk12_Results1 [1,] 1259 1459 2621 [2,] 1246 1391 2430 [3,] 1231 1483 2373 [4,] 176 197 334 [5,] 173 182 251 26sz_szk13_Results1 27sz_szk14_Results1 30kr_szk16_Results1 [1,] 1604 1183 1181 [2,] 1624 1047 1090 [3,] 1476 1141 1159 [4,] 361 268 139 [5,] 345 276 145 30sz_szk17_Results1 34kr_szk20_Results1 35kr_szk6_Results1 [1,] 1015 1731 1593 [2,] 1198 1776 1618 [3,] 1195 1666 1741 [4,] 269 561 165 [5,] 257 566 161 35sz_szk7_Results1 38kr_szk6_Results1 ML_szk15_Results1_BRZYDKIE [1,] 1851 1287 2707 [2,] 1819 1224 1981 [3,] 1755 1087 2400 [4,] 208 186 444 [5,] 217 193 487 ML_szk15_Results1_BRZYDKIEpopluk [1,] 2403 [2,] 2506 [3,] 2733 [4,] 1166 [5,] 1223 2636 more rows ... $Rb 15sz_szk10_Results1 16kr_szk11_Results1 19sz_szk19_Results1 [1,] 929 176 313 [2,] 943 180 308 [3,] 906 174 306 [4,] 969 171 309 [5,] 991 169 315 20sz_szk9_Results1 22sz_szk18_Results1 26kr_szk12_Results1 [1,] 509 168 816 [2,] 588 173 815 [3,] 592 171 776 [4,] 621 159 785 [5,] 637 162 834 26sz_szk13_Results1 27sz_szk14_Results1 30kr_szk16_Results1 [1,] 683 304 204 [2,] 651 289 196 [3,] 646 279 187 [4,] 665 278 179 [5,] 643 280 179 30sz_szk17_Results1 34kr_szk20_Results1 35kr_szk6_Results1 [1,] 419 667 663 [2,] 410 688 635 [3,] 409 706 648 [4,] 425 703 630 [5,] 419 692 605 35sz_szk7_Results1 38kr_szk6_Results1 ML_szk15_Results1_BRZYDKIE [1,] 691 525 1518 [2,] 700 520 1330 [3,] 710 506 1216 [4,] 724 474 1111 [5,] 698 454 1075 ML_szk15_Results1_BRZYDKIEpopluk [1,] 2131 [2,] 2023 [3,] 1857 [4,] 1711 [5,] 1542 2636 more rows ... $targets [1] "15sz_szk10_Results1.csv" "16kr_szk11_Results1.csv" [3] "19sz_szk19_Results1.csv" "20sz_szk9_Results1.csv" [5] "22sz_szk18_Results1.csv" 11 more rows ... $source [1] "generic" The columns are not shifted to right, so I get the correct Median values as desired. However there is something wrong with the object because any following instructions give this error: Error in if is.int(totalPlate)) { : argument is of length zero Preasumably it is because the printer layout is missing, so I do this: bialkoRaw$printer<-getLayout("bialko.gal") Error in getLayout("bialko.gal") : gal needs to have columns Block, Row and Column Strange enough the function readGAL works, but if I try to replace the above instruction there is no printer parameters like ngrid.c, ngrid.r etc. just and array. Is there a way to check RGList object integrity? Kind Regards, Piotrek On Wed, Jun 4, 2008 at 6:56 AM, Gordon K Smyth <smyth at="" wehi.edu.au=""> wrote: > Dear Piotr, > > I can't diagnose your problem, because the shortened version of your data > file that you emailed reads fine for me when I put the lines in a text file, > as I show below. I used sep="" in my code because email doesn't preserve > tab separators. Presumably the problem appears further into the file, > perhaps near the bottom. Or else you file has inconsistent separators. > > Can you try the arguments nrows=2 and nrows=2640? > > I would also expect the csv file to read with the following: > > read.maimages("file.csv",columns=list(G="F543 Median",Gb="B543 Median", > R="F633 Median", Rb="B633 Median"),sep=",",nrows=2640) > > Best wishes > Gordon > > My code: > >> read.maimages("temp.txt",source="genepix",columns=list(G="F543 Median", > > Gb="B543 Median", R="F633 Median", Rb="B633 Median"),sep="") > Read temp.txt > An object of class "RGList" > $G > temp > [1,] 5946 > [2,] 5368 > > $Gb > temp > [1,] 2490 > [2,] 2330 > > $R > temp > [1,] 1604 > [2,] 1624 > > $Rb > temp > [1,] 683 > [2,] 651 > > $targets > FileName > temp temp.txt > > $genes > Block Row Column ID Name > 1 1 1 1 2078 ERG_Operon > 2 1 1 2 2078 ERG_Operon > > $source > [1] "genepix" > > $printer > $ngrid.r > [1] 1 > > $ngrid.c > [1] 1 > > $nspot.r > [1] 1 > > $nspot.c > [1] 2 > > attr(,"class") > [1] "PrintLayout" > > > On Mon, 2 Jun 2008, Piotr St?pniak wrote: > >> Dear Gordon, >> >> Thank you for your reply. >> >> I tried using source="genepix", it did not work better than "scanarray". >> The following commands give: >> >>> bialkoRaw<-read.maimages(dir(pattern="gpr"), source="genepix")Error in >>> read.table(file = file, header = TRUE, col.names = allcnames, : >> >> duplicate 'row.names' are not allowed >> >> It turnes out the format is not 100% valid GenePix, e.g. it does not >> have any index column, so I try this: >> >>> bialkoRaw<-read.maimages(dir(pattern="gpr"), source="genepix", >>> row.names=NULL) >> >> Error in RG[[a]][, i] <- obj[, columns[[a]]] : >> number of items to replace is not a multiple of replacement length >> In addition: Warning message: >> In getLayout(RG$genes, guessdups = FALSE) : NAs introduced by coercion >> >> I tried different parameter combinations which got me to the command >> you've seen in the previous messages (I'm sorry for sending it 3 >> times...). >> >> The file is finally read, but wrongly as described earlier. >> >> Same happens to gal file: >> >>> gal<-readGAL("Bialko.gal") >> >> Error in read.table(file = file, header = TRUE, col.names = allcnames, : >> duplicate 'row.names' are not allowed >> >>> gal<-readGAL("Bialko.gal", row.names=NULL) >> >> Error in if is.int(totalPlate)) { : argument is of length zero >> >> To answer your further questions shortly: >> 2. Yes, these are the files straight from the scanner software. >> ScanArrayExpress also offers csv export, but reading them is another >> problem. They do have Index column, >>> >>> bialkoRaw<- read.maimages( dir(pattern="csv"), columns=list(G="Ch1\ >>> Median", Gb="Ch1\ B\ Median", R="Ch2\ Median", Rb="Ch2\ B\ Median"), >>> sep=",") >> >> reads the file and the values are under correct columns but I get no >> printer layout read and other function to process the data gives: >> Error in if is.int(totalPlate)) { : argument is of length zero >> >> 3. Yes, I'd be happy to if you please look at it: >> >> Beginning of GPR file: >> >> ATF 1.0 >> >> 21 82 >> >> "Type=GenePix Results 2" >> >> "DateTime=2008/03/28 10:30:03" >> >> "Settings=Easy Quant" >> >> "GalFile=D:\Luiza\Grant bialaczkowy_BADANIA\BIALACZKI_skany\DRUGI >> RZUT\BIALACZKI_2_25luty2008_popr.gal" >> >> "Scanner=Model: Express Serial No.: 432617" >> >> "Comment=<f1>Alexa 555<f2>Alexa 647<f1 offset="">0,0<f2 offset="">0,0<comment>" >> >> "PixelSize=10" >> >> "Wavelengths=543 nm 633 nm" >> >> "ImageFiles=D:\Luiza\Grant bialaczkowy_BADANIA\BIALACZKI_skany\DRUGI >> RZUT\12_03_2008\Skan >> Agi\HL60_szk13_PMT65_roz10_Alexa555.tif D:\Luiza\Grant >> bialaczkowy_BADANIA\BIALACZKI_skany\DRUGI RZUT\12_03_2008\Skan >> Agi\26sz_szk13_PMT60_roz10_Alexa647.tif" >> >> "PMTGain=65 60" >> >> "NormalizationMethod=LOWESS" >> >> "NormalizationFactors=0.000 0.000" >> >> "JpegImage=" >> >> "RatioFormulations=W2/W1(633/543)" >> >> "Barcode=" >> >> "ImageOrigin=1500 11600" >> >> "JpegOrigin=0 0" >> >> "Creator=ScanArray Express, Microarray Analysis System 3.0.0.16" >> >> "Temperature=0.0" >> >> "LaserPower=90 90 0 0" >> >> "LaserOnTime=0 0 0 0" >> >> "Block" "Column" "Row" "Name" "ID" "X" "Y" "Dia." >> "F543 Median" "F543 >> Mean" "F543 SD" "B543 Median" "B543 Mean" "B543 SD" "% >> > B543+1SD" "% >>> >>> B543+2SD" "F543 % Sat." "F633 Median" "F633 Mean" "F633 SD" >>> "B633 >> >> Median" "B633 Mean" "B633 SD" "% > B633+1SD" "% > B633+2SD" >> "F633 % >> Sat." "F3 Median" "F3 Mean" "F3 SD" "B3 Median" "B3 Mean" >> "B3 SD" "% > >> B3+1SD" "% > B3+2SD" "F3 % Sat." "F4 Median" "F4 Mean" >> "F4 SD" "B4 >> Median" "B4 Mean" "B4 SD" "% > B4+1SD" "% > B4+2SD" "F4 % >> Sat." "Ratio >> of Medians (633/543)" "Ratio of Means (633/543)" "Median of Ratios >> (633/543)" "Mean of Ratios (633/543)" "Ratios SD (633/543)" >> "Rgn Ratio >> (633/543)" "Rgn R? (633/543)" "Ratio of Medians (Ratio/2)" >> "Ratio of >> Means (Ratio/2)" "Median of Ratios (Ratio/2)" "Mean of Ratios >> (Ratio/2)" "Ratios SD (Ratio/2)" "Rgn Ratio (Ratio/2)" "Rgn R? >> (Ratio/2)" "Ratio of Medians (Ratio/3)" "Ratio of Means >> (Ratio/3)" "Median of Ratios (Ratio/3)" "Mean of Ratios >> (Ratio/3)" "Ratios SD (Ratio/3)" "Rgn Ratio (Ratio/3)" "Rgn R? >> (Ratio/3)" "F Pixels" "B Pixels" "Sum of Medians" >> "Sum of Means" "Log >> Ratio (633/543)" "Log Ratio (Ratio/2)" "Log Ratio (Ratio/3)" >> "F543 >> Median - B543" "F633 Median - B633" "F3 Median - B3" "F4 Median >> - >> B4" "F543 Mean - B543" "F633 Mean - B633" "F3 Mean - B3" >> "F4 Mean - >> B4" "Flags" "Normalize" >> >> 1 1 1 ERG_Operon 2078 2805 13125 230 >> 5946 6035 1754 2490 2506 529 97 92 0 1604 >> 1636 517 683 698 194 94 84 0 0 0 >> 0 0 0 0 0 0 0 0 0 >> 0 0 0 0 0 0 0 0.266 0.269 >> 0.270 0.329 0.329 0.232 0.621 0.000 0.000 0.000 0.000 >> 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 >> 0.000 384 734 4377 4498 -1.908 0.000 0.000 3456 921 >> 0 0 3545 953 0 0 100 1 >> >> 1 2 1 ERG_Operon 2078 3250 13128 220 >> 5368 5457 1634 2330 2378 537 96 91 0 1624 >> 1651 531 651 671 188 95 88 0 0 0 >> 0 0 0 0 0 0 0 0 0 >> 0 0 0 0 0 0 0 0.320 0.320 >> 0.318 0.567 0.567 0.254 0.608 0.000 0.000 0.000 0.000 >> 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 >> 0.000 351 858 4011 4127 -1.643 0.000 0.000 3038 973 >> 0 0 3127 1000 0 0 100 1 >> >> 1 3 1 ERG_Operon 2078 3698 13124 220 >> 4368 4676 1646 2206 2240 490 90 81 0 1476 >> 1562 592 646 673 182 90 80 0 0 0 >> 0 0 0 0 0 0 0 0 0 >> 0 0 0 0 0 0 0 0.384 0.371 >> 0.377 0.498 0.498 0.281 0.610 0.000 0.000 0.000 0.000 >> 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 >> 0.000 348 858 2992 3386 -1.381 0.000 0.000 2162 830 >> 0 0 2470 916 0 0 100 1 >> >> And for comparison here is a corresponding csv: >> >> BEGIN HEADER >> >> PerkinElmer Inc. >> >> ScanArrayCSVFileFormat,2.00 >> >> ScanArray Express,2.00 >> >> Number_of_Columns,62 >> >> END HEADER >> >> >> >> BEGIN GENERAL INFO >> >> DateTime,2008/03/28 10:30 >> >> GalFile,D:\Luiza\Grant bialaczkowy_BADANIA\BIALACZKI_skany\DRUGI >> RZUT\BIALACZKI_2_25luty2008_popr.gal >> >> Scanner,Model: Express Serial No.: 432617 >> >> User Name,Luiza >> >> Computer Name, >> >> Protocol,Easy Quant >> >> Quantitation Method,Adaptive Circle >> >> Quality Confidence Calculation,Footprint >> >> User comments, >> >> Image Origin,1500,11600 >> >> Temperature,0 >> >> Laser Powers,90,90 >> >> Laser On Time,0 >> >> PMT Voltages,65,60 >> >> END GENERAL INFO >> >> >> >> BEGIN QUANTITATION PARAMETERS >> >> Min Percentile,30 >> >> Max Percentile,300 >> >> END QUANTITATION PARAMETERS >> >> >> >> BEGIN QUALITY MEASUREMENTS >> >> Max Footprint,100 >> >> END QUALITY MEASUREMENTS >> >> >> >> BEGIN ARRAY PATTERN INFO >> >> Units,?m >> >> Array Rows,10 >> >> Array Columns,4 >> >> Spot Rows,9 >> >> Spot Columns,9 >> >> Array Row Spacing,4500.000000 >> >> Array Column Spacing,4500.000000 >> >> Spot Row Spacing,450.000000 >> >> Spot Column Spacing,450.000000 >> >> Spot Diameter,200 >> >> Interstitial,0 >> >> Spots Per Array,81 >> >> Total Spots,2640 >> >> END ARRAY PATTERN INFO >> >> >> >> BEGIN IMAGE INFO >> >> ImageID,Channel,Image,Fluorophore,Barcode,Units,X Units Per Pixel,Y >> Units Per Pixel,X Offset,Y Offset,Status >> >> -1,CH1,D:\Luiza\Grant bialaczkowy_BADANIA\BIALACZKI_skany\DRUGI >> RZUT\12_03_2008\Skan Agi\HL60_szk13_PMT65_roz10_Alexa555.tif,Alexa >> 555,,?m,10.000000,10.000000,0.000000,0.000000,Control Image >> >> -1,CH2,D:\Luiza\Grant bialaczkowy_BADANIA\BIALACZKI_skany\DRUGI >> RZUT\12_03_2008\Skan Agi\26sz_szk13_PMT60_roz10_Alexa647.tif,Alexa >> 647,,?m,10.000000,10.000000,0.000000,0.000000, >> >> END IMAGE INFO >> >> >> >> BEGIN NORMALIZATION INFO >> >> Normalization Method,LOWESS >> >> END NORMALIZATION INFO >> >> >> >> BEGIN DATA >> >> Index,Array Row,Array Column,Spot Row,Spot >> Column,Name,ID,X,Y,Diameter,F Pixels,B Pixels,Footprint,Flags,Ch1 >> Median,Ch1 Mean,Ch1 SD,Ch1 B Median,Ch1 B Mean,Ch1 B SD,Ch1 % > B + 1 >> SD,Ch1 % > B + 2 SD,Ch1 F % Sat.,Ch1 Median - B,Ch1 Mean - B,Ch1 >> SignalNoiseRatio,Ch2 Median,Ch2 Mean,Ch2 SD,Ch2 B Median,Ch2 B >> Mean,Ch2 B SD,Ch2 % > B + 1 SD,Ch2 % > B + 2 SD,Ch2 F % Sat.,Ch2 >> Median - B,Ch2 Mean - B,Ch2 SignalNoiseRatio,Ch2 Ratio of Medians,Ch2 >> Ratio of Means,Ch2 Median of Ratios,Ch2 Mean of Ratios,Ch2 Ratios >> SD,Ch2 Rgn Ratio,Ch2 Rgn R?,Ch2 Log Ratio,Sum of Medians,Sum of >> Means,Ch1 N Median,Ch1 N Mean,Ch1 N (Median-B),Ch1 N (Mean-B),Ch2 N >> Median,Ch2 N Mean,Ch2 N (Median-B),Ch2 N (Mean-B),Ch2 N Ratio of >> Medians,Ch2 N Ratio of Means,Ch2 N Median of Ratios,Ch2 N Mean of >> Ratios,Ch2 N Rgn Ratio,Ch2 N Log Ratio >> >> >> 1,1,1,1,1,"ERG_Operon","2078",2805,13125,230,384,734,0,3,5946,6035, 1754.26,2490,2506,529.19,97.4,92.2,0.0,3456,3545,11.24,1604,1636,517.2 7,683,698,194.19,94.3,84.1,0.0,921,953,8.26,0.27,0.27,0.27,0.33,0.39,0 .23,0.62,-1.908,4377,4498,5946,6035,3456,3545,3027,2984,1446,2664,0.42 ,0.75,0.42,0.92,0.44,-1.257 >> >> >> 2,1,1,1,2,"ERG_Operon","2078",3250,13128,220,351,858,0,3,5368,5457, 1634.22,2330,2378,537.27,96.0,90.9,0.0,3038,3127,9.99,1624,1651,531.34 ,651,671,188.42,94.9,88.0,0.0,973,1000,8.62,0.32,0.32,0.32,0.57,2.14,0 .25,0.61,-1.643,4011,4127,5368,5457,3038,3127,3100,3039,1536,2956,0.51 ,0.95,0.50,1.68,0.48,-0.984 >> >> >> 3,1,1,1,3,"ERG_Operon","2078",3698,13124,220,348,858,0,3,4368,4676, 1645.59,2206,2240,490.01,90.2,81.0,0.0,2162,2470,8.91,1476,1562,591.68 ,646,673,182.34,90.2,80.2,0.0,830,916,8.09,0.38,0.37,0.38,0.50,0.92,0. 28,0.61,-1.381,2992,3386,4368,4676,2162,2470,2947,2941,283,797,0.13,0. 32,0.13,0.43,0.56,-2.934 >> >> >> Kind Regards, >> Piotr >> >> On Mon, Jun 2, 2008 at 3:57 AM, Gordon K Smyth <smyth at="" wehi.edu.au=""> wrote: >>> >>> Dear Piotr, >>> >>> The file extension "gpr" is short for GenePix Results file. If ScanArray >>> Express outputs a file with this extension, you should have every >>> expectation that is formated exactly the same as a gpr file from GenePix, >>> and therefore you should be able to read it using >>> read.maimages(source="genepix"). If this is not true, then ScanArray is >>> irresponsible to use this extension. >>> >>> Same comments for the GAL file. It is obviously not a GAL file as >>> defined >>> by GenePix, otherwise it would be read using readGAL(). >>> >>> From your description below, a possible explanation for the problem is >>> that >>> your files have an extra column with no corresponding heading, e.g., a >>> column of row numbers. However no one on this mailing list can tell that >>> for sure without you showing us some lines from your file. >>> >>> Questions: >>> 1. Why have you set row.names=NULL? This prevents R from detecting a >>> column >>> of row numbers. What happens if you remove this? >>> >>> 2. Are these files exactly as output by ScanArray, or have they been >>> further >>> processed? >>> >>> 3. Can you post the first few lines of an example file? >>> >>> Best wishes >>> Gordon >>> >>> PS. You posted the same question to the BioC mailing list on three >>> consecutive days during the weekend. Please post the question just once. >>> >>> >>>> Date: Sat, 31 May 2008 12:55:25 +0200 >>>> From: " Piotr St?pniak " <piotrek.stepniak at="" gmail.com=""> >>>> Subject: [BioC] limma and marray data import problem >>>> To: bioconductor at stat.math.ethz.ch >>>> >>>> Hello Everyone, >>>> >>>> I am Piotr St?pniak, B.Sc. in Biotechnology, currently under M.Sc. >>>> course at Adam Mickiewicz University in Pozna?, Poland. I am working >>>> in Polish Science Academy in microarray experiments group. >>>> >>>> I'm a newbie in R and BioC, so please forgive me if my question is >>>> easy... >>>> >>>> I'm having problem with data import to RGList or marrayRaw objects. >>>> Using the following instruction: >>>> bialkoRaw<- read.maimages( dir(pattern="gpr"), columns=list(G="F543 >>>> Median", Gb="B543 Median", R="F633 Median", Rb="B633 Median"), >>>> annotation=c("Block", "Column", "Row", "Name", "ID"), row.names=NULL) >>>> The data seems to load, but $genes table looks odd, I guess the column >>>> names are shifted right by 1 column: >>>> $genes >>>> Block Column Row Name ID >>>> 1 1 1 ERG_Operon 2078 2647 >>>> 2 2 1 ERG_Operon 2078 3102 >>>> 3 3 1 ERG_Operon 2078 3549 >>>> 4 4 1 FLT3_Operon 2322 3994 >>>> 5 5 1 FLT3_Operon 2322 4444 >>>> 2635 more rows ... >>>> This I think causes printer layout to be imported wrongly and then any >>>> other try to process the data (e.g. quality tests) produce such error >>>> message: >>>> Error in if is.int(totalPlate)) { : argument is of length zero >>>> >>>> The data is obtained with ScanArrayExpress software, so I have it in >>>> gpr or csv files, both give similar errors, but loading csv files >>>> seems also to fail import values for each channel and gets only the >>>> file name headers. >>>> >>>> Marray import also fails, I will skip the info about it not to enlarge >>>> the mail unnecessarily. >>>> >>>> My R session info is as follows: >>>>> >>>>> sessionInfo() >>>> >>>> R version 2.6.2 (2008-02-08) >>>> i486-pc-linux-gnu >>>> >>>> locale: >>>> C >>>> >>>> attached base packages: >>>> [1] grid splines tools stats graphics grDevices utils >>>> [8] datasets methods base >>>> >>>> other attached packages: >>>> [1] arrayQuality_1.18.0 gridBase_0.4-3 hexbin_1.14.0 >>>> [4] convert_1.16.0 RColorBrewer_1.0-2 cluster_1.11.10 >>>> [7] arrayMagic_1.16.1 genefilter_1.16.0 survival_2.34-1 >>>> [10] marray_1.18.0 vsn_3.6.0 limma_2.14.1 >>>> [13] affy_1.16.0 preprocessCore_1.0.0 affyio_1.8.0 >>>> [16] Biobase_1.16.3 lattice_0.17-7 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] AnnotationDbi_1.0.6 DBI_0.2-4 RSQLite_0.6-8 >>>> [4] annotate_1.18.0 rcompgen_0.1-17 >>>> >>>> >>>> I think I should also say that these data causes import problems to >>>> any other data analysis software :( I also tried to read the printer >>>> layout from gal file, but all I got was "Block, Row, Column, ID >>>> columns not found" error. >>>> >>>> I'd greatly appreciate any help, please. >>>> >>>> Yours faithfully, >>>> Piotr St?pniak >>> >

Microarray Normalization limma PROcess marray Microarray Normalization limma PROcess • 1.7k views

ADD COMMENT • link updated 15.9 years ago by Gordon Smyth 50k • written 15.9 years ago by Piotr Stępniak ▴ 90

score 0 · Answer 1 · 2008-06-04

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 1 hour ago

WEHI, Melbourne, Australia

On Wed, 4 Jun 2008, Piotr St?pniak wrote: > Dear Gordon, > > Thank you very much for your answer. When I try the arguments nrows=2 > or 2640 I get the error: > Read 15sz_szk10_Results1.gpr This shows that the first gpr file has read correctly. Try putting file=targets$FileName[1] to read one file at a time. This shows that your gpr files have a problem after the last line of data. > Error in read.table(file = file, header = TRUE, col.names = allcnames, : > formal argument "nrows" matched by multiple actual arguments > > Csv files give the same error with nrows attribute but the do read > fine with the following instruction: > bialkoRaw<- read.maimages( targets$FileName, columns=list(G="Ch1\ > Median", Gb="Ch1\ B\ Median", R="Ch2\ Median", Rb="Ch2\ B\ Median"), > sep=",") > And the object is almost fine: > The columns are not shifted to right, so I get the correct Median > values as desired. > > However there is something wrong with the object because any following > instructions give this error: > Error in if is.int(totalPlate)) { : argument is of length zero > Preasumably it is because the printer layout is missing, There's nothing wrong with the object, the printer layout is simply not set. Since limma doesn't "know" the ScanArray csv format, you have to set the printer layout yourself manually. > Is there a way to check RGList object integrity? Just look at the object, show(RG). Best wishes Gordon > Kind Regards, > Piotrek

ADD COMMENT • link 15.9 years ago Gordon Smyth 50k

0

Entering edit mode

Dear Gordon, Thank you again for your prompt reply. On Wed, Jun 4, 2008 at 10:25 AM, Gordon K Smyth <smyth at="" wehi.edu.au=""> wrote: > > > On Wed, 4 Jun 2008, Piotr St?pniak wrote: > >> Dear Gordon, >> >> Thank you very much for your answer. When I try the arguments nrows=2 >> or 2640 I get the error: >> Read 15sz_szk10_Results1.gpr > > This shows that the first gpr file has read correctly. Try putting > file=targets$FileName[1] to read one file at a time. > If I do that, how do I put them together later on to have all the files in one object for further analysis? > This shows that your gpr files have a problem after the last line of data. > Is it possible it is because the files have different eof sign? Would changing them to unix like or windows like be of any help and would it be considered as data manipulation? >> Error in read.table(file = file, header = TRUE, col.names = allcnames, : >> formal argument "nrows" matched by multiple actual arguments >> >> Csv files give the same error with nrows attribute but the do read >> fine with the following instruction: >> bialkoRaw<- read.maimages( targets$FileName, columns=list(G="Ch1\ >> Median", Gb="Ch1\ B\ Median", R="Ch2\ Median", Rb="Ch2\ B\ Median"), >> sep=",") >> And the object is almost fine: > >> The columns are not shifted to right, so I get the correct Median >> values as desired. >> >> However there is something wrong with the object because any following >> instructions give this error: >> Error in if is.int(totalPlate)) { : argument is of length zero >> Preasumably it is because the printer layout is missing, > > There's nothing wrong with the object, the printer layout is simply not set. > Since limma doesn't "know" the ScanArray csv format, you have to set the > printer layout yourself manually. > OK, I do this: bialkoRaw$printer$ngrid.r=10 bialkoRaw$printer$ngrid.c=4 bialkoRaw$printer$nspot.c=9 bialkoRaw$printer$nspot.r=9 bialkoRaw$printer$spacing=1 bialkoRaw$printer$ndups=3 Is there anything else I should add? When I try to use e.g. maQualityPlots() I get: > maQualityPlots(bialkoRaw) Error: dims [product 2641] do not match the length of object [3240] In addition: Warning message: In samplesub & which & subset & good : longer object length is not a multiple of shorter object length Which is because the array has 36 grids instead of 40 (it is not a full rectangle). Is there a way to go around this situation? Maybe I can set a grid number value? Kind Regards, Piotr >> Is there a way to check RGList object integrity? > > Just look at the object, show(RG). > > Best wishes > Gordon > >> Kind Regards, >> Piotrek

ADD REPLY • link 15.9 years ago Piotr Stępniak ▴ 90

0

Entering edit mode

On Wed, 4 Jun 2008, Piotr St?pniak wrote: > Dear Gordon, > > Thank you again for your prompt reply. > >> This shows that the first gpr file has read correctly. Try putting >> file=targets$FileName[1] to read one file at a time. > > If I do that, how do I put them together later on to have all the > files in one object for further analysis? cbind() ?cbind.RGList >> This shows that your gpr files have a problem after the last line of data. > > Is it possible it is because the files have different eof sign? Would > changing them to unix like or windows like be of any help and would it > be considered as data manipulation? Sounds unlikely, but strange things can happen. Storing files in appropriate platform mode is sensible to me. >> There's nothing wrong with the object, the printer layout is simply not set. >> Since limma doesn't "know" the ScanArray csv format, you have to set the >> printer layout yourself manually. >> > OK, I do this: > bialkoRaw$printer$ngrid.r=10 > bialkoRaw$printer$ngrid.c=4 > bialkoRaw$printer$nspot.c=9 > bialkoRaw$printer$nspot.r=9 > bialkoRaw$printer$spacing=1 > bialkoRaw$printer$ndups=3 > > Is there anything else I should add? When I try to use e.g. > maQualityPlots() I get: >> maQualityPlots(bialkoRaw) > Error: dims [product 2641] do not match the length of object [3240] > In addition: Warning message: > In samplesub & which & subset & good : > longer object length is not a multiple of shorter object length > > Which is because the array has 36 grids instead of 40 (it is not a > full rectangle). Is there a way to go around this situation? Maybe I > can set a grid number value? There is no way to set the print layout in limma. limma strictly assumes complete arrays. You'd have to pad out your data object with NAs for the other 4 grids if you want to use print layout functions in limma. You do not need the layout to create MA-plots or to loess normalize. I can't give advice on maQualityPlots(). Best wishes Gordon

ADD REPLY • link 15.9 years ago Gordon Smyth 50k

0

Entering edit mode

Dear Gordon, Thank you VERY much for all your answers. One more question and I should be able to proceed with my analysis: >> Error: dims [product 2641] do not match the length of object [3240] >> In addition: Warning message: >> In samplesub & which & subset & good : >> longer object length is not a multiple of shorter object length >> >> Which is because the array has 36 grids instead of 40 (it is not a >> full rectangle). Is there a way to go around this situation? Maybe I >> can set a grid number value? > > There is no way to set the print layout in limma. limma strictly assumes > complete arrays. You'd have to pad out your data object with NAs for the > other 4 grids if you want to use print layout functions in limma. > > You do not need the layout to create MA-plots or to loess normalize. > Could you please advice on the best way to pad out the object with NAs? I tried this: bialkoRaw[2641:3240,]<- read.maimages( targets$FileName[1], columns=list(G="Ch1\ Median", Gb="Ch1\ B\ Median", R="Ch2\ Median", Rb="Ch2\ B\ Median"), sep=",", na.strings="NA", blank.lines.skip = FALSE) and I did not get any additional columns or rows written. Kind Regards, Piotr

ADD REPLY • link 15.9 years ago Piotr Stępniak ▴ 90

0

Entering edit mode

Hello Everyone, If anyone was following this thread we have found what is wrong with the gpr files produced by our ScanArrayExpress software that makes them impossible to read by limma or marray: It turned out each data row has additional tabulation before the new line sign but after the last value in the row. This somehow makes for example read.GenePix() fail with error duplicate row.names not allowed even though row.names is set to NULL. My question is if automatic batch removal of aforementioned tabulations is considered a data manipulation? After all we are just fixing what the software did wrong on the output, no values are changed. Kind Regards, Piotr St?pniak On Thu, Jun 5, 2008 at 9:45 AM, Piotr St?pniak <piotrek.stepniak at="" gmail.com=""> wrote: > Dear Gordon, > > Thank you VERY much for all your answers. One more question and I > should be able to proceed with my analysis: > >>> Error: dims [product 2641] do not match the length of object [3240] >>> In addition: Warning message: >>> In samplesub & which & subset & good : >>> longer object length is not a multiple of shorter object length >>> >>> Which is because the array has 36 grids instead of 40 (it is not a >>> full rectangle). Is there a way to go around this situation? Maybe I >>> can set a grid number value? >> >> There is no way to set the print layout in limma. limma strictly assumes >> complete arrays. You'd have to pad out your data object with NAs for the >> other 4 grids if you want to use print layout functions in limma. >> >> You do not need the layout to create MA-plots or to loess normalize. >> > > Could you please advice on the best way to pad out the object with > NAs? I tried this: > bialkoRaw[2641:3240,]<- read.maimages( targets$FileName[1], > columns=list(G="Ch1\ Median", Gb="Ch1\ B\ Median", R="Ch2\ Median", > Rb="Ch2\ B\ Median"), sep=",", na.strings="NA", blank.lines.skip = > FALSE) > > and I did not get any additional columns or rows written. > > Kind Regards, > Piotr >

ADD REPLY • link 15.9 years ago Piotr Stępniak ▴ 90