Reading Illumina data from GenomeStudio in limma

0

Entering edit mode

Guest User ★ 13k

@guest-user-4897

Last seen 11.4 years ago

I have data exported from GenomeStudio (1.6) that I can correctly read and pre-process with lumi. However, when I try using read.ilmn in limma, I found that I get an error saying "Error in `rownames<-`(`*tmp*`, value = list(PROBE_ID = c("ILMN_1762337", : length of 'dimnames' [1] not equal to array extent." I have found that this was due to the inclusion of extra columns in my probe profiles (TargetID, ProbeID, SPECIES, SOURCE, TRANSCRIPT and on). When I removed them, I can correctly read the probe and control profiles with read.ilmn. Is there a better fix around this than manually removing the columns? -- output of sessionInfo(): R version 3.0.0 (2013-04-03) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] lumi_2.12.0 Biobase_2.20.0 BiocGenerics_0.6.0 limma_3.16.2 -- Sent via the guest posting facility at bioconductor.org.

probe lumi probe lumi • 2.2k views

ADD COMMENT • link updated 12.4 years ago by Wei Shi ★ 3.6k • written 12.5 years ago by Guest User ★ 13k

0

Entering edit mode

Wei Shi ★ 3.6k

@wei-shi-2183

Last seen 11 days ago

Australia/Melbourne

Dear Josh, Could you please provide the command you used and also the first 10 lines of your probe profile file? The read.ilmn function allows any number of columns to be included in your data files. It just extracts those columns whose names match the names specified in the parameters. So I do not know what caused your problem unless I can have a look at your data. Best regards, Wei On Aug 14, 2013, at 12:30 PM, Josh [guest] wrote: > > I have data exported from GenomeStudio (1.6) that I can correctly read and pre-process with lumi. However, when I try using read.ilmn in limma, I found that I get an error saying "Error in `rownames<-`(`*tmp*`, value = list(PROBE_ID = c("ILMN_1762337", : length of 'dimnames' [1] not equal to array extent." I have found that this was due to the inclusion of extra columns in my probe profiles (TargetID, ProbeID, SPECIES, SOURCE, TRANSCRIPT and on). When I removed them, I can correctly read the probe and control profiles with read.ilmn. Is there a better fix around this than manually removing the columns? > > -- output of sessionInfo(): > > R version 3.0.0 (2013-04-03) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods base > > other attached packages: > [1] lumi_2.12.0 Biobase_2.20.0 BiocGenerics_0.6.0 limma_3.16.2 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:6}}

ADD COMMENT • link 12.5 years ago Wei Shi ★ 3.6k

0

Entering edit mode

Wei Shi ★ 3.6k

@wei-shi-2183

Last seen 11 days ago

Australia/Melbourne

Hi Chris, Please keep your posts on the list so you can get help from other people as well. There are more than one column in your data that have their column names containing the keyword 'Probe', however the read.ilmn function expect there is only one such column so that it can properly assign probe identifiers to the probes. It looks the first column of your data contains the probe identifiers, so you should set 'probe=PROBE_ID' when you run read.ilmn. Best wishes, Wei On Aug 17, 2013, at 4:33 AM, Chris Risothal wrote: > Wei, > > I am using the following command: > > x <- read.ilmn(files="probe profile.txt", ctrlfiles="control probe profiles.txt") > > My "probe profile.txt" does not contain header information, but here is the first line with two lines of data. > > PROBE_ID SYMBOL 9273370068_A.AVG_Signal 9273370068_A.Detection Pval 9273370068_A.NARRAYS 9273370068_A.ARRAY_STDEV 9273370068_A.BEAD_STDERR 9273370068_A.Avg_NBEADS 9273370068_B.AVG_Signal 9273370068_B.Detection Pval 9273370068_B.NARRAYS 9273370068_B.ARRAY_STDEV 9273370068_B.BEAD_STDERR 9273370068_B.Avg_NBEADS 9273370068_C.AVG_Signal 9273370068_C.Detection Pval 9273370068_C.NARRAYS 9273370068_C.ARRAY_STDEV 9273370068_C.BEAD_STDERR 9273370068_C.Avg_NBEADS 9273370068_D.AVG_Signal 9273370068_D.Detection Pval 9273370068_D.NARRAYS 9273370068_D.ARRAY_STDEV 9273370068_D.BEAD_STDERR 9273370068_D.Avg_NBEADS 9273370068_E.AVG_Signal 9273370068_E.Detection Pval 9273370068_E.NARRAYS 9273370068_E.ARRAY_STDEV 9273370068_E.BEAD_STDERR 9273370068_E.Avg_NBEADS 9273370068_F.AVG_Signal 9273370068_F.Detection Pval 9273370068_F.NARRAYS 9273370068_F.ARRAY_STDEV 9273370068_F.BEAD_STDERR 9273370068_F.Avg_NBEADS 9273370068_G.AVG_Signal 9273370068_G.Detection Pval 9273370068_G.NARRAYS 9273370068_G.ARRAY_STDEV 9273370068_G.BEAD_STDERR 9273370068_G.Avg_NBEADS 9273370068_H.AVG_Signal 9273370068_H.Detection Pval 9273370068_H.NARRAYS 9273370068_H.ARRAY_STDEV 9273370068_H.BEAD_STDERR 9273370068_H.Avg_NBEADS 9273370068_I.AVG_Signal 9273370068_I.Detection Pval 9273370068_I.NARRAYS 9273370068_I.ARRAY_STDEV 9273370068_I.BEAD_STDERR 9273370068_I.Avg_NBEADS 9273370068_J.AVG_Signal 9273370068_J.Detection Pval 9273370068_J.NARRAYS 9273370068_J.ARRAY_STDEV 9273370068_J.BEAD_STDERR 9273370068_J.Avg_NBEADS 9273370068_K.AVG_Signal 9273370068_K.Detection Pval 9273370068_K.NARRAYS 9273370068_K.ARRAY_STDEV 9273370068_K.BEAD_STDERR 9273370068_K.Avg_NBEADS 9273370068_L.AVG_Signal 9273370068_L.Detection Pval 9273370068_L.NARRAYS 9273370068_L.ARRAY_STDEV 9273370068_L.BEAD_STDERR 9273370068_L.Avg_NBEADS SEARCH_KEY ILMN_GENE CHROMOSOME DEFINITION SYNONYMS TargetID ProbeID SPECIES SOURCE TRANSCRIPT SOURCE_REFERENCE_ID REFSEQ_ID UNIGENE_ID ENTREZ_GENE_ID GI ACCESSION PROTEIN_PRODUCT ARRAY_ADDRESS_ID PROBE_TYPE PROBE_START PROBE_SEQUENCE PROBE_CHR_ORIENTATION PROBE_COORDINATES CYTOBAND ONTOLOGY_COMPONENT ONTOLOGY_PROCESS ONTOLOGY_FUNCTION > ILMN_1762337 7A5 236.0978 0.7441558 1 NaN 14.72377 22 262.2536 0.4545455 1 NaN 17.13116 21 259.018 0.4857143 1 NaN 20.26811 21 244.6225 0.5766234 1 NaN 13.17049 20 293.7429 0.1545455 1 NaN 19.53695 20 252.7831 0.4649351 1 NaN 14.87979 26 247.0036 0.4272727 1 NaN 9.686612 25 254.1092 0.5168831 1 NaN 25.24741 19 238.7411 0.5805195 1 NaN 11.25148 23 248.3546 0.4714286 1 NaN 14.3304 24 257.9403 0.5389611 1 NaN 18.90778 10 255.1144 0.4883117 1 NaN 15.00543 22 NM_182762.2 7A5 7 Homo sapiens putative binding protein 7a5 (7A5), mRNA. 7A5 6450255 Homo sapiens RefSeq ILMN_183371 NM_182762.2 NM_182762.2 346389 47271497 NM_182762.2 NP_877439.2 0006450255 S 2725 GTGTTACAAGACCTTCAGTCAGCTTTGGACAGAATGAAAAACCCTGTGAC - 20147187-20147236 7p15.3e > ILMN_2055271 A1BG 317.0753 0.08961039 1 NaN 17.60022 12 365.0513 0.01948052 1 NaN 29.1449 17 362.9214 0.01688312 1 NaN 20.55249 22 297.9507 0.138961 1 NaN 23.81178 14 383.1979 0.007792208 1 NaN 24.62921 21 286.5022 0.1805195 1 NaN 14.57291 20 349.5719 0.02337662 1 NaN 29.1553 19 373.5414 0.01428571 1 NaN 22.20049 19 269.0879 0.2688312 1 NaN 13.6713 11 370.8822 0.01298701 1 NaN 29.8101 21 392.3283 0.01428571 1 NaN 41.74369 15 369.3197 0.01948052 1 NaN 20.21619 27 NM_130786.2 A1BG 19 Homo sapiens alpha-1-B glycoprotein (A1BG), mRNA. A1B; GAB; HYST2477; ABG; DKFZp686F0970 A1BG 2570615 Homo sapiens RefSeq ILMN_175569 NM_130786.2 NM_130786.2 1 21071029 NM_130786.2 NP_570602.2 0002570615 S 3151 GGGATTACAGGGGTGAGCCACCACGCCCAGCCCCAGCTTAGTTTTTTAAA - 63548541-63548590 19q13.43c The space external to the outermost structure of a cell. For cells without external protective or external encapsulating structures this refers to space outside of the plasma membrane. This term covers the host cell environment outside an intracellular parasite [goid 5576] [pmid 3458201] [evidence IDA] Any process specifically pertinent to the functioning of integrated living units: cells, tissues, organs, and organisms. A process is a collection of molecular events with a defined beginning and end [goid 8150] [evidence ND ] Elemental activities, such as catalysis or binding, describing the actions of a gene product at the molecular level. A given gene product may exhibit one or more molecular functions [goid 3674] [evidence ND ] > ILMN_1736007 A1BG 304.7664 0.1454545 1 NaN 22.61858 25 343.8062 0.03766234 1 NaN 24.54836 22 306.6519 0.1324675 1 NaN 18.23108 15 286.0155 0.1987013 1 NaN 18.77761 24 316.0644 0.06883117 1 NaN 22.10342 17 364.5466 0.02337662 1 NaN 23.77021 16 328.7427 0.03246753 1 NaN 22.41896 19 334.654 0.05194805 1 NaN 16.66162 23 295.7652 0.1246753 1 NaN 15.93948 30 324.818 0.04285714 1 NaN 15.19761 27 425.6782 0.005194805 1 NaN 42.57351 21 336.354 0.04415584 1 NaN 11.02559 15 NM_130786.2 A1BG 19 Homo sapiens alpha-1-B glycoprotein (A1BG), mRNA. A1B; GAB; HYST2477; ABG; DKFZp686F0970 A1BG 6370619 Homo sapiens RefSeq ILMN_18893 NM_130786.2 NM_130786.2 1 21071029 NM_130786.2 NP_570602.2 0006370619 S 2512 GCAGAGCTGGACGCTGTGGAAATGGCTGGATTCCTCTGTGTTCTTTCCCA - 63549180-63549229 19q13.43c The space external to the outermost structure of a cell. For cells without external protective or external encapsulating structures this refers to space outside of the plasma membrane. This term covers the host cell environment outside an intracellular parasite [goid 5576] [pmid 3458201] [evidence IDA] Any process specifically pertinent to the functioning of integrated living units: cells, tissues, organs, and organisms. A process is a collection of molecular events with a defined beginning and end [goid 8150] [evidence ND ] Elemental activities, such as catalysis or binding, describing the actions of a gene product at the molecular level. A given gene product may exhibit one or more molecular functions [goid 3674] [evidence ND ] > > > On Wed, Aug 14, 2013 at 1:46 AM, Wei Shi <shi@wehi.edu.au> wrote: > Dear Josh, > > Could you please provide the command you used and also the first 10 lines of your probe profile file? The read.ilmn function allows any number of columns to be included in your data files. It just extracts those columns whose names match the names specified in the parameters. So I do not know what caused your problem unless I can have a look at your data. > > > Best regards, > > Wei > > > On Aug 14, 2013, at 12:30 PM, Josh [guest] wrote: > > > > > I have data exported from GenomeStudio (1.6) that I can correctly read and pre-process with lumi. However, when I try using read.ilmn in limma, I found that I get an error saying "Error in `rownames<-`(`*tmp*`, value = list(PROBE_ID = c("ILMN_1762337", : length of 'dimnames' [1] not equal to array extent." I have found that this was due to the inclusion of extra columns in my probe profiles (TargetID, ProbeID, SPECIES, SOURCE, TRANSCRIPT and on). When I removed them, I can correctly read the probe and control profiles with read.ilmn. Is there a better fix around this than manually removing the columns? > > > > -- output of sessionInfo(): > > > > R version 3.0.0 (2013-04-03) > > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > > > locale: > > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > > > attached base packages: > > [1] parallel stats graphics grDevices utils datasets methods base > > > > other attached packages: > > [1] lumi_2.12.0 Biobase_2.20.0 BiocGenerics_0.6.0 limma_3.16.2 > > > > -- > > Sent via the guest posting facility at bioconductor.org. > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:17}}

ADD COMMENT • link 12.4 years ago Wei Shi ★ 3.6k

Login before adding your answer.