Dear Bioconductor readers,
My name is Irene and I am trying to analyze a dataset from GEO obtained with an Agilent platform. I have successfully analyzed Agilent data from GEO recently using the information on the limma package. But this time I cannot read the txt files. It seems as if the columns that read.maimages is looking for were not in the txt files or had a different name. If I knew the names of these columns I could provide those with the columns argument of the read.maimages function, but I do not know them.
I would be extremely grateful if you could give me some advice.
Thank you in advance, Irene.
Here is my code:
### Fetching the data
workingDir<-"C:/Users/iroman/Documents/Master_Omics/Project"
setwd(workingDir)
GEO48872<-getGEOSuppFiles("GSE48872",makeDirectory=TRUE, fetch_files = TRUE)
setwd(paste(workingDir,"GSE48872",sep="/"))
untar("GSE48872_RAW.tar", exdir = getwd())
### Targets file
SampleNumber<-c(1,2,3,4,5,6,7)
FileName<-c("GSM1186204_raw_data_ActivatedaOPCs_1.txt","GSM1186205_raw_data_ActivatedaOPCs_2.txt",
"GSM1186206_raw_data_ActivatedaOPCs_3.txt","GSM1186207_raw_data_ActivatedaOPCs_4.txt",
"GSM1186208_raw_data_NonactivatedaOPCs_1.txt","GSM1186209_raw_data_NonactivatedaOPCs_2.txt",
"GSM1186210_raw_data_NonactivatedaOPCs_3.txt")
Condition<-c("Cupri","Cupri","Cupri","Cupri","Ctr","Ctr","Ctr")
designO<-as.data.frame(cbind(SampleNumber,FileName,Condition))
write.table(designO,file="targetsO.txt",sep="\t")
targetsO = readTargets("targetsO.txt")
### Reading the files
rawO = read.maimages(targetsO, source="agilent",green.only=FALSE,ext = "gz",other.columns="gIsWellAboveBG")
#Error in readGenericHeader(fullname, columns = columns, sep = sep) :
# Specified column headings not found in file
> traceback()
3: file(file, "r")
2: readGenericHeader(fullname, columns = columns, sep = sep)
1: read.maimages(targetsO, source = "agilent", green.only = FALSE,
ext = "gz", other.columns = "gIsWellAboveBG")
Here is the sessionInfo:
R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 LC_MONETARY=Spanish_Spain.1252
[4] LC_NUMERIC=C LC_TIME=Spanish_Spain.1252
attached base packages:
[1] grid parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] agilp_3.14.0 mgug4122a.db_3.2.3 topGO_2.34.0
[4] SparseM_1.77 graph_1.60.0 dplyr_0.8.0.1
[7] sva_3.30.1 mgcv_1.8-27 nlme_3.1-137
[10] casper_2.16.1 a4Base_1.30.0 a4Core_1.30.0
[13] a4Preproc_1.30.0 glmnet_2.0-16 foreach_1.4.4
[16] Matrix_1.2-15 multtest_2.38.0 genefilter_1.64.0
[19] mpm_1.0-22 KernSmooth_2.23-15 MASS_7.3-51.1
[22] annaffy_1.54.0 KEGG.db_3.2.3 GO.db_3.7.0
[25] ReactomePA_1.26.0 tidyr_0.8.3 oligo_1.46.0
[28] Biostrings_2.50.2 XVector_0.22.0 oligoClasses_1.44.0
[31] mogene10sttranscriptcluster.db_8.7.0 org.Mm.eg.db_3.7.0 annotate_1.60.0
[34] XML_3.98-1.19 AnnotationDbi_1.44.0 GEOquery_2.50.5
[37] limma_3.38.3 gplots_3.0.1.1 scatterplot3d_0.3-41
[40] affyQCReport_1.60.0 lattice_0.20-38 affyPLM_1.58.0
[43] preprocessCore_1.44.0 gcrma_2.54.0 affy_1.60.0
[46] SummarizedExperiment_1.12.0 DelayedArray_0.8.0 BiocParallel_1.16.6
[49] matrixStats_0.54.0 Biobase_2.42.0 GenomicRanges_1.34.0
[52] GenomeInfoDb_1.18.2 IRanges_2.16.0 S4Vectors_0.20.1
[55] BiocGenerics_0.28.0
loaded via a namespace (and not attached):
[1] proto_1.0.0 tidyselect_0.2.5 RSQLite_2.1.1 munsell_0.5.0
[5] codetools_0.2-16 chron_2.3-53 statmod_1.4.30 colorspace_1.4-1
[9] GOSemSim_2.8.0 knitr_1.22 rstudioapi_0.10 DOSE_3.8.2
[13] simpleaffy_2.58.0 urltools_1.7.3 GenomeInfoDbData_1.2.0 polyclip_1.10-0
[17] bit64_0.9-7 farver_2.0.1 coda_0.19-3 xfun_0.6
[21] affxparser_1.54.0 R6_2.4.0 graphlayouts_0.5.0 VGAM_1.1-2
[25] bitops_1.0-6 fgsea_1.8.0 gridGraphics_0.4-1 assertthat_0.2.1
[29] scales_1.0.0 ggraph_2.0.0 enrichplot_1.2.0 gtable_0.3.0
[33] tidygraph_1.1.2 rlang_0.3.4 splines_3.5.3 rtracklayer_1.42.2
[37] lazyeval_0.2.2 europepmc_0.3 checkmate_1.9.1 BiocManager_1.30.4
[41] yaml_2.2.0 reshape2_1.4.3 GenomicFeatures_1.34.3 backports_1.1.3
[45] qvalue_2.14.1 tools_3.5.3 ggplotify_0.0.4 ggplot2_3.1.1
[49] affyio_1.52.0 ff_2.2-14 RColorBrewer_1.1-2 ggridges_0.5.1
[53] gsubfn_0.7 Rcpp_1.0.1 plyr_1.8.4 progress_1.2.0
[57] zlibbioc_1.28.0 purrr_0.3.2 RCurl_1.95-4.12 prettyunits_1.0.2
[61] sqldf_0.4-11 viridis_0.5.1 cowplot_0.9.4 ggrepel_0.8.1
[65] cluster_2.0.7-1 magrittr_1.5 data.table_1.12.2 DO.db_2.9
[69] triebeard_0.3.0 reactome.db_1.66.0 hms_0.4.2 xtable_1.8-3
[73] gaga_2.28.1 gridExtra_2.3 compiler_3.5.3 biomaRt_2.38.0
[77] tibble_2.1.1 crayon_1.3.4 DBI_1.0.0 tweenr_1.0.1
[81] rappdirs_0.3.1 readr_1.3.1 gdata_2.18.0 igraph_1.2.4.1
[85] pkgconfig_2.0.2 rvcheck_0.1.7 GenomicAlignments_1.18.1 xml2_1.2.0
[89] EBarrays_2.46.0 stringr_1.4.0 digest_0.6.18 fastmatch_1.1-0
[93] curl_3.3 Rsamtools_1.34.1 gtools_3.8.1 graphite_1.28.2
[97] jsonlite_1.6 viridisLite_0.3.0 pillar_1.3.1 httr_1.4.0
[101] survival_2.43-3 glue_1.3.1 UpSetR_1.4.0 iterators_1.0.10
[105] bit_1.1-14 ggforce_0.3.1 stringi_1.4.3 blob_1.1.1
[109] caTools_1.17.1.2 memoise_1.1.0
Dear Gordon,
Thank you so much for your quick response.
I tried using "genepix" as source but it did not work. Do you know what could be the problem?
Thank you so much, sincerely, Irene.
That's because these are one-color microarrays so you need to specify
green.only=TRUE
. I have edited my answer above to reflect this.