Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 10.6 years ago
I am working with an OpenArray miRNA dataset with 72 samples.
I am having a little trouble with the file input. I had been told by
the lab scientist who gathered the data that there were 750 genes
measured by this array, so I tried this:
> fileList <- c("Runs 1-4.csv","Runs 5-8.csv")
> memStickPath <- "E:/Work/miRNomics/miRNA data/raw"
>
> sampleCounts <- c(36,36)
>
> raw <- readCtData(files = fileList, path = memStickPath, format =
"OpenArray", n.features = 750, n.data = sampleCounts)
Warning messages:
1: In matrix(sample[, column.info[["Ct"]]], ncol = n.data[i]) :
data length [26994] is not a sub-multiple or multiple of the number
of rows [750]
2: In matrix(sample[, column.info[["flag"]]], ncol = n.data[i]) :
data length [26994] is not a sub-multiple or multiple of the number
of rows [750]
The first odd thing here is that my file has 29448 rows, not the 26994
quoted in the error (which turns out to be 3 samples shorter). Because
of the error relating to the 750 multiple I looked at the file and
discovered that there appear to be 818 rows per sample so...
> fileList <- c("Runs 1-4.csv","Runs 5-8.csv")
> memStickPath <- "E:/Work/miRNomics/miRNA data/raw"
>
> sampleCounts <- c(36,36)
>
> raw <- readCtData(files = fileList, path = memStickPath, format =
"OpenArray", n.features = 818, n.data = sampleCounts)
Error in `[<-.data.frame`(`*tmp*`, undeter, value = "Undetermined") :
only logical matrix subscripts are allowed in replacement
In addition: Warning message:
In matrix(sample[, column.info[["Ct"]]], ncol = n.data[i]) :
data length [26994] is not a sub-multiple or multiple of the number
of rows [750]
I have since tried joining the two files (of 36 samples each) into one
file (of 72):
> thisPath <- "C:/Users/sr216a/Documents/PreEc_miRNA/raw"
>
> sampleCount <- 72
>
> raw <- readCtData(files = "allRuns.csv", path = thisPath, format =
"OpenArray", n.features = 818, n.data = sampleCount)
Error in `[<-.data.frame`(`*tmp*`, undeter, value = "Undetermined") :
only logical matrix subscripts are allowed in replacement
In addition: Warning message:
In matrix(sample[, column.info[["Ct"]]], ncol = n.data[i]) :
data length [56442] is not a sub-multiple or multiple of the number
of rows [784]
I returned to trying "n.features = 750" and the command appears to
work! I am quite confused as to what is going on here and would very
much appreciate any help regarding:
-which 750 of the 818 are being picked up, why and how
-how samples are distinguished from one another (does the method read
the "SampleInfo.SampleID" column or should the files be pre-ordered by
sample)
I am also a little confused about how one associates samples with
classifications e.g. case and control. It seems that most of the
methods utilising this info use "groups = files$Treatment", but I
don't seem to be able to find a description of the format of this
file. Is "phenoData" meant to contain similar info? Is the "phenoData"
important for standard usage of the package or is this an additional
helpful data structure?
Any help would be very much appreciated,
Scott
-- output of sessionInfo():
> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] HTqPCR_1.12.0 limma_3.14.1 RColorBrewer_1.0-5
Biobase_2.18.0
[5] BiocGenerics_0.4.0
loaded via a namespace (and not attached):
[1] affy_1.36.0 affyio_1.26.0 BiocInstaller_1.8.3
[4] gdata_2.12.0 gplots_2.11.0 gtools_2.7.0
[7] preprocessCore_1.20.0 stats4_2.15.2 tools_2.15.2
[10] zlibbioc_1.4.0
--
Sent via the guest posting facility at bioconductor.org.