Question: Does anyone have experience using the massiR package?
0
9 weeks ago by
Steve Lowe10
Steve Lowe10 wrote:

I am having trouble aligning the probes (built into the package) to my dataset. I am concerned my dataset is not formatted correctly or there is some step I am missing. Does anyone have experience with massiR and willing to lend a hand?

Thanks!

microarray probe R massir • 111 views
written 9 weeks ago by Steve Lowe10

I irregularly use this library. If I recall correctly, I had to change few lines of code to get it working. I also do not use the built-in sets of Y-chromosome located probes but I extract these from an OrgDb. It would be helpful if you could post some code that shows where you get stuck.

library(massiR) set8 <- read.table(file.choose()) ##the file is a matrix from GEO, I did not change anything about this file data(y.probes) names(y.probes) [1] "illuminahumanwg6v1" "illuminahumanwg6v2" "illuminahumanwg6v1" [4] "illuminahumanht12" "affyhugene10stv1" "affyhgu133plus2"
probes <- data.frame(y.probes["affyhgu133plus2"]) ##GEO names this platform massi.y.out <- massiy(set8, probes) massiy_plot(massi.y.out) Error in plot.window(xlim, ylim, log = log, ...) : need finite 'xlim' values In addition: Warning messages: 1: In min(w.l) : no non-missing arguments to min; returning Inf 2: In max(w.r) : no non-missing arguments to max; returning -Inf 3: In min(x) : no non-missing arguments to min; returning Inf 4: In max(x) : no non-missing arguments to max; returning -Inf

Your lines of code worked fine for me when starting directly from a set of raw data (CEL) files. This means that your input data (i.e. set8) was somehow wrong... Assuming you would like to use the expression data in the Series Matrix File available at GEO, I would recommend to use functions from the GEOquery library to import the data (and not use the function read.table).

Using as example a data set from an hgu133plus2-based experiment from GEO (GSE20986; platform GPL570) that I had laying around: (note that I downloaded the (compressed) Series Matix File manually to my computer and put it in my working directory)

> library(GEOquery)
> library(massiR)
> set8 <- getGEO( filename = "GSE20986_series_matrix.txt.gz")
> set8
ExpressionSet (storageMode: lockedEnvironment)
assayData: 54675 features, 12 samples
element names: exprs
protocolData: none
phenoData
sampleNames: GSM524662 GSM524663 ... GSM524673 (12 total)
varLabels: title geo_accession ... tissue:ch1 (34 total)
featureData
featureNames: 1007_s_at 1053_at ... AFFX-TrpnX-M_at (54675 total)
fvarLabels: ID GB_ACC ... Gene Ontology Molecular Function (16 total)
fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
pubMedIds: 22028475
Annotation: GPL570
>
> data(y.probes)
> probes <- data.frame(y.probes["affy_hg_u133_plus_2"])
> massi.y.out <- massi_y(set8, probes)
> massi_y_plot(massi.y.out)
>
> # Use the upper 50% most variable probes (= default = 3) for identifying the sex of the sample
> massi.select.out <- massi_select(set8, probes, threshold=3)
>
> #Now predict the sex of the samples using massi cluster
> results <- massi_cluster(massi.select.out)
>
> #Extract the results for each sample from the returned list:
> sample.results <- data.frame(results[[2]])
ID mean_y_probes_value y_probes_sd     z_score    sex
1 GSM524662            4.175512    2.936009  0.20297740   male
2 GSM524663            3.851902    2.645961 -0.31148554   male
3 GSM524664            3.020613    1.878319 -0.70793786 female
4 GSM524665            3.052636    1.878140 -0.29445062 female
5 GSM524666            3.951170    2.777243 -0.07732056   male
6 GSM524667            4.158953    2.975387  0.22673768   male

ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by Guido Hooiveld2.5k

Just got this step to work. Thank you so much for your help!!

Thank you for lending a hand! I really appreciate it

--Steve