I am new to R and I am sure it is a very trivial question.
I am following the 'RNA-Seq Data Pathway and Gene-set Analysis Workflows' (here: http://bioconductor.org/packages/release/bioc/vignettes/gage/inst/doc/RNA-seqWorkflow.pdf).
At steps 3 and 4 (sections 4.4 and 4.5), there is the following commands:
> library(gage) > ref.idx=5:8 > samp.idx=1:4 > data(kegg.gs) > #knockdown and control samples are unpaired > cnts.kegg.p <- gage(cnts.norm, gsets = kegg.gs, ref = ref.idx, samp = samp.idx, compare ="unpaired")
#differential expression: log2 ratio or fold change, uppaired samples > cnts.d= cnts.norm[, samp.idx]-rowMeans(cnts.norm[, ref.idx])
where 'ref.idx' and 'samp.idx' are the numeric vectors of column numbers for the reference and target conditions respectively.
In the example taken in this workflow, the data frame consists of 4 target samples (columns 1 to 4) and 4 reference samples (columns 5 to 8).
But my data only contains 1 reference sample (column 1) and 1 target sample (column 2). So I get the 'ref.idx' and 'amp.idx' based on pattern matching in the column names as follows:
> cn=colnames(cnts) > ref.idx=grep('REF',cn,ignore.case=T) > samp.idx=grep('SAMP',cn,ignore.case=T) > ref.idx [1] 1 > samp.idx [1] 2
Since gage expect a numeric vector, the last command below returns an error:
> exp.d = exp.fc[, samp.idx]-rowMeans(exp.fc[, ref.idx]) Error in exp.fc[, samp.idx] : incorrect number of dimensions​
I tried that, but it returned the same error. Anyway, I used the edgeR fork to bypass the 'exp.d' calculation and it works now. Thanks !