Issue with RUVs
3
0
Entering edit mode
@asifzubair-6770
Last seen 8.1 years ago

Hi, 

I am trying to use RUVs to normalise my data. I have 12 samples which are replicated in pairs. 

i.e. I can specify my differences like this:

differences = matrix(data= seq(1,12,by=1), byrow=TRUE, nrow=6)

where each row is a biological replicate

I create a "counts" matrix which is n X 12 matrix of the read counts of my genes. I create a vector "geneNames" which lists all the genes in the count matrix. 

I call the RUVs function as such:

RUVs(counts, geneNames, k=1, differences)

However, I am getting the following error:

Error in Y[scIdx[ii, jj], , drop = FALSE] : 
  no 'dimnames' attribute for array

Please note my "counts" matrix is not a data.frame but simply a numeric matrix. 

I then tried using the newSeqExpressionSet function:

x = as.factor(rep(c(1,2,3,4,5,6), each=2))

set = newSeqExpressionSet(as.matrix(counts), phenoData = data.frame(x,row.names=geneNames))

and run 

RUVs(set, genes, k=1, differences)

However, I get the following error:

Error in Y[scIdx[ii, jj], , drop = FALSE] : subscript out of bounds

Could someone suggest to me how I could modify my commands to make this work ? 

Thank you ! 

ruvseq RUVs EDA • 2.8k views
ADD COMMENT
1
Entering edit mode
davide risso ▴ 980
@davide-risso-5075
Last seen 8 months ago
University of Padova

Hi Asif,

my guess is that your count matrix doesn't have the "rownames" attribute. You should modify your matrix to have row names the same gene IDs that you use in the "genes" vector. Alternatively, you can use a numeric index instead of a character string to identify genes.  

I hope this helps.

davide

ADD COMMENT
0
Entering edit mode

Hi Davide,

Thank you, it worked ! Can I use my normalized counts directly for an analysis by say edgeR ? 

Best, 

Asif 

ADD REPLY
1
Entering edit mode

No. We don't recommend using the normalized counts (except for exploration/visualization). You should include the estimated factors (the W matrix) in edgeR's GLM design matrix (see the RUVSeq vignette for a detailed example).

ADD REPLY
0
Entering edit mode
@asifzubair-6770
Last seen 8.1 years ago

I plotted the PCA for my normalized counts but it doesn't give anything sensible. I did normalization using DESeq's size factor estimation algorithm and it gives me sensible results. The function call I made is below:

ruv_de = RUVs(data.matrix(de), genes, k=1, differences)

x=rep(c(1,2,3,4,5,6),2)

EDASeq::plotPCA(data.matrix(ruv_de$normalizedCounts), col=colors[x], xlim=c(-0.5,0.5), cex=0.90)

Is there something I am doing wrong ? 

ADD COMMENT
0
Entering edit mode

What do you mean by sensible? As you can see, there are some choices you have to make: for instance you can try different values of k.

It's hard to comment on this without knowing more about your experiment. If you're happy with the DESeq size factors, maybe it means your dataset doesn't need any other normalization more than just scaling for sequencing depth, then you should stick with DESeq normalization. But again, it's hard to tell without looking at the data.

 

ADD REPLY
0
Entering edit mode

Yes, I think you are right. 

By sensible I meant that the replicates don't cluster together as I would like. However, simply by using the DESeq size factors, it works. 

I like RUVSeq approach and so wanted to try it on my dataset as well. Particularly interesting was the counter intuitive argument that you make about the spike-in genes in your paper. 

I am currently using only part of the dataset, perhaps this normalisation will be of more use when I use the whole dataset ? 

Thank you for your help. Appreciate it. 

ADD REPLY

Login before adding your answer.

Traffic: 916 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6