Question

Orthogonal regression (edgeR, DESeq, limma or others)?

0

Entering edit mode

Panos Bolan ▴ 20

@panos-bolan-6456

Last seen 9.6 years ago

Dear list, I am a postdoc in Bioinformatics, working on gene/gene regulation using RNA-seq data. I would like to find the associations for a set of gene pairs that my collaborator sent me. I have 1000 such pairs whose counts are measured for 400 samples. One way to do it would be by simple correlations (Spearman CPMs) or by using limma (faster than edgeR and DESeq for this task) and model the voom-transformed data as Gene1 ~ Gene2. The problem I see with the 'correlations' solution is that it's a very simple model that does not take into account the dispersion of the data, while 'lima' or edger or other would possibly give different answers for Gene1 ~ Gene2 and Gene2 ~ Gene1, so it would be confusing if I wanted to estimate a bootstrap P-value of significance. I would like to ask if there is any model that uses orthogonal regression for RNA-seq data (assuming that all measurements come with error and that the error variances are equal). Thank you, Pan [[alternative HTML version deleted]]

limma edgeR DESeq limma edgeR DESeq • 1.3k views

ADD COMMENT • link updated 10.1 years ago by Gordon Smyth 50k • written 10.1 years ago by Panos Bolan ▴ 20

score 0 · Answer 1 · 2014-03-22

Dear Pan, Orthogonal regression doesn't seem very relevant for your problem, for one thing because the error variances aren't equal. There are many ways to correlate expression values. One could easily use the voom log-cpm and weights if one had voom transformed data, but this would require a design matrix to be defined. Another easy way is library(edgeR) logCPM <- cpm(y,log=TRUE,prior.count=4) where y is your count matrix or DGEList object. Then design <- matrix(1,ncol(y),1) interGeneCorrelation(logCPM[i,],design) would compute the correlation between any set of genes selected by 'i'. This could be a pair of genes, or it could be more than 2 genes. Best wishes Gordon > Date: Thu, 20 Mar 2014 20:40:13 +0900 > From: Panos Bolan <panbolan at="" hotmail.com=""> > To: "bioconductor at r-project.org" <bioconductor at="" r-project.org=""> > Subject: [BioC] Orthogonal regression (edgeR, DESeq, limma or others)? > > Dear list, > > I am a postdoc in Bioinformatics, working on gene/gene regulation using > RNA-seq data. I would like to find the associations for a set of gene > pairs that my collaborator sent me. I have 1000 such pairs whose counts > are measured for 400 samples. One way to do it would be by simple > correlations (Spearman CPMs) or by using limma (faster than edgeR and > DESeq for this task) and model the voom-transformed data as Gene1 ~ > Gene2. > > The problem I see with the 'correlations' solution is that it's a very > simple model that does not take into account the dispersion of the data, > while 'lima' or edger or other would possibly give different answers for > Gene1 ~ Gene2 and Gene2 ~ Gene1, so it would be confusing if I wanted to > estimate a bootstrap P-value of significance. > > I would like to ask if there is any model that uses orthogonal > regression for RNA-seq data (assuming that all measurements come with > error and that the error variances are equal). > > Thank you, > Pan ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}