Regularization of ballgown object on Limma
1
0
Entering edit mode
@komuracom16-22770
Last seen 4.8 years ago

Hi,

I'm doing transcription level analysis with my RNA-seq data while seeing nature protocol(https://www.nature.com/articles/nprot.2016.095).

I did alignment, assembly and quantification expressed genes and transcripts with HISAT2 and Stringtie. However, I found my sample sizes was small (n < 4 per group) as mentioned in nature protocol.

Note that Ballgown’s statistical test is a standard linear model based comparison. For small sample sizes (n < 4 per group) it is often better to perform regularization. This can be done using the limma33 package in Bioconductor.

Therefore, I did regularization on Limma, but I'm not sure if this is correct. So, I want to ask everyone to confirm my command below.

#use data in nature protocal
#create ballgown instance
pheno_data = read.csv("chrX_data/geuvadis_phenodata.csv")
bg_chrX = ballgown(dataDir = "ballgown", samplePattern = "ERR", pData=pheno_data)
bg_chrX_filt = subset(bg_chrX,"rowVars(texpr(bg_chrX)) >1",genomesubset=TRUE)

#make design for limma
design <- model.matrix(~0+sex+population, data=pheno_data)
colnames(design) <- gsub("sex", "", colnames(design))

#regularization on limma
v <- voom(texpr(bg_chrX_filt), design, plot=TRUE)
vfit <- lmFit(v, design)
efit <- eBayes(vfit)
plotSA(efit, main="Final model: Mean-variance trend")
contr <- makeContrasts(female - male, levels = colnames(coef(efit)))
tmp <- contrasts.fit(efit, contr)
tmp <- eBayes(tmp)
top.table <- topTable(tmp, sort.by = "P", n = Inf)

#see DEGs
head(top.table, 10)

I referred to those sites; https://github.com/alyssafrazee/ballgown

https://ucdavis-bioinformatics-training.github.io/2018-June-RNA-Seq-Workshop/thursday/DE.html

https://www.bioconductor.org/packages/devel/workflows/vignettes/RNAseq123/inst/doc/limmaWorkflow.html#normalising-gene-expression-distributions

Thanks!

limma ballgown RNA-seq • 1.1k views
ADD COMMENT
1
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

I am the limma author. I am not familiar with ballgown output so I can't advise you on how to analyse it, but there are several aspects of your question and code that fill me with concern:

  1. limma is designed to analyse either gene level expression or exon level expression but not transcript-level
  2. limma is not designed to analyse FPKM values
  3. limma is incompatible with variance filtering.
  4. voom in particular is only designed to analyse counts, not any other type of expression data..

There has been a lot written about all these points, so I won't repeat it all here. The various limma workflows show how the use limma with RNA-seq data.

ADD COMMENT
0
Entering edit mode

Thanks Gordon Smyth! I try to use limma based on counts.

ADD REPLY

Login before adding your answer.

Traffic: 685 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6