Question: PCA from TPM
gravatar for tanyabioinfo
23 months ago by
tanyabioinfo20 wrote:


I am trying to do PCA analysis of my samples. I generated the matrix using the tximport package. I have transcript ids as my rows and the sample names are the columns.

txi <- tximport(files, type="salmon", tx2gene=NULL, ignoreTxVersion=TRUE,dropInfReps=TRUE,txOut = TRUE)
tpm <- (txi$abundance[apply(txi$abundance, MARGIN = 1, FUN = function(x) sd(x) != 0),])

tpm = log2(tpm + 1)
tpm_centered <- t(tpm-rowMeans(tpm))

pca = prcomp(tpm_centered , scale=TRUE, center=TRUE)

cols <- as.factor(as.numeric(colnames(tpm_centered)))

plot(pca$x[,1],pca$x[,2], xlab = "PC1", ylab = "PC2",main ="PCA replicate1", col =cols)

text(pca$x[,1],pca$x[,2], row.names(pca$x), cex=0.5, pos=3)

I have couple of question.

1. Is generating PCA plot from txi$abundance a good idea to plot the PCA

2. I am unable to get the output colored based on samples.

Can someone please help me



pca tpm tximport • 859 views
ADD COMMENTlink modified 23 months ago • written 23 months ago by tanyabioinfo20

Hi Michael

I am now using the follwoing code:

txi <- tximport(files, type="salmon", tx2gene=tx2gene, ignoreTxVersion=TRUE,dropInfReps=TRUE)
sampleTable <- data.frame(condition =samples$condition,time=factor(samples$time))
rownames(sampleTable) <- colnames(txi$counts)
dds <- DESeqDataSetFromTximport(txi, sampleTable, ~condition+time)
dds <- dds[ rowSums(counts(dds)) > 1, ]
rld <- rlog(dds, blind = FALSE)
plotPCA(rld, intgroup = c("condition", "time"))

I have wild type and mutant as condition and the time point as 0hr 6hr and 24 hr. I have 4 replicate for each one of them. In the PCA plot the replicates are not grouping together. Do you think this is normal or I am making some mistake.





ADD REPLYlink written 23 months ago by tanyabioinfo20

Make sure that files and sample table are the same order. This is very important.

ADD REPLYlink written 23 months ago by Michael Love25k
Answer: PCA from TPM
gravatar for Michael Love
23 months ago by
Michael Love25k
United States
Michael Love25k wrote:

I would recommend generating PCA plot from the normalized transformed counts. There are statistical reasons to prefer variance stabilized measurements, and the normalization takes care of any biases that were corrected by the quantification software, as well as sequencing depth.

This would look like:

dds <- DESeqDataSetFromTximport(txi)
vsd <- vst(dds)
ADD COMMENTlink written 23 months ago by Michael Love25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 180 users visited in the last hour