Question: PCA from TPM
gravatar for tanyabioinfo
19 months ago by
tanyabioinfo20 wrote:


I am trying to do PCA analysis of my samples. I generated the matrix using the tximport package. I have transcript ids as my rows and the sample names are the columns.

txi <- tximport(files, type="salmon", tx2gene=NULL, ignoreTxVersion=TRUE,dropInfReps=TRUE,txOut = TRUE)
tpm <- (txi$abundance[apply(txi$abundance, MARGIN = 1, FUN = function(x) sd(x) != 0),])

tpm = log2(tpm + 1)
tpm_centered <- t(tpm-rowMeans(tpm))

pca = prcomp(tpm_centered , scale=TRUE, center=TRUE)

cols <- as.factor(as.numeric(colnames(tpm_centered)))

plot(pca$x[,1],pca$x[,2], xlab = "PC1", ylab = "PC2",main ="PCA replicate1", col =cols)

text(pca$x[,1],pca$x[,2], row.names(pca$x), cex=0.5, pos=3)

I have couple of question.

1. Is generating PCA plot from txi$abundance a good idea to plot the PCA

2. I am unable to get the output colored based on samples.

Can someone please help me



pca tpm tximport • 675 views
ADD COMMENTlink modified 19 months ago • written 19 months ago by tanyabioinfo20

Hi Michael

I am now using the follwoing code:

txi <- tximport(files, type="salmon", tx2gene=tx2gene, ignoreTxVersion=TRUE,dropInfReps=TRUE)
sampleTable <- data.frame(condition =samples$condition,time=factor(samples$time))
rownames(sampleTable) <- colnames(txi$counts)
dds <- DESeqDataSetFromTximport(txi, sampleTable, ~condition+time)
dds <- dds[ rowSums(counts(dds)) > 1, ]
rld <- rlog(dds, blind = FALSE)
plotPCA(rld, intgroup = c("condition", "time"))

I have wild type and mutant as condition and the time point as 0hr 6hr and 24 hr. I have 4 replicate for each one of them. In the PCA plot the replicates are not grouping together. Do you think this is normal or I am making some mistake.





ADD REPLYlink written 19 months ago by tanyabioinfo20

Make sure that files and sample table are the same order. This is very important.

ADD REPLYlink written 19 months ago by Michael Love23k
Answer: PCA from TPM
gravatar for Michael Love
19 months ago by
Michael Love23k
United States
Michael Love23k wrote:

I would recommend generating PCA plot from the normalized transformed counts. There are statistical reasons to prefer variance stabilized measurements, and the normalization takes care of any biases that were corrected by the quantification software, as well as sequencing depth.

This would look like:

dds <- DESeqDataSetFromTximport(txi)
vsd <- vst(dds)
ADD COMMENTlink written 19 months ago by Michael Love23k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 305 users visited in the last hour