Question: PCA from TPM
gravatar for tanyabioinfo
2.1 years ago by
tanyabioinfo20 wrote:


I am trying to do PCA analysis of my samples. I generated the matrix using the tximport package. I have transcript ids as my rows and the sample names are the columns.

txi <- tximport(files, type="salmon", tx2gene=NULL, ignoreTxVersion=TRUE,dropInfReps=TRUE,txOut = TRUE)
tpm <- (txi$abundance[apply(txi$abundance, MARGIN = 1, FUN = function(x) sd(x) != 0),])

tpm = log2(tpm + 1)
tpm_centered <- t(tpm-rowMeans(tpm))

pca = prcomp(tpm_centered , scale=TRUE, center=TRUE)

cols <- as.factor(as.numeric(colnames(tpm_centered)))

plot(pca$x[,1],pca$x[,2], xlab = "PC1", ylab = "PC2",main ="PCA replicate1", col =cols)

text(pca$x[,1],pca$x[,2], row.names(pca$x), cex=0.5, pos=3)

I have couple of question.

1. Is generating PCA plot from txi$abundance a good idea to plot the PCA

2. I am unable to get the output colored based on samples.

Can someone please help me



pca tpm tximport • 1.0k views
ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by tanyabioinfo20

Hi Michael

I am now using the follwoing code:

txi <- tximport(files, type="salmon", tx2gene=tx2gene, ignoreTxVersion=TRUE,dropInfReps=TRUE)
sampleTable <- data.frame(condition =samples$condition,time=factor(samples$time))
rownames(sampleTable) <- colnames(txi$counts)
dds <- DESeqDataSetFromTximport(txi, sampleTable, ~condition+time)
dds <- dds[ rowSums(counts(dds)) > 1, ]
rld <- rlog(dds, blind = FALSE)
plotPCA(rld, intgroup = c("condition", "time"))

I have wild type and mutant as condition and the time point as 0hr 6hr and 24 hr. I have 4 replicate for each one of them. In the PCA plot the replicates are not grouping together. Do you think this is normal or I am making some mistake.





ADD REPLYlink written 2.1 years ago by tanyabioinfo20

Make sure that files and sample table are the same order. This is very important.

ADD REPLYlink written 2.1 years ago by Michael Love26k
Answer: PCA from TPM
gravatar for Michael Love
2.1 years ago by
Michael Love26k
United States
Michael Love26k wrote:

I would recommend generating PCA plot from the normalized transformed counts. There are statistical reasons to prefer variance stabilized measurements, and the normalization takes care of any biases that were corrected by the quantification software, as well as sequencing depth.

This would look like:

dds <- DESeqDataSetFromTximport(txi)
vsd <- vst(dds)
ADD COMMENTlink written 2.1 years ago by Michael Love26k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 165 users visited in the last hour