Search
Question: PCA from TPM
1
gravatar for tanyabioinfo
10 months ago by
tanyabioinfo20
tanyabioinfo20 wrote:

Hi

I am trying to do PCA analysis of my samples. I generated the matrix using the tximport package. I have transcript ids as my rows and the sample names are the columns.

txi <- tximport(files, type="salmon", tx2gene=NULL, ignoreTxVersion=TRUE,dropInfReps=TRUE,txOut = TRUE)
tpm <- (txi$abundance[apply(txi$abundance, MARGIN = 1, FUN = function(x) sd(x) != 0),])

tpm = log2(tpm + 1)
tpm_centered <- t(tpm-rowMeans(tpm))

pca = prcomp(tpm_centered , scale=TRUE, center=TRUE)

cols <- as.factor(as.numeric(colnames(tpm_centered)))

plot(pca$x[,1],pca$x[,2], xlab = "PC1", ylab = "PC2",main ="PCA replicate1", col =cols)

text(pca$x[,1],pca$x[,2], row.names(pca$x), cex=0.5, pos=3)

I have couple of question.

1. Is generating PCA plot from txi$abundance a good idea to plot the PCA

2. I am unable to get the output colored based on samples.

Can someone please help me

Thanks

Tanya

ADD COMMENTlink modified 10 months ago • written 10 months ago by tanyabioinfo20

Hi Michael

I am now using the follwoing code:

txi <- tximport(files, type="salmon", tx2gene=tx2gene, ignoreTxVersion=TRUE,dropInfReps=TRUE)
sampleTable <- data.frame(condition =samples$condition,time=factor(samples$time))
rownames(sampleTable) <- colnames(txi$counts)
dds <- DESeqDataSetFromTximport(txi, sampleTable, ~condition+time)
dds <- dds[ rowSums(counts(dds)) > 1, ]
rld <- rlog(dds, blind = FALSE)
plotPCA(rld, intgroup = c("condition", "time"))


I have wild type and mutant as condition and the time point as 0hr 6hr and 24 hr. I have 4 replicate for each one of them. In the PCA plot the replicates are not grouping together. Do you think this is normal or I am making some mistake.

 

Regards

Tanya

 

ADD REPLYlink written 10 months ago by tanyabioinfo20

Make sure that files and sample table are the same order. This is very important.

ADD REPLYlink written 10 months ago by Michael Love19k
0
gravatar for Michael Love
10 months ago by
Michael Love19k
United States
Michael Love19k wrote:

I would recommend generating PCA plot from the normalized transformed counts. There are statistical reasons to prefer variance stabilized measurements, and the normalization takes care of any biases that were corrected by the quantification software, as well as sequencing depth.

This would look like:

dds <- DESeqDataSetFromTximport(txi)
vsd <- vst(dds)
plotPCA(vsd)
ADD COMMENTlink written 10 months ago by Michael Love19k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 334 users visited in the last hour