Dear All,
I am using IsoformSwitchAnalyzeR package to identify recurrent isoform switches in a set of 24 rectal tumor samples. The isoform and gene quantification has been done in RSEM.
However, the package requires specifying two different conditions for making pairwise comparisons in the design matrix under $condition column. Since all my samples are rectal tumor samples, I was interested in all possible pairwise comparisons between these samples to identify recurrently occurring switches. I created the $condition column with only one label "CRC", and the program terminated with error. This is the code which I was using -
library(IsoformSwitchAnalyzeR)
library(factoextra)
## create isoform quantification data (step 1: importIsoformExpression())
myquant <- importIsoformExpression("RSEM_isoform_files/", normalizationMethod = 'TMM', showProgress = TRUE)
meta_data <- read.csv('transcriptome_meta_data_test.tsv', sep='\t')
# Test whether the program's normalization is working by calculating PCA
tpm_mat <- myquant$abundance
row_head <- tpm_mat[,1] # convert column1 (isoform identifiers) as row headers
row.names(tpm_mat) <- row_head
tpm_mat[,1] <- NULL
pca <- prcomp(t(tpm_mat))
fviz_pca_ind(pca,col.ind = "contrib", pointsize ="contrib",
gradient.cols = brewer.pal(10, "Spectral"),
repel = TRUE, labelsize = 4)
# create design matrix for importR Data , for next step
sampleID <- colnames(myquant$counts)[-1]
condition <- rep('CRC', each=24)
intron_level <- c(rep("high", each=6), 'low', rep('high', each=7), rep('low', each=10))
batch <- as.character(meta_data$BATCH)
design_matrix <- cbind(sampleID, condition, intron_level, batch)
design_matrix <- data.frame(design_matrix)
# import R data from previous step to create a switchAnalyzeRlist list (step 2: importRdata())
myswitchlist <- importRdata(
isoformCountMatrix = myquant$counts,
isoformRepExpression = myquant$abundance,
designMatrix = design_matrix,
isoformExonAnnoation = "Homo_sapiens.GRCh38.94.gtf",
showProgress = TRUE,
ignoreAfterPeriod = TRUE
)
I got this error at the second step - importRdata()
Step 1 of 6: Checking data...
Error in importRdata(isoformCountMatrix = myquant$counts, isoformRepExpression = myquant$abundance, :
The supplied 'designMatrix' only contains 1 condition
output of traceback()
2: stop("The supplied 'designMatrix' only contains 1 condition")
1: importRdata(isoformCountMatrix = myquant$counts, isoformRepExpression = myquant$abundance,
designMatrix = pheno_data, isoformExonAnnoation = "Homo_sapiens.GRCh38.94.gtf",
showProgress = TRUE, ignoreAfterPeriod = TRUE)
Sample of my design matrix -
sampleID condition intronlevel . batch RIT1 CRC RIT2 CRC RIT3 . CRC RIT4 . CRC RIT5 . CRC ... ... ... ... ... ...
I don't have multiple conditions across which I can do a comparison.
Is there a way in which I can do all possible pairwise comparisons between samples?? how can I modify the requirements of condition column in design matrix?
Could you elaborate on what you mean by "recurrently occurring switches"?
I have RNA-Seq data from 24 tumor samples. By "recurrent switches", I mean to identify an isoform switch or in other words, a preferential isoform usage across samples.
For example - If a gene has 2 isoforms A and B, I would like to identify the % of cases or samples where isoform A is used more than B or vice versa; showing switch between these two forms.
The idea is - if 10 out of 24 samples show a higher or positive dIF score for isoform A than B, I can then take a step back and identify whether these 10 samples belong to a specific pathologic group (tumor grade/stage etc.), rather than moving in opposite direction i.e. classify the samples into groups according to their conditions and then perform pairwise comparisons between these conditions.
The usual workflow in IsoformSwitchAnalyzeR involves comparisons between conditions to identify switch. Here, I want to make all possible comparisons between 24 samples itself and then classify them into different conditions. I hope I am making some sense here.