Question: Identify recurrent isoform switches using IsoformSwitchAnalyzeR in multiple rectal tumor samples
0
gravatar for asmita
9 weeks ago by
asmita0
asmita0 wrote:

Dear All,

I am using IsoformSwitchAnalyzeR package to identify recurrent isoform switches in a set of 24 rectal tumor samples. The isoform and gene quantification has been done in RSEM.

However, the package requires specifying two different conditions for making pairwise comparisons in the design matrix under $condition column. Since all my samples are rectal tumor samples, I was interested in all possible pairwise comparisons between these samples to identify recurrently occurring switches. I created the $condition column with only one label "CRC", and the program terminated with error. This is the code which I was using -

library(IsoformSwitchAnalyzeR)
library(factoextra)

## create isoform quantification data (step 1: importIsoformExpression())

myquant <- importIsoformExpression("RSEM_isoform_files/", normalizationMethod = 'TMM', showProgress = TRUE)
meta_data <- read.csv('transcriptome_meta_data_test.tsv', sep='\t')

# Test whether the program's normalization is working by calculating PCA
tpm_mat <- myquant$abundance    
row_head <- tpm_mat[,1]                       # convert column1 (isoform identifiers) as row headers
row.names(tpm_mat) <- row_head
tpm_mat[,1] <- NULL

pca <- prcomp(t(tpm_mat))
fviz_pca_ind(pca,col.ind = "contrib", pointsize ="contrib", 
             gradient.cols = brewer.pal(10, "Spectral"),
             repel = TRUE, labelsize = 4)

# create design matrix for importR Data , for next step
sampleID <- colnames(myquant$counts)[-1]
condition <- rep('CRC', each=24)
intron_level <- c(rep("high", each=6), 'low', rep('high', each=7), rep('low', each=10))
batch <- as.character(meta_data$BATCH)

design_matrix <- cbind(sampleID, condition, intron_level, batch)
design_matrix <- data.frame(design_matrix) 

# import R data from previous step to create a switchAnalyzeRlist list (step 2: importRdata())

myswitchlist <- importRdata(
    isoformCountMatrix = myquant$counts,
    isoformRepExpression = myquant$abundance,
    designMatrix = design_matrix,
    isoformExonAnnoation = "Homo_sapiens.GRCh38.94.gtf",
    showProgress = TRUE,
    ignoreAfterPeriod = TRUE
)

I got this error at the second step - importRdata()

Step 1 of 6: Checking data...
Error in importRdata(isoformCountMatrix = myquant$counts, isoformRepExpression = myquant$abundance,  : 
  The supplied 'designMatrix' only contains 1 condition

output of traceback()

2: stop("The supplied 'designMatrix' only contains 1 condition")
1: importRdata(isoformCountMatrix = myquant$counts, isoformRepExpression = myquant$abundance, 
       designMatrix = pheno_data, isoformExonAnnoation = "Homo_sapiens.GRCh38.94.gtf", 
       showProgress = TRUE, ignoreAfterPeriod = TRUE)

Sample of my design matrix -

sampleID condition intronlevel . batch RIT1 CRC RIT2 CRC RIT3 . CRC RIT4 . CRC RIT5 . CRC ... ... ... ... ... ...

I don't have multiple conditions across which I can do a comparison.

Is there a way in which I can do all possible pairwise comparisons between samples?? how can I modify the requirements of condition column in design matrix?

ADD COMMENTlink modified 9 weeks ago by k.vitting.seerup100 • written 9 weeks ago by asmita0

Could you elaborate on what you mean by "recurrently occurring switches"?

ADD REPLYlink written 9 weeks ago by k.vitting.seerup100

I have RNA-Seq data from 24 tumor samples. By "recurrent switches", I mean to identify an isoform switch or in other words, a preferential isoform usage across samples.

For example - If a gene has 2 isoforms A and B, I would like to identify the % of cases or samples where isoform A is used more than B or vice versa; showing switch between these two forms.

The idea is - if 10 out of 24 samples show a higher or positive dIF score for isoform A than B, I can then take a step back and identify whether these 10 samples belong to a specific pathologic group (tumor grade/stage etc.), rather than moving in opposite direction i.e. classify the samples into groups according to their conditions and then perform pairwise comparisons between these conditions.

The usual workflow in IsoformSwitchAnalyzeR involves comparisons between conditions to identify switch. Here, I want to make all possible comparisons between 24 samples itself and then classify them into different conditions. I hope I am making some sense here.

ADD REPLYlink written 9 weeks ago by asmita0
Answer: Identify recurrent isoform switches using IsoformSwitchAnalyzeR in multiple rect
1
gravatar for k.vitting.seerup
9 weeks ago by
European Union
k.vitting.seerup100 wrote:

Hi Asmita

Thanks for reaching out. You describe an interesting, but very hard to implement, idea as switches in two different genes could use different groupings. IsoformSwitchAnalyzeR unfortunately does not support such analysis - it requires the groupings up front. Doing all possible devisions of 24 samples into two groups results in more than 16 million possible groupings so brute forcing it is not feasible either.

The only suggestion I have is that you can try to do unsupervised clustering (PCA, dendrogram etc) on the isoform fractions to see if when you analyse the relative isoform usage some grouping will show itself.

To get the isoform fractions from the TPM matrix you can use the isoformToIsoformFraction()function from IsoformSwitchAnalyzeR.

Cheers Kristoffer

ADD COMMENTlink written 9 weeks ago by k.vitting.seerup100

Thanks for the response! I will try it out.

ADD REPLYlink written 8 weeks ago by asmita0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 329 users visited in the last hour