Isoforms detection using bioconductor
1
0
Entering edit mode
@seungjoonkim-13660
Last seen 4.5 years ago
Berlin

Dear all,

hi we have RNA-seq data with long reads and try to find transcript isoforms(AS) for every genes. I have now aligned BAM file and want to detect isoforms using IRanges, GenomicRanges and biomart. I have no idea how to detect and quantify the number of isoforms for each gene. Could you give me any advice about work flow or tips how to analyze my data?

 

Thank you in advance!!

 

bioconductor biomart granges rna-seq • 1.3k views
ADD COMMENT
0
Entering edit mode
@michael-lawrence-3846
Last seen 6 weeks ago
United States

It's a complex topic but here are some tips:

One option is to quantify junctions and use those to make inferences about transcript structure and quantity. The summarizeJunctions() function in GenomicAlignments might help with this. There are also specialized packages like SGSeq.

External tools like Salmon and Kallisto use deconvolution approaches to quantifying isoforms. The tximport package loads the output of those tools into R/Bioconductor.

ADD COMMENT
0
Entering edit mode

Thanks for the advice. So I tried to use SGSeq to analyze data and some questions and errors came out. I tried to follow the method in splice-variant-identification written by Leonard D Goldstein and how do you retrieve GRanges object of FBXO31 gene. Because it is possible to retrieve a certain chromosome but not the specific gene. 

So I manually create GRanges object and proceed the protocol and in the last step using analyzeFeathres, it caused errors

> sgfc_ucsc <- analyzeFeatures(si, features = txf_ucsc)
Error: sample_info is missing column(s) paired_end, read_length, frag_length, lib_size

Could you figure out how to figure out this error?

 

ADD REPLY
0
Entering edit mode

Hi, 

analyzeFeatures needs some basic information on your data (e.g. whether reads are paired-end, the total number of aligned reads in the BAM file etc.) All required info can be obtained with

> si <- getBamInfo(si) 

Providing a gene locus is optional, you can do your analysis genome-wide, but this may be computationally intensive, so I would test it on one gene first. Please check the package vignette or help page for more information. 

Leonard

ADD REPLY
0
Entering edit mode

Thanks. I will try it.

ADD REPLY
0
Entering edit mode

In the beginning, I found this kind of error.

> bamPath <- "/data/murphy/home/XXX/XXX/sorted_ccs.bam"

> bamFile <- BamFile(bamPath)

> getBamInfo(bamFile)
Error: sample_info must be a data frame with columns sample_name, file_bam

In this case, how can I add sample_name?

Also, I used the same file previously and why other sample_info is missing? (e.g. paired_end, read_length, frag_length, lib_size)

Thanks in advance!!

ADD REPLY
0
Entering edit mode

The input for getBamInfo is a data.frame with columns sample_name and file_bam (see example below). Please read the vignette or help pages.

> si <- data.frame(sample_name = c("a", "b"), file_bam = c("file_a.bam", "file_b.bam"), stringsAsFactors = FALSE)
> si <- getBamInfo(si)
ADD REPLY
0
Entering edit mode

Thanks for the comment. it works well with analyzing isoforms

but still has problem when plotting the results.

When I followed this script:

par(mfrow = c(2,1), mar = c(2,2,1,2), pty = "m")
plotSpliceGraph(rowRanges(sgfc_pred), geneID = 1, toscale = "none", color_novel = "red")
for (j in 1:4){
  plotCoverage(sgfc_pred[, j], geneID = 1, toscale = "none")
}

It gives this messages: 

Error: subscript contains out-of-bounds indices

And the the plot goes out of bound so I cannot see the whole plot at all.

Could you help me to figure out of this problem?

 

 

ADD REPLY
0
Entering edit mode

Based on the error message it looks like you are working with less than 4 samples, but you try to make plots for columns 1-4 of the SGFeatureCounts object. Regarding out-of-bound plots, the vignette includes several examples on how to specify what genomic regions to include in the plots. Please have a look at the examples in the vignette. 

 

ADD REPLY

Login before adding your answer.

Traffic: 270 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6