Question

Single end STAR Chimeric.out.junction fails to read with chimera::importFusionData

0

Entering edit mode

wresch • 0

@wresch-7286

Last seen 10.2 years ago

United States

Hi,

I have a 1x50nt illumina RNASeq data set that was only intended for gene level expression analysis. I was asked to look for fusion transcripts even though I emphasized that the sensitivity for finding such fusions would be lousy. I decided to start with a STAR (alignment to mm9) -> chimera workflow. Each sample reports a small number of fusion reads with only a few fusions that have 5 or more supporting reads as determined by awk. My guess is that they are all false positives. One example from a out.junction file (full file: https://s3.amazonaws.com/idata.drgang.net/temp/Chimeric.out.junction)

chr3 138267132 + chr2 181382061 + 0 0 0 DFXGT8Q1:294:C5A6EACXX:8:1105:20793:74047 138267108 24M26S 181382062 24S26M

chimera::importFusionData("star", "path/to/file", org = "mm", min.support = 1)

returns NULL and complains:

The input file does not have any spanning read.
Your fusion lacking of spanning reads are most probably artifacts
The analysis of fusions lacking spanning reads is not supported.

I'm new to fusion transcript detection, so this is a stupid question, but the read above to me seems to be a spanning read, right? So what is wrong with what I'm doing?

Thanks in advance for any help

Wolfgang

> sessionInfo() R version 3.1.1 (2014-07-10) Platform: x86_64-unknown-linux-gnu (64-bit)

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base

other attached packages: [1] chimera_1.8.4 [2] TxDb.Hsapiens.UCS [3] GenomicFeatures_1.18.2 [4] BSgenome.Hsapiens [5] BSgenome_1.34.0 [6] rtracklayer_1.26.2 [7] org.Hs.eg.db_3.0.0 [8] RSQLite_1.0.0 [9] DBI_0.3.1 [10] AnnotationDbi_1.28.1 [11] GenomicAlignments_1.2.1 [12] Rsamtools_1.18.2 [13] Biostrings_2.34.0 [14] XVector_0.6.0 [15] GenomicRanges_1.18.3 [16] GenomeInfoDb_1.2.3 [17] IRanges_2.0.0 [18] S4Vectors_0.4.0 [19] Biobase_2.26.0 [20] BiocGenerics_0.12.1 C.hg19.knownGene_3.0.0 .UCSC.hg19_1.4.0

loaded via a namespace (and not attached): [1] base64enc_0.1-2 BatchJobs_1.5 BBmisc_1.8 BiocParallel_1.0.0 [5] biomaRt_2.22.0 bitops_1.0-6 brew_1.0-6 checkmate_1.5.0 [9] codetools_0.2-9 digest_0.6.4 fail_1.2 foreach_1.4.2 [13] iterators_1.0.7 RCurl_1.95-4.5 sendmailR_1.2-1 stringr_0.6.2 [17] tools_3.1.1 XML_3.98-1.1 zlibbioc_1.12.0

chimera STAR • 2.7k views

ADD COMMENT • link updated 10.1 years ago by rcaloger ▴ 500 • written 10.2 years ago by wresch • 0

0

Entering edit mode

raffaele calogero ▴ 500

@raffaele-calogero-294

Last seen 9.3 years ago

Italy/Turin/University of Torino

Hi,

we committed to the Bioconductor repository chimera 1.8.5. We think we fixed the issue encountered importing STAR data.

It should be available for downloading in 24 hours

Cheers

Raf

ADD COMMENT • link 10.2 years ago raffaele calogero ▴ 500

0

Entering edit mode

Hi Raf,

I was receiving the same error message using chimera 1.6 with STAR data when I found this post via google. I installed version 1.8.5 and received this error message:

tmp <- importFusionData('star',"Chimeric.out.junction",org="mm", min.support=1)

chrM is removed from fusion acceptor

chrM is removed from fusion donor

The input file does not seems to have any fusion.
Please contact the developers.

Is this the same issue with the C++ parser or something else?

thanks in advance

ADD REPLY • link 10.1 years ago balli.dave • 0

0

Entering edit mode

rcaloger ▴ 500

@rcaloger-1888

Last seen 10.1 years ago

European Union

I think the problem is related to the version of the human genome you have used.

Are you using hg38?

In the actual stable version it only allow the use of hg19. This issue is solved in the devel version.

Could please try to use the devel version?

If the problem is not solved with the devel could please send me the STAR output to understand the issue?

Cheers

Raf

ADD COMMENT • link 10.1 years ago rcaloger ▴ 500

score 1 · Accepted Answer · 2015-01-24

1

Entering edit mode

raffaele calogero ▴ 500

@raffaele-calogero-294

Last seen 9.3 years ago

Italy/Turin/University of Torino

Hi Wolfgang,

you are the second person that highlight this problem in uploading STAR data in chimera.

We have identified the problem in the C++ parser that counts the reads in the Chimeric.out.junction file and we are going to fix it by next week in version 1.8.5.

Thanks for highlighting the problem.

Raffaele

ADD COMMENT • link 10.2 years ago raffaele calogero ▴ 500

0

Entering edit mode

Hi Raffaele,

great. Thanks for the fast reply and the upcoming fix.

Wolfgang

ADD REPLY • link 10.2 years ago wresch • 0