BAMBU extended gtf file?
1
1
Entering edit mode
Seongwoo Han ▴ 10
@6d55f695
Last seen 20 days ago
United States

Hello there, I want to know what "extended_annotations.gtf is here (https://github.com/GoekeLab/sg-nex-data/blob/master/docs/SG-NEx_Bambu_tutorial.md#running-bambu)," one of BAMBU's main outputs. It sounds like extended_annotations.gtf is a file with the entire reference annotation plus all discovered novel transcripts. This is a size of about 200 Mb. What I am trying to get is something like "transcript_models.gtf" that has just constructed transcripts (both known and novel), so no entire reference annotation. To my knowledge, its size is 90 ~ 100 Mb. Is there a way to gain that filtered gtf through the command line?

I am using cDNA ONT and cDNA PacBio datasets. I am providing the command line I used to convert from .fastq file to bam file below for cDNA ONT in case I missed something.

./minimap2 -t 8 -ax splice /home/seong/R/x86_64-pc-linux-gnu-library/4.1/bambu/extdata/hg38.fa /data/long_read/ENCBS944CBA/ENCFF263YFG.fastq -o /data/long_read/ENCBS944CBA/ENCFF263YFG.sam



One another question that I have is, does BAMBU detect intron retention? Let me know for these questions, thanks a lot!

BAMBU bambu • 184 views
0
Entering edit mode

hello Seongwoo, Did u able to run bambu successfully? I am facing technical problems so it would be great to get help.

1
Entering edit mode
Andre ▴ 20
@c3f05232
Last seen 5 weeks ago
Singapore

Hi Seongwoo,

I think I addressed this on the Github Issue, but for the sake of users that might find this issue here.

This line filters the output and you can then write the output as usual.

constructedAnnotations = se[assays(se)\$fullLengthCounts > 0]
writeBambuOutput(constructedAnnotations, path = "./YOUR_PATH_HERE/")