SomaticSignatures - Getting Started & Trouble Shooting
0
0
Entering edit mode
jpluta26 • 0
@jpluta26-9948
Last seen 8.1 years ago

Hello, I am new to using bioconductor and new to genetics in general- sorry if this is an obvious question. I am trying to get started using the SomaticSignatures package. I have several TCGA subjects with .vcf files, and each subject has a .bam sequence file. I understand I need to load both the sequence and the reference to calculate signatures. 

 

 

First, I converted the .bam file to .fa, and generated an index, using samtools. Then, I loaded the data as follows:

fa_A <- FaFile("sub1.fa")

dat <- readVcfAsVRanges("sub1.vcf", fa_A)

vr_A = mutationContext(dat, fa_A)

The last line returns: Error in value[[3L]](cone) : record 1 (1:12837280-12837282) failed

file: sub1.fa

as a starting point, can anyone tell me what this errors means?

 

 

 

 

 

 

 

software error somaticsignatures somaticsignatures mutationcontext • 1.2k views
ADD COMMENT
1
Entering edit mode

Are you sure you have the reference genome in your bam file?  Bam files normally contain sequencing reads and the position the align to in a reference, but not the reference sequence itself.

Given the TCGA samples are human, you can probably use the BSgenome.Hsapiens.UCSC.hg19 reference package, as in section 4.2 of the SomaticSignatures vignette.

ADD REPLY
0
Entering edit mode

thanks for the reply, it was helpful. i was able to get some basic code working using the BSgenome reference package. the SomaticSignatures vignette states that a Fasta file can be used naturally, and i have this for each subject, so i was hoping to incorporate that. maybe i can figure out how to include the reference sequence in the fast file. thanks again for your help.

ADD REPLY

Login before adding your answer.

Traffic: 903 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6