Question: Error when predicting features with SGSeq
0
gravatar for Helen Zhou
4.1 years ago by
Helen Zhou140
United States
Helen Zhou140 wrote:

I'm trying to use the analyzeFeatures function in SGSeq following the vignette, but I get this error:

> library(SGSeq)

> library(TxDb.Hsapiens.UCSC.hg19.knownGene)

> sgfc <- analyzeFeaturesbam.info, which = gr)

Predict features...

Error in mergeTxFeatures(list_features, min_n_sample = min_n_sample) : 

  ... must be one or more TxFeatures or a single

            list of TxFeatures

 

The bam.info object contains information about my BAM files, after running getBamInfo():

> bam.info

  sample_name                                                     file_bam paired_end read_length frag_length  lib_size

1     sample1    /home/projects/gene_expression/Tmp/A549.cytosol.polyA.bam       TRUE         101         262 126299548

2     sample2    /home/projects/gene_expression/Tmp/A549.nucleus.polyA.bam       TRUE         101         293 149162537

6     sample3 /home/projects/gene_expression/Tmp/GM12878.cytosol.polyA.bam       TRUE          76         276  62103999

7     sample4 /home/projects/gene_expression/Tmp/GM12878.nucleus.polyA.bam       TRUE          76         239  60730206

 

These files have been merged from two separate bam files, processed with RSEM and TopHat. It includes transcripts from UCSC (hg19), but also some from other sources.

When I did into the functions a bit, the problem seems to occur with predictTxFeaturesPerSample. If I run that directly on a single sample, I get:

> sample_info <- bam.info

>  i <- 1

>  SGSeq:::predictTxFeaturesPerSample(file_bam = sample_info$file_bam[i], 

+         paired_end = sample_info$paired_end[i], read_length = sample_info$read_length[i], 

+         frag_length = sample_info$frag_length[i], lib_size = sample_info$lib_size[i], which=gr, alpha = 2, psi = 0, beta = 0.2, 

+     gamma = 0.2)

Error in (function (classes, fdef, mtable)  : 

  unable to find an inherited method for function ‘mcols’ for signature ‘"character"’

 

Does SGSeq simply not work for samples that have been mapped using annotation that only partly overlaps with the USCS data?

~~Helen

 

 

sgseq • 652 views
ADD COMMENTlink modified 4.1 years ago by Leonard Goldstein80 • written 4.1 years ago by Helen Zhou140
Answer: Error when predicting features with SGSeq
0
gravatar for Leonard Goldstein
4.1 years ago by
United States
Leonard Goldstein80 wrote:

Hi Helen,

Thanks for your question. Can I ask what version of SGSeq you are using? The output of sessionInfo() would be helpful.

BAM files used as input for SGSeq should be RNA-seq data mapped to a reference genome with a splice-aware aligner that includes the custom tag XS in the BAM output (e.g. GSNAP, STAR, TopHat). If you have multiple BAM  files that are intended to be analyzed together, they should all include mappings to the same reference genome. It sounds like your BAM files might contain alignments against a transcriptome rather than a genome?

Leonard

ADD COMMENTlink written 4.1 years ago by Leonard Goldstein80
Answer: Error when predicting features with SGSeq
0
gravatar for Helen Zhou
4.1 years ago by
Helen Zhou140
United States
Helen Zhou140 wrote:

My apologize Leonard. Session info is shown below, along with the error.

The bam files do contain multiple alignments. I asked our bioinformatics facility to follow the approach outlined in this G&D article, since we are also looking at neuronal stem cells. http://genesdev.cshlp.org/content/27/9/1032.long It is my understanding (but I am not sure) that the reads were mapped both to the genome and the transcriptome (human).

 

> sgfc <- analyzeFeaturesbam.info, which = NULL)

Predict features...

Error in mergeTxFeatures(list_features, min_n_sample = min_n_sample) : 

  ... must be one or more TxFeatures or a single

            list of TxFeatures

In addition: Warning message:

In mclapply(seq_len(n), do_one, mc.preschedule = mc.preschedule,  :

  20 function calls resulted in an error

> sessionInfo()

R version 3.2.1 (2015-06-18)

Platform: x86_64-unknown-linux-gnu (64-bit)

Running under: Ubuntu 14.04.2 LTS

 

locale:

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              

 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    

 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   

 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 

 [9] LC_ADDRESS=C               LC_TELEPHONE=C            

[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

 

attached base packages:

[1] stats4    parallel  stats     graphics  grDevices utils     datasets 

[8] methods   base     

 

other attached packages:

 [1] TxDb.Hsapiens.UCSC.hg19.knownGene_3.1.2

 [2] GenomicFeatures_1.20.1                 

 [3] AnnotationDbi_1.30.1                   

 [4] Biobase_2.28.0                         

 [5] SGSeq_1.2.2                            

 [6] GenomicRanges_1.20.5                   

 [7] GenomeInfoDb_1.4.1                     

 [8] IRanges_2.2.5                          

 [9] S4Vectors_0.6.2                        

[10] BiocGenerics_0.14.0                    

 

loaded via a namespace (and not attached):

 [1] igraph_1.0.1            XVector_0.8.0           magrittr_1.5           

 [4] zlibbioc_1.14.0         GenomicAlignments_1.4.1 BiocParallel_1.2.9     

 [7] tools_3.2.1             DBI_0.3.1               lambda.r_1.1.7         

[10] futile.logger_1.4.1     rtracklayer_1.28.6      futile.options_1.0.0   

[13] bitops_1.0-6            RCurl_1.95-4.7          biomaRt_2.24.0         

[16] RSQLite_1.0.0           Biostrings_2.36.1       Rsamtools_1.20.4       

[19] XML_3.98-1.3           

ADD COMMENTlink written 4.1 years ago by Helen Zhou140
Answer: Error when predicting features with SGSeq
0
gravatar for Leonard Goldstein
4.1 years ago by
United States
Leonard Goldstein80 wrote:

Hi Helen,

Thanks for providing the sessionInfo output. 

It sounds like these are non-standard BAM files (merged from different sources). Can you try running SGSeq on the unmodified BAM files that were generated by TopHat? 

Leonard

 

ADD COMMENTlink written 4.1 years ago by Leonard Goldstein80
Answer: Error when predicting features with SGSeq
0
gravatar for Helen Zhou
4.1 years ago by
Helen Zhou140
United States
Helen Zhou140 wrote:

Leonard,

unfortunately we have to remove temporary files due to size constraints on our folders. I will try running SGSeq next time I have some of the intermediate result file from our bioinformatics core.

~~Helen

ADD COMMENTlink written 4.1 years ago by Helen Zhou140

Peharps you could try extracting the reads in the BAM file you have and then realign to a reference genome. Googling for "realign reads bam file" will turn up several hits around that theme that you might be able to try.

ADD REPLYlink written 4.1 years ago by Steve Lianoglou12k
Answer: Error when predicting features with SGSeq
0
gravatar for Leonard Goldstein
4.1 years ago by
United States
Leonard Goldstein80 wrote:

Hi Helen,

If it is an option to make available the current BAM files, I can have a look at the data to understand what is causing the problem. 

Otherwise I suggest running SGSeq on the unmodified TopHat output when you get the chance and if you are still encountering problems, please do report them on the support site. 

Many thanks,

Leonard

ADD COMMENTlink written 4.1 years ago by Leonard Goldstein80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 79 users visited in the last hour