Hej Nico
Thanks for your answer. I took the time to figure the simpleRNASeq function and took your advice on the annotation. I got a msg error that I can not figure it out. I include the litle code, the error msg and the sessionInfo bellow. Hope you can be as helpful as always!
Best
Mayte
library("easyRNASeq")
library(Rsamtools)
library(DESeq)
library(edgeR)
library(GenomicRanges)
library(parallel)
library(S4Vectors)
fls.bam = list.files(path= BamPath,recursive=FALSE, pattern="*sorted.bam$", full=FALSE)
bamFiles <- getBamFileList(filenames= list.files(path= BamPath,recursive=FALSE, pattern="*sorted.bam$", full=FALSE))
annotParam <- AnnotParam(datasource="/mydir/Homo_sapiens.GRCh37.75.tran.gtf", type="gtf")
> Counts <- simpleRNASeq(
+ bamFiles=bamFiles,
+ param= RnaSeqParam(annotParam=annotParam, countBy='genes'),
+ verbose=TRUE,
+ nnodes=6
+ )
==========================
simpleRNASeq version 2.2.0
==========================
Creating a SummarizedExperiment.
==========================
Processing the alignments.
==========================
Pre-processing 84 BAM files.
Validating the BAM files.
Extracted 93 reference sequences information.
Error in checkForRemoteErrors(val) :
84 nodes produced errors; first error: could not find function "DataFrame"
sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] edgeR_3.8.5 limma_3.22.4 DESeq_1.18.0
[4] lattice_0.20-29 locfit_1.5-9.1 Biobase_2.26.0
[7] Rsamtools_1.18.2 Biostrings_2.34.1 XVector_0.6.0
[10] GenomicRanges_1.18.4 GenomeInfoDb_1.2.4 IRanges_2.0.1
[13] S4Vectors_0.4.0 BiocGenerics_0.12.1 easyRNASeq_2.2.0
loaded via a namespace (and not attached):
[1] annotate_1.44.0 AnnotationDbi_1.28.1 base64enc_0.1-2
[4] BatchJobs_1.5 BBmisc_1.8 BiocParallel_1.0.1
[7] biomaRt_2.22.0 bitops_1.0-6 brew_1.0-6
[10] checkmate_1.5.1 codetools_0.2-10 DBI_0.3.1
[13] digest_0.6.8 fail_1.2 foreach_1.4.2
[16] genefilter_1.48.1 geneplotter_1.44.0 genomeIntervals_1.22.0
[19] GenomicAlignments_1.2.1 GenomicFeatures_1.18.3 grid_3.1.1
[22] hwriter_1.3.2 intervals_0.15.0 iterators_1.0.7
[25] latticeExtra_0.6-26 LSD_3.0 plyr_1.8.1
[28] RColorBrewer_1.1-2 Rcpp_0.11.4 RCurl_1.95-4.5
[31] RSQLite_1.0.0 rtracklayer_1.26.2 sendmailR_1.2-1
[34] ShortRead_1.24.0 splines_3.1.1 stringr_0.6.2
[37] survival_2.37-7 tools_3.1.1 XML_3.98-1.1
[40] xtable_1.7-4
Hi Mayte,
Ensembl switched from using human assembly GRCh37 to GRCh38 in August 2014 with their release 76:
http://lists.ensembl.org/pipermail/announce/2014-August/thread.html
This is probably why easyRNASeq() is complaining that "your annotation is not in sync with your alignments".
AFAIK the current release of Ensembl (release 78) is still using the GRCh38 assembly. Note that this assembly is the same as hg38 from UCSC except for the chromosome/scaffold names.
Cheers,
H.