Entering edit mode
Paul Geeleher
★
1.3k
@paul-geeleher-2679
Last seen 10.3 years ago
I'm attempting to read some aligned RNA-seq paired end reads in
Bioconductor. The reads were aligned with maq (.fastq to .bqf to .map)
and converted to SAM then BAM format by samtools as I couldn't get the
paired end reads into biocondcutor in MAQ format.
I'm basically following this tutorial:
http://bioconductor.org/help/course-
materials/2010/CSAMA10/Lab-8-RNAseqUseCase.pdf
The following step causes me an error:
> seqlengths(alnRanges) <-
seqlengths(hgTxDb)[names(seqlengths(alnRanges))]
Error in validObject(.Object) :
invalid class "GRanges" object: slot 'ranges' contains values
outside of sequence bounds
I think it means that there is something aligned to outside the range
of my reference genome but I don't understand why this would be as I
used HG18 for everything.
Any suggestions as to how I can possibly figure out what went wrong
here would be greatly appreciated. I can provide more information if
necessary. Summaries of my GenomicRanges and GenomicFeatures objects
are below.
> alnRanges
GRanges with 11634719 ranges and 1 elementMetadata value
seqnames ranges strand |
pData.alignData.from...notNA...
<rle> <iranges> <rle> |
<integer>
[1] chr1 [3345, 3381] + |
163
[2] chr1 [3391, 3427] - |
147
[3] chr1 [4223, 4259] + |
163
[4] chr1 [4247, 4283] - |
147
[5] chr1 [4275, 4311] + |
163
[6] chr1 [4304, 4340] + |
163
[7] chr1 [4320, 4356] + |
163
[8] chr1 [4343, 4379] - |
147
[9] chr1 [4343, 4379] - |
147
... ... ... ... ...
...
[11634711] chrM [16531, 16567] - |
147
[11634712] chrM [16532, 16568] + |
161
[11634713] chrM [16532, 16568] - |
147
[11634714] chrM [16532, 16568] - |
147
[11634715] chrM [16532, 16568] - |
147
[11634716] chrM [16534, 16570] - |
147
[11634717] chrM [16534, 16570] - |
147
[11634718] chrM [16534, 16570] - |
147
[11634719] chrM [16541, 16577] - |
147
seqlengths
chr1 chr2 chr3 chr4 chr5 chr6 ... chr20 chr21 chr22 chrX chrY
chrM
NA NA NA NA NA NA ... NA NA NA NA NA
NA
> hgTxDb <- makeTranscriptDbFromUCSC(genome="hg18",
tablename="ensGene")
> hgTxDb
TranscriptDb object:
| Db type: TranscriptDb
| Data source: UCSC
| Genome: hg18
| UCSC Table: ensGene
| Type of Gene ID: Ensembl gene ID
| Full dataset: yes
| transcript_nrow: 63280
| exon_nrow: 276075
| cds_nrow: 225373
| Db created by: GenomicFeatures package from Bioconductor
| Creation time: 2010-08-17 15:28:19 +0100 (Tue, 17 Aug 2010)
| GenomicFeatures version at creation time: 1.0.6
| RSQLite version at creation time: 0.9-1
>
--
Paul Geeleher
School of Mathematics, Statistics and Applied Mathematics
National University of Ireland
Galway
Ireland
--
www.bioinformaticstutorials.com