GRange and bam files
2
0
Entering edit mode
Merlin ▴ 10
@merlin-15723
Last seen 4.3 years ago
Vancouver

Hello, 

I'm trying to do a differential expression analysis with Rstudio.

I know that  GRanges represents a collection of genomic ranges that each have a single start and end location on the genome but I don't know how to set it up.

Assuming that I'm interested of a certain region of the chromosome to analyze the expression of a particular gene, 
Where I can find the region that I'm looking for and how to set up the code? 

Here is an example:

> gr <- GRanges(
seqnames = Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),ranges = IRanges(101:110, end = 111:120, names = head(letters, 10)),strand = Rle(strand(c("-", "+", "*", "+", "-")), c(1, 2, 2, 3, 2)),score = 1:10,GC = seq(1, 0, length=10))
> gr
       <Rle> <IRanges>  <Rle> | <integer>
GRanges object with 10 ranges and 2 metadata columns:
    seqnames    ranges strand |     score
a  chr1
b  chr2
c  chr2
. ...
h  chr3
i  chr3
j  chr3
-------
seqinfo: 3 sequences from an unspecified genome; no seqlengths

Thank you

bioconductor Grange • 949 views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 34 minutes ago
United States

Generally speaking, you get that sort of information from either a TxDb package, or maybe a GTF or GFF file. You don't say what species you are working with, so here's an example for human. Say you are working with human data, and you want to use the GRCh38 genome. And further assume you want to use UCSC's knownGene table to define what is and isn't a gene.

library(BiocInstaller)
biocLite("TxDb.Hsapiens.UCSC.hg38.knownGene")
library(TxDb.Hsapiens.UCSC.hg38.knownGene)
ex <- exonsBy(TxDb.Hsapiens.UCSC.hg38.knownGene, by = "gene")

And you will now have a GRangesList that defines the exonic locations for all the genes in that table. The names for the list will be the Entrez Gene ID, so if you just care about one or a set of those genes, you could use the org.Hs.eg.db package to get those IDs and subset.

library(org.Hs.eg.db)
genesILike <- c("BRCA1","BRCA2")
ids <- mapIds(org.Hs.eg.db, genesILike, "ENTREZID","SYMBOL")
ex.subset <- ex[[ids]]

You can also create a GRangesList based on the transcript, or the gene, etc. See the vignettes for GenomicRanges and AnnotationDbi for more information.

ADD COMMENT
0
Entering edit mode

I should also note that if you are working with a different species, particularly a non-model species, or you want to use Ensembl mappings, etc., you can use the AnnotationHub.

biocLite("AnnotationHub")
library(AnnotationHub)
hub <- AnnotationHub()
query(hub, c("homo sapiens","ensembldb"))

See the AnnotationHub vignette for more in depth information.

ADD REPLY
0
Entering edit mode
Merlin ▴ 10
@merlin-15723
Last seen 4.3 years ago
Vancouver

Hi James, 

Thank you, now it's clear to me 

 

-M

ADD COMMENT

Login before adding your answer.

Traffic: 1093 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6