Search
Question: How to show chromosome position and SNP label at the same time in x axis
0
gravatar for oswaldo.lorenzo
14 months ago by
oswaldo.lorenzo10 wrote:

Dear all,

I'm trying to plot a certain number of SNPs and have in the x-axis both, the chromosomal position and their labels.

I have imported the SNP information from a bed file into a GRanges object.

My bed file looks like this:

chr17    78191000    78191000    rsAAA    1    +
chr17    78191900    78191900    rsBBB    1    +
chr17    78194002    78194002    rsCCC    1    +
chr17    78197170    78197170    rsDDD    1    +

The function I used to convert the bed file into a GRanges object is the one from this website: http://davetang.org/muse/2015/02/04/bed-granges/

bed_to_granges <- function(file){
        df <- read.table(file,
                         header=F,
                         stringsAsFactors=F)
        
        if(length(df) > 6){
                df <- df[,-c(7:length(df))]
        }
        
        if(length(df)<3){
                stop("File has less than 3 columns")
        }
        
        header <- c('chr','start','end','id','score','strand')
        names(df) <- header[1:length(names(df))]
        
        if('strand' %in% colnames(df)){
                df$strand <- gsub(pattern="[^+-]+", replacement = '*', x = df$strand)
        }
        
        library("GenomicRanges")
        
        if(length(df)==3){
                gr <- with(df, GRanges(chr, IRanges(start, end)))
        } else if (length(df)==4){
                gr <- with(df, GRanges(chr, IRanges(start, end), id=id))
        } else if (length(df)==5){
                gr <- with(df, GRanges(chr, IRanges(start, end), id=id, score=as.character(score)))
        } else if (length(df)==6){
                gr <- with(df, GRanges(chr, IRanges(start, end), id=id, score=as.character(score), strand=strand))
        }
        return(gr)
}

The code to import the bed file and reformat it according to the human hg19 build is the following one:

library(ggbio)
data(hg19Ideogram, package = "biovizBase")
setwd(".../Test")

## Import bed file as GRanges file
SNP <- bed_to_granges("SNP_position.bed")
seqlengths(SNP) <- seqlengths(hg19Ideogram)[names(seqlengths(SNP))]
SNP_dn <- keepSeqlevels(SNP, paste0("chr", c(1:22, "X", "Y")))

 

I have tried to plot the SNPs in the following ways:

SNP_location <-  autoplot(SNP_dn) +
        theme(text = element_text(size=8),
              axis.text.x = element_text(angle=45, hjust=1)) +
        theme(legend.position="none") +        
        xlim(78190000,78200000) +
        scale_x_sequnit("Mb")
fixed(SNP_location) <- TRUE
SNP_location

This code returns a plot with the chromosomal positions in the x-axis and the SNPs in their correct location:

SNP_IDs <-  autoplot(SNP_dn) +
        scale_x_continuous(name = "\nSNP IDs",
                           breaks = as.vector(start(SNP_dn)),
                           labels = as.factor (SNP_dn$id)) +
        theme(text = element_text(size=8),
              axis.text.x = element_text(angle=45, hjust=1)) +
        theme(legend.position="none") +        
        xlim(78190000,78200000)
fixed(SNP_IDs) <- TRUE
SNP_IDs

This code returns a re-scaled x-axis where the x-axis ticks correspond to the positions and labels of the SNPs themselves, but I loose the chromosomal references:

I would like to get a figure like the first one with the x-axis scaled according to chromosomal positions with a second line located anywhere containing the SNP names in the same figure.

I want to combine this figure afterwards with other plots showing other features from the same region with the ggbio tracks function and in order to do so they need to have the same chromosomal limits.

Is there an easy way to label SNPs keeping the original x-axis in chromosomal scale?

Thank you very much,

Best,

R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
 [1] grid      stats4    parallel  stats     graphics  grDevices utils     datasets  methods  
[10] base     

other attached packages:
 [1] Homo.sapiens_1.3.1                      TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
 [3] org.Hs.eg.db_3.3.0                      GO.db_3.3.0                            
 [5] OrganismDbi_1.14.1                      GenomicFeatures_1.24.5                 
 [7] AnnotationDbi_1.34.4                    Biobase_2.32.0                         
 [9] GenomicRanges_1.24.2                    GenomeInfoDb_1.8.3                     
[11] IRanges_2.6.1                           S4Vectors_0.10.3                       
[13] biovizBase_1.20.0                       ggbio_1.20.2                           
[15] ggplot2_2.1.0                           BiocGenerics_0.18.0                    

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.6                   lattice_0.20-33               Rsamtools_1.24.0             
 [4] Biostrings_2.40.2             digest_0.6.10                 mime_0.5                     
 [7] R6_2.1.3                      plyr_1.8.4                    chron_2.3-47                 
[10] acepack_1.3-3.3               RSQLite_1.0.0                 httr_1.2.1                   
[13] BiocInstaller_1.22.3          zlibbioc_1.18.0               data.table_1.9.6             
[16] rpart_4.1-10                  Matrix_1.2-6                  labeling_0.3                 
[19] splines_3.3.1                 BiocParallel_1.6.6            AnnotationHub_2.4.2          
[22] stringr_1.1.0                 foreign_0.8-66                RCurl_1.95-4.8               
[25] biomaRt_2.28.0                munsell_0.4.3                 shiny_0.13.2                 
[28] httpuv_1.3.3                  rtracklayer_1.32.2            htmltools_0.3.5              
[31] nnet_7.3-12                   SummarizedExperiment_1.2.3    gridExtra_2.2.1              
[34] interactiveDisplayBase_1.10.3 Hmisc_3.17-4                  XML_3.98-1.4                 
[37] reshape_0.8.5                 GenomicAlignments_1.8.4       bitops_1.0-6                 
[40] RBGL_1.48.1                   xtable_1.8-2                  GGally_1.2.0                 
[43] gtable_0.2.0                  DBI_0.5                       magrittr_1.5                 
[46] scales_0.4.0                  graph_1.50.0                  stringi_1.1.1                
[49] XVector_0.12.1                reshape2_1.4.1                latticeExtra_0.6-28          
[52] Formula_1.2-1                 RColorBrewer_1.1-2            ensembldb_1.4.7              
[55] tools_3.3.1                   dichromat_2.0-0               BSgenome_1.40.1              
[58] survival_2.39-5               colorspace_1.2-6              cluster_2.0.4                
[61] VariantAnnotation_1.18.7     
ADD COMMENTlink modified 14 months ago • written 14 months ago by oswaldo.lorenzo10
1
gravatar for oswaldo.lorenzo
14 months ago by
oswaldo.lorenzo10 wrote:

I think I have found the parameter I was looking for: it is all about working with the

geom_text() 

function.

You can generate a int vector with the positions of the SNPs (int_vector) and a chr vector (chr_vector) for the SNP names.

vector_breaks <- as.vector(start(SNP_dn))

vector_labels <- as.character (SNP_dn$id)

Afterwards adding + geom_text(x = int_vector, y = rep(1.3,4), label = chr_vector, angle = 45, hjust = -0.4, vjust = 0.2, size = 3) would make it. There might be easier ways of doing it and I would appreciate it, if you share them.

The final code would look something like this:

SNP_location <-  autoplot(SNP_dn, layout = "linear") +
                 theme(text = element_text(size=8),
                 axis.text.x = element_text(angle=45, hjust=1)) +
                 theme(legend.position="none") +       
                 xlim(78190000,78200000) +
                 scale_x_sequnit("Mb")

SNP_label2 <- SNP_location +
                geom_text(x = vector_breaks,
                          y = rep(1.3,4),
                          label = vector_labels,
                          angle = 45,
                          hjust = -0.4,
                          vjust = 0.2,
                          size = 3)
               

If you know an easier way of doing it, I would really appreciate if you share your code.

Thanks,

 

ADD COMMENTlink modified 14 months ago • written 14 months ago by oswaldo.lorenzo10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 164 users visited in the last hour