I am new to R and CummeRbund and am trying to figure it all out myself with no computer language or Unix skills. I am trying to create a heatmap of my 418 significantly expressed genes between 2 groups. This data is RNA-seq data that has been aligned with STAR and analyzed with Cufflinks. I have been following the commands below:
> library(cummeRbund)
> cuff<-readCufflinks()
> cuff
CuffSet instance with:
2 samples
24341 genes
34535 isoforms
27685 TSS
26885 CDS
24272 promoters
27685 splicing
20510 relCDS
> mySigGeneIds<-getSig(cuff,alpha=0.05,level='genes')
> head(mySigGeneIds,n=50)
[1] "1190007I07Rik" "2310007B03Rik" "2310034O05Rik" "4833419F23Rik" "4932411E22Rik" "4932438H23Rik"
[7] "6030443J06Rik" "A2m" "AI661453" "Abca8b" "Abcc4" "Abhd2"
[13] "Ackr3" "Acnat1" "Acta2" "Actg2" "Adam7" "Adamts1"
[19] "Adamts16" "Adgrf1" "Adh1" "Adh6a" "Adora2b" "Ago3"
[25] "Ahrr" "Aim1" "Akr1b7" "Alas2" "Aldh2" "Aldh3b2"
[31] "Angptl4" "Ankfn1" "Anxa2" "Aox3" "Apod" "Apol7a"
[37] "Aqp3" "Aqp5" "Arg1" "Art5" "Asprv1" "Aurka"
[43] "B2m" "B3glct" "BC005561" "BC018473" "BC100530" "Bace2"
[49] "Baiap2l2" "Bdh2"
> length(mySigGeneIds)
[1] 418
> mySigGenes<-getGenes(cuff,mySigGeneIds)
> mySigGenes
CuffGeneSet instance for 418 genes
Slots:
annotation
fpkm
repFpkm
diff
count
isoforms CuffFeatureSet instance of size 602
TSS CuffFeatureSet instance of size 490
CDS CuffFeatureSet instance of size 505
promoters CuffFeatureSet instance of size 418
splicing CuffFeatureSet instance of size 490
relCDS CuffFeatureSet instance of size 418
> h<-csHeatmap(mySigGenes,cluster='both')
Using tracking_id, sample_name as id variables
No id variables; using all as measure variables
> h
There are 2 things that I want to do.
1) I cannot read the names of the genes because there are too many, so I want to either change the size of the heatmap so all the names can be listed. Or, create more than 1 heatmap where the first one list only the first 100.
2) I need to order my genes from the greatest difference in log2 fold change (or most signficant q value) to the least. Currently I believe, they are alphabetical.
Any help is appreciated!
Here is session Info:
> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 10586)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] grid stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] cummeRbund_2.14.0 Gviz_1.16.1 rtracklayer_1.32.2 GenomicRanges_1.24.2 GenomeInfoDb_1.8.3
[6] IRanges_2.6.1 S4Vectors_0.10.2 fastcluster_1.1.20 reshape2_1.4.1 ggplot2_2.1.0
[11] RSQLite_1.0.0 DBI_0.4-1 BiocGenerics_0.18.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.6 biovizBase_1.20.0 lattice_0.20-33
[4] Rsamtools_1.24.0 Biostrings_2.40.2 digest_0.6.9
[7] mime_0.5 R6_2.1.2 plyr_1.8.4
[10] chron_2.3-47 acepack_1.3-3.3 httr_1.2.1
[13] BiocInstaller_1.22.3 zlibbioc_1.18.0 GenomicFeatures_1.24.5
[16] data.table_1.9.6 rpart_4.1-10 Matrix_1.2-6
[19] labeling_0.3 splines_3.3.1 BiocParallel_1.6.3
[22] AnnotationHub_2.4.2 stringr_1.0.0 foreign_0.8-66
[25] RCurl_1.95-4.8 biomaRt_2.28.0 munsell_0.4.3
[28] shiny_0.13.2 httpuv_1.3.3 mgcv_1.8-13
[31] htmltools_0.3.5 nnet_7.3-12 SummarizedExperiment_1.2.3
[34] gridExtra_2.2.1 interactiveDisplayBase_1.10.3 Hmisc_3.17-4
[37] matrixStats_0.50.2 XML_3.98-1.4 GenomicAlignments_1.8.4
[40] bitops_1.0-6 nlme_3.1-128 xtable_1.8-2
[43] gtable_0.2.0 magrittr_1.5 scales_0.4.0
[46] stringi_1.1.1 XVector_0.12.1 latticeExtra_0.6-28
[49] Formula_1.2-1 RColorBrewer_1.1-2 ensembldb_1.4.7
[52] tools_3.3.1 dichromat_2.0-0 BSgenome_1.40.1
[55] Biobase_2.32.0 survival_2.39-5 AnnotationDbi_1.34.4
[58] colorspace_1.2-6 cluster_2.0.4 VariantAnnotation_1.18.5
>