Entering edit mode
Hello,
I am trying to compare gene expression of a set of genes between pancreatic cancer (TCGA-PAAD) vs normal pancreas (GTEx) data at normalized count level or through differential expression analysis (edgeR, or DESeq2). I have managed to use TCGAquery_recount2() from TCGABiolinks (code below) to obtain the count data from both studies. Are these the raw count data? If so, can I simply merge the matrices and then perform scaling/normalizing for my subsequent analysis?
Any advice would be greatly appreciated!
pancreas.tcga <- TCGAquery_recount2(project = "tcga", tissue = "pancreas")
pancreas.gtex <- TCGAquery_recount2(project = "gtex", tissue = "pancreas")
count.tcga <- assay(pancreas.tcga$tcga_pancreas)
count.gtex <- assay(pancreas.gtex$gtex_pancreas)
sessionInfo()
R version 4.0.4 (2021-02-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] genefilter_1.72.1 gProfileR_0.7.0 RColorBrewer_1.1-2 survminer_0.4.9
[5] ggpubr_0.4.0 survival_3.2-7 gplots_3.1.1 SummarizedExperiment_1.20.0
[9] Biobase_2.50.0 GenomicRanges_1.42.0 GenomeInfoDb_1.26.7 IRanges_2.24.1
[13] S4Vectors_0.28.1 BiocGenerics_0.36.1 MatrixGenerics_1.2.1 matrixStats_0.61.0
[17] caret_6.0-90 lattice_0.20-41 FactoMineR_2.4 factoextra_1.0.7
[21] ggplot2_3.3.5 glmnet_4.1-3 Matrix_1.3-2 edgeR_3.32.1
[25] limma_3.46.0 TCGAbiolinks_2.18.0
loaded via a namespace (and not attached):
[1] backports_1.3.0 BiocFileCache_1.14.0 plyr_1.8.6 splines_4.0.4
[5] listenv_0.8.0 digest_0.6.28 foreach_1.5.1 htmltools_0.5.2
[9] fansi_0.5.0 magrittr_2.0.1 memoise_2.0.0 cluster_2.1.0
[13] tzdb_0.2.0 annotate_1.68.0 recipes_0.1.17 globals_0.14.0
[17] readr_2.0.2 gower_0.2.2 R.utils_2.11.0 askpass_1.1
[21] prettyunits_1.1.1 colorspace_2.0-2 blob_1.2.2 rvest_1.0.2
[25] rappdirs_0.3.3 ggrepel_0.9.1 xfun_0.26 dplyr_1.0.7
[29] crayon_1.4.2 RCurl_1.98-1.5 jsonlite_1.7.2 zoo_1.8-9
[33] iterators_1.0.13 glue_1.4.2 gtable_0.3.0 ipred_0.9-12
[37] zlibbioc_1.36.0 XVector_0.30.0 DelayedArray_0.16.3 car_3.0-12
[41] future.apply_1.8.1 shape_1.4.6 abind_1.4-5 scales_1.1.1
[45] DBI_1.1.1 rstatix_0.7.0 Rcpp_1.0.7 xtable_1.8-4
[49] progress_1.2.2 flashClust_1.01-2 bit_4.0.4 km.ci_0.5-2
[53] lava_1.6.10 prodlim_2019.11.13 DT_0.19 htmlwidgets_1.5.4
[57] httr_1.4.2 ellipsis_0.3.2 pkgconfig_2.0.3 XML_3.99-0.8
[61] R.methodsS3_1.8.1 nnet_7.3-15 dbplyr_2.1.1 locfit_1.5-9.4
[65] utf8_1.2.2 tidyselect_1.1.1 rlang_0.4.12 reshape2_1.4.4
[69] AnnotationDbi_1.52.0 munsell_0.5.0 tools_4.0.4 cachem_1.0.6
[73] downloader_0.4 generics_0.1.1 RSQLite_2.2.8 broom_0.7.10
[77] stringr_1.4.0 fastmap_1.1.0 ModelMetrics_1.2.2.2 knitr_1.36
[81] bit64_4.0.5 survMisc_0.5.5 caTools_1.18.2 purrr_0.3.4
[85] future_1.23.0 nlme_3.1-152 R.oo_1.24.0 leaps_3.1
[89] xml2_1.3.2 biomaRt_2.46.3 compiler_4.0.4 curl_4.3.2
[93] ggsignif_0.6.3 tibble_3.1.5 stringi_1.7.5 TCGAbiolinksGUI.data_1.10.0
[97] KMsurv_0.1-5 vctrs_0.3.8 pillar_1.6.4 lifecycle_1.0.1
[101] BiocManager_1.30.16 data.table_1.14.2 bitops_1.0-7 R6_2.5.1
[105] gridExtra_2.3 KernSmooth_2.23-18 parallelly_1.28.1 codetools_0.2-18
[109] MASS_7.3-53 gtools_3.9.2 assertthat_0.2.1 openssl_1.4.5
[113] withr_2.4.2 GenomeInfoDbData_1.2.4 hms_1.1.1 grid_4.0.4
[117] rpart_4.1-15 timeDate_3043.102 tidyr_1.1.4 class_7.3-18
[121] carData_3.0-4 pROC_1.18.0 scatterplot3d_0.3-41 lubridate_1.8.0
[125] tinytex_0.35