TPP and NPARC: A problem about the conversion of data frame.
0
0
Entering edit mode
yangwanqi • 0
@6e5d153a
Last seen 4.0 years ago


# tppData <- readRDS("../data/tppData.Rds")


sessionInfo( )

Please help us to know how to convert the TPP result to a "tidy format".

TPP NPARC • 843 views
ADD COMMENT
0
Entering edit mode

A first comment upfront: It would have been hard for me to understand your question without the email you've sent before. To clarify: the problem is concerning the conversion of the ExpressionSet structure obtained after data import of a "Thermal proteome profiling" experiment using the TPP package and then converting this to a long format data frame which the NPARC package needs as input. (https://bioconductor.org/packages/release/bioc/vignettes/NPARC/inst/doc/NPARC.html)

Secondly, it would help if you also post the out put appearing in your console upon typing sessionInfo().

Assuming you have imported data using the functions from the TPP package similar to:

library(TPP)
data("hdacTR_smallExample")
trData <- tpptrImport(configTable = hdacTR_config, data = hdacTR_data)

you can convert the obtained ExpressionSet into a long format data frame by using the Bioconductor package biobroom:

trTidyData <- bind_rows(lapply(names(trData), function(eset_names){
   biobroom::tidy.ExpressionSet(trData[[eset_names]], addPheno = TRUE) %>% 
      mutate(dataset = eset_names)
}))

This results in the following data frame:

trTidyData

# # A tibble: 20,340 x 7
# gene   sample      label temperature normCoeff value dataset  
# <chr>  <chr>       <chr>       <dbl> <lgl>     <dbl> <chr>    
#     1 AAK1   rel_fc_131L 131L           37 NA            1 Vehicle_1
# 2 AAMDC  rel_fc_131L 131L           37 NA            1 Vehicle_1
# 3 ACACA  rel_fc_131L 131L           37 NA            1 Vehicle_1
# 4 ACAP2  rel_fc_131L 131L           37 NA            1 Vehicle_1
# 5 ACBD6  rel_fc_131L 131L           37 NA            1 Vehicle_1
# 6 ACO2   rel_fc_131L 131L           37 NA            1 Vehicle_1
# 7 ACTR1B rel_fc_131L 131L           37 NA            1 Vehicle_1
# 8 ADI1   rel_fc_131L 131L           37 NA            1 Vehicle_1
# 9 AIMP1  rel_fc_131L 131L           37 NA            1 Vehicle_1
# 10 AIMP2  rel_fc_131L 131L           37 NA            1 Vehicle_1
# # … with 20,330 more rows

To adapt the column names to match the ones from the NPARC example:

library(NPARC)
data("stauro_TPP_data_tidy")
stauro_TPP_data_tidy

## A tibble: 307,080 x 7
#   dataset uniqueID relAbundance temperature compoundConcent… replicate
#   <chr>   <chr>           <dbl>       <dbl>            <dbl>     <int>
# 1 Stauro… 15 KDA …        1.00           40               20         1
# 2 Stauro… 15 KDA …        1.39           43               20         1
# 3 Stauro… 15 KDA …        0.987          46               20         1
# 4 Stauro… 15 KDA …        1.33           49               20         1
# 5 Stauro… 15 KDA …        0.959          52               20         1
# 6 Stauro… 15 KDA …        0.789          55               20         1
# 7 Stauro… 15 KDA …        0.807          58               20         1
# 8 Stauro… 15 KDA …        1.27           61               20         1
# 9 Stauro… 15 KDA …        0.688          64               20         1
# 10 Stauro… 15 KDA …        0.655          67               20         1
## … with 307,070 more rows, and 1 more variable: uniquePeptideMatches <dbl>

you can now use dplyr:

trTidyData %>% dplyr::select(dataset, uniqueID = gene, relAbundance = value, temperature) # and so on

# # A tibble: 20,340 x 4
# dataset   uniqueID relAbundance temperature
# <chr>     <chr>           <dbl>       <dbl>
#     1 Vehicle_1 AAK1                1          37
# 2 Vehicle_1 AAMDC               1          37
# 3 Vehicle_1 ACACA               1          37
# 4 Vehicle_1 ACAP2               1          37
# 5 Vehicle_1 ACBD6               1          37
# 6 Vehicle_1 ACO2                1          37
# 7 Vehicle_1 ACTR1B              1          37
# 8 Vehicle_1 ADI1                1          37
# 9 Vehicle_1 AIMP1               1          37
# 10 Vehicle_1 AIMP2               1          37
# # … with 20,330 more rows

Based on this data frame you should be able to perform the NPARC analysis as described in the vignette.

sessionInfo()

#R version 4.0.0 Patched (2020-05-04 r78358)
#Platform: x86_64-apple-darwin17.0 (64-bit)
#Running under: macOS Mojave 10.14.6
#
#Matrix products: default
#BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
#LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
#
#locale:
#[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#
#attached base packages:
#[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     
#
#other attached packages:
#[1] TPP_3.17.0          tidyr_1.1.0         magrittr_1.5        dplyr_1.0.0        
#[5] Biobase_2.48.0      BiocGenerics_0.34.0 NPARC_1.1.1        
#
#loaded via a namespace (and not attached):
# [1] pkgload_1.1.0        VGAM_1.1-3           splines_4.0.0       
# [4] foreach_1.5.0        assertthat_0.2.1     stats4_4.0.0        
# [7] nls2_0.2             cellranger_1.1.0     yaml_2.2.1          
# [10] remotes_2.1.1        sessioninfo_1.1.1    pillar_1.4.4        
# [13] backports_1.1.7      lattice_0.20-41      glue_1.4.1          
# [16] limma_3.44.1         digest_0.6.25        RColorBrewer_1.1-2  
# [19] colorspace_1.4-1     htmltools_0.5.0      plyr_1.8.6          
# [22] pkgconfig_2.0.3      devtools_2.3.0       broom_0.5.6         
# [25] purrr_0.3.4          scales_1.1.1         processx_3.4.2      
# [28] VennDiagram_1.6.20   openxlsx_4.1.5       BiocParallel_1.22.0 
# [31] tibble_3.0.1         generics_0.0.2       ggplot2_3.3.2       
# [34] usethis_1.6.1        ellipsis_0.3.1       withr_2.2.0         
# [37] cli_2.0.2            crayon_1.3.4         readxl_1.3.1        
# [40] evaluate_0.14        memoise_1.1.0        ps_1.3.3            
# [43] fs_1.4.1             fansi_0.4.1          doParallel_1.0.15   
# [46] nlme_3.1-148         MASS_7.3-51.6        pkgbuild_1.0.8      
# [49] tools_4.0.0          data.table_1.12.8    prettyunits_1.1.1   
# [52] formatR_1.7          lifecycle_0.2.0      stringr_1.4.0       
# [55] munsell_0.5.0        zip_2.0.4            lambda.r_1.2.4      
# [58] callr_3.4.3          compiler_4.0.0       rlang_0.4.6         
# [61] RCurl_1.98-1.2       futile.logger_1.4.3  grid_4.0.0          
# [64] iterators_1.0.12     rstudioapi_0.11      bitops_1.0-6        
# [67] rmarkdown_2.2        testthat_2.3.2       gtable_0.3.0        
# [70] codetools_0.2-16     reshape2_1.4.4       R6_2.4.1            
# [73] gridExtra_2.3        knitr_1.28           utf8_1.1.4          
# [76] rprojroot_1.3-2      futile.options_1.0.1 desc_1.2.0          
# [79] stringi_1.4.6        Rcpp_1.0.4.6         biobroom_1.20.0     
# [82] vctrs_0.3.0          tidyselect_1.1.0     xfun_0.14
ADD REPLY

Login before adding your answer.

Traffic: 688 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6