BiocCheck object initialization - NOTE
1
0
Entering edit mode
ullrich ▴ 20
@ullrich-23697
Last seen 2.5 years ago
Germany

Hi,

the BiocChek gives a NOTE about NOTE: Consider clarifying how 9 object(s) are initialized. they are part of a data set loaded with data(), or perhaps part of an object referenced in with() or within(). function (object) codonmat2xy (Codon) codonmat2xy (syn) codonmat2xy (nonsyn) codonmat2xy (SynSum) codonmat2xy (NonSynSum) codonmat2xy (IndelSum) codonmat2xy (SynMean) codonmat2xy (NonSynMean) codonmat2xy (IndelMean)

However, the objects are initialized within a parallel foreach loop and than processed with dplyr.

Is there a way how to handle this kind of object initialization without getting a NOTE from BiocCheck?

Thank you in anticipation

Best regards

Kristian Ullrich

codonmat2xy <- function(codonmat, threads = 1){
  doMC::registerDoMC(threads)
  i <- NULL
  j <- NULL
  k <- NULL
  OUT <- foreach(i = seq(from = 1, to = nrow(codonmat)), .combine=rbind) %dopar% {
    foreach(j = seq(from = 1, to = ncol(codonmat) - 1), .combine=rbind) %do% {
      foreach(k = seq(from = j + 1, to = ncol(codonmat)), .combine=rbind) %do% {
        c(setNames(i, "Codon"),
          setNames(j, "Comp1"),
          setNames(k, "Comp2"),
          setNames(compareCodons(codonmat[i, j], codonmat[i, k]), c("syn", "nonsyn", "indel")))
      }
    }
  }
  OUT <- as.data.frame(OUT)
  OUT.NAs <- OUT %>% dplyr::group_by(Codon) %>% dplyr::filter(!is.na(syn)) %>%
                     dplyr::count(Codon)
  OUT.SynSum <- OUT %>% dplyr::group_by(Codon) %>%
                        dplyr::summarise(SynSum = sum(syn, na.rm = TRUE))
  OUT.NonSynSum <- OUT %>% dplyr::group_by(Codon) %>%
                           dplyr::summarise(NonSynSum = sum(nonsyn, na.rm = TRUE))
  OUT.IndelSum <- OUT %>% dplyr::group_by(Codon) %>%
                          dplyr::summarise(IndelSum = sum(indel, na.rm = TRUE))
  OUT.join <- dplyr::left_join(OUT.NAs, OUT.SynSum) %>%
             dplyr::left_join(OUT.NonSynSum) %>% dplyr::left_join(OUT.IndelSum)
  OUT.xy <- OUT.join %>% dplyr::mutate(SynMean = SynSum/n,
                                       NonSynMean = NonSynSum/n,
                                       IndelMean = IndelSum/n)
  OUT.xy <- OUT.xy %>% dplyr::mutate(CumSumSynMean = cumsum(SynMean),
                                       CumSumNonSynMean = cumsum(NonSynMean),
                                       CumSumIndelMean = cumsum(IndelMean))
  return(OUT.xy)
}
BiocC BiocCheck • 1.1k views
ADD COMMENT
0
Entering edit mode
@marcel-ramos-7325
Last seen 23 days ago
United States

Hi Ullrich,

This is often the case when using NSE in dplyr. You can use utils::globalVariables at the top of your R file to declare these and avoid the NOTE.

Best regards,

Marcel

ADD COMMENT
0
Entering edit mode

Thank you Marcel,

I could solve it by adding #' @importFrom rlang .data

now the code looks like this:

codonmat2xy <- function(codonmat, threads = 1){
    cl <- parallel::makeForkCluster(threads)
    doParallel::registerDoParallel(cl)
    i <- NULL
    j <- NULL
    k <- NULL
    OUT <- foreach::foreach(i = seq(from = 1, to = nrow(codonmat)),
            .combine=rbind, .packages = c('foreach')) %dopar% {
        foreach::foreach(j = seq(from = 1, to = ncol(codonmat) - 1),
            .combine=rbind) %do% {
            foreach::foreach(k = seq(from = j + 1, to = ncol(codonmat)),
                .combine=rbind) %do% {
                c(setNames(i, "Codon"),
                    setNames(j, "Comp1"),
                    setNames(k, "Comp2"),
                    setNames(distSTRING::compareCodons(codonmat[i, j],
                        codonmat[i, k]), c("syn", "nonsyn", "indel")))
            }
        }
    }
    parallel::stopCluster(cl)
    OUT <- as.data.frame(OUT)
    OUT.NAs <- OUT %>% dplyr::group_by(.data$Codon) %>%
        dplyr::filter(!is.na(.data$syn)) %>%
        dplyr::count(.data$Codon)
    OUT.SynSum <- OUT %>% dplyr::group_by(.data$Codon) %>%
        dplyr::summarise(SynSum = sum(.data$syn, na.rm = TRUE))
    OUT.NonSynSum <- OUT %>% dplyr::group_by(.data$Codon) %>%
        dplyr::summarise(NonSynSum = sum(.data$nonsyn, na.rm = TRUE))
    OUT.IndelSum <- OUT %>% dplyr::group_by(.data$Codon) %>%
        dplyr::summarise(IndelSum = sum(.data$indel, na.rm = TRUE))
    OUT.join <- dplyr::left_join(OUT.NAs, OUT.SynSum) %>%
        dplyr::left_join(OUT.NonSynSum) %>% dplyr::left_join(OUT.IndelSum)
    OUT.xy <- OUT.join %>% dplyr::mutate(SynMean = .data$SynSum/.data$n,
        NonSynMean = .data$NonSynSum/.data$n,
        IndelMean = .data$IndelSum/.data$n)
    OUT.xy <- OUT.xy %>%
        tibble::add_column(CumSumSynMean = cumsum(OUT.xy$SynMean),
        CumSumNonSynMean = cumsum(OUT.xy$NonSynMean),
        CumSumIndelMean = cumsum(OUT.xy$IndelMean))
    return(OUT.xy)
}
ADD REPLY

Login before adding your answer.

Traffic: 523 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6