FlowSOM and ConsensusClusterPlus: reproducibility issue
0
0
Entering edit mode
@lukas-weber-10656
Last seen 8 weeks ago
Johns Hopkins Bloomberg School of Publi…

Hi,

I have a question about reproducibility of meta-clustering results in FlowSOM.

My colleague Malgorzata Nowicka noticed a while ago that the 'metaClustering_consensus' function does not give reproducible results when setting a seed with 'set.seed()'. Then we saw that this is because 'metaClustering_consensus' internally calls 'ConsensusClusterPlus::ConsensusClusterPlus', which automatically sets the seed to 'as.numeric(Sys.time())' if it is not specified with the 'seed' argument; hence it ignores any seeds set externally with 'set.seed()'.

The FlowSOM authors kindly provided us with a bug fix which solved the problem (by including an additional seed argument in 'metaClustering_consensus'), but we have noticed that this bug fix was never included in the version on Bioconductor.

It would be great if this bug fix could be included in the release version on Bioconductor. We have found the FlowSOM package to be very useful in our CyTOF data analysis pipelines, and having this seed argument in the release version would make things easier for getting reproducible results.

In addition, it may also be useful for the ConsensusClusterPlus authors to consider removing the default setting of the seed to 'as.numeric(Sys.time())', since users will often set a seed with 'set.seed()' at the top of their analysis script, and expect it to propagate through.

I have pasted a copy of the updated 'FlowSOM::metaClustering_consensus' function below (with the additional seed argument), for reference.

Thanks again for creating this very useful package.

Best regards,

Lukas

 

> metaClustering_consensus

function(data, k=7, seed=NULL){
    results <- suppressMessages(ConsensusClusterPlus::ConsensusClusterPlus(
                                t(data),
                                maxK=k, reps=100, pItem=0.9, pFeature=1,
                                title=tempdir(), plot="pdf", verbose=FALSE,
                                clusterAlg="hc",
                                distance="euclidean",
                                seed=seed
    ))

    results[[k]]$consensusClass
}
flowsom consensusclusterplus • 1.1k views
ADD COMMENT

Login before adding your answer.

Traffic: 333 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6