Search
Question: Question about seed in ConsensusClusterPlus, can't repeat the result
0
gravatar for afraTW
13 months ago by
afraTW0
afraTW0 wrote:

Hello everyone,

I try to use ConsensusClusterPlus to subgroup my samples. This package will generate a random seed number and put the seed number in the output log file. However, I found I couldn’t repeat the result with the seed number even with the domo dataset. The code and result I get is listed below. 

The other question is about the reproducibility. Using my own data, the output from manually assigned a seed number (seed = as.numeric(Sys.time())) is very different from the output from randomly generate a seed number. All the “non_seeded_outputs” I get are very similar if I didn’t give any seed number (with bootstrap=1000). All the “seeded_outputs” I get are also similar to each other (also with bootstrap=1000). But the “non_seeded_outputs" and “seeded_outputs” are very different.  :(

I appreciate any suggestions.

                                                                                                                          


##### prepare data #####
data(ALL)
d = exprs(ALL)
mads = apply(d, 1, mad)
d = d[rev(order(mads)[1:5000]),] 
d = sweep(d,1,apply(d,1,median, na.rm=T)) # for each value minused by the median of each column)
######## running consensus clustering ##########

rep_times = 50

title = paste("pam_no_seed_rep_", rep_times, sep="")

results1 = ConsensusClusterPlus(d,maxK=6, reps= rep_times, pItem=0.8, pFeature=0.8, title=title,clusterAlg="pam",distance="pearson",plot="png", writeTable = TRUE)

logInfo = read.delim(paste(getwd(), "/", title, "/", title, ".log.csv", sep=""), sep=",")
seed = logInfo[13,2]

title=paste(title, "_seed_", seed, sep="")
results2 = ConsensusClusterPlus(d,maxK=6, reps=rep_times, pItem=0.8, pFeature=0.8, title=title,clusterAlg="pam",distance="pearson",seed = 1475989793.02773, plot="png", writeTable = TRUE)

title=paste(title, "_2_seed_", seed, sep="")
results3 = ConsensusClusterPlus(d,maxK=6, reps=rep_times, pItem=0.8, pFeature=0.8, title=title,clusterAlg="pam",distance="pearson",seed = 1475989793.02773, plot="png", writeTable = TRUE)
identical(results1, results2)

identical(results2, results3)

 

 

ADD COMMENTlink modified 13 months ago • written 13 months ago by afraTW0

i havent had the same problem running ccplus - results are always the same with seed param, but i dont normally resample the features - try setting that parameter to 1

ADD REPLYlink written 13 months ago by chris86340
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 143 users visited in the last hour