A question on the included metadata for a particular experimentHub dataset.
HSMMSingleCell is the most popular RNASeq dataset in the Bioconductor ExperimentHub as of early 2022.
The vignette information talks about how the sequencing was performed and expression calculated, but there are several metadata columns left unexplained, perhaps too obvious.
Each sample has a Media annotation that does overlap with the timepoint, so I can guess what it means.
Each sample has a 'State' in 1,2,3 that sort of correlates with the other variables like timepoint. If I filter for a particular state I get clearer results, so I guess the samples are segregated on some phenotype. This feels important to interpreting results, because samples within one state are more homogeneous than samples within a timepoint.
There's a Pseudotime column that also basically correlates with timepoint, but no explanation of what it is.
Is there a publication for this dataset?
It's a few years old, but top ranked in rna-seq expression sets, so maybe others have used this dataset successfully.