Splatter (unknown correlation between gene expression level in different cell types)
Entering edit mode
Fatima • 0
Last seen 8 weeks ago
United States


I'm using Splatter to generate single-cell simulated data.

I'm using these parameters:

###generate single-cell simulated object
K=5### number of Cell types
Ng=10 ###number of genes
Ns=20###number of samples
vcf <- mockVCF(n.samples = Ns)
gff <- mockGFF(n.genes = Ng)
params.group <- newSplatPopParams(batchCells =100,#Number of cells in each batch.
                                  similarity.scale =1,
                                  de.downProb = c(0.1, 0.4, 0.3, 0.6, 0.5),
                                  de.prob = c(0.3, 0.1, 0.2, 0.01, 0.4),
                                  de.facLoc = c(0.6, 5, 0.1, 0.01, 2), 
                                  de.facScale = c(0.1, 0.4, 2, 0.5, 0.4),
                                  group.prob = rep(1/K,K),
sim.means <- splatPopSimulateMeans(vcf = vcf, gff = gff,
                                   params = params.group)
sim.sc.gr <- splatPopSimulateSC(params=params.group, 
                                key = sim.means$key,
                                sparsify = FALSE)

sce= sim.sc.gr

Then you can see the heatmap of the normalized count matrix (SCcount) in figure 1. NormalizedCountMatrix

Then I aggregated the SCcount matrix across cluster-sample groups.

###Aggregate count matrix across cluster-sample groups
pb <- aggregate.Matrix(t(assays(sce)$logcounts), 
                       groupings = groups, fun = "mean") 

annot_cols = data.frame(
  Group = rep(apply(expand.grid(c("CellA","CellB",'CellC',"CellD","CellE")), 1, paste, collapse="."), each=Ns), 
  row.names = colnames(t(pb))
g=pheatmap::pheatmap(t(pb),annotation_col = annot_cols,cluster_rows = F,cluster_cols = F,show_colnames = F)
ggsave(g,file=paste(path,'sample-Celltypeheatmap.pdf',sep=""),width = 30,height = 10,limitsize = FALSE)

you can see the aggregated Sample-Celltype matrix in figure 2: sample-Celltypeheatmap

Then, from the aggregated Sample-Celltype matrix, I made another matrix to show gene expression levels in each CellType across all samples like:

###gene expression levels in each CellType across all samples

rownames(cellg)=apply(expand.grid(c(paste('celltype',LETTERS[1:K],sep='')),c(paste('g',gen,sep = ''))), 1, paste, collapse=".")
colnames(cellg)=apply(expand.grid(c(paste('S',1:Ns,sep = ''))), 1, paste, collapse=".")

annot_rows= data.frame(
  Group = rep(apply(expand.grid(c(paste('g',gen,sep = ''))), 1, paste, collapse="."), each=5), 
  row.names = rownames(cellg)
g=pheatmap::pheatmap(cellg,cluster_rows = F,cluster_cols = F,fontsize = 10,annotation_row =annot_rows,show_rownames = F )
ggsave(g,file=paste(path,'cellgheatmap.pdf',sep=''),width = 10,height = 5,limitsize = FALSE)

you can see the heatmap of the cellg matrix in figure 3:


Then I calculate the correlation between rows of Matric cells.

###correlation between each rows of cellg matrix
g= pheatmap::pheatmap(cor(mm),cluster_rows = F,cluster_cols = F,display_numbers = F,fontsize = 20)
ggsave(g,file=paste(path,"Corelation.pdf",sep=""),dpi = 1000,width = 50,height = 50,limitsize = FALSE)

you can see the heatmap of this correlation in figure 4.


Now I have some questions:

In figure 3, we can see an unknown structure (the blue blocks). It seems some genes are off in all of the cell types. This leads to a high correlation between rows of the matrix cellg (As you can see in figure 4).

1- How can I get rid of the blue and red blocks in figure 3 and also the red blocks in the Correlation matrix (figure4). 2- How are samples generated in the Splatter object? 3-Is it a multiplicative factor to create the samples? Is each sample just a multiplicative version of all the others?

I appreciate your help.

SingleCellData splatter • 125 views
Entering edit mode
luke.zappia ▴ 50
Last seen 11 days ago

Duplicate GitHub issue https://github.com/Oshlack/splatter/issues/147


Login before adding your answer.

Traffic: 175 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6