Question

Using GSVA All the pathway score are same

0

Entering edit mode

initialqy • 0

@initialqy-23722

Last seen 3.9 years ago

Hi, I analyzed scRNA-seq data. Using Seurat, we have an object with 19742genes in 2516 cells. In that, control group 525 cells, treatment group 1991 cells. by "FindMarkers" function I got the different gene expression matrix between the control (pct.1) and treatment (pct.2) groups.

>dim(mydata)
[1] 540   2
>mydata[1:10,1:2]
      pct.1 pct.2
Fth1  0.996 1.000
Fosb  0.389 0.092
Prdx5 0.693 0.856
Slpi  0.175 0.529
Ccl7  0.425 0.161
Ass1  0.131 0.458
Apoe  0.413 0.164
Ly6a  0.187 0.500
Junb  0.916 0.888
Nos2  0.025 0.319

By "getGmt", I could get specific MSigDB dataset. then the codes are :

myC7 <- getGmt("h.all.v7.1.symbols.gmt")
mask <- lengths(geneIds(myC7)) > 1  ## make sure all dataset contain more than one gene
myC7filt <- myC7[mask]
score <- gsva(mydata, myC7filt, method= "ssgsea",
              kcdf=c("Gaussian", "Poisson", "none"),abs.ranking=F,
              min.sz=1,max.sz=Inf,  parallel.sz=1L, mx.diff=T, ssgsea.norm=TRUE,verbose=T)

Estimating ssGSEA scores for 3 gene sets.
  |============================================================================| 100%

Warning message:
In .gsva(expr, mapped.gset.idx.list, method, kcdf, rnaseq, abs.ranking,  :
  Some gene sets have size one. Consider setting 'min.sz > 1'.
> score
                                 pct.1   pct.2
HALLMARK_COMPLEMENT            1.28481 2.28481
HALLMARK_XENOBIOTIC_METABOLISM 1.28481 2.28481
HALLMARK_COAGULATION           1.28481 2.28481

All these scores are the same in each group.???

if change "min.sz=2"

score <- gsva(mydata, myC7filt, method= "ssgsea",
              kcdf=c("Gaussian", "Poisson", "none"),abs.ranking=F,
              min.sz=2,max.sz=Inf,  parallel.sz=1L, mx.diff=T, ssgsea.norm=TRUE,verbose=T)

>Error in .gsva(expr, mapped.gset.idx.list, method, kcdf, rnaseq, abs.ranking,  : 
  The gene set list is empty! Filter may be too stringent.

I'm sorry I'm really amateur in learning R.

GSVA scRNA-seq data • 1.9k views

ADD COMMENT • link updated 3.9 years ago by Robert Castelo ★ 3.3k • written 3.9 years ago by initialqy • 0

score 2 · Accepted Answer · 2020-06-22

2

Entering edit mode

Robert Castelo ★ 3.3k

@rcastelo

Last seen 7 days ago

Barcelona/Universitat Pompeu Fabra

hi,

what happens probably is that the gene symbols in the gene sets you load from h.all.v7.1.symbols.gmt are all written in upper case letters, while the genes that you show in your input expression data matrix are only capitalized in the first letter. please, try the following right before calling gsva():

rownames(mydata) <- touppper(rownames(mydata))

this instruction will set the row names of the matrix mydata, which correspond to the gene symbols, to uppercase letter.

cheers,

robert.

ADD COMMENT • link 3.9 years ago Robert Castelo ★ 3.3k

0

Entering edit mode

It's done. Thank you so much.

ADD REPLY • link 3.9 years ago initialqy • 0