Dotplots for differential gene expression data ?
0
0
Entering edit mode
@sunnykevin97-23726
Last seen 8 months ago

HI

Some how I manage cluster profiler tutorial, I performed GO and KEGG analysis for significant selected differential expression genes after edgeR analysis. I followed steps as given in tutorial. I'm unable to generate a Dot plot for the filter pathways. Here is my code, suggestions please.

Here is my sgenelist

head(sgenelist)
100003547    404621    568653    568653    450072    450072 
0.5079993 0.4388901 0.4072887 0.4072887 0.3639646 0.3639646 

mydf dataframe

    56649   56649   1.587225    upregulated A
    3898    3898    1.586980    upregulated A
    28424   28424   1.586624    upregulated A
    952 952 1.585856    upregulated A
    26279   26279   1.585371    upregulated A
    8792    8792    1.582148    upregulated A
    3294    3294    1.571530    upregulated A
    79725   79725   1.561914    upregulated A
    1493    1493    1.554131    upregulated A
    6352    6352    1.554130    upregulated A
    25907   25907   1.546584    upregulated A
    9768    9768    1.543440    upregulated A
    80020   80020   1.542459    upregulated A
    6663    6663    1.533759    upregulated A
    51378   51378   1.533411    upregulated A
    8842    8842    1.530652    upregulated A
    4998    4998    1.529966    upregulated A
    11339   11339   1.513516    upregulated A
    3070    3070    1.512863    upregulated A
    57211   57211   1.510945    upregulated A
    4288    4288    1.504310    upregulated A
    7804    7804    1.500046    upregulated A
    10733   10733   1.497867    upregulated A
    339479  339479  1.496286    upregulated A
    3613    3613    1.485338    upregulated A
    7453    7453    1.483423    upregulated A
    6536    6536    1.479826    upregulated A
    797 797 1.477915    upregulated A
    7037    7037    1.469708    upregulated A
    1163    1163    1.468578    upregulated A
    6884    6884    1.466794    upregulated A
    1001    1001    1.466047    upregulated A
    26472   26472   1.465092    upregulated A
    1824    1824    1.463979    upregulated A
    5026    5026    1.461255    upregulated A
    81831   81831   1.458972    upregulated A
    3934    3934    1.458640    upregulated A
    59272   59272   1.457920    upregulated A
    154754  154754  1.456948    upregulated A
    4316    4316    1.452348    upregulated A
    54801   54801   1.451846    upregulated A
    9134    9134    1.449722    upregulated A
    445 445 1.447451    upregulated A
    4599    4599    1.443001    upregulated A
    2175    2175    1.442596    upregulated A
    1717    1717    1.440275    upregulated A
    4175    4175    1.439526    upregulated A
    9435    9435    1.432742    upregulated A
    83937   83937   1.428212    upregulated A
    4173    4173    1.422334    upregulated A
    54210   54210   1.417809    upregulated A
    26150   26150   1.417003    upregulated A
    54959   54959   1.416347    upregulated A
    23560   23560   1.416187    upregulated A
    1075    1075    1.412488    upregulated A
    55711   55711   1.409885    upregulated A
    9603    9603    1.408101    upregulated A
    65009   65009   1.407076    upregulated A
    5307    5307    1.398947    upregulated A
    10797   10797   1.394704    upregulated A
    10663   10663   1.389748    upregulated A
    2237    2237    1.385611    upregulated A
    3755    3755    1.384549    upregulated A
    56833   56833   1.376687    upregulated A
    5214    5214    1.372449    upregulated A
    29899   29899   1.369043    upregulated A
    2019    2019    1.357915    upregulated A
    4199    4199    1.356848    upregulated A
    10549   10549   1.356045    upregulated A
    3932    3932    1.354111    upregulated A
    10926   10926   1.345737    upregulated A
    56938   56938   1.337508    upregulated A
    8139    8139    1.336826    upregulated A
    80380   80380   1.336333    upregulated A
    54962   54962   1.334750    upregulated A
    713 713 1.331600    upregulated A
    55010   55010   1.330514    upregulated A
    3110    3110    1.329481    upregulated A
    6502    6502    1.326637    upregulated A
    55633   55633   1.324729    upregulated A
    59336   59336   1.319593    upregulated A
    56942   56942   1.316442    upregulated A
    4111    4111    1.313481    upregulated A
    7378    7378    1.313374    upregulated A
    440 440 1.307646    upregulated A
    5551    5551    1.303901    upregulated A
    55353   55353   1.298713    upregulated A
    55612   55612   1.298083    upregulated A
    6627    6627    1.297917    upregulated A
    133 133 1.296697    upregulated A
    5100    5100    1.296196    upregulated A
    3559    3559    1.295228    upregulated A
    91646   91646   1.293538    upregulated A
    54954   54954   1.291815    upregulated A
    22948   22948   1.291598    upregulated A
    768 768 1.291301    upregulated A
    79315   79315   1.290375    upregulated A
    1230    1230    1.290361    upregulated A
    79943   79943   1.287475    upregulated A
    55526   55526   1.286864    upregulated A
    4597    4597    1.280029    upregulated A
    2537    2537    1.277486    upregulated A
    200315  200315  1.277134    upregulated A
    23007   23007   1.273399    upregulated A
    8538    8538    1.272534    upregulated A
    5984    5984    1.266485    upregulated A
    3823    3823    1.258140    upregulated A
    128872  128872  1.257428    upregulated A
    6772    6772    1.257088    upregulated A
    51311   51311   1.252924    upregulated A
    26692   26692   1.251941    upregulated A
    994 994 1.249583    upregulated A
    79931   79931   1.249106    upregulated A
    6347    6347    1.248129    upregulated A
    7345    7345    1.241996    upregulated A
    3507    3507    1.241548    upregulated A
    30848   30848   1.238564    upregulated A
    29094   29094   1.237265    upregulated A
    9654    9654    1.234895    upregulated A
    78997   78997   1.234607    upregulated A
    9918    9918    1.233430    upregulated A
    712 712 1.232196    upregulated A
    29980   29980   1.230867    upregulated A
    11240   11240   1.230517    upregulated A
    51373   51373   1.228151    upregulated A
    79581   79581   1.225722    upregulated A
    6402    6402    1.225665    upregulated A
    639 639 1.223682    upregulated A
    29028   29028   1.223041    upregulated A
    94025   94025   1.222887    upregulated A
    7298    7298    1.221392    upregulated A
    2888    2888    1.218675    upregulated A
    53347   53347   1.217192    upregulated A
    8875    8875    1.215134    upregulated A
    3838    3838    1.212955    upregulated A
    1058    1058    1.211648    upregulated A
    84296   84296   1.211371    upregulated A
    914 914 1.209554    upregulated A
    54478   54478   1.209381    upregulated A
    51338   51338   1.208520    upregulated A
    5320    5320    1.203571    upregulated A
    29078   29078   1.202691    upregulated A
    64581   64581   1.202508    upregulated A
    3126    3126    1.199839    upregulated A
    8833    8833    1.199775    upregulated A
    5650    5650    1.198893    upregulated A
    3112    3112    1.197845    upregulated A
    11135   11135   1.195057    upregulated A
    1690    1690    1.193322    upregulated A
    57823   57823   1.193254    upregulated A
    8326    8326    1.193043    upregulated A
    3015    3015    1.190929    upregulated A
    2643    2643    1.188264    upregulated A
    699 699 1.185457    upregulated A
    10437   10437   1.183906    upregulated A
    3768    3768    1.182054    upregulated A
    932 932 1.181259    upregulated A
    50802   50802   1.180658    upregulated A
    1482    1482    1.179333    upregulated A
    54913   54913   1.178350    upregulated A
    5645    5645    1.175839    upregulated A
    221692  221692  1.173809    upregulated A

Code-
>         mydf <- data.frame(Entrez=names(sgenelist), FC=sgenelist)
>         head(mydf)
>         tail(mydf)
>         nrow(mydf)
>         ncol(mydf)
>         # log value encoding 
>         mydf$group <- with(mydf,ifelse(FC < 0,"downregulated","upregulated"))
>         head(mydf)
>         nrow(mydf)
>         mydf$othergroup <- "A"
>         mydf$othergroup[abs(mydf$FC) > 0.2] <- "B"
>         head(mydf)
>         dat1 <- compareCluster(Entrez~group+othergroup,data = mydf,fun = "enrichKEGG")
>         head(as.data.frame(dat1))
>         dotplot(dat1)
> 
>     When I run dat1, it shows error.
>     "Error in compareCluster(Entrez ~ group + othergroup, data = mydf, fun = "enrichKEGG") : "
>       could not find function "compareCluster"

I'm looking for the two plots as show in the manual dotplots(11 section)

What are the better ways to shows the differential gene expression plotting ?

clusterprofiler • 784 views
ADD COMMENT
0
Entering edit mode

Few questions (since you didn't post output from sessionInfo()):

  • did you actually load the library clusterProfiler?
  • does it work with the sample data set included with clusterProfiler/DOSE?
  • note your spelling error! (in 'upgregualted'; the 'l' and 'a' should be switched...)
  • could you please format all code? Makes reading much easier...
ADD REPLY
0
Entering edit mode

I tried with the example DOSE package, its works fine.

When i try with my data, its not working. I'm beginner of R. I found only 225 genes significantly expressed genes in the sample. out of 225 only 165 had annotation (up/down genes). Due to less number of significant genes I'm unable to find over represented genes when I perform ego and geneset enrichment analysis.

I wish to plot this data same as it show in manual. Or else, similar plot shows the differential genes expression for up/down.

ADD REPLY
0
Entering edit mode
> dat1 <- compareCluster(Entrez~group+othergroup,data = mydf,fun = "enrichKEGG")
Error in compareCluster(Entrez ~ group + othergroup, data = mydf, fun = "enrichKEGG") : 
  No enrichment found in any of gene cluster, please check your input...
> head(as.data.frame(dat1))
Error in as.data.frame(dat1) : object 'dat1' not found
> dotplot(dat1)
Error in dotplot(dat1) : object 'dat1' not found
>
ADD REPLY
0
Entering edit mode

I would recommend to not perform multiple testing (BH correction), and set both pvalueCutoff and qvalueCutoff to 1, when running the compareCluster() function. By doing so, you don't apply any filtering, so all results will "survive". You can then fine-tune according to your needs.

Thus:

dat1 <- compareCluster(Entrez~group+othergroup, data=mydf, fun="enrichKEGG", pAdjustMethod = "none", pvalueCutoff = 1,  qvalueCutoff = 1)
dotplot(dat1,  showCategory = 50)

BTW: i also noted that you only seem to have 'upregulated' for group, and 'A' for othergroup...??

ADD REPLY
0
Entering edit mode

Not really, I other group category I had downregualted genes as well. I didn't show it in the picture. I formatted my data as provided in the example in manual, easy to plot. This time, I runned with as you suggested it not working at all. by giving the genelist. Suggestion please.

 1 100003547    0.5079993   upregulated B
    2   404621  0.4388901   upregulated B
    3   568653  0.4072887   upregulated B
    4   568653  0.4072887   upregulated B
    5   450072  0.3639646   upregulated B
    6   450072  0.3639646   upregulated B
    7   450072  0.3639646   upregulated B
    8   100535241   0.3588841   upregulated B
    9   100333757   0.3440863   upregulated B
    10  569058  0.3398989   upregulated B
    11  550424  0.3157571   upregulated B
    12  30199   0.3103427   upregulated B
    13  30199   0.3103427   upregulated B
    14  606663  0.2956820   upregulated B
    15  792196  0.2888694   upregulated B
    16  792196  0.2888694   upregulated B
    17  792196  0.2888694   upregulated B
    18  570807  0.2824910   upregulated B
    19  336965  0.2769353   upregulated B
    20  378742  0.2736060   upregulated B
    21  325802  0.2729100   upregulated B
    22  325758  0.2723399   upregulated B
    23  407642  0.2637653   upregulated B
    24  407642  0.2637653   upregulated B
    25  494097  0.2585510   upregulated B
    26  445208  0.2561739   upregulated B
    27  692276  0.2488975   upregulated B
    28  550572  0.2462167   upregulated B
    29  405832  0.2373560   upregulated B
    30  558917  0.2232176   upregulated B
    31  558917  0.2232176   upregulated B
    32  393809  0.2163555   upregulated B
    33  796066  0.2121404   upregulated B
    34  100534737   0.2109319   upregulated B
    35  58099   0.2050804   upregulated B
    36  58099   0.2050804   upregulated B
    37  58099   0.2050804   upregulated B
    38  334982  0.2015492   upregulated B
    39  436646  0.2011665   upregulated B
    40  436646  0.2011665   upregulated B
    41  436646  0.2011665   upregulated B

Downregualted in same table
111 561894  -0.2886679  downregulated   B
112 442930  -0.2891991  downregulated   B
113 570636  -0.2895023  downregulated   B
114 565909  -0.2920794  downregulated   B
115 563509  -0.2941317  downregulated   B
116 792865  -0.2992927  downregulated   B
117 101886283   -0.3006801  downregulated   B
118 563912  -0.3019769  downregulated   B
119 563912  -0.3019769  downregulated   B
120 394248  -0.3092518  downregulated   B
121 560639  -0.3138518  downregulated   B
122 553162  -0.3140999  downregulated   B
123 566768  -0.3213447  downregulated   B
124 100333310   -0.3217226  downregulated   B
125 792697  -0.3257150  downregulated   B
126 792697  -0.3257150  downregulated   B
127 555350  -0.3261360  downregulated   B
128 568935  -0.3267766  downregulated   B
129 100149852   -0.3310173  downregulated   B
130 100149322   -0.3324482  downregulated   B
131 100535764   -0.3404785  downregulated   B
ADD REPLY
0
Entering edit mode

My dataframe (mydf5) looks

> head(mydf5)
Entrez        FC       group               othergroup
100003547 0.5079993 upregulated    B
404621 0.4388901 upregulated          B
568653 0.4072887 upregulated          B
450072 0.3639646 upregulated          B
100535241 0.3588841 upregulated    B
100333757 0.3440863 upregulated    B

> tail(mydf5)
 Entrez         FC         group            othergroup
 563742 -0.5763808 downregulated          B
 561754 -0.5932895 downregulated          B
 567335 -0.6057246 downregulated          B
 100003419 -0.6669306 downregulated    B
 114432 -0.7318652 downregulated          B
 322969 -0.8099672 downregulated          B
>

my input seems OK. When I run, it says.. How do I rectify this error.

> dat1 <- compareCluster(Entrez~group+othergroup, data=mydf5, fun="enrichKEGG", pAdjustMethod = "none", pvalueCutoff = 1,  qvalueCutoff = 1)
Error in compareCluster(Entrez ~ group + othergroup, data = mydf5, fun = "enrichKEGG",  : 
  No enrichment found in any of gene cluster, please check your input...
> dotplot(dat1,  showCategory = 50)
Error in dotplot(dat1, showCategory = 50) : object 'dat1' not found
>
ADD REPLY
0
Entering edit mode

1) For starters, please be sure to read all documentation for each package and function, because this will show all relevant details! Then you know what you are doing!

2) In addition, be consistent! The content of your data frame posted in the first post seem to be human entrezids, whereas in mydf5 these are zebrafish ids! This is important, because for many functions that require annotation information the default organism is set to human. If you don't explicitly change this, this may cause problems. See my first remark.

3) Finally, try to interpret the messages returned when running a function. In your case:

Error in compareCluster(Entrez ~ group + othergroup, data = mydf5, fun = "enrichKEGG",  : 
  No enrichment found in any of gene cluster, please check your input...

... which IMO clearly states that somehow the input doesn't match with what the compareCluster() is expecting. It then also doesn't make any sense to continue, because dat1 is NOT generated... Hence, the error

Error in dotplot(dat1, showCategory = 50) : object 'dat1' not found

... is perfectly logic!

ADD REPLY
1
Entering edit mode

Post is split because of hitting the max character limit...

OK, to help you I will paste my code that shows that from a coding perspective you get results. It is up to you to decide whether these make sense, or not, by fine-tuning all settings required for an analysis (such as cutoff values, etc).

mydf5 contains the content of the 62 entries you pasted above.

Gene Ontology over-representation analysis

> dat1 <- compareCluster(Entrez~group+othergroup, data=mydf5, fun="enrichGO",
+   pAdjustMethod = "none", pvalueCutoff = 1,  qvalueCutoff = 1,
+   OrgDb='org.Dr.eg.db')
> head(as.data.frame(dat1))
          Cluster         group othergroup         ID
1 downregulated.B downregulated          B GO:0004713
2 downregulated.B downregulated          B GO:0035198
3 downregulated.B downregulated          B GO:0017134
4 downregulated.B downregulated          B GO:0061608
5 downregulated.B downregulated          B GO:0061980
6 downregulated.B downregulated          B GO:0004709
                              Description GeneRatio   BgRatio      pvalue    p.adjust
1        protein tyrosine kinase activity      2/17 130/17439 0.006969860 0.006969860
2                           miRNA binding      1/17  10/17439 0.009708108 0.009708108
3        fibroblast growth factor binding      1/17  13/17439 0.012603198 0.012603198
4 nuclear import signal receptor activity      1/17  14/17439 0.013566457 0.013566457
5                  regulatory RNA binding      1/17  15/17439 0.014528831 0.014528831
6       MAP kinase kinase kinase activity      1/17  19/17439 0.018369498 0.018369498
      qvalue        geneID Count
1 0.05892021 570636/563509     2
2 0.05892021     100535764     1
3 0.05892021        792865     1
4 0.05892021     100149852     1
5 0.05892021     100535764     1
6 0.05892021        560639     1
> 
> dotplot(dat1, showCategory = 50)
>

Output here.

KEGG pathway over-representation analysis. Note the difference in the enrichment function that is called, and the use of OrgDb vs organism.

> dat2 <- compareCluster(Entrez~group+othergroup, data=mydf5, fun="enrichKEGG",
+   pAdjustMethod = "none", pvalueCutoff = 1,  qvalueCutoff = 1,
+   organism = "dre") 
> head(as.data.frame(dat2))
          Cluster         group othergroup       ID
1 downregulated.B downregulated          B dre00534
2 downregulated.B downregulated          B dre04810
3 downregulated.B downregulated          B dre04330
4 downregulated.B downregulated          B dre04012
5 downregulated.B downregulated          B dre04621
6 downregulated.B downregulated          B dre04514
                                                 Description GeneRatio  BgRatio
1 Glycosaminoglycan biosynthesis - heparan sulfate / heparin       1/7  33/6862
2                           Regulation of actin cytoskeleton       2/7 304/6862
3                                    Notch signaling pathway       1/7  75/6862
4                                     ErbB signaling pathway       1/7 110/6862
5                        NOD-like receptor signaling pathway       1/7 173/6862
6                             Cell adhesion molecules (CAMs)       1/7 181/6862
      pvalue   p.adjust    qvalue        geneID Count
1 0.03319616 0.03319616 0.1119170        553162     1
2 0.03544038 0.03544038 0.1119170 561894/568935     2
3 0.07407619 0.07407619 0.1559499     101886283     1
4 0.10700223 0.10700223 0.1689509        570636     1
5 0.16374452 0.16374452 0.1797091        792697     1
6 0.17072366 0.17072366 0.1797091        792865     1
> 
> dotplot(dat2, showCategory = 50)
>

Output here.

ADD REPLY
0
Entering edit mode

It make sense, for some dotplots, cnetplots and ridge plots to show enrichment analysis. For bar-plots it doesn't P.Adjust value is "1". Thanks for compare cluster, I learn some stuff.

ADD REPLY

Login before adding your answer.

Traffic: 191 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6