Ontology lack in goProfiles : no BP or CC just MF
0
0
Entering edit mode
@arnaud-mounier-5957
Last seen 8.1 years ago
Hi, I've build a specific DataFrame with python pandas to compute ontology frequencies with goProfiles in bioconductor. I use the basicProfile function with option 'GOTermsFrame' but without the optional column 'Evidence'. I've got one big dataframe as follow : In [1]: df.info() <class 'pandas.core.frame.dataframe'=""> Int64Index: 119626 entries, 0 to 119625 Data columns (total 3 columns): GeneID 119626 non-null object GOID 119626 non-null object Ontology 119626 non-null object dtypes: object(3) So, almost 120000 entries with divided with Ontology as follow : In [2]: df.groupby(['Ontology'])['Ontology'].count() Ontology BP 58802 CC 26867 MF 33957 When I compute goProfile with any three Ontology at level 2, I've got this frequencies : In [3]: rdf = com.convert_to_r_dataframe(df) In [4]: %%R -i rdf > library(goProfiles) > rdf <- as.data.frame(rdf) > print(head(rdf)) GeneID GOID Ontology 0 VIT_201s0011g00010.1 GO:0043565 MF 1 VIT_201s0011g00010.1 GO:0003964 MF 2 VIT_201s0011g00010.1 GO:0006278 BP 3 VIT_201s0011g00010.1 GO:0006367 BP 4 VIT_201s0011g00010.1 GO:0003743 MF 5 VIT_201s0011g00010.1 GO:0005840 CC > profiles.ANY <- basicProfile(rdf,idType='GOTermsFrame',onto="ANY",level=2) > printProfiles(profiles.ANY,percentage=T,aTitle="Test GO Profile") Test GO Profile ======================== [1] "MF ontology" Description GOID Frequency 12 antioxidant activity GO:0016209 1.0 9 binding GO:0005488 75.0 4 catalytic activity GO:0003824 65.1 1 electron carrier activity... GO:0009055 3.5 15 enzyme regulator activity... GO:0030234 1.6 21 molecular transducer acti... GO:0060089 3.1 3 nucleic acid binding tran... GO:0001071 2.8 6 nutrient reservoir activi... GO:0045735 0.5 2 protein binding transcrip... GO:0000988 0.1 5 receptor activity GO:0004872 1.2 7 structural molecule activ... GO:0005198 2.8 8 transporter activity GO:0005215 8.2 [1] "BP ontology" [1] Description GOID Frequency <0 lignes> (ou 'row.names' de longueur nulle) [1] "CC ontology" [1] Description GOID Frequency <0 lignes> (ou 'row.names' de longueur nulle) So, neither BP or CC Ontology is show up. But when I take a slice of 500 rows of this big dataframe and compute the same ways (any ontology, level=2), I've got this : In [5]: dft = df[0:500] In [6]: rdft = com.convert_to_r_dataframe(dft) In [7]: %%R -i rdft > profs.ANY <- basicProfile(rdf,idType='GOTermsFrame',onto="ANY",level=2) > printProfiles(profiles.ANY,percentage=T,aTitle="Test GO Profile") Test Profile ============ [1] "MF ontology" Description GOID Frequency 9 binding GO:0005488 77.8 4 catalytic activity GO:0003824 49.2 1 electron carrier activity... GO:0009055 3.2 3 nucleic acid binding tran... GO:0001071 1.6 7 structural molecule activ... GO:0005198 1.6 8 transporter activity GO:0005215 12.7 [1] "BP ontology" [1] Description GOID Frequency <0 lignes> (ou 'row.names' de longueur nulle) [1] "CC ontology" Description GOID Frequency 3 cell GO:0005623 93.4 6 cell junction GO:0030054 3.3 17 cell part GO:0044464 93.4 2 extracellular region GO:0005576 8.2 9 macromolecular complex... GO:0032991 21.3 1 membrane GO:0016020 34.4 8 membrane-enclosed lumen... GO:0031974 3.3 15 membrane part GO:0044425 19.7 4 nucleoid GO:0009295 1.6 10 organelle GO:0043226 75.4 13 organelle part GO:0044422 21.3 19 symplast GO:0055044 3.3 I'm not really understand why : - there is no BP frequencies in both df whereas thereis 58802 genes with BP ontology in the main frame - there is CC frequencies in short frame and not at all in the main frame whereas the short in first part of the big one. Can the level (2 in this case) can explain this big difference ? Thank's a lot, Arnome. -- ? Quand les hommes consid?rent certaines situations comme r?elles, elles sont r?elles dans leur cons?quence. ? Le th?or?me de Thomas. Arnaud Mounier INRA - UMR Agro?cologie 1347 CNRS - ERL IPM 6300 (Plant-Microorganism Interaction) 17, rue Sully - BP 86510 - F-21065 Dijon Cedex - France Work phone : +33 380 693 167 - Fax : +33 380 693 753 https://www6.dijon.inra.fr/umragroecologie/Personnel/IPM/ITA/MOUNIER- Arnaud
GO goProfiles GO goProfiles • 1.6k views
ADD COMMENT

Login before adding your answer.

Traffic: 758 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6