Tuesday, September 20, 2011

On the topic of ontology evaluation

 With only a cursory look at the literature pertaining to the evaluation of ontologies, I already get the feeling that the current measures completely miss the point. The answer doesn't lie in the syntax (format) or structure of the ontology (the number of classes and properties, subsumed classes, axioms, etc), but rather the effectiveness of an ontology ( a representation of knowledge ) is in whether the semantics can be used for some task. So what we really want, is to focus on the nature of the task, and whether ontology provides a competitive advantage over other technologies.

Off the top of my head, here are some tasks:
- search/browse (most sites using GO, etc)
- text annotation (gopubmed)
- data normalization and structured queries - bio2rdf
- answering questions that require background knowledge e.g. across a yeast database
- data integration (heterogeneous types of data; data from different sources, of differing granular detail)  (see my translational medicine paper)
- classification e.g. domains or chemicals 
- prediction e.g. predicting phenotypes

Perhaps others can suggest some?

1 comment:

Staudinger's Molecules said...

Completely agree Michel - and that is something that Robert and I have discussed many times along similar lines.

From a technical point of view, the trouble with task-focused evaluation of ontologies is just simply that that it is very hard to build general tooling that might assist in that evaluation - which is clearly the reason why people have focued on things like coverage etc.

Carrying out such evaluations is the equivalent of, for example, procedures one would go through to validate text minig technologies (interannotator agreements etc...) which is hugely expensive.

I guess one approach one could take would be to co think about whether one can - as you have done - can build tooling round these high level tasks that you have listed with extension points that allow evaluation against domain use-cases.