From OntologPSMW
Contents |
NIST, Gaithersburg, MD
    (1A)
April 24, 2007
    (1B)
Under the appellation of "ontology" are found many different types of
artifacts created and used in different communities to represent
entities and their relations for purposes including annotating datasets,
supporting natural language understanding, integrating information
sources and to serve as a background knowledge in various applications.
    (2A)
The Ontology Summit 2007 "Ontology, Taxonomy, Folksonomy: Understanding
the Distinctions" is an attempt to bring together various communities
(computer scientists, information scientists, philosophers, domain
experts) having a different understanding of what is an ontology, and to
foster dialog and cooperation among these communities.
    (2B)
In practice, the name ontology covers a spectrum of artifacts, from
formal upper-level ontologies expressed in first order logic (e.g.,
Basic Formal Ontology (BFO) and DOLCE) to the simple lists of user-defined
keywords used, for example, to annotate resources on the Web. The latter
are called "folksonomies" and play an important role in the Web 2.0. In
between the two extremities of the ontology spectrum are taxonomies and
controlled vocabularies (e.g., MeSH), often used for information
indexing and retrieval, and whose organization of mostly hierarchical.
Finally, there are richer ontologies, often based on formalisms such as
frames or description logics, representing not only subsumption
relations, but also other kinds of relations among entities (e.g.,
functional, physical.) Examples of such ontologies in the biomedical
domain include the Foundational Model of Anatomy, SNOMED CT and the NCI
Thesaurus.
    (2C)
The goal of the Ontology Summit is not to establish a definitive
definition of the word "ontology", which has proved extremely
challenging due to the diversity of artifacts it can refer to. Rather,
we propose to identify a limited number of key dimensions along which
ontologies can be characterized and to provide operational definitions
for these dimensions. The relative position of ontologies in the space
defined by these dimensions, the "Framework", is indicative of the
similarities and differences between these ontologies. The Framework has
been applied to the characterization of a dozen ontologies, whose
descriptions were collected through a survey.
    (2D)
The ontology summit is an outgrowth of the work
and discussions of the of the Ontolog Forum. Last
year the Ontology Summit was concerned with an
examination of Upper Ontologies. This year the
Ontology Forum was concerned with characterizing
a wide variety of ontology and ontology-like
activities.
    (3A)
One major goal of the Ontology Summit 2007 was to
bring together the various diverse communities
working on ontology-like activities so as encourage
cooperative efforts. Toward this end the summit
has attempted to characterize what is an ontology,
e.g., to construct a typology
of ontologies. The framework of dimensions is
comprised of two groups: semantic dimensions and
and pragmatic dimensions. Semantic dimensions include
expressiveness, structure, and representational
granularity. Pragmatic dimensions include intended
use, use of automated reasoning, and prescriptive
vs descriptive.
    (4A)
Expressiveness is a property of the knowledge
representation language which describes the extent
and ease with which the KRL can describe increasingly
complex semantics, cf. propositional logic,
description logic(s), first order logic, sorted
logics, modal logics, ...
    (4B)
Structure is a property of the ontology, which
records how elaborate (or well organized)
are the semantics encoded by the
ontology. It may be the same as the expressiveness of
the KRL in which the ontology is encoded, or it may
be less the expressiveness of the knowledge
representation language. Thus a simple taxonomy,
e.g., a tree, may be encoded in RDF, a description
logic language such as OWL-DL, or first order logic,
e.g., Common Logic. Viewed from a graph theoretic
perspective level of structure might be either
a simple set of terms (glossary), a
tree structures (taxonomy), a directed acyclic
graph, e.g., a partial order (faceted classificiation
schemes), or an arbitrary directed graph (e.g., RDF).
    (4C)
The granularity dimension concerns the level of
detail at which the ontology is specified.
A crude measure of granularity measure would
be the number of concepts (nodes) and the number
relation instances (links or edges in graph
representations). However, this fails to
recognize that some ontologies may have larger
scopes (domains) than others. A coarse grained
ontology might be suitable for use as an upper
ontology, or a broad subject index while a
fine-grained ontology (such as SNOMED CT with
300K concepts) may be better suited for
encoding medical diagnoses.
    (4D)
Intended use is the dimension which records the
orginal purpose(s) of the ontology. These may include
semantically informed search, data semantics specification
for databases or data entry, data integration across
multiple data sources, agent communication languages,
controlled vocabularies for recording medical diagnoses,
etc.
    (4E)
Automated reasoning is a dimension which records
the extent to which it is anticipated that an ontology
will be used by automated reasoning software, e.g.,
for question answering, etc. If so, then one would
expect that the ontology would likely be encoded as
using some form of logic, e.g., First Order Logic.
    (4F)
Prescriptive vs. Descriptive is a dimension which
characterizes whether the intent of the ontology
developer is simply to describe contemporary semantic
usage without much regard as to the scientific
correctness of the encoded knowledge (e.g., a
whale might (in common parlance) be described as
a large fish. Examples of such descriptive ontologies
include folksonomies and most linguistic ontologies.
Alternatively, an ontology
may be intended as a normative prescriptive document whose
correctness is considerable concern, e.g.,
a whale is a mammal not a fish. Other prescriptive ontologies
include medical diagnostic terminologies, legal or
regulatory ontologies, accounting ontologies,
mathematical or engineering ontologies, etc.
    (4G)
The governance dimension is concerned with how
decisions concerning the structure and (esp.) content
of an ontology are made. There was agreement at
the summit that ontology developers need to defer to
existing legal, regulatory, and professional organizations
concerning the natural language definitions
of concepts and semantic relationships.
Ontology development should be viewed
as an effort to organize and formalize concept definitions
and relationships which are conventionally defined by
existing institutions, not as an attempt to replace
existing definitions with de novo definitions generated
by autonomous computer scientists. As a corollary,
it was observed that it is necessary to record the
provenance of every definition, etc. incorporated into
an ontology, e.g., the controlling legislation, regulation,
standard, etc. from which a definition is taken.
    (4H)
One of the issues discussed was the relationship
between social tagging and folksonomies and more
traditional structured / formal ontologies such
as taxonomies and axiomatized ontologies. Until
recently these efforts have been viewed as
competitive approaches. The consensus of the
Ontology Summit was that social tagging efforts should
be viewed as large scale corpora to be used
for inferring and validating more formal ontologies,
akin to the use of large text corpora in computational
linguistics studies. In addition, more formal ontologies can
be used to inform social tagging by providing
improved tag sets, and faceted tagging.
    (5A)
Tom Gruber and Paola Di Maio both argued that ontologies
should be considered a type of software artifact,
and that ontological engineering should be
thought of as a discipline akin to software
engineering or database design - i.e., a
standard component of the software
professional's toolkit, taught routinely to
every CS student.
    (6A)
In order to elicit the distinctions between various kinds of ontologies,
an interactive study was designed and posted on the Web in order to
engage various communities. The respondents were invited to identify the
community of which they are a representative and to describe the value
of ontologies, as well as issues with ontologies in this community. The
last section of the survey invites the respondents to describe and
characterize the ontologies or related artifacts in use in this
community.
    (7A)
Over fifty respondents from 24 communities submitted entries to the
survey. The best represented communities were Formal ontology,
Applications development, Standards development, Web 2.0 and
Biomedicine. 41 terms were identified as closely related to ontology,
including formal ontology, upper ontology, concept system and controlled
vocabulary. Some 70 ontologies from a variety of domains were
characterized in the survey, including formal ontologies (e.g., BFO,
DOLCE, SUMO), biomedical ontologies (e.g., Gene Ontology, SNOMED CT,
UMLS, ICD), thesauri (e.g., MeSH, National Agricultural Library
Thesaurus), folksonomies (e.g., Social bookmarking tags), general
ontologies (WordNet, OpenCyc) and specific ontologies (e.g., Process
Specificatin Language). The list also includes markup languages (e.g.,
NeuroML), representation formalisms (e.g., Entity-Relation model, OWL,
WSDL-S) and various ISO standards (e.g., ISO 11179). This sample clearly
illustrates the diversity of artifacts collected under "ontology".
    (7B)
This is the proposed first draft.
    (7C)
See the finalized and adopted version of the document at: OntologySummit2007_Communique
    (7D)
This page has been migrated from the OntologWiki - Click here for original page
    (7E)