Actions

Ontolog Forum

Metholodology to be used to build our first UBL-Ontology

Candidates

  • The 7-Step Knowledge-Engineering Methodology outlined in Noy & McGuinness' "Ontology 101" Paper
  • Methodology in Guarino & Welty's OntoClean / Methontology approach
  • Build it on our Wiki
  • Michael Daconta 2003-03-25: "I would recommend that the overaching framework be the Ontology 101 process with a "Welty and Guarino" pass in step 4." (ref. http://ontolog.cim3.net/forum/ontolog-forum/2003-03/msg00125.html)
  • Remarks: We don't necessarily have to chose ONE and discard all other. We should consider adopting mutiple approaches, or a hybrid of them too. Whatever is most effective, giving our context. (also, see discussion)

Adopted Methodology

    • Step 1. Determine the domain and scope of the ontology
    • Step 2. Consider reusing existing ontologies
    • Step 3. Enumerate important terms in the ontology
    • Step 4. Define the classes and the class hierarchy
    • Step 5. Define the properties of classes
    • Step 6. Define the additional properties related to or necessary for properties (i.e., cardinality, bidirectionality/inverse, etc.)
    • Step 7. Create instances
    • Step 8: Create axioms/rules
  • See also the thread on "Nuts and Bolts" started by Adam Pease on 2003-04-17.
    • Extract nouns and verbs from a source text
    • Find classes in SUMO for the nouns and verbs
    • Record a mapping as being either equal, subsuming or instance.
      • type a single word that relates to the UBL term in the "SUMO term" or "English Word" text areas in the SUMO browser
    • Create a subclass of SUMO if it's a subsuming mapping
    • Add properties to the subclass
      • reusing SUMO properties
      • extending SUMO properties by creating a &%subrelation of an existing property
    • Add English definition to the class
      • define constraints that express how the subclass is more specific than the superclass
    • Express the classes and properties in KIF and begin creating axioms, based on the English definitions created previously

Representations of Choice

  • OntologyRepresentation

Tools of Choice

  • ToolsOntology

Elaboration on the Various Candidate Approaches

The "Nuts and Bolts" process

  • 1. Use Protege to define the class hierarchy and properties of classes
    • a. Find classes in SUMO that are relevant and record a mapping from UBL to SUMO as being either equal, subsuming or instance.
    • b. Create a subclass of SUMO if it's a subsuming mapping
    • c. Add properties to the subclass, either by reusing SUMO properties, or by extending SUMO properties by creating a &%subrelation of an existing property
    • d. Add English definition to the class to define constraints that express how the subclass is more specific than the superclass
  • 2. Switch from using Protege to using a text editor to express the classes and properties in KIF and begin creating axioms, based on the English definitions created previously.

Our final candidate 8-step Approach

  • full text of LeoObrst's post part-1:

" ... Over the course of our last two telecons, we've raised some issues (and satisfied ourselves with some discussion), but we want to open these issues up to discussion among the larger membership.

1) Ontology Methodology

We are going to follow steps 1-5 in the Noy & McGuinness Ontology 101 methodology. Why? Because it is relatively simple and we are trying to adopt a KISS (Keep It Simple, Stupd) meta-methodology. The fused Methontology-OntoClean methodology is more complicated, because it tries to create an engineering discipline for the development of ontologies, an ontological engineering, borrowing from the more evolved software engineering/development lifecycle methodologies (Methontology) and from formal ontology analysis (OntoClean).

We are going to follow steps 1-5 only (and really, part of 5) because these are generic and make no assumptions about knowledge representation/ontology language. When you talk about "slot" and "facet", in particular, you are oriented toward a "frame-based" KR language, and we didn't want to force that perspective yet. [Aside: however, when we talk about KR languages and prospective tools based on those languages, we will possibly re-introduce these frame notions -- so stay tuned!]

  • Step 1. Determine the domain and scope of the ontology
  • Step 2. Consider reusing existing ontologies
  • Step 3. Enumerate important terms in the ontology
  • Step 4. Define the classes and the class hierarchy
  • Step 5. Define the properties of classes and their slots

( Step 6. Define the facets of the slots )

( Step 7. Create instances )

I'll reword the above to be the following, and probably we can adopt all seven steps with this rewording (and I'll add a distinct step 8, even though this step is partially included in steps 4-7, about which we'll have more discussion later):

  • Step 1. Determine the domain and scope of the ontology
  • Step 2. Consider reusing existing ontologies
  • Step 3. Enumerate important terms in the ontology
  • Step 4. Define the classes and the class hierarchy
  • Step 5. Define the properties of classes
  • Step 6. Define the additional properties related to or necessary for properties (i.e., cardinality, bidirectionality /inverse, etc.)
  • Step 7. Create instances
  • Step 8: Create axioms/rules

Question: are these revised 8 steps reasonable to folks? ... "

Choice of Upper Ontology

  • full text of LeoObrst's post part-2:

" ... 2) Upper Ontology/ies

It is much easier to develop domain ontologies (domain defined as a subject area or area of knowledge, e.g., business-to-business e-commerce) when these can use upper ontologies. In our experience, developing domain ontologies without upper ontologies causes you to spend a good portion of your time creating what should be in an upper ontology (if you had one), i.e., time, space, part-hood, abstract vs. concrete, organization, process, state, task, product, location, role, contiguity, synchronization, dependency, physical property, scalar measures, unit of measure, etc.

So it is useful at the beginning of the process in developing domain ontologies to have and use a set of upper ontologies. SUMO (Suggested Upper Merged Ontology) has been offered as one candidate by Adam Pease. He has mapped one of our targeted areas of focus (Invoicing) to the SUMO -- a very useful exercise.

This is an issue we need to address: let's pull in an Upper Ontology or set of upper ontologies, so we don't spin our wheels re-inventing stuff that may already be available. Which ones? Well, currently there are a few out there. SUMO, Upper Cyc, and some others probably not as extensive. I will try to dig up a summary.

Leo ... "

"Ontology 101" Approach

  • Synopsis
    1. Determine the domain and scope of the ontology
    2. Consider reusing existing ontologies
    3. Enumerate important terms in the ontology
    4. Define the classes and the class hierarchy
    5. Define the properties of classes and their slots
    6. Define the facets of the slots
    7. Create instances
    • The paper shows how one can use this approach in association with the Protégé-2000 Ontology Editor
  • Pros
  • Cons
  • Other Remarks
  • References

"OntoClean" Approach

  • Synopsis
  • Pros
  • Cons
  • Other Remarks
  • References

"Wiki" Approach

  • Synopsis
  • Pros
    • Easy to collaboratively create and edit.
  • Cons
    • No automated transform to a formal ontology language.
  • Other Remarks
    • I think we should do this informally. The question I would have is whether we feel we should do this as a formal deliverable. I would recommend that we use this as a "bootstrapping" tool to get general consensus and then formalize the general ideas expressed in the wiki in a formal ontology language via a specific tool.
  • References

Representation

  • The issue about whether to create a formal ontology in first order logic seems to have resurfaced. This is an attempt to summarize our reasons for our earlier consensus on using first order logic. (--AdamPease / 2003-07-31)
    • There were several reasons for the earlier consensus to use KIF but foremost among them was the stated purpose of this effort - to formalize the content in the UBL, rather than leaving definitions in English. If we create the ontology in Protege, we'll be left with stating definitions in English. We could state axioms in the comment fields, but Protege can't make any use of those axioms, and the tool does not contain even the most basic functions like hyperlinking the terms in a pseudo-axiom comment. (--AdamPease / 2003-07-31)
    • Protege is a good way to get one's feet wet in ontology. It's very easy to use. But it has to be seen as a mere stepping stone to our objective. If we linger too long with it, the resulting model will reflect Protege's representational capabilities. For those of us who are programmers, I'm sure we all will have an anecdote about another programmer who "writes Fortran in C" or who "writes C in C++". In order to create code that is idiomatic in a given language one needs to write in that language at the very least, and make use of its facilities over time, while one develops proficiency. (--AdamPease / 2003-07-31)
    • Frame representations have the limitation of not being able to represent three-way (or higher number) relations. One has to divide any three-way relation into three binary relations with a new "relation ID" term. This is very awkward both for representation and reasoning. So (between A B C) becomes (between1 A gensym1) (between2 B gensym1) (between3 C gensym1).
    • Another problem with frames is in the representation of negation. In logic we can simply enclose any formula in (not ...) to negate it so (likes Mary Bill) can become (not (likes Mary Bill). With a frame system we have to create the opposite version of the predicate, say "dislikes". That results in doubling the number of relations, since every predicate must have an opposite. Maybe that wouldn't be so bad, but one would still be limited to negating individual statements, rather than abitrary formulas (which a frame system can't represent anyway). The list of things a frame system can't handle is extensive.
    • If we stick with a frame representation for long, we'll wind up recreating a lot of the content of SUMO, simply because frames can't handle the content that already exists.