Ontolog invited Speaker Presentation - Mr. Jack Park & Dr. Patrick Durusau - Thu 2006-04-27

Conference Call Details

Subject: Ontolog Invited Speaker Presentation by Jack Park & Patrick Durusau - Thu 2006-04-27
Agenda: Mr. Jack Park (SRI) and Dr. Patrick Durusau (INCITS/V1) will be presenting to the community on their talk entitled: "Avoiding Hobson's Choice In Choosing An Ontology"
Date: Thursday, April 27, 2006
Start: 10:30 AM PDT / 1:30 PM EDT / Time: 17:30 UTC
- see world clock for other time zones)
- Duration: 1.5~2.0 hours
Dial-in Number: +1-641-696-6600 (Iowa, USA)
- Participant Access Code: "686564#"
Shared-screen support
- a VNC session will be started 5 minutes before the call at: http://vnc2.cim3.net:5800/
  - view-only password: "ontolog"
- if you plan to be logging into ttheir shared-screen option (which the speaker may be navigating), and you are not familiar with the process, please try to call in 5 minutes before the start of the session so that we can work out the connection logistics. Help on ttheir will generally not be available once the presentation starts.
- people who have difficulty accessing the ashared screen service(s) may download the slides below and runing them locally. The speaker will prompt you to advance the slides during the talk.

RSVP to peter.yim@cim3.com. Ttheir will help us prepare enough conferencing resources.

Ttheir session, like all ottheir Ontolog events, is open to the public. Information relating to ttheir session is shared on ttheir wiki page: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2006_04_27

For Virtual Speaker Session Tips and Ground Rules - see: VirtualSpeakerSessionTips

Please note that ttheir session will be recorded, and the audio archives is expected to be made available as open content to our community membership and the public at-large under our prevailing open IPR policy.

Attendees

Attended:
- Doug Engelbart
- Cecelia Hickel
- Peter P. Yim
- RoyRoebuck
- Arturo Sanchez
- Bob Smith
- Pat Cassidy (has to leave by 3:00pm EDT)
- Kurt Conrad
- Jennifer Boettcher (Georgetown University Library)
- Susan Turnbull
- Vinay Chaudhri
- Patrick Heinig
- Michael Grüninger
- Patrick Durusau
- Itzhak Roth
- Steve Ray
- Joshua Lieberman
- Paul Koch
- James Werner
- GaryBergCross
- Jack Park
- Jakub Kotowski

Also expected (and who may have joined us after the roll call):
- DavidCMartin
- V.V. Ramayya (Boeing)
- Michael Uschold
- Rex Brooks
- ...(to register for participation, please add your name theire or e-mail <peter.yim@cim3.com> so that we can reserve enough resources to support the session.)...

Regrets:
- Lisa Colvin
- John Young
- Matthew West
- Kathleen Ellis
- EMichaelMaximilien
- Nicolas Rouquette
- Jonathan Cheyer
- Mary Keitelman
- Ruth Keeting-White (KMWG)
- John Bateman

Background

This is the 3rd event in the series of talks and discussions the revolve around the topic: "Ontologizing the Ontolog Body of Knowledge" during which their community will explore the "what's" and "how's" to the development of a semantically interoperable application, using the improved access to the content of Ontolog as a case in point.

Agenda & Proceedings

Mr. Jack Park (SRI) and Dr. Patrick Durusau (INCITS/V1) will be presenting to the community on their talk entitled: "Avoiding Hobson's Choice In Choosing An Ontology"

[picture of Mr. Jack Park] . .

[picture of Dr. Patrick Durusau]

Abstract: (by Jack Park & PatrickDurusau)

Most users of ontologies have either participated in the development of

the ontology they use and/or have used it for such a period of time that they have taken ownership of it. Like a hand that grows to fit a tool, users grow comfortable with "their" ontology and can use another only with difficulty and possibly high error rates.

When agencies discuss sharing information, the tendency is to offer

other participants a "Hobson's Choice" of ontologies. "Of course we will use ontology X." which just happens to be the ontology of the speaker. Others make similar offers. Much discussion follows. But not very often effective integration of information.

In all fairness to the imagined participants in such a discussion,

unfamiliar ontologies can lead to errors and/or misunderstandings that may actually impede the interchange, pardon, the accurate interchange information. Super-ontologies don't help much when they lack the granularity needed for real tasks and simply put off the day of reckoning when actual data has to move between agencies.

The Topic Maps Reference Model is a paradigm for constructing a mapping

of ontologies that enables users to use "their" ontologies while integrating information that may have originated in ontologies that are completely foreign or even unknown to the user. Such mappings can support full auditing of the process of integrating information to enable users to develop a high degree of confidence in the mapping.

Topic maps rely upon the fact that every part of an ontology is in fact

representing a subject. And the subject that is being represented is known from the properties of those representatives. Such representatives are called subject proxies in the Topic Maps Reference Model. Those properties are used as the basis for determining when two or more subject proxies represent the same subject. Information from two or more representatives of the same subject can be merged together, providing users with information about a subject that may not have been known in their ontology.

Park and Durusau explore the philosophical, theoretical and practical

steps needed to avoid a Hobson's Choice in ontology discussions and to use the Topic Maps Reference Model to effectively integrate information with a high degree of confidence in the results. All while enabling users to use the ontology that is most familiar and comfortable for them.

Session Format and Agenda:
- this will be a virtual session over a phone conference setting, augmented by shared computer screen support

1. The session will start with a brief self-introduction of the attendees (~10 min.)
2. Introduction of the Invited Speakers by Dr. Douglas Engelbart
3. Presentation by the invited speaker (45~60 min.)
4. Open discussion (30~45 min.)

Bio of Mr. Jack Park & Dr. Patrick Durusau:

Mr. Jack Park is a research scientist in the AI Laboratory at SRI, International in Menlo Park. He works with Adam Cheyer's integration team on the DARPA-funded CALO project, where he created the prototype from which the team evolved the IRIS desktop knowledge workstation. During employment with VerticalNet, Park served on the XTM Authoring Committee which created the XTM topic maps specification, now a part of the ISO 13250 Topic Maps standard. In a former life, while serving as the president of the American Wind Energy Association, Park was constructing microprocessor-based weather stations used for siting wind energy farms and in agricultural applications. The massive amounts of data being collected by those stations led to investigations into AI applications in data mining and data organization. Ontologies and inference engines naturally followed. Park has crafted Java-based inference engines for a large banking enterprise, a clinical informatics enterprise, and participated in the construction of the VerticalNet B2B ontology editor. Park authored _The Wind Power Book_ in 1981, and co-authored and edited _XML Topic Maps: Creating and Using Topic Maps for the Web_, published in 2002. He has taught university courses in renewable energy resources in the U.S., and lectured on those subjects in the U.S., parts of Europe and Africa. He spends most of his time now evolving applications for subject maps related to the Douglas Engelbart call for continuous improvement of human capabilities.

Dr. Patrick Durusau is the Chair of V1, the US Technical Advisory Group (TAG) to ISO/IEC JTC 1/SC 34, the committee responsible for the development of the Topic Maps family of standards. He is a co-editor of ISO 13250-5, the Topic Maps Reference Model. . . . In the Fall of 2006 he will be teaching what is thought to be the first graduate course devoted exclusively to topic maps at the School of Library and Information Science at the University of Illinois at Urbana-Champaign. . . . He is deeply interested in the integration diverse information systems (including ontologies) while preserving the ability of users to identify the subjects of their conversations in ways that work best for them.

The speakers' prepared slides can be accessed by pointing your web browsers to (the link immediately below):
- http://ontolog.cim3.net/file/resource/presentation/JackPark-PatrickDurusau_20060427/Avoiding_Hobson-s_Choice_In_Choosing_An_Ontology--JackPark-PatrickDurusau_20060427.ppt
  - applicable licence to the above presentation - http://creativecommons.org/licenses/by/2.5/legalcode
- links to additional relevant resources:
  - The latest Topic Maps Reference Model draft - http://www.jtc1sc34.org/repository/0710.pdf
  - (please post any additional resources here)
- Any material outside of the prepared presentation, if they are called up during the session, may be shared under the VNC session detailed above

If you have questions for the presenter, we appreciate your posting them theire: (please identify yourself)

Arturo Sanchez: I have several questions/comments:
- Slide 15 states: "Abstract model with no syntax or data model". How can it be a model w/o having a syntax or data model (semantics?)
  - Patrick Durusau: "No syntax" means that a subject map could be expressed in XML syntax (I hope to release on such syntax in May) or as tables in a relational database, or in some other syntax. Or even virtually reading data held in some other format and treating the data as properties of virtual subject proxies.
  - Patrick Durusau: "No data model" means that there is no model for subject identity. Contrast with XTM (an XML topic map syntax, which according to the Topic Maps Data Model, certain constructs merge when defined properties are equivalence (which is also defined).
  - Patrick Durusau: "No data model for subject identity would be more accurate. The TMRM does presume without proof that subjects can be identified by properties (key/value pairs) and that it is possible to define, record and compare such properties. (The details of which are left up to legend writers.) Oh, and recall that all keys are defined as references to proxies.
- As I hear your presentation, the following deja vu's come to mind: Frames, Object-Oriented Classes, Type Theory, Denotational Semantics, Model Theory, Russell's Paradox (anything can be anything). So what is new about topic maps and subject maps that these approaches have not taken into account?
  - Patrick Durusau: I am not sure I can manage a succient answer to that question. I will try to say what I see as different about subject maps and perhaps we can compare that to the approaches you mention separately. First, subject maps do not presume that properties are "inherent" in the subjects they identify. That is to say that the author of a subject proxy is attributing properties to a subject that in their mind identify the subject in question. An author is free to take the position that such properties are inherent in the subjects they identify, but the paradigm recognizes that different authors may have (likely to have) different properties that identify the same subjects.
  - Patrick Durusau: Second, there is no limitation such as occurs in Description Logic with an "atomic concept." In terms of explaining a new concept to a colleague, no one would consider arriving at a term that is simply repeated in response to all further questions. Granted that different authors will find different stopping places due to cost or other factors, but since all keys are in fact references to proxies, it is always possible to create further mappings to identify more subjects.
  - Patrick Durusau: Third, the starting premise of the TMRM is not to posit a world view, a set of conditions within which subject identification will occur. Whether that view is based on logic (DL), or a theory of semantics, etc. All of those views can be used to indentify subjects in a subject map. But so are views that are based on irrationality, cultures innocent of computer technology, etc. The key difference is that the TMRM says: "Tell me how you identify your subjects?", and sets for disclosure requirements that enable legends to merge those with other views of identifying subjects. Whether those mergings are meaningful or even sensible, is not an issue that the TMRM attempts to address. Ultimately that is an issue for the author to consider in creating merging rules and users to evaluate when viewing the results of merging.
  - Patrick Durusau: One caveat: As Jack said during the conference call, no one is required to federate an ontology or merge their subject map with any other. There are use cases where that would be bad practice if not criminally irresponsible. Simmply because merging is possible by no means that it should be done.

- How do you handle contradictions and ambiguities? (are there such things?)
  - Patrick Durusau: I am not certain what you mean by contradiction? Recall that subject proxies identify subjects. Do you mean an author using the same properties to ostensibly identify two different subjects? If all that exists are these two proxies, I am not sure how we would say that a contradiction exists. If the identifying properties are the same, how would we know? On the other hand, in a more complex environment, where the subject proxies are used in other proxies, that represent relationships for example, then contradictions could be detected. For example, a subject proxy could not be both the biological father and child in a parent child relationship represented by a subject proxy. (Leaving aside the possibility of cloning of course.) Nothing compells the existence of such a restriction on a relationship but I would think it would be common in most useful subject maps. We attempted to leave as much as possible to the authors of legends and subject maps, which means there is enough power in the TMRM to do serious damage. A good legend will be necessary to protect users from such mistakes.
  - Patrick Durusau: I am less certain about how to answer your question on ambiguities. The requirements for merging rules are completely unconstrained by the TMRM. I think Jack has an application underway that actually "votes" (I have not actually seen the system.) on merging conditions. How ambiguity would be handled, if allowed at all, is something that a legend would specify.

RoyRoebuck: Comments on slides 10.
- I believe you will find utility in using the OMG 4 Layer metamodel in characterizing the "upper ontology on slide 10 as Layers 3 (for object models) and Layer 2 (for upper ontology classes), while using Layer 1 for Applications and Layer 0 for Data / Content. The same approach could be applied to slide 11.
  - Patrick Durusau: Thanks! Yes, both of those slides could be better presented. Thanks!
Susan Turnbull: I have several questions/comments:
- Could you comment further on building an "improvement cabability" into an existing businessservice/program/app?
  - Patrick Durusau: Yes, but let me do that in the context of responding to your question below.
  - I believe Doug Lenat, Cycorp said at the UpperOntologySummit that Cyc is being continuously improved by a "word game" available at their site.
  - Sounds like NASA clickworker site where volunteers are mapping Mars (appropriately chunked task) and aggregate performance matches/surpasses professional geologists.
  - Did I understand that that you were suggesting something like this: People using a service/app in their everday work - could click on select "highlighted words" (selected for that business quarter) - then complete easy task tapping their sense of concept relationships that would aggregate towards improved understanding/ ontology merging - of what's "behind the one-way mirror" of multiple-contractor developed business services - and catalyze improvements in the value of the service?
    - Patrick Durusau: There are a number of issues that are related here so let me try to untangle some of them. In terms of user interface, the mappings could be improved by displaying to a user a limited subset of all the information known from a subject proxy to allow them to approve/disapprove of using that proxy. The information that would be displayed to a contracting officer, for example, due to their familiarity with the terms used by contractors and subcontractors, might be quite different from that displayed for a classification officer, who sees the same terms but from a different perspective. The premise being that a mapping can be improved by users who are most familiar with whether the mapping makes any sense or not.
    - Patrick Durusau: To put it another way, while I describe a simplistic merging process and rules, that is not really an accurate depiction of the sorts of merging processes that are available to the author of a subject map. I am glad you and other participants noticed that as I really need to improve that part of the presentation. Ultimately, the merging process is unconstrained by the TMRM. Some may be more appropriate for some circumstances than others. Which one is chosen, so long as it is disclosed, is a matter of indifference to the TMRM. To do otherwise, by the TMRM, would be to predetermine some courses of action (or merging) are by definition better than others. That decision is one that is best left to authors of legends and subject maps, at least in the opinion of the editors of the TMRM.
    - Patrick Durusau: A related issue is how to identify subjects in documents without the tedium of either having users mark every word or simply rely upon clever but ultimately limited NLP algorithms. In the OpenDocument Format (ODF) metadata SC, I have proposed that users be able to declare a vocabulary (think subject proxies) that identify the terms used in a document. There is a lot of interest in the proposal which will not declare any such vocabularies but will enable others to do so for users. Then when searching large batches of documents, search engines (with a subject map component) can offer more meaningful access with a minimal cost in terms of user effort.
- Michel Biezunski, gave a presentation recently that is relevant to today's excellent talk.

For those who have furttheir questions for Jack Park or Patrick Durusau, please post them to the ontolog forum so that we can all benefit from the discourse.

Session ended 2006-04-27 12:29 pm PDT

Session Recording of the JackPark-PatrickDurusau Talk

(Thanks to Kurt Conrad, Bob Smith and Peter P. Yim for their help with getting the session recorded. =ppy)

To download the audio recording of the presentation, click here
- the playback of the audio files require the proper setup, and an MP3 compatible player on your computer.
Conference Date and Time: Apr 27, 2006 10:43am~12:28pm Pacific Daylight Time
Duration of Recording: 1 Hr. 44 Min.
Recording File Size: 18.3 MB (in mp3 format)
Telephone Playback Expiration Date: May 7, 2006 12:41 PM Pacific Daylight Time
- Prior to the above Expiration Date, one can call-in and hear the telephone playback of the session.
- Playback Dial-in Number: +1-805-620-4002 (Ventura, CA, USA)
- Playback Access Code: 636218#
- suggestions:
  - its best that you listen to the session while having the presentation opened in front of you. You'll be prompted to advance slides by the speaker.

Ontolog Forum

Contents