Session	Yolanda Gil
Duration	1 hour
Date/Time	27 May 2020 16:00 GMT
	9:00am PDT/12:00pm EDT
	5:00pm BST/6:00pm CEST
Convener	KenBaclawski
Track	Use Cases

Ontology Summit 2020 Yolanda Gil

Knowledge graphs, closely related to ontologies and semantic networks, have emerged in the last few years to be an important semantic technology and research area. As structured representations of semantic knowledge that are stored in a graph, KGs are lightweight versions of semantic networks that scale to massive datasets such as the entire World Wide Web. Industry has devoted a great deal of effort to the development of knowledge graphs, and they are now critical to the functions of intelligent virtual assistants such as Siri and Alexa. Some of the research communities where KGs are relevant are Ontologies, Big Data, Linked Data, Open Knowledge Network, Artificial Intelligence, Deep Learning, and many others.

Agenda

Presentation by Yolanda Gil Seven Ontologies for Publishing the Scientific Record on the Web Sides Video Recording YouTube Video
Abstract:

This talk will describe our work on seven ontologies that we have developed to describe complementary aspects of scientific work, and that interlinked together present a path towards publishing the scientific record on the Web. The Linked Earth Ontology extends existing standards, and was developed collaboratively and entirely online by scientists. The OntoSoft ontology describes scientific software artifacts with information relevant to scientists. The W3C PROV-O ontology represents provenance of scientific data, whether observable or derived through computation. The P-PLAN ontology extends PROV-O to describe high-level general plans, and the OPMW-PROV ontology extends both to describe abstract computational workflows linked to their executions. The DISK Hypothesis ontology describes hypothesis statements, their supporting evidence, and their revisions as new data is analyzed. The Software Description Ontology for Models characterizes the development of models so they can be understood and compared. These seven ontologies provide essential capabilities, but much work remains to be done to capture more comprehensively the scientific record. Are we far from a day when each scientific article will be properly linked to hypotheses, models, software, provenance, workflows, and other key scientific entities on the Web? Will AI research tools then be able to access this information to generate new results? Will AI systems ultimately be capable of writing the scientific papers in the future?

Bio:

Dr. Yolanda Gil is Director of Knowledge Technologies at the Information Sciences Institute of the University of Southern California, and Research Professor in Computer Science and in Spatial Sciences. She is also Associate Director for Data Science, and Director of the USC Center for Knowledge-Powered Interdisciplinary Data Science. She received her M.S. and Ph. D. degrees in Computer Science from Carnegie Mellon University, with a focus on artificial intelligence. Her research is on intelligent interfaces for knowledge capture and discovery, which she investigates in a variety of projects concerning scientific discovery, knowledge-based planning and problem solving, information analysis and assessment of trust, semantic annotation and metadata, and community-wide development of knowledge bases. Dr. Gil collaborates with scientists in many domains on semantic workflows and metadata capture, social knowledge collection, computer-mediated collaboration, and automated discovery. She is a Fellow of the Association for Computing Machinery (ACM), and Past Chair of its Special Interest Group in Artificial Intelligence. She is also Fellow of the Association for the Advancement of Artificial Intelligence (AAAI), and was elected as its 24th President in 2016.

Discussion

Conference Call Information

Date: Wednesday, 27 May 2020
Start Time: 9:00am PDT / 12:00pm EDT / 6:00pm CEST / 5:00pm BST / 1600 UTC
- ref: World Clock
Expected Call Duration: 1 hour
The Video Conference URL is https://zoom.us/j/689971575
- iPhone one-tap :
  - US: +16699006833,,689971575# or +16465588665,,689971575#
- Telephone:
  - Dial(for higher quality, dial a number based on your current location): US: +1 669 900 6833 or +1 646 558 8665
  - Meeting ID: 689 971 575
  - International numbers available: https://zoom.us/u/Iuuiouo
Chat Room: http://bit.ly/2LkAbKj
- If the chat room is not available, then use the Zoom chat room.

Attendees

Discussion

[12:09:00] Ravi Sharma: Yolanda: Human limitations agree but human innovation is cause for scientific progress and new discoveries. Of course there is distinction between sci and tech

[12:10:21] Ravi Sharma: collaboration is the new key to large area scientific projects: LHC, ITER, Genomics, etc.

[12:16:05] Ravi Sharma: Yolanda what is the proxy archive or sensor?

[12:17:43] Todd Schneider: What analysis techniques were used in reviewing/developing domain vocabularies?

[12:18:45] Ravi Sharma: Yolanda - did you imply that crowd sourcing by scientist participants resulted in linked Earth Ontology?

[12:19:26] Ravi Sharma: or Just Vocabularies and not all relations?

[12:26:24] Ravi Sharma: What is OPM?

[12:26:38] David Eddy: OPM = open provenance model

[12:27:24] Ravi Sharma: David - thanks

[12:27:45] David Eddy: or… if you prefer… “other people’s money”

[12:28:23] Ravi Sharma: :)

[12:30:17] Ravi Sharma: Yolanda - did you describe collaborative methods?

[12:32:47] Ravi Sharma: Yolanda - How are these models represented in RDBMS or in a visualizer tool or in UML?

[12:33:18] Ravi Sharma: Can the ontology describe the model itself?

[12:36:56] Ravi Sharma: Yolanda - where do BIOARXIV and ARXIV stand in modern publishing? Are there examples of any collaborative interactive publishing?

[12:37:59] Leia Dickerson: Have you interacted at all with the bibliometrics community in regards to your work? I’m curious what conversations you may have had with them.

[12:39:13] Ravi Sharma: ken I also want to ask

[12:42:48] Janet Singer: To have AI write papers wouldn’t they need to have an upper-level ontology that related the various ontologies you presented to a high standard of quality. Do you think that is feasible?

[12:44:15] Ravi Sharma: enigma hosp neuro imaging

[12:44:31] Ravi Sharma: hundreds of authors?

[12:45:54] David Eddy: what about the editors who pound the writings of “authors” into usable words?

[12:48:01] Mike Bennett: @Janet not only need UO for AI to write papers, we also need to clear up the ways we distinguish between ontologies of data and ontologies of the things. Still v early days on that.

[12:51:40] Mike Bennett: Yolanda has some great bookshelf action going here.

[12:52:59] Patrick Kamongi: In implementing and populating these ontologies, how do you deal with the lack of good open source back-end triple stores to manage and use the generated knowledge bases/graphs? And what have been your best practice in selecting and maintaining a back-end graph/triple store for these ontologies? Thanks!

[12:53:51] Janet Singer: @David —You have used the example of the timeline for chemistry. I’ve wondered where we are and who the players are: KGs are practitioners working with and identifying elements and their properties; ontolog it’s are developing broad theory, but how still reflecting alchemy origins?

[12:54:36] Janet Singer: ontolog it’s —> ontologists

[12:56:51] Mark Underwood: I am overjoyed -- kid you not -- to see standardization endeavors elevated to the highest level in AAAI

[12:57:14] Ravi Sharma: Yolanda- How do you harmonize vocabularies across connected domains?

[12:57:26] Janet Singer: @David —as in “earth air fire water”?

[12:57:42] Ravi Sharma: Ken - if time permits!

[12:59:20] David Eddy: @Janet… that’s an excellent time line addition… gotta include the 4 elements… how long to get to Mendeleev;s 90 elements table>

[13:04:28] Ravi Sharma: Patrick K - Great Q KG as triple stores - back end, great.

[13:04:32] Marcia Zeng: Some of the issues might be covered by the W3C DCAT version 2. https://www.w3.org/TR/vocab-dcat-2/

[13:05:27] Marcia Zeng: Data Catalog Vocabulary (DCAT), an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web.

[13:06:18] Mark Underwood: Reminder for those interested in cultural ontologies, we had a presentation previously on Arches https://www.archesproject.org/

[13:09:32] TerryLongstreth: so the hypothesis is that we can predict flooding?

[13:11:41] Mark Underwood: Yolanda, thanks!

[13:12:20] Ravi Sharma: Yolanda - great talk with both width and depth! thanks

[13:12:27] ToddSchneider: Open source triple store: https://github.com/SemWebCentral/parliament

[13:13:42] Patrick Kamongi: Thank you!

[13:13:44] Mark Underwood: FYI I had hoped we'd have a presenter from Neo4J here

[13:14:04] Marcia Zeng: Thank you!

Resources

Previous Meetings

... further results

Next Meetings