Actions

Ontolog Forum

Revision as of 18:30, 16 December 2022 by Forum (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Session COVID-19 KGs
Duration 1 hour
Date/Time 09 Feb 2022 17:00 GMT
9:00am PST/12:00pm EST
5:00pm GMT/6:00pm CET
Convener Ravi Sharma
Track Disaster Landscape

Ontology Summit 2022 COVID-19 KGs

Dealing with Disasters

The COVID-19 pandemic as well as other pandemics and disasters have prompted an impressive, worldwide response by governments, industry, and the academic community. Ontologies can play a significant role in search, data description, interoperability and harmonization of the increasingly large data sources that are relevant to disasters such as the COVID-19 pandemic. The Ontology Summit 2022 examined the overall landscape of disasters and related ontologies. A framework consisting of a set of dimensions was developed to characterize this landscape. The framework was applied to health-related disasters, environmental disasters, as well as aerospace and cyberspace disasters. It was found that there are many cross-domain linkages between different kinds of disasters and that ontologies developed for one kind of disaster can be repurposed for other kinds. A representative sample of projects that have been developing and using ontologies for disaster monitoring and response management is presented to illustrate best practices and lessons learned. The Communiqué ends by presenting the findings and recommendations of the summit.

Agenda

  • Chris Mungall and Justin Reese COVID-19 Knowledge Graphs
  • Chris Mungall is Department Head of Biosystems Data Science at Lawrence Berkeley National Laboratory. His research interests center around the capture, computational integration, and dissemination of biological research data, and the development of methods for using this data to elucidate biological mechanisms underpinning the health of humans and of the planet. He is particularly interested in developing and applying knowledge-based AI methods, particularly Knowledge Graphs (KGs) as an approach for integrating and reasoning over multiple types of data. Dr. Mungall and his team have led the creation of key biological ontologies for the integration of resources covering gene function, anatomy, phenotypes and the environment. He is a PI on major projects such as the Gene Ontology (GO) Consortium, the Monarch Initiative, the NCATS Biomedical Data Translator, and the National Microbiome Data Collaborative project. In collaboration with Stanford University, Dr. Mungall is leading the creation of a new BioPortal Knowledge Graph.
  • Justin Reese is a Computer Research Scientist at LBNL. His research focuses on using computational methods to extract actionable knowledge from biomedical and biological data, and in particular, developing performant graph machine learning algorithms to extract knowledge from biomedical knowledge graphs. Dr. Reese, along with Dr. Mungall, led the KG-Covid-19 project, which included methods for performing inference over the KG. The knowledge graphs generated by the project have been leveraged by the National Virtual Biotechnology Laboratory (NVBL) Therapeutics project and the National COVID Cohort Collaborative (N3C) for accessing integrated COVID-related data. Starting in 2022, Dr. Reese will lead the LBNL team in a new project to develop machine learning tools and best practices to improve the response to COVID-19, leveraging a large range of patient-level COVID-19 data in the National COVID Cohort Collaborative (N3C) Enclave.
  • Video Recording

Conference Call Information

  • Date: Wednesday, 09 Feb 2022
  • Start Time: 9:00am PST / 12:00pm EST / 6:00pm CET / 5:00pm GMT / 1700 UTC
  • Expected Call Duration: 1 hour
  • The Video Conference URL is https://bit.ly/3rTKSGQ
    • Meeting ID: 881 4427 2329
    • Passcode: 553714
  • Chat Room: https://bit.ly/37g93pC
    • If the chat room is not available, then use the Zoom chat room.
  • One tap mobile

Attendees

Discussion

[12:12] Chris Mungall: Hi everyone!

[12:13] RaviSharma: hello and welcome

[12:14] RaviSharma: It seems that one of the Labs in DoD such as those doing cancer and related research at NIH at ft Dietrick also could be a stakeholder?

[12:18] RaviSharma: Are you also connected to the Livermore lab?

[12:19] Chris Mungall: We are a distinct lab, we have a few collaborators are other national labs, but I personally don't have many contacts an LLNL.

[12:20] Chris Mungall: If you have any contacts at Ft Dietrick would love to talk to them!

[12:20] Chris Mungall: Justin will talk about the NIH collaboration shortly

[12:23] RaviSharma: How is drug interface with ENT or nose in situ interaction simulated, similarly pulmonary and other organs?

[12:23] BobbinTeegarden: Can you talk more about semantic similarity calculations?

[12:24] Chris Mungall: KGX: https://github.com/biolink/kgx/ -- this is a simple TSV based exchanged formal for semantic Labeled Property Graphs

[12:24] Chris Mungall: The semantic web community are moving in this direction with 1G

[12:25] Chris Mungall: Semantic Similarity: let me pull up some references..

[12:26] Chris Mungall: Ravi: that kind of contextualized knowledge may go beyond what we have in the KG at the moment. We may have tissue-independent representations of the underlying biological pathways but are lacking tissue-specific models

[12:27] Chris Mungall: Here is a good summary of semantic similarity with ontologies: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000443

[12:29] Chris Mungall: This is the performant rust graph library: https://github.com/AnacletoLAB/ensmallen

[12:37] Chris Mungall: going back to the semantic similariry question, Fig2 in the phenomizer paper gives a sense of how this can be used to match patients to diseases/phenotypic clusters, making use of the ontology hierarchy: https://www.ncbi.nlm.nih.gov/labs/pmc/articles/PMC2756558/

[12:37] BobbinTeegarden: Thank you, Chris!

[12:39] Chris Mungall: We are not using semantic similarity explicitly with the KG embedding approach, but there are similarities -- we can measure similarity between the embedded vectors for any two concepts or individuals using standard measures like cosine similarity. This exploits a wider range of links in the KG than standard semantic similarity, but with some loss in explainability.

[12:45] RaviSharma: I have questions on ecosystem and drug interactions?

[12:50] Mike Bennett: Have you needed to do anything to reflect the distinction between data qua data, and things represented by data in the KG? Or are they indistinguishable in this scenario?

[12:54] Chris Ahern: Which KG database do you use?

[12:55] Chris Mungall: We basically do everything in memory using the ensmallen library... but we also make everything available in a triplestore and the kgs files can easily be loaded into neo4j We have been using blaze graph as the triplestore but it is no longer developed, we are looking for (open source) replacements

[12:56] BobbinTeegarden: If you change contexts/perspectives, can you discover new things in the same collection of data/ontology/KG (in a holonic way)?

[12:57] Chris Mungall: Good Q Mike, I would say we focus on knowledge over raw data, and there are upstream processes to normalize, e.g. RNAseq data -> simple X expressed-in Y triples

[12:58] Chris Mungall: Thanks - very interesting. We've used both Neo4j and AllegroGraph but not extensively enough to gauge their performance at scale. Your in-memory approach no doubt speaks to that.

[13:00] Chris Mungall: https://ontologforum.org/index.php/KGSQL?

[13:01] Ken Baclawski: Yes, that is the reference for KGSQL.

Resources

Previous Meetings

 Session
ConferenceCall 2022 02 02CYC
ConferenceCall 2022 01 26General Disaster Parametric Landscape
ConferenceCall 2022 01 19Launch
... further results

Next Meetings

 Session
ConferenceCall 2022 02 16Mark Fox
ConferenceCall 2022 02 23Synthesis
ConferenceCall 2022 03 02Workshop Report
... further results