KE4WoTChallengeWWW2018

From Knoesis wiki
Revision as of 13:43, 10 November 2017 by Amelie (Talk | contribs) (Challenge Task 1: Exploiting the Web of Things Knowledge Base)

Jump to: navigation, search

Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located with The Web Conference 2018 (WWW 2018): [1]

Short Description of the KE4WoT Challenge

The Web of Things (WoT) is an extension of the Internet of Things (IoT) to ease the access to data using the benefits of Web technologies. Data is generated by things/devices and then exploited by more and more web-based applications to monitor healthcare or even control home automation devices. There is a growing interest within standardization in designing models to represent devices and produced data as demonstrated by standards (e.g., oneM2M, W3C SSN, W3C WoT, SmartM2M, ETSI M2M). The purpose of this challenge would be to automatically extract the knowledge (e.g. the most common concepts and properties) in already designed and available Knowledge Bases (e.g., datasets and/or models) released on the Web. We will focus on KBs designed by standards, and/or ontology-based WoT research projects already applied to numerous domains. It will demonstrate that the complementary knowledge is constantly redesigned in different communities and would encourage semantic interoperability among WoT projects and KBs.

Flyer: KE4WoT Challenge at WWW 2018

FlyerA4AdvertisementWWW2018ChallengeKE4WoT.png

Description of the KE4WoT Challenge

The Web of Things (WoT) is an extension of the Internet of Things (IoT) to ease the access to data using the benefits of Web technologies. Data is generated by things/devices and then exploited by more and more web-based applications to monitor healthcare or even control home automation devices. There is a growing interest within standardization in designing models to represent devices and produced data as demonstrated by the following standards. Those models should be used to design interoperable smart web-based WoT applications:

  • W3C Semantic Sensor Networks (SSN) is the first initiative to address interoperability issues to describe sensor networks through an ontology since devices are required to build WoT applications. A new version of the ontology [2] has been recently released and became a W3C recommendation in October 2017. It is a joint contribution with the Open Geospatial Consortium (OGC) standard, extending and improving the SSN ontology published in 2011.
  • W3C Web of Things (WoT) Interest Group is designing a vocabulary to describe interactions between objects through the Web, a potential implementation is the WoT ontology [3]. At the current date of writing, WoT ontology is not aligned with W3C SSN ontologies. A healthcare scenario has been designed "Remote health monitoring system" [4] among several use cases.
  • OneM2M, an international standard for Machine-to-Machine (M2M) with the development of the OneM2M ontology [5]. It extends the European ETSI M2M standard. At the current date of writing, OneM2M is not aligned with W3C SSN. The MyOntoSens ontology, based on SSN V1 is being standardized as a Technical Specification (TS) within the SmartBAN (Body Area Networks) Technical Committee of the ETSI standardization body [Nachabe et al. 2015]. This ontology is relevant to build health applications based on smart devices.
  • Smart Appliances REFerence (SAREF) [6], is a European standard supported by ETSI M2M and SmartM2M. It mainly covers the smart building applicative domain. The SAREF ontology has been designed re-using SSN and oneM2M.
  • Schema.org is a well-known schema catalogue to structure data on Web pages to describe location, person, etc. The IoT Schema.org extension [7] is planned; nothing concrete has been developed yet, but discussions are ongoing.
  • Haystack [8] is a project aiming at standardizing semantic data models and web services. For instance, the Haystack Tagging Ontology which employs SSN V1 ontology has been developped [9] [Charpenay et al. 2015].


It would be interesting to have methodologies enabling answering such questions:

  • What are the sensors designed within the models (e.g. Body Thermometer)?
  • What are the logical rules (IF THEN ELSE) designed within the models (e.g., if body temperature greater than 38 Degree Celsius than fever)?
  • What is the applicative domain within this model (e.g., healthcare) useful when the ontology covers several domains (e.g., Ambient Assisted Living combines smart homes and healthcare domains).

The purpose of this challenge would be to automatically extract the knowledge (e.g. the most common concepts and properties) in already designed and available Knowledge Bases (e.g., datasets and/or models) released on the Web. We will focus on KBs from standards, and/or ontology-based WoT research projects applied to numerous domains. It will demonstrate that the complementary knowledge is constantly redesigned in different communities.

This research challenge could be solved with knowledge extraction technologies. However, most of the existing extraction techniques are frequently applied to text from document and social networks. The main novelty of this challenge would be to apply web-based extraction techniques to models employed to structure data. Indeed, data can be considered as the new oil, what it is still neglected is the reuse of the models used to structure and/or linking data (e.g., Linked Data) to ease the knowledge extraction from data.

In this challenge, we suggest to focus on the healthcare domain with health ontologies to build domain-specific WoT applications and for challenge evaluation purpose. Ideally, the challenge proposal with designed solutions could be applied to any other applicative domains.

Important Dates

Challenge Paper Submission : 12 January 2018

Challenge papers acceptance notification: 14 February 2018

Challenge test data published: 14 February 2018

Camera Ready of authors’ papers : Will appear

Challenge – proclamation of winners:: During the conference 23-27 April 2018

Where: Lyon, France

Co-located with The Web Conference 2018 (WWW 2018): https://www2018.thewebconf.org/

Submission Guidelines

Please express your interest in the KE4WoT Challenge and which tasks by filling this form:

https://docs.google.com/forms/d/e/1FAIpQLSeXBx7Y1QZ01FCnJbiLqiOpNJroS3cCNdR49XESukKc41S7_Q/viewform?usp=sf_link

Submission Web page on Easy Chair: https://easychair.org/conferences/?conf=ke4wotchallengewww2018

The paper submission will be maximum 6 pages and should follow the ACM format (see WWW 2018 template).

Feel free to ask any questions to: ke4wotchallenge-www2018@easychair.org, amelie.gyrard@emse.fr

Challenge Task 1: Exploiting the Web of Things Knowledge Base

To open the challenge to a larger audience and complementary web communities, we have designed following tasks, some of them can be isolated:

The LOV4IoT is an ontology catalogue referencing almost 380 WoT research projects in various areas such as home automation, smart cities, smart agriculture, healthcare, etc. More information can be found the LOV4IoT project http://lov4iot.appspot.com/. In the same way, other ontology catalogues can be employed (e.g., Ready4SmartCities, LOV, OpenSensingCity); we are suggesting the LOV4IoT dataset since the organizers can help for any requests during the challenge process. BioPortal is another ontology catalogue dedicated to the healthcare domain.

This task can be split into two subtasks that can be addressed by different communities:

  • Task 1.1: Extracting the most popular terms and properties

Definition: Loading a set of WoT-related ontologies and extract the most popular/important terms and properties. An example would be to query all ontologies from the healthcare domain, by analyzing most popular terms, we expect the results would display Body Temperature, Blood Pressure, etc. It would be interesting to have algorithms answering such questions: (1) What are the sensors designed within the ontology (e.g. Body Thermometer)?, (2) What are the logical rules (IF THEN ELSE) designed within the ontology (e.g., if body temperature greater than 38 Degree Celcius than fever)? What is the applicative domain within this ontology (e.g., healthcare) useful when the ontology covers several domains (e.g., Ambient Assisted Living combines smart homes and healthcare domains).

Input: A set of ontologies from LOV4IoT ontology catalogue: health ontologies.

 LOV4IoT Tutorial to get health ontologies: http://lov4iot.appspot.com/?p=queryHealthOntologiesWS (using a web service or a dump of ontologies)

Output: For each ontology, finding the most 20 relevant concepts and properties.

 Suggestion:
 OntoKhoj: a semantic web portal for ontology searching, ranking and classification [Patel et al. 2003] [10]
 Identifying potentially important concepts and relations in an ontology [Wu et al. 2008] [11]

Impact: Such algorithms would demonstrate the most relevant concepts and properties in a set of domain domains. Hopefully, the algorithm will be generic enough to be applied to any domains. Such algorithms would be relevant to assist to create iot.schema.org for instance.

Audience: WoT/IoT and healthcare communities who want to discover and study already designed models, any developers and/or data scientists willing to make statistics, Knowledge Extraction Experts.

Evaluation: For evaluation purpose, we will choose some health ontologies referenced within the set of health ontologies mentioned above.

Generecity: To test the genericity of your algorithm, you can play with additional datasets

  LOV4IoT Tutorial to get city ontologies: http://lov4iot.appspot.com/?p=queryCityOntologiesWS (using a web service or a dump of ontologies)
  LOV4IoT Tutorial to get Web of Things (WoT) ontologies: http://lov4iot.appspot.com/?p=queryWoTOntologiesWS (using a web service or a dump of ontologies)


  • Task 1.2: Ontology Matching algorithms and software.

Definition: Instead of using the OAEI benchmark, ontology matching experts could apply their algorithms on the LOV4IoT benchmark to align WoT-related ontologies. Aligning all ontologies related to health for instance.

Input: A set of ontologies: LOV4IoT ontology catalogue (see useful links above)

Output: Alignment for ontologies referenced within LOV4IoT

Impact: Ontology matching experts would observe that ontologies referenced are not structured in the same way for instance (perhaps no labels or comments are provided within the ontology which is a huge problem since most of the methods are using this hypothesis). This would lead to the design of new ontology matching tools relevant for WoT.

Audience: Ontology Matching Experts

Challenge Task 2: Creating a System for extracting named entities using ontologies and a Q/A System over it

To open the challenge to a larger audience and complementary web communities, we have designed following tasks, some of them can be isolated:

  • Task 2.1: Extracting named entities using ontologies

Definition: Create a system and submit the output results. Our system [12] will compare the results and generate the F-score.

Input: A domain-specific corpus of social media text (e.g., tweets).

Output: Named Entities using Statistical methods and knowledge bases/ontologies.

Impact: LOV4IoT is a valuable resource combining semantic-based projects relevant to IoT. This task motivates the participating team to develop an NLP system over this inter-linked IoT vocabulary in order to identify named entities and linked entities across various semantic related projects. This task will be leveraged by a question answering system for providing better insights to user queries.

Audience: Natural Language Processing (NLP) Experts


  • Task 2.2: Q/A System

Definition: We will use the entity recognition Task: Extracting named entities using ontologies together with relationship extraction task to generate Triples of the data. These triples will be stored in RDF format (linked data). We will create a Q/A system in terms of SPARQL queries. For instance, we will run a bunch of say 20-30 SPARQL queries and generate a result set. The participants will also run the same set 20-30 SPARQL queries over their RDF linked data and generate the results. Now they have to submit their results with ours and we will report the f-score of the task.

Input: Their own result of Task: Extracting named entities using ontologies and queries provided by us.

Output: RDF Triples and the results from the given queries.

Impact: Question Answering (QA) system have been an apogee of research in deep learning and linked open data. This system provides the user an interactional framework showing the expressive potential to query linked open data while keeping the complexity in the back-end. There has been little research in developing a QA system over an IoT linked open data. This is because there is less influence of IoT within semantic web community. LOV4IoT provides a heterogeneous IoT data lake over which the participants have to create QA paradigm for gleaning relevant insights using some state-of-the-art approaches in the domain of QA. This challenge will create an impact on the community of semantic web of things by providing a system that leverages natural language processing and IoT-RDF data. These challenges aim to promote deep innovations in IoT related research leading to improve human's life.

Audience: Natural Language Processing (NLP) Experts, Semantic Web Experts (RDF, RDFS, OWL, SPARQL)

Organisation

The challenge co-chairs:


Amelie Gyrard

Homepage: http://sensormeasurement.appspot.com/?p=AmelieGyrard

Institute: Univ Lyon, MINES Saint-Etienne, CNRS, Laboratoire Hubert Curien, Saint-Etienne, France

Amelie Gyrard is a post-doc researcher at Ecole des Mines de Saint-Etienne, France, working within the Connected Intelligence - Knowledge Representation and Reasoning team. Previously, she was a post-doc at Insight Center for Data Analytics, National University of Galway and actively working in the scientific development and coordination of the FIESTA-IoT (Federated Interoperable Semantic IoT/Cloud Testbeds and Applications) EU H2020 project. She already co-organized tutorials, workshops and hackathons on the Semantic Web of Things related topics. Her research interests are on Software engineering for Semantic Web of Things and Internet of Things (IoT), semantic web best practices and methodologies, ontology engineering, reasoning and interoperability of IoT data. She holds a Ph.D. from Eurecom since 2015 where she designed and implemented the Machine-to-Machine Measurement (M3) framework. The title of her dissertation is ``Designing Cross-Domain Semantic Web of Things Applications. Her work is published in conferences, journals and book chapters. She also disseminated her work in standardizations such as ETSI M2M, oneM2M, and W3C Web of Things. She is also a reviewer for IoT, Semantic Web related journals and conferences.


Mihaela Juganaru-Mathieu

Homepage: https://www.linkedin.com/in/mihaela-juganaru-4893635/

Institute: Univ Lyon, MINES Saint-Etienne, CNRS, Laboratoire Hubert Curien, Saint-Etienne, France

Mihaela Juganaru-Mathieu is Associate Professor in computer science and data science at Ecole des Mines de Saint Etienne, France, Department of Computer Science and Intelligent Systems. Her actually research concerns text mining and topic modeling. She participated at various Text Mining challenges at CLEF competitions in author identification (PAN) and mining huge structured text (INEX). She is also responsible of the interdisciplinary Specialization "Big Data".


Manas Gaur

Homepage: http://knoesis.org/people/manas

Institute: Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis), Wright State University, USA

Manas Gaur is a Ph.D. student at the Kno.e.sis Ohio Center of Excellence in Knowledge-enabled Computing at Wright State University in Dayton, Ohio, within Kno.e.sis-Knowledge Graph Development and Social and Physical Sensing Enabled Decision Support Team. His research lies at the intersection of deep learning, text mining and knowledge graph to solve challenges in Biomedical and Clinical Natural Language Text. He had been a Data Science for Social Good Fellow at the University of Chicago working on improving healthcare outcomes of the patients using semantics and machine learning. As a researcher in Kno.e.sis, he utilized clinical and biomedical information in UMLS, BKR, and PubMed to create a healthcare knowledge base which can help in classifying musculoskeletal diseases and can be leveraged for analyzing forums and twitter data. He has actively participated in various North America Hackathons such as SteelHacks (University of Pittsburgh), HackDuke (Duke University) and BoilerMake (Purdue). Previously, he completed his graduate studies in Computer Science from Delhi College of Engineering, India. Where he organized and co-organized various hackathons and tutorials on machine learning, data mining, and natural language processing. His work has been published in conferences, book chapters, and journals.


Swati Padhee

Homepage: http://knoesis.org/resources/researchers/swati

Institute: Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis), Wright State University, USA

Swati Padhee is a Graduate Research Assistant at Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis). Her current research work involves semantic web, knowledge representations, dynamically evolving knowledge graphs, ontologies, information extraction, natural language processing, and machine learning. The publicly available knowledge bases do not capture the changing dynamics of the events occurring in the real world. They also lack many domain-specific relationships between the entities. Swati is working on solving such issues of capturing the dynamics of domain-specific raw data with time. She is also working on knowledge graph-based similarity measures for enhancing semantic analysis of social data clustering. Prior to joining Kno.e.sis, she holds a Masters in Electrical Engineering from India. Her thesis involved predictive analysis to prevent electrical hazards using background knowledge for power systems. She has organized Robotics competitions and research seminars during her study in India.


Amit Sheth

Homepage: http://knoesis.org/amit/

Institute: Kno.e.sis, Wright State University, USA

Amit Sheth is an educator, researcher, and entrepreneur. He is the LexisNexis Ohio Eminent Scholar, an IEEE Fellow, and the executive director of Kno.e.sis—the Ohio Center of Excellence in Knowledge-enabled Computing. Kno.e.sis' faculty and researchers are computer scientists, cognitive scientists, biomedical researchers, and clinicians. It has the largest US academic research group in the area of Semantic Web and maintains a very high publication impact. Prof. Sheth is one of the 100 top computer sciences based on publication impact (h-index = 95). He has founded three companies by licensing his university research outcomes. He has organized over 75 international events (as General/Organization Committee/Steering Committee/Program Chair) and given over 35 tutorials. Examples of his relevant activities includes (a) initiating and co-chairing W3C Semantic Sensor Networking group, whose outcomes serve as the defacto international standard, and (b) serving as the IoT department editor for IEEE Intelligent Systems.

References

[Bakerally et al. 2016] N. Bakerally, O. Boissier, and A. Zimmermann. Smart city artifacts web portal. In International Semantic Web Conference, Springer, 2016.

[Charpenay et al. 2015] An Ontology Design Pattern for IoT Device Tagging Systems 2015 5th International Conference on the Internet of Things (IoT) [13]

[Compton et al. 2012] M. Compton, P. Barnaghi, L. Bermudez, R. Garcia-Castro, O. Corcho, S. Cox, J. Graybeal, M. Hauswirth, C. Henson, A. Herzog, et al. The ssn ontology of the w3c semantic sensor network incubator group. Web Semantics: Science, Services and Agents on the World Wide Web, 2012. http://www.w3.org/2005/Incubator/ssn/ssnx/ssn.

[Guha et al. 2016] R. V. Guha, D. Brickley, and S. Macbeth. Schema. org: Evolution of structured data on the web. Communications of the ACM, 2016.

[Gyrard et al. 2016] A. Gyrard, G. Atemezing, C. Bonnet, K. Boudaoud, and M. Serrano. Reusing and Unifying Background Knowledge for Internet of Things with LOV4IoT. In 4th International Conference on Future Internet of Things and Cloud (FiCloud). IEEE, 2016.

[Gyrard et al. 2016] A. Gyrard, C. Bonnet, K. Boudaoud, and M. Serrano. LOV4IoT: A second life for ontology-based domain knowledge to build Semantic Web of Things applications. In 4th International Conference on Future Internet of Things and Cloud (FiCloud). IEEE, 2016.

[Moreira et al. 2017] J. e. a. Moreira. Towards iot platforms' integration: Semantic translations between w3c ssn and etsi saref. In SIS-IoT: Semantic Interoperability and Standardization in the IoT Workshop at Semantics Conference, 2017.

[Nachabe et al. 2015] L. Nachabe, M. Girod-Genet, and B. El Hassan. Unified data model for wireless sensor network. IEEE Sensors Journal, 2015.

[Noy et al. 2009] N. F. Noy, N. H. Shah, P. L. Whetzel, B. Dai, M. Dorf, N. Grith, C. Jonquet, D. L. Rubin, M.-A. Storey, C. G. Chute, et al. Bioportal: ontologies and integrated data resources at the click of a mouse. Nucleic acids research, 2009.

[Parsia et al. 2015] B. Parsia, N. Matentzoglu, R. S. Goncalves, B. Glimm, and A. Steigmiller. The owl reasoner evaluation (ore) 2015 competition report. Journal of Automated Reasoning, 2015.

[Parsia et al. 2016] B. Parsia, N. Matentzoglu, R. S. Goncalves, B. Glimm, and A. Steigmiller. The owl reasoner evaluation (ore) 2015 resources. In International Semantic Web Conference, Springer, 2016.

[Patel et al. 2003] C. Patel, K. Supekar, Y. Lee, and E. Park. Ontokhoj: a semantic web portal for ontology searching, ranking and classification. In Proceedings of the 5th ACM international workshop on Web information and data management, ACM, 2003.

[Paulheim 2017] H. Paulheim. Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic web, 2017.

[Poveda Villalon et al. 2014] M. Poveda Villalon, R. Garcia Castro, and A. Goomez-Perez. Building an ontology catalogue for smart cities. 2014.

[Sheth et al. 2017] A. Sheth, S. Perera, S. Wijeratne, and K. Thirunarayan. Knowledge will propel machine understanding of content: extrapolating from current examples. arXiv preprint arXiv:1707.05308, 2017.

[Vandenbussche et al. 2016] P.-Y. Vandenbussche, G. A. Atemezing, M. Poveda-Villalon, and B. Vatant. Linked Open Vocabularies (LOV): a gateway to reusable semantic vocabularies on the Web. Semantic Web Journal, 2016.

[Wu et al. 2008] G. Wu, J. Li, L. Feng, and K. Wang. Identifying potentially important concepts and relations in an ontology. In International Semantic Web Conference, Springer, 2008.

[Zhang et al. 2012] Y. Zhang, P. M. Duc, O. Corcho, and J.-P. Calbimonte. Srbench: a streaming rdf/sparql benchmark. In International Semantic Web Conference, Springer, 2012.