Seminar: Semantic Alignment for Agent Interactions: making communication meaningful in open environments

Date: 11:15, 12 September 2016

Venue: F.17. Colin Maclaurin Building, Heriot-Watt University

Title: Semantic Alignment for Agent Interactions: making communication meaningful in open environments

Speaker: Paula Chocrón, Artificial Intelligence Research Institute (IIIA-CSIC)

Abstract: The fact that the meaning of words depends on the context in which they are used is evident for any speaker: if someone asks for chips in a cafeteria, she will unlikely be expecting to get electronic circuits. In human dialogues this kind of semantic alignment happens permanently and has been extensively studied.

In this talk I will discuss how these ideas can also be applied to help achieve meaningful communication in artificial multi-agent systems, in which heterogeneous interlocutors will likely use different vocabularies. I will start by presenting a notion of context that is based on the formal specifications of the tasks performed by agents. I will then show how this context can be used by the agents to align their vocabularies dynamically, by learning mappings from the experience of previous interactions. In doing so, we will also rethink the traditional approach to semantic matching and its evaluation, tackling the following questions: What does it mean for agents to “understand each other”? When is an alignment good for a particular application? How can the interaction context help interoperability?

Bio: Paula Chocrón is a PhD student at the Artificial Intelligence Research Institute (IIIA-CSIC) in Barcelona, Spain. She is part of the ESSENCE Marie Curie ITN, which funds PhD projects on topics related to the evolution of shared semantics in artificial environments in different European institutes. Paula is currently interested on studying the relation between the fields of ontology matching and multi-agent communication.

HCLS Community Profile for Dataset Descriptions

My latest publication [1] describes the process followed in developing the W3C Health Care and Life Sciences Interest Group (HCLSIG) community profile for dataset descriptions which was published last year. The diagram below provides a summary of the data model for describing datasets which covers 61 metadata terms drawn from 18 vocabularies. [1] M. Dumontier, A. […]

My latest publication [1] describes the process followed in developing the W3C Health Care and Life Sciences Interest Group (HCLSIG) community profile for dataset descriptions which was published last year. The diagram below provides a summary of the data model for describing datasets which covers 61 metadata terms drawn from 18 vocabularies.Overview of the HCLS Community Profile for Dataset Descriptions

[1] [doi] M. Dumontier, A. J. G. Gray, S. M. Marshall, V. Alexiev, P. Ansell, G. Bader, J. Baran, J. T. Bolleman, A. Callahan, J. Cruz-Toledo, P. Gaudet, E. A. Gombocz, A. N. Gonzalez-Beltran, P. Groth, M. Haendel, M. Ito, S. Jupp, N. Juty, T. Katayama, N. Kobayashi, K. Krishnaswami, C. Laibe, N. {Le Novère}, S. Lin, J. Malone, M. Miller, C. J. Mungall, L. Rietveld, S. M. Wimalaratne, and A. Yamaguchi, “The health care and life sciences community profile for dataset descriptions,” PeerJ, vol. 4, p. e2331, 2016.
[Bibtex]
@article{Dumontier2016HCLS,
abstract = {Access to consistent, high-quality metadata is critical to finding, understanding, and reusing scientific data. However, while there are many relevant vocabularies for the annotation of a dataset, none sufficiently captures all the necessary metadata. This prevents uniform indexing and querying of dataset repositories. Towards providing a practical guide for producing a high quality description of biomedical datasets, the {W3C} Semantic Web for Health Care and the Life Sciences Interest Group ({HCLSIG}) identified Resource Description Framework ({RDF}) vocabularies that could be used to specify common metadata elements and their value sets. The resulting guideline covers elements of description, identification, attribution, versioning, provenance, and content summarization. This guideline reuses existing vocabularies, and is intended to meet key functional requirements including indexing, discovery, exchange, query, and retrieval of datasets, thereby enabling the publication of {FAIR} data. The resulting metadata profile is generic and could be used by other domains with an interest in providing machine readable descriptions of versioned datasets.},
author = {Dumontier, Michel and Gray, Alasdair J.G. and Marshall, M Scott and Alexiev, Vladimir and Ansell, Peter and Bader, Gary and Baran, Joachim and Bolleman, Jerven T and Callahan, Alison and Cruz-Toledo, Jos{'{e}} and Gaudet, Pascale and Gombocz, Erich A and Gonzalez-Beltran, Alejandra N. and Groth, Paul and Haendel, Melissa and Ito, Maori and Jupp, Simon and Juty, Nick and Katayama, Toshiaki and Kobayashi, Norio and Krishnaswami, Kalpana and Laibe, Camille and {Le Nov{`{e}}re}, Nicolas and Lin, Simon and Malone, James and Miller, Michael and Mungall, Christopher J and Rietveld, Laurens and Wimalaratne, Sarala M and Yamaguchi, Atsuko},
doi = {10.7717/peerj.2331},
issn = {2167-8359},
journal = {PeerJ},
month = aug,
title = {The health care and life sciences community profile for dataset descriptions},
volume = {4},
pages = {e2331},
year = {2016},
url = {https://peerj.com/articles/2331/}
}

Discussion: Describing Learning Resources with schema.org

Date: 11:15, 8 August 2016

Venue: F.17. Colin Maclaurin Building, Heriot-Watt University

Lorna Johnstone is an MSc student conducting a project with Phil Barker. Her project has been examining previous efforts at resource description and requirements analysis to identify a subset of schema.org that is adequate for learning resources, demonstrating its use and evaluating its suitability.

Seminar: Theoretical Models of Decision Making in Ultimatum Game

Date: 11:15, 1 August 2016

Venue: F.17. Colin Maclaurin Building, Heriot-Watt University

Title: Theoretical Models of Decision Making in Ultimatum Game

Speaker: Tatiana V. Guy, Head of Department of Adaptive Systems, Institute Information Theory and Automation, Czech Academy of Sciences, Prague

Abstract: Decision-making (DM) is considered the most essential phase in a human volitional act and according to traditional economic models humans could be replaced by “rational agents”. Predictions implied by this are well seen on the considered Ultimatum Game (UG).

In a short informal talk I will discuss i) fairness aspects as the cause of the deviations from the predicted game-theoretical behaviour in UG responder’s behaviour and ii) how the impact of limited deliberation effort allocated by human-responder can be modelled in multi-proposer UG.

Bio: Tatiana V. Guy is Head of Department of Adaptive Systems, Institute Information Theory and Automation, Czech Academy of Sciences, Prague.

Research interests include conceptual, theoretical and algorithmic aspects of multiple-participant decision-making (DM) problem in complex dynamic and uncertain environment; descriptive DM under uncertainty; nature-inspired patterns of cooperation.

Degrees: Dipl. Eng.- Polytechnic Institute, Kiev, USSR, 1991;

Ph.D. – Faculty of Electrical Engineering, Czech Technical University, Prague, 1999.

Seminar: Ontology-Driven Resource Description for Software Defined Wireless Networks

Date: 11:15 20 June 2016

Venue: F.17. Colin Maclaurin Building, Heriot-Watt University

Title: Ontology-Driven Resource Description for Software Defined Wireless Networks

Speaker: Qianru Zhou, PhD student, Advanced Wireless Technologies (AWiTec) Lab, Electrical, Electronic and Computer Engineering, Heriot-Watt University

Abstract: The future management and control of wireless communication networks will rely on developing the most appropriate abstraction to represent various network elements. In order to provide high-level abstraction and enhance network programmability to mine these data, a semantic-based network information modeling approach is required. In this presentation, a ontology is built for software defined wireless networks (SDWNs) and the methodology for modeling based on the proposed ontology of the network information is illustrated in detail. By applying data mining to extract implicit and valuable information from the proposed SDWN information model, “Lost Silence”, which can recognize the pattern of a disaster and provide an early alert service, is developed utilizing a real life scenario.

Bio: Qianru Zhou received her Bachelor degree in Telecommunication Engineering from Shenzhen University, Guangdong, China, in 2009, and MSc degree in Optical Engineering from Beijing University of Posts and Telecommunications, Beijing, China, in 2013. She worked as a System Programmer in Sanmina, Shenzhen, China, in 2014. Since January 2015 she has been a PhD student at Heriot-Watt University, Edinburgh, UK, under the supervision of Prof. Cheng-Xiang Wang, Heriot-Watt University and Prof. Stephen McLaughlin, Heriot-Watt University.

Schema.org Dataset Descriptions Meeting

On Monday 16 May, several interested individuals (including myself) from ELIXIR, bioCADDIE, Bioschemas and the W3C HCLS Community Profile met with representatives from Google involved in the schema.org activity on describing datasets. Finding datasets, and understanding their content, is a challenging task for humans and currently not possible to automate. Schema.org is an initiative from […]

On Monday 16 May, several interested individuals (including myself) from ELIXIR, bioCADDIE, Bioschemas and the W3C HCLS Community Profile met with representatives from Google involved in the schema.org activity on describing datasets.

Finding datasets, and understanding their content, is a challenging task for humans and currently not possible to automate. Schema.org is an initiative from the major web search engines to help with the discovery of web resource.

There are multiple parallel activities in the life sciences community working on developing ways to publish metadata about datasets. This is due to the wide variety of use cases that dataset descriptions need to satisfy, including data discovery, data citation, and provenance tracking. bioCADDIE has worked on an extensive analysis of  use cases mapping data models from existing efforts. We agreed this could be used as a basis to improve the existing schema.org dataset type and come up with a new Bioschemas dataset specification.

The outcomes of the meeting with Google was to focus on the find-ability of datasets with an emphasis on data citation. For data citation the following key properties are important (as stated in the FORCE 11 data citation principles).

data-citation

Image from https://www.force11.org/node/4771

The next steps will be to develop some pilot projects to both publish and use dataset descriptions for discovery and citation.

ELIXIR-UK members are heavily involved in Bioschemas.org and will be involved in these efforts as part of the ELIXIR interoperability platform activities in collaboration with NIH and ELIXIR-EBI.

My thanks go to Rafael Jimenez for his help with the preparation of this post which will also appear in the ELIXIR-UK newsletter.

Seminar: Azmi Hassan – 9 May 2016

Talk Title: Communication and Tracking Ontology Development for Civilians Earthquake Disaster Assistance

Presenter: Azmi Hassan

Abstract :
One of the most important components of recovery and speedy response during and immediately after an earthquake disaster is a communication and tracking which possibly capable of discovering affected peoples and connects them with their families, friends, and communities with first responders and/or to support computational systems. With the capabilities of current mobile technologies, we believed that it can be a smart earthquake disaster tools aid to help people in this situation. Ontologies are becoming crucial parts to facilitate an effective communication and coordination across different parties and domains in providing assistance during earthquake disasters, especially where affected locations are remote, affected population is large and centralized coordination is poor. Several existing competing methodologies give guidelines as how ontology may be built, there are no single right ways of building an ontology and no standard of Disaster Relief Ontology exist, although separated related ontologies may be combined to create an initial version. This article discusses the on-going development of an ontology for a Communication and Tracking System (CTS), based on existing related ontologies, that is aimed to be used by mobile phone applications to support earthquake disaster relief at the real-time.

ISWC 2016 Deadlines Approaching

ISWC 2016 will be taking place in Kobe, Japan from 17-21 October. Tomorrow is the deadline for abstract submissions for ISWC, with full papers due on 30 April. There are three tracks for you to submit to: The Research Track: innovative and groundbreaking work on the cross between semantics and the web. The Applications Track: benefits and […]

ISWC 2016 will be taking place in Kobe, Japan from 17-21 October. Tomorrow is the deadline for abstract submissions for ISWC, with full papers due on 30 April. There are three tracks for you to submit to:

  1. The Research Track: innovative and groundbreaking work on the cross between semantics and the web.
  2. The Applications Track: benefits and challenges of applying semantic technologies. This track is accepting three different types of submissions on in-use applications, industry applications and industry applications.
  3. The Resources Track: reusable resources like datasets, ontologies, benchmarks and tools are crucial for many research disciplines and especially ours. Make sure you read the guidelines for describing a reusable resources.

To entice you to come to Kobe, Japan, there are three fantastic keynotes lined up:

  • Kathleen McKeown – Professor of Computer Science at Columbia University,
    Director of the Institute for Data Sciences and Engineering, and Director of the North East Big Data Hub.
  • Hiroaki Kitano – CEO of Sony Computer Science Laboratory and President of the systems biology institute. A truly inspirational figure who has done everything from RoboCup to systems biology. He was even an invited artist at MoMA.
  • Chris Bizer – Professor at the Univesity of Mannheim and Director of the Institute of Computer Science and Business Informatics there. If you’re in the Semantic Web community – you know the amazing work Chris has done. He really kicked the entire move toward Linked Data into high gear.

I am co-chairing the Resources Track with Marta Sabou. I hope to be able to welcome you to Kobe.

Thanks to Paul Groth as the text for this post is based on his post from a month ago.

The FAIR Principles herald more open, transparent, and reusable scientific data

Today, March 15 2016, the FAIR Guiding Principles for scientific data management and stewardship were formally published in the Nature Publishing Group journal Scientific Data. The problem the FAIR Principles address is the lack of widely shared, clearly articulated, and broadly applicable best practices around the publication of scientific data. While the history of scholarly […]

FAIR Article PosterToday, March 15 2016, the FAIR Guiding Principles for scientific data management and stewardship were formally published in the Nature Publishing Group journal Scientific Data. The problem the FAIR Principles address is the lack of widely shared, clearly articulated, and broadly applicable best practices around the publication of scientific data. While the history of scholarly publication in journals is long and well established, the same cannot be said of formal data publication. Yet, data could be considered the primary output of scientific research, and its publication and reuse is necessary to ensure validity, reproducibility, and to drive further discoveries. The FAIR Principles address these needs by providing a precise and measurable set of qualities a good data publication should exhibit – qualities that ensure that the data is Findable, Accessible, Interoperable, and Reusable (FAIR).

The principles were formulated after a Lorentz Center workshop in January, 2014 where a diverse group of stakeholders, sharing an interest in scientific data publication and reuse, met to discuss the features required of contemporary scientific data publishing environments. The first-draft FAIR Principles were published on the Force11 website for evaluation and comment by the wider community – a process that lasted almost two years. This resulted in the clear, concise, broadly-supported principles that were published today. The principles support a wide range of new international initiatives, such as the European Open Science Cloud and the NIH Big Data to Knowledge (BD2K), by providing clear guidelines that help ensure all data and associated services in the emergent ‘Internet of Data’ will be Findable, Accessible, Interoperable and Reusable, not only by people, but notably also by machines.

The recognition that computers must be capable of accessing a data publication autonomously, unaided by their human operators, is core to the FAIR Principles. Computers are now an inseparable companion in every research endeavour. Contemporary scientific datasets are large, complex, and globally-distributed, making it almost impossible for humans to manually discover, integrate, inspect and interpret them. This (re)usability barrier has, until now, prevented us from maximizing the return-on-investment from the massive global financial support of big data research and development projects, especially in the life and health sciences. This wasteful barrier has not gone unnoticed by key agencies and regulatory bodies. As a result, rigorous data management stewardship – applicable to both human and computational “users” – will soon become a funded, core activity within modern research projects. In fact, FAIR-oriented data management activities will increasingly be made mandatory by public funding bodies.

The high level of abstraction of the FAIR Principles, sidestepping controversial issues such as the technology or approach used in the implementation, has already made them acceptable to a variety of research funding bodies and policymakers. Examples include FAIR Data workshops from EU-ELIXIR, inclusion of FAIR in the future plans of Horizon 2020, and advocacy from the American National Institutes of Health. As such, it seems assured that these principles will rapidly become a key basis for innovation in the global move towards Open Science environments. Therefore, the timing of the Principles publication is aligned with the Open Science Conference in April 2016.

With respect to Open Science, the FAIR Principles advocate being “intelligently open”, rather than “religiously open”. The Principles do not propose that all data should be freely available – in particular with respect to privacy-sensitive data. Rather, they propose that all data should be made available for reuse under clearly-defined conditions and licenses, available through a well-defined process, and with proper and complete acknowledgement and citation.This will allow much wider participation of players from, for instance, the biomedical domain and industry where rigorous and transparent data usage conditions are a core requirement for data reuse.

“I am very proud that just over two years after the meeting where we came up with the early FAIR Principles. They play such an important role in many forward looking policy documents around the world and the authors on this paper are also in positions that allow them to follow these Principles. I sincerely hope that FAIR data will become a ‘given’ in the future of Open Science, in the Netherlands and globally”, says Barend Mons, Professor in Biosemantics at the Leiden University Medical Center.