Heriot-Watt Semantic Web Lab – Page 10 – Data integration, visualisation, and analytics

Discussion: Describing Learning Resources with schema.org

Date: 11:15, 8 August 2016

Venue: F.17. Colin Maclaurin Building, Heriot-Watt University

Lorna Johnstone is an MSc student conducting a project with Phil Barker. Her project has been examining previous efforts at resource description and requirements analysis to identify a subset of schema.org that is adequate for learning resources, demonstrating its use and evaluating its suitability.

Seminar: Theoretical Models of Decision Making in Ultimatum Game

Date: 11:15, 1 August 2016

Venue: F.17. Colin Maclaurin Building, Heriot-Watt University

Title: Theoretical Models of Decision Making in Ultimatum Game

Speaker: Tatiana V. Guy, Head of Department of Adaptive Systems, Institute Information Theory and Automation, Czech Academy of Sciences, Prague

Abstract: Decision-making (DM) is considered the most essential phase in a human volitional act and according to traditional economic models humans could be replaced by “rational agents”. Predictions implied by this are well seen on the considered Ultimatum Game (UG).

In a short informal talk I will discuss i) fairness aspects as the cause of the deviations from the predicted game-theoretical behaviour in UG responder’s behaviour and ii) how the impact of limited deliberation effort allocated by human-responder can be modelled in multi-proposer UG.

Bio: Tatiana V. Guy is Head of Department of Adaptive Systems, Institute Information Theory and Automation, Czech Academy of Sciences, Prague.

Research interests include conceptual, theoretical and algorithmic aspects of multiple-participant decision-making (DM) problem in complex dynamic and uncertain environment; descriptive DM under uncertainty; nature-inspired patterns of cooperation.

Degrees: Dipl. Eng.- Polytechnic Institute, Kiev, USSR, 1991;

Ph.D. – Faculty of Electrical Engineering, Czech Technical University, Prague, 1999.

Seminar: Ontology-Driven Resource Description for Software Defined Wireless Networks

Date: 11:15 20 June 2016

Venue: F.17. Colin Maclaurin Building, Heriot-Watt University

Title: Ontology-Driven Resource Description for Software Defined Wireless Networks

Speaker: Qianru Zhou, PhD student, Advanced Wireless Technologies (AWiTec) Lab, Electrical, Electronic and Computer Engineering, Heriot-Watt University

Abstract: The future management and control of wireless communication networks will rely on developing the most appropriate abstraction to represent various network elements. In order to provide high-level abstraction and enhance network programmability to mine these data, a semantic-based network information modeling approach is required. In this presentation, a ontology is built for software defined wireless networks (SDWNs) and the methodology for modeling based on the proposed ontology of the network information is illustrated in detail. By applying data mining to extract implicit and valuable information from the proposed SDWN information model, “Lost Silence”, which can recognize the pattern of a disaster and provide an early alert service, is developed utilizing a real life scenario.

Bio: Qianru Zhou received her Bachelor degree in Telecommunication Engineering from Shenzhen University, Guangdong, China, in 2009, and MSc degree in Optical Engineering from Beijing University of Posts and Telecommunications, Beijing, China, in 2013. She worked as a System Programmer in Sanmina, Shenzhen, China, in 2014. Since January 2015 she has been a PhD student at Heriot-Watt University, Edinburgh, UK, under the supervision of Prof. Cheng-Xiang Wang , Heriot-Watt University and Prof. Stephen McLaughlin, Heriot-Watt University.

Schema.org Dataset Descriptions Meeting

On Monday 16 May, several interested individuals (including myself) from ELIXIR, bioCADDIE, Bioschemas and the W3C HCLS Community Profile met with representatives from Google involved in the schema.org activity on describing datasets. Finding datasets, and understanding their content, is a challenging task for humans and currently not possible to automate. Schema.org is an initiative from […]

Finding datasets, and understanding their content, is a challenging task for humans and currently not possible to automate. Schema.org is an initiative from the major web search engines to help with the discovery of web resource.

There are multiple parallel activities in the life sciences community working on developing ways to publish metadata about datasets. This is due to the wide variety of use cases that dataset descriptions need to satisfy, including data discovery, data citation, and provenance tracking. bioCADDIE has worked on an extensive analysis of use cases mapping data models from existing efforts. We agreed this could be used as a basis to improve the existing schema.org dataset type and come up with a new Bioschemas dataset specification.

The outcomes of the meeting with Google was to focus on the find-ability of datasets with an emphasis on data citation. For data citation the following key properties are important (as stated in the FORCE 11 data citation principles).

Image from https://www.force11.org/node/4771

The next steps will be to develop some pilot projects to both publish and use dataset descriptions for discovery and citation.

ELIXIR-UK members are heavily involved in Bioschemas.org and will be involved in these efforts as part of the ELIXIR interoperability platform activities in collaboration with NIH and ELIXIR-EBI.

My thanks go to Rafael Jimenez for his help with the preparation of this post which will also appear in the ELIXIR-UK newsletter.

Outstanding Reviewer

I’ve been recognised an outstanding reviewer in 2015 for the Journal of Web Semantics. You can see full details of my reviewing history on my reviewer profile page.

Seminar: Azmi Hassan – 9 May 2016

Talk Title: Communication and Tracking Ontology Development for Civilians Earthquake Disaster Assistance

Presenter: Azmi Hassan

Abstract :
One of the most important components of recovery and speedy response during and immediately after an earthquake disaster is a communication and tracking which possibly capable of discovering affected peoples and connects them with their families, friends, and communities with first responders and/or to support computational systems. With the capabilities of current mobile technologies, we believed that it can be a smart earthquake disaster tools aid to help people in this situation. Ontologies are becoming crucial parts to facilitate an effective communication and coordination across different parties and domains in providing assistance during earthquake disasters, especially where affected locations are remote, affected population is large and centralized coordination is poor. Several existing competing methodologies give guidelines as how ontology may be built, there are no single right ways of building an ontology and no standard of Disaster Relief Ontology exist, although separated related ontologies may be combined to create an initial version. This article discusses the on-going development of an ontology for a Communication and Tracking System (CTS), based on existing related ontologies, that is aimed to be used by mobile phone applications to support earthquake disaster relief at the real-time.

ISWC 2016 Deadlines Approaching

ISWC 2016 will be taking place in Kobe, Japan from 17-21 October. Tomorrow is the deadline for abstract submissions for ISWC, with full papers due on 30 April. There are three tracks for you to submit to: The Research Track: innovative and groundbreaking work on the cross between semantics and the web. The Applications Track: benefits and […]

The Research Track: innovative and groundbreaking work on the cross between semantics and the web.
The Applications Track: benefits and challenges of applying semantic technologies. This track is accepting three different types of submissions on in-use applications, industry applications and industry applications.
The Resources Track: reusable resources like datasets, ontologies, benchmarks and tools are crucial for many research disciplines and especially ours. Make sure you read the guidelines for describing a reusable resources.

To entice you to come to Kobe, Japan, there are three fantastic keynotes lined up:

Kathleen McKeown – Professor of Computer Science at Columbia University,
Director of the Institute for Data Sciences and Engineering, and Director of the North East Big Data Hub.
Hiroaki Kitano – CEO of Sony Computer Science Laboratory and President of the systems biology institute. A truly inspirational figure who has done everything from RoboCup to systems biology. He was even an invited artist at MoMA.
Chris Bizer – Professor at the Univesity of Mannheim and Director of the Institute of Computer Science and Business Informatics there. If you’re in the Semantic Web community – you know the amazing work Chris has done. He really kicked the entire move toward Linked Data into high gear.

I am co-chairing the Resources Track with Marta Sabou. I hope to be able to welcome you to Kobe.

Thanks to Paul Groth as the text for this post is based on his post from a month ago.

Schema course extension progress update

I am chair of the Schema Course Extension W3C Community Group, which aims to develop an extension for schema.org concerning the discovery of any type of educational course. This progress update is cross-posted from there. If the forming-storming-norming-performing model of group development still has any currency, then I am pretty sure that February was the … Continue reading Schema course extension progress update →

If the forming-storming-norming-performing model of group development still has any currency, then I am pretty sure that February was the “storming” phase. There was a lot of discussion, much of it around the modelling of the basic entities for describing courses and how they relate to core types in schema (the Modelling Course and CourseOffering & Course, a new dawn? threads). Pleased to say that the discussion did its job, and we achieved some sort of consensus (norming) around modelling courses in two parts

Course, a subtype of CreativeWork: A description of an educational course which may be offered as distinct instances at different times and places, or through different media or modes of study. An educational course is a sequence of one or more educational events and/or creative works which aims to build knowledge, competence or ability of learners.

CourseInstance, a subtype of Event: An instance of a Course offered at a specific time and place or through specific media or mode of study or to a specific section of students.

hasCourseInstance, a property of Course with expected range CourseInstance: An offering of the course at a specific time and place or through specific media or mode of study or to a specific section of students.

(see Modelling Course and CourseInstance on the group wiki)

This modelling, especially the subtyping from existing schema.org types allows us to meet many of the requirements arising from the use cases quite simply. For example, the cost of a course instance can be provided using the offers property of schema.org/Event.

The wiki is working to a reasonable extent as a place to record the outcomes of the discussion. Working from the outline use cases page you can see which requirements have pages, and those pages that exist point to the relevant discussion threads in the mail list and, where we have got this far, describe the current solution. The wiki is also the place to find examples for testing whether the proposed solution can be used to mark up real course information.

As well as the wiki, we have the proposal on github, which can be used to build working test instances on appspot showing the proposed changes to the schema.org site.

The next phase of the work should see us performing, working through the requirements from the use cases and showing how they can be me. I think we should focus first on those that look easy to do with existing properties of schema.org/Event and schema.org/CreativeWork.

The FAIR Principles herald more open, transparent, and reusable scientific data

Today, March 15 2016, the FAIR Guiding Principles for scientific data management and stewardship were formally published in the Nature Publishing Group journal Scientific Data. The problem the FAIR Principles address is the lack of widely shared, clearly articulated, and broadly applicable best practices around the publication of scientific data. While the history of scholarly publication in journals is long and well established, the same cannot be said of formal data publication. Yet, data could be considered the primary output of scientific research, and its publication and reuse is necessary to ensure validity, reproducibility, and to drive further discoveries. The FAIR Principles address these needs by providing a precise and measurable set of qualities a good data publication should exhibit – qualities that ensure that the data is Findable, Accessible, Interoperable, and Reusable (FAIR).

The principles were formulated after a Lorentz Center workshop in January, 2014 where a diverse group of stakeholders, sharing an interest in scientific data publication and reuse, met to discuss the features required of contemporary scientific data publishing environments. The first-draft FAIR Principles were published on the Force11 website for evaluation and comment by the wider community – a process that lasted almost two years. This resulted in the clear, concise, broadly-supported principles that were published today. The principles support a wide range of new international initiatives, such as the European Open Science Cloud and the NIH Big Data to Knowledge (BD2K), by providing clear guidelines that help ensure all data and associated services in the emergent ‘Internet of Data’ will be Findable, Accessible, Interoperable and Reusable, not only by people, but notably also by machines.

The recognition that computers must be capable of accessing a data publication autonomously, unaided by their human operators, is core to the FAIR Principles. Computers are now an inseparable companion in every research endeavour. Contemporary scientific datasets are large, complex, and globally-distributed, making it almost impossible for humans to manually discover, integrate, inspect and interpret them. This (re)usability barrier has, until now, prevented us from maximizing the return-on-investment from the massive global financial support of big data research and development projects, especially in the life and health sciences. This wasteful barrier has not gone unnoticed by key agencies and regulatory bodies. As a result, rigorous data management stewardship – applicable to both human and computational “users” – will soon become a funded, core activity within modern research projects. In fact, FAIR-oriented data management activities will increasingly be made mandatory by public funding bodies.

The high level of abstraction of the FAIR Principles, sidestepping controversial issues such as the technology or approach used in the implementation, has already made them acceptable to a variety of research funding bodies and policymakers. Examples include FAIR Data workshops from EU-ELIXIR, inclusion of FAIR in the future plans of Horizon 2020, and advocacy from the American National Institutes of Health. As such, it seems assured that these principles will rapidly become a key basis for innovation in the global move towards Open Science environments. Therefore, the timing of the Principles publication is aligned with the Open Science Conference in April 2016.

With respect to Open Science, the FAIR Principles advocate being “intelligently open”, rather than “religiously open”. The Principles do not propose that all data should be freely available – in particular with respect to privacy-sensitive data. Rather, they propose that all data should be made available for reuse under clearly-defined conditions and licenses, available through a well-defined process, and with proper and complete acknowledgement and citation.This will allow much wider participation of players from, for instance, the biomedical domain and industry where rigorous and transparent data usage conditions are a core requirement for data reuse.

“I am very proud that just over two years after the meeting where we came up with the early FAIR Principles. They play such an important role in many forward looking policy documents around the world and the authors on this paper are also in positions that allow them to follow these Principles. I sincerely hope that FAIR data will become a ‘given’ in the future of Open Science, in the Netherlands and globally”, says Barend Mons, Professor in Biosemantics at the Leiden University Medical Center.

Open PHACTS is dead, long live Open PHACTS!

I have spent the last five years working on the Open PHACTS project which is sadly at an end. However it is not the end of the Open PHACTS drug discovery platform. We have transitioned to a new era of a foundation organisation running and developing the platform. The milestone was marked by the symbolic handover of the Open PHACTS flag (see photo of on the right Barend Mons (Leiden Medical Center) and Gerhard Ecker (University of Vienna) handing the flag to on the left Stefan Senger (GlaxoSmithKline), Derek Marren (Eli Lilly), and Herman van Vlijmen (Janssen Pharmaceutica).

A nice summary of the closing symposium is available:

Linking Life Science Data: Design to Implementation, and Beyond

19 Feb, 2016 Open PHACTS project closing conference (Vienna, Austria)

On 18–19 February, 2016, we celebrated the completion of the Open PHACTS project with a conference at the University of Vienna, Austria. A total of 79 people attended to discuss the achievements of the Open PHACTS project, what they mean for the future of linked data, and how they can be carried forward.

Source: Linking Life Science Data: Design to Implementation, and Beyond – Open PHACTS Foundation