Open PHACTS is dead, long live Open PHACTS!

I have spent the last five years working on the Open PHACTS project which is sadly at an end. However it is not the end of the Open PHACTS drug discovery platform. We have transitioned to a new era of a foundation organisation running and developing the platform. The milestone was marked by the symbolic handover of […]

I have spent the last five years working on the Open PHACTS project which is sadly at an end. However it is not the end of the Open PHACTS drug discovery platform. We have transitioned to a new era of a foundation organisation running and developing the platform. The milestone was marked by the symbolic handover of the Open PHACTS flag (see photo of on the right Barend Mons (Leiden Medical Center) and Gerhard Ecker (University of Vienna) handing the flag to on the left Stefan Senger (GlaxoSmithKline), Derek Marren (Eli Lilly), and Herman van Vlijmen (Janssen Pharmaceutica).

A nice summary of the closing symposium is available:

Linking Life Science Data: Design to Implementation, and Beyond

19 Feb, 2016 Open PHACTS project closing conference (Vienna, Austria)

On 18–19 February, 2016, we celebrated the completion of the Open PHACTS project with a conference at the University of Vienna, Austria. A total of 79 people attended to discuss the achievements of the Open PHACTS project, what they mean for the future of linked data, and how they can be carried forward.

Source: Linking Life Science Data: Design to Implementation, and Beyond – Open PHACTS Foundation

 

Open PHACTS Closing Symposium

For the last 5 years I have had the pleasure of working with the Open PHACTS project. Sadly, the project is now at an end. To celebrate we are having a two day symposium to look over the contributions of the project and its future legacy. The project has been hugely successful in developing an […]

For the last 5 years I have had the pleasure of working with the Open PHACTS project. Sadly, the project is now at an end. To celebrate we are having a two day symposium to look over the contributions of the project and its future legacy.

The project has been hugely successful in developing an integrated data platform to enable drug discovery research (see a future post for details to support this claim). The result of the project is the Open PHACTS Foundation which will now own the drug discovery platform and sustain its development into the future.

Here are my slides on the state of the data in the Open PHACTS 2.0 platform.

Validata: An online tool for testing RDF data conformance

Validata is an online web application for validating an RDF document against a set of constraints. This is useful for data exchange applications or ensuring conformance of an RDF dataset against a community agreed standard. Constraints are expressed as a Shape Expression (ShEx) schema. Validata extends the ShEx functionality to support multiple requirement levels. Validata […]

Validata is an online web application for validating an RDF document against a set of constraints. This is useful for data exchange applications or ensuring conformance of an RDF dataset against a community agreed standard. Constraints are expressed as a Shape Expression (ShEx) schema.
Validata extends the ShEx functionality to support multiple requirement levels. Validata can be repurposed for different deployments by providing it with a new ShEx schema.

The Validata code is available from https://github.com/HW-SWeL/Validata. Existing deployments are available for:

Paper published at SWAT4LS2015.

MACS Christmas Conference

I was asked to speak at the School (Faculty) of Mathematical and Computer Sciences (MACS) Christmas conference. I decided I would have some fun with the presentation. Title: Project X Abstract: For the last 11 months I have been working on a top secret project with a world renowned Scandinavian industry partner. We are now […]

I was asked to speak at the School (Faculty) of Mathematical and Computer Sciences (MACS) Christmas conference. I decided I would have some fun with the presentation.

Title: Project X

Abstract: For the last 11 months I have been working on a top secret project with a world renowned Scandinavian industry partner. We are now moving into the exciting operational phase of this project. I have been granted an early lifting of the embargo that has stopped me talking about this work up until now. I will talk about the data science behind this big data project and how semantic web technology has enabled the delivery of Project X.

You can find more details of flood defence work in this paper.

A short project on linking course data from Sharing and learning

During the summer my colleague Phil Barker (author of the Sharing and Learning blog) and I hosted a summer intern, Anna Grant. Anna’s project was to investigate the feasibility of publishing the data about our courses as Linked Data. Phil subsequently wrote up a blog post about the work which I have been meaning to share for […]

During the summer my colleague Phil Barker (author of the Sharing and Learning blog) and I hosted a summer intern, Anna Grant.

Anna’s project was to investigate the feasibility of publishing the data about our courses as Linked Data. Phil subsequently wrote up a blog post about the work which I have been meaning to share for a long time, so here it is; long overdue.

Below I have picked out some quotes from Phil’s original blog post that describe the work that Anna did.

The objectives for Anna’s work were ambitious: survey existing HE [Higher Education] open data and ontologies in use; design an ontology that we can use; develop an interface we can use to create and publish our course data. Anna made great progress on all three fronts.

The ontologies reviewed were: AIISO, Teach, CourseWare, XCRI, MLO, ECIM and CEDS. A live working draft of the summary / review for these is available for comment as a Google Doc.

The final draft [of the extended MLO Ontology] is shown below. Key:  Green= MLO, Purple=MLO extension, Blue=ECIM / previous alteration to MLO Yellow= generic ontologies such as Dublin core and SKOS.

MLO Extension to capture taught courses and their relationships to degree programmes.

Anna has finished her work here now and returns to Edinburgh Napier University to finish her Master’s project. Alasdair and I think she has done a really impressive job, not least considering she had no previous experience with RDF and semantic technologies. We’ve also found her a pleasure to work with and would like to thank her for her efforts on this project.

Source: A short project on linking course data | Sharing and learning

Crusade for Big Data Keynote

Today I gave the keynote presentation (slides below) at the Crusade for Big Data in the AAL domain workshop as part of the EU Ambient Assisted Living Forum. I gave an overview of the way that the Open PHACTS project has overcome various Big Data challenges to provide a production quality data integration platform that is […]

Today I gave the keynote presentation (slides below) at the Crusade for Big Data in the AAL domain workshop as part of the EU Ambient Assisted Living Forum. I gave an overview of the way that the Open PHACTS project has overcome various Big Data challenges to provide a production quality data integration platform that is being used to answer real pharmacology business questions.

The workshop then broke out into five breakout groups to discuss open challenges facing the AAL community that are posed by Big Data. The breakout groups were:

  1. Privacy and Ethics
  2. Business models for sustainability
  3. Data reuse and interoperability
  4. Data quality
  5. Feedback to the users

The organisers of the workshop (Femke Ongenae and Femke De Backere) will be sharing the outcomes of the brainstorming by proposing several working groups to focus on the issues in the area of AAL.

Open PHACTS wins European Linked Data Award

We are delighted to announce that Open PHACTS has been awarded first place in the Linked Open Data Award of the inaugural European Linked Data Contest (ELDC). An international jury of ambassadors from over 15 European countries elected Open PHACTS as the winner, judged by the following criteria: Shows a high degree of innovation triggers […]

We are delighted to announce that Open PHACTS has been awarded first place in the Linked Open Data Award of the inaugural European Linked Data Contest (ELDC). An international jury of ambassadors from over 15 European countries elected Open PHACTS as the winner, judged by the following criteria:

Gerhard receiving ELDC award

Gerhard receiving ELDC award

  • Shows a high degree of innovation
  • triggers network effects
  • embraces open standards
  • proves technological matureness
  • shows great potential to be utilised in multiple domains
  • achieves a high degree of comprehensibility for the users

The ELDC has been established to recognise Europe’s crème de la crème of linked data and semantic web. Prizes are awarded to stories, products, projects or persons presenting novel and innovative projects, products and industry implementations involving linked data. The ELDC also aims to build a directory of the best European projects in the domains of linked data and the semantic web. Open PHACTS is honored to be chosen as the first winner of the ELDC’s Linked Open Data Award, and to be included in this directory.

Data Integration in a Big Data Context

Today I had the pleasure of visiting the Urban Big Data Centre (UDBC) to give a seminar on Data Integration in a Big Data context (slides below). The idea for the seminar came about due to my collaboration with Nick Bailey (Associate Director of the UBDC) in the Administrative Research Data Centre for Scotland (ADRC-S). In […]

Today I had the pleasure of visiting the Urban Big Data Centre (UDBC) to give a seminar on Data Integration in a Big Data context (slides below). The idea for the seminar came about due to my collaboration with Nick Bailey (Associate Director of the UBDC) in the Administrative Research Data Centre for Scotland (ADRC-S).

In the seminar I wanted to highlight the challenges of data integration that arise in a Big Data context and show examples from my past work that would be relevant to those in the UBDC. In the presentation, I argue that RDF provides a good approach for data integration but it does not solve the basic challenges of messy data and generating mappings between datasets. It does however lay these challenges bare on the table, as Frank van Harmelen highlighted in his SWAT4LS keynote in 2013.

The first use case is drawn from my work on the EU SemSorGrid4Env project where we were developing an integrated view for emergency response planning. The particular use case shown is that of coastal flooding on the south coast of England. Although this project finished in 2011, I am still involved with developing RDF and SPARQL continuous data extensions; see the W3C RDF Stream Processing Community Group for details.

The second use case is drawn from my work on the EU Open PHACTS project. I showed the approach we developed for supporting user controlled views of the integrated data through Scientific Lenses. However, I also talked about the successes of the project and the fact that is currently being actively used for pharmacology research and receiving over 20million hits a month.

I finished the talk with an overview of the Administrative Data Research Centre for Scotland (ADRC-S) and my work on linking birth, marriage, and death records. I am hoping that we can adopt the lenses approach together with incorporating feedback on the linkages from the researchers who will use the integrated views.

In the discussions following the talk, the notion of FAIR data came up. This is the idea that data should be Findable, Accessible, Interoperable, and Reusable by both humans and machines. RDF is one approach that could lead to this. The other area of discussion was around community initiatives for converting existing open datasets into an RDF format. I advocated adopting the approach followed by the Bio2RDF community who share the tasks of creating and maintaining such scripts for biological datasets. An important part of this jigsaw is tracking the provenance of the datasets, for which the W3C Health Care and Life Sciences Community Profile for Dataset Descriptions could be beneficial (there is nothing specific to the HCLS community in the profile).

W3C HCLS Dataset Descriptions Profile Published

After 3 years hard work, countless telephone conferences, issues and drafts, the W3C Health Cara and Life Sciences Community Group (HCLS) have finally published their community profile for describing datasets. The profile deals with different versions of a dataset with each version being published in multiple formats. Below is the announcement from the W3C. The Semantic […]

After 3 years hard work, countless telephone conferences, issues and drafts, the W3C Health Cara and Life Sciences Community Group (HCLS) have finally published their community profile for describing datasets. The profile deals with different versions of a dataset with each version being published in multiple formats. Below is the announcement from the W3C.

The Semantic Web Health Care and Life Sciences Interest Group has published a Group Note of Dataset Descriptions: HCLS Community Profile. Access to consistent, high-quality metadata is critical to finding, understanding, and reusing scientific data. This document describes a consensus among participating stakeholders in the Health Care and the Life Sciences domain on the description of datasets using the Resource Description Framework (RDF). This specification meets key functional requirements, reuses existing vocabularies to the extent that it is possible, and addresses elements of data description, versioning, provenance, discovery, exchange, query, and retrieval. Learn more about the Data Activity.

 

 

SICSA Databases for the Environmental and Social Sciences

Today I attended the SICSA Databases for the Environmental and Social Sciences event hosted by Andy Cobley from the University of Dundee. I gave the below talk on the challenges of linking data. Many areas of scientific discovery rely on combining data from multiples data sources. However there are many challenges in linking data. This […]

Today I attended the SICSA Databases for the Environmental and Social Sciences event hosted by Andy Cobley from the University of Dundee. I gave the below talk on the challenges of linking data.

Many areas of scientific discovery rely on combining data from multiples data sources. However there are many challenges in linking data. This presentation highlights these challenges in the context of using Linked Data for environmental and social science databases.