SLiDInG 6

Today, the Semantic Web Lab hosted the 6th Scottish Linked Data Interest Group workshop at Heriot-Watt University. The event was sponsored by the SICSA Data Science Theme. The event was well attended with 30 researchers from across Scotland (and Newcastle) coming together for a day of flash talks and discussions. Live minutes were captured during the […]

Today, the Semantic Web Lab hosted the 6th Scottish Linked Data Interest Group workshop at Heriot-Watt University. The event was sponsored by the SICSA Data Science Theme. The event was well attended with 30 researchers from across Scotland (and Newcastle) coming together for a day of flash talks and discussions. Live minutes were captured during the day and can be found here.

I gave a talk on the successes and challenges of FAIR data. My slides are embedded below.

UK Ontology Network 2018

This week I went to the UK Ontology Network meeting hosted at Keele University. There was an interesting array of talks in the programme showing the breadth of work going on in the UK. I gave a talk on the Bioschemas Community  (slides below) and Leyla Garcia presented a poster providing more details of the […]

This week I went to the UK Ontology Network meeting hosted at Keele University. There was an interesting array of talks in the programme showing the breadth of work going on in the UK.

I gave a talk on the Bioschemas Community  (slides below) and Leyla Garcia presented a poster providing more details of the current Bioschema Profiles.

The UK Ontology Network is going through a reflection phase and would like interested parties to complete the following online survey.

 

Bioschemas Samples Hackathon

Last week the Bioschemas Community hosted a workshop. The focus of the meeting was to get web resources describing biological samples to embed Schema.org mark up in their pages. The embedded mark up will enable the web resources to become more discoverable, and therefore the biological samples also. I was not able to attend the […]

Last week the Bioschemas Community hosted a workshop. The focus of the meeting was to get web resources describing biological samples to embed Schema.org mark up in their pages. The embedded mark up will enable the web resources to become more discoverable, and therefore the biological samples also.

I was not able to attend the event but Justin Clark-Casey has written this blog post summarising the event.

ISWC2017 Papers

I have had two papers accepted within the events that make up ISWC2017. My PhD student Qianru Zhou has been working on using RDF stream processing to detect anomalous events through telecommunication network messages. The particular scenario in our paper that will be presented at the Web Stream Processing workshop focuses on detecting a disaster such as […]

I have had two papers accepted within the events that make up ISWC2017.

My PhD student Qianru Zhou has been working on using RDF stream processing to detect anomalous events through telecommunication network messages. The particular scenario in our paper that will be presented at the Web Stream Processing workshop focuses on detecting a disaster such as the capsizing of the Eastern Star on the Yangtze River [1].

The second paper is a poster in the main conference that provides an overview of the Bioschemas project where we are identifying the Schema.org markup that is of primary importance for life science resources. Hopefully the paper title will pull the punters in for the session [2].

[1] Qianru Zhou, Stephen McLaughlin, Alasdair J. G. Gray, Shangbin Wu, and Chengxiang Wang. Lost Silence: An emergency response early detection service through continuous processing of telecommunication data streams. In Web Stream Processing 2017, 2017.
[Bibtex]
@InProceedings{ZhouEtal2017:LostSilence:WSP2017,
abstract = {Early detection of significant traumatic events, e.g. terrorist events, ship capsizes, is important to ensure that a prompt emergency response can occur. In the modern world telecommunication systems can and do play a key role in ensuring a successful emergency response by detecting such incidents through significant changes in calls and access to the networks. In this paper a methodology is illustrated to detect such incidents immediately (with the delay in the order of milliseconds), by processing semantically annotated streams of data in cellular telecommunication systems. In our methodology, live information of phones' positions and status are encoded as RDF streams. We propose an algorithm that processes streams of RDF annotated telecommunication data to detect abnormality. Our approach is exemplified in the context of capsize of a passenger cruise ship but is readily translatable to other incidents. Our evaluation results show that with properly chosen window size, such incidents can be detected effectively.},
author = {Qianru Zhou and Stephen McLaughlin and Alasdair J G Gray and Shangbin Wu and Chengxiang Wang},
title = {Lost Silence: An emergency response early detection service through continuous processing of telecommunication data streams},
OPTcrossref = {},
OPTkey = {},
booktitle = {Web Stream Processing 2017},
year = {2017},
OPTeditor = {},
OPTvolume = {},
OPTnumber = {},
OPTseries = {},
OPTpages = {},
OPTmonth = {},
OPTaddress = {},
OPTorganization = {},
OPTpublisher = {},
OPTnote = {},
OPTannote = {}
}
[2] Alasdair J. G. Gray, Carole Goble, Rafael C. Jimenez, and The Bioschemas Community. Bioschemas: From Potato Salad to Protein Annotation. In ISWC 2017 Poster Proceedings, Vienna, Austria, 2017. Poster
[Bibtex]
@InProceedings{grayetal2017:bioschemas:iswc2017,
abstract = {The life sciences have a wealth of data resources with a wide range of overlapping content. Key repositories, such as UniProt for protein data or Entrez Gene for gene data, are well known and their content easily discovered through search engines. However, there is a long-tail of bespoke datasets with important content that are not so prominent in search results. Building on the success of Schema.org for making a wide range of structured web content more discoverable and interpretable, e.g. food recipes, the Bioschemas community (http://bioschemas.org) aim to make life sciences datasets more findable by encouraging data providers to embed Schema.org markup in their resources.},
author = {Alasdair J G Gray and Carole Goble and Rafael C Jimenez and {The Bioschemas Community}},
title = {Bioschemas: From Potato Salad to Protein Annotation},
OPTcrossref = {},
OPTkey = {},
booktitle = {ISWC 2017 Poster Proceedings},
year = {2017},
OPTeditor = {},
OPTvolume = {},
OPTnumber = {},
OPTseries = {},
OPTpages = {},
OPTmonth = {},
address = {Vienna, Austria},
OPTorganization = {},
OPTpublisher = {},
note = {Poster},
OPTannote = {}
}

ISWC2017 Papers

I have had two papers accepted within the events that make up ISWC2017. My PhD student Qianru Zhou has been working on using RDF stream processing to detect anomalous events through telecommunication network messages. The particular scenario in our paper that will be presented at the Web Stream Processing workshop focuses on detecting a disaster such as […]

I have had two papers accepted within the events that make up ISWC2017.

My PhD student Qianru Zhou has been working on using RDF stream processing to detect anomalous events through telecommunication network messages. The particular scenario in our paper that will be presented at the Web Stream Processing workshop focuses on detecting a disaster such as the capsizing of the Eastern Star on the Yangtze River [1].

The second paper is a poster in the main conference that provides an overview of the Bioschemas project where we are identifying the Schema.org markup that is of primary importance for life science resources. Hopefully the paper title will pull the punters in for the session [2].

[1] Qianru Zhou, Stephen McLaughlin, Alasdair J. G. Gray, Shangbin Wu, and Chengxiang Wang. Lost Silence: An emergency response early detection service through continuous processing of telecommunication data streams. In Web Stream Processing 2017, 2017.
[Bibtex]
@InProceedings{ZhouEtal2017:LostSilence:WSP2017,
abstract = {Early detection of significant traumatic events, e.g. terrorist events, ship capsizes, is important to ensure that a prompt emergency response can occur. In the modern world telecommunication systems can and do play a key role in ensuring a successful emergency response by detecting such incidents through significant changes in calls and access to the networks. In this paper a methodology is illustrated to detect such incidents immediately (with the delay in the order of milliseconds), by processing semantically annotated streams of data in cellular telecommunication systems. In our methodology, live information of phones' positions and status are encoded as RDF streams. We propose an algorithm that processes streams of RDF annotated telecommunication data to detect abnormality. Our approach is exemplified in the context of capsize of a passenger cruise ship but is readily translatable to other incidents. Our evaluation results show that with properly chosen window size, such incidents can be detected effectively.},
author = {Qianru Zhou and Stephen McLaughlin and Alasdair J G Gray and Shangbin Wu and Chengxiang Wang},
title = {Lost Silence: An emergency response early detection service through continuous processing of telecommunication data streams},
OPTcrossref = {},
OPTkey = {},
booktitle = {Web Stream Processing 2017},
year = {2017},
OPTeditor = {},
OPTvolume = {},
OPTnumber = {},
OPTseries = {},
OPTpages = {},
OPTmonth = {},
OPTaddress = {},
OPTorganization = {},
OPTpublisher = {},
OPTnote = {},
OPTannote = {}
}
[2] Alasdair J. G. Gray, Carole Goble, Rafael C. Jimenez, and The Bioschemas Community. Bioschemas: From Potato Salad to Protein Annotation. In ISWC 2017 Poster Proceedings, Vienna, Austria, 2017. Poster
[Bibtex]
@InProceedings{grayetal2017:bioschemas:iswc2017,
abstract = {The life sciences have a wealth of data resources with a wide range of overlapping content. Key repositories, such as UniProt for protein data or Entrez Gene for gene data, are well known and their content easily discovered through search engines. However, there is a long-tail of bespoke datasets with important content that are not so prominent in search results. Building on the success of Schema.org for making a wide range of structured web content more discoverable and interpretable, e.g. food recipes, the Bioschemas community (http://bioschemas.org) aim to make life sciences datasets more findable by encouraging data providers to embed Schema.org markup in their resources.},
author = {Alasdair J G Gray and Carole Goble and Rafael C Jimenez and {The Bioschemas Community}},
title = {Bioschemas: From Potato Salad to Protein Annotation},
OPTcrossref = {},
OPTkey = {},
booktitle = {ISWC 2017 Poster Proceedings},
year = {2017},
OPTeditor = {},
OPTvolume = {},
OPTnumber = {},
OPTseries = {},
OPTpages = {},
OPTmonth = {},
address = {Vienna, Austria},
OPTorganization = {},
OPTpublisher = {},
note = {Poster},
OPTannote = {}
}

Supporting Dataset Descriptions in the Life Sciences

Seminar talk given at the EBI on 5 April 2017. Abstract: Machine processable descriptions of datasets can help make data more FAIR; that is Findable, Accessible, Interoperable, and Reusable. However, there are a variety of metadata profiles for describing datasets, some specific to the life sciences and others more generic in their focus. Each profile has […]

Seminar talk given at the EBI on 5 April 2017.

Abstract: Machine processable descriptions of datasets can help make data more FAIR; that is Findable, Accessible, Interoperable, and Reusable. However, there are a variety of metadata profiles for describing datasets, some specific to the life sciences and others more generic in their focus. Each profile has its own set of properties and requirements as to which must be provided and which are more optional. Developing a dataset description for a given dataset to conform to a specific metadata profile is a challenging process.

In this talk, I will give an overview of some of the dataset description specifications that are available. I will discuss the difficulties in writing a dataset description that conforms to a profile and the tooling that I’ve developed to support dataset publishers in creating metadata description and validating them against a chosen specification.