Bioschemas Samples Hackathon

Last week the Bioschemas Community hosted a workshop. The focus of the meeting was to get web resources describing biological samples to embed Schema.org mark up in their pages. The embedded mark up will enable the web resources to become more discoverable, and therefore the biological samples also. I was not able to attend the […]

Last week the Bioschemas Community hosted a workshop. The focus of the meeting was to get web resources describing biological samples to embed Schema.org mark up in their pages. The embedded mark up will enable the web resources to become more discoverable, and therefore the biological samples also.

I was not able to attend the event but Justin Clark-Casey has written this blog post summarising the event.

NAR Database Paper

The new year started with a new publication, an article in the 2018 NAR Database issue about the IUPHAR Guide to Pharmacology Database. Simon D. Harding, Joanna L. Sharman, Elena Faccenda, Chris Southan, Adam J. Pawson, Sam Ireland, Alasdair J. G. Gray, Liam Bruce, Stephen P. H. Alexander, Stephen Anderton, Clare Bryant, Anthony P. Davenport, […]

The new year started with a new publication, an article in the 2018 NAR Database issue about the IUPHAR Guide to Pharmacology Database.

  • Simon D. Harding, Joanna L. Sharman, Elena Faccenda, Chris Southan, Adam J. Pawson, Sam Ireland, Alasdair J. G. Gray, Liam Bruce, Stephen P. H. Alexander, Stephen Anderton, Clare Bryant, Anthony P. Davenport, Christian Doerig, Doriano Fabbro, Francesca Levi-Schaffer, Michael Spedding, Jamie A. Davies, and {NC-IUPHAR}. The IUPHAR/BPS Guide to PHARMACOLOGY in 2018: updates and expansion to encompass the new guide to IMMUNOPHARMACOLOGY. Nucleic Acids Research, 46(D1):D1091-D1106, 2018. doi:10.1093/nar/gkx1121
    [BibTeX] [Abstract] [Download PDF]

    The IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb, www.guidetopharmacology.org) and its precursor IUPHAR-DB, have captured expert-curated interactions between targets and ligands from selected papers in pharmacology and drug discovery since 2003. This resource continues to be developed in conjunction with the International Union of Basic and Clinical Pharmacology (IUPHAR) and the British Pharmacological Society (BPS). As previously described, our unique model of content selection and quality control is based on 96 target-class subcommittees comprising 512 scientists collaborating with in-house curators. This update describes content expansion, new features and interoperability improvements introduced in the 10 releases since August 2015. Our relationship matrix now describes ∼9000 ligands, ∼15 000 binding constants, ∼6000 papers and ∼1700 human proteins. As an important addition, we also introduce our newly funded project for the Guide to IMMUNOPHARMACOLOGY (GtoImmuPdb, www.guidetoimmunopharmacology.org). This has been ‘forked’ from the well-established GtoPdb data model and expanded into new types of data related to the immune system and inflammatory processes. This includes new ligands, targets, pathways, cell types and diseases for which we are recruiting new IUPHAR expert committees. Designed as an immunopharmacological gateway, it also has an emphasis on potential therapeutic interventions.

    @Article{Harding2018-GTP,
    abstract = {The IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb, www.guidetopharmacology.org) and its precursor IUPHAR-DB, have captured expert-curated interactions between targets and ligands from selected papers in pharmacology and drug discovery since 2003. This resource continues to be developed in conjunction with the International Union of Basic and Clinical Pharmacology (IUPHAR) and the British Pharmacological Society (BPS). As previously described, our unique model of content selection and quality control is based on 96 target-class subcommittees comprising 512 scientists collaborating with in-house curators. This update describes content expansion, new features and interoperability improvements introduced in the 10 releases since August 2015. Our relationship matrix now describes ∼9000 ligands, ∼15 000 binding constants, ∼6000 papers and ∼1700 human proteins. As an important addition, we also introduce our newly funded project for the Guide to IMMUNOPHARMACOLOGY (GtoImmuPdb, www.guidetoimmunopharmacology.org). This has been ‘forked’ from the well-established GtoPdb data model and expanded into new types of data related to the immune system and inflammatory processes. This includes new ligands, targets, pathways, cell types and diseases for which we are recruiting new IUPHAR expert committees. Designed as an immunopharmacological gateway, it also has an emphasis on potential therapeutic interventions.},
    author = {Harding, Simon D and Sharman, Joanna L and Faccenda, Elena and Southan, Chris and Pawson, Adam J and Ireland, Sam and Gray, Alasdair J G and Bruce, Liam and Alexander, Stephen P H and Anderton, Stephen and Bryant, Clare and Davenport, Anthony P and Doerig, Christian and Fabbro, Doriano and Levi-Schaffer, Francesca and Spedding, Michael and Davies, Jamie A and , {NC-IUPHAR}},
    title = {The IUPHAR/BPS Guide to PHARMACOLOGY in 2018: updates and expansion to encompass the new guide to IMMUNOPHARMACOLOGY},
    journal = {Nucleic Acids Research},
    year = {2018},
    volume = {46},
    number = {D1},
    pages = {D1091-D1106},
    month = jan,
    OPTnote = {},
    OPTannote = {},
    url = {http://dx.doi.org/10.1093/nar/gkx1121},
    doi = {10.1093/nar/gkx1121}
    }

My involvement came from Liam Bruce’s honours project. Liam developed the RDB2RDF mappings that convert the existing relational content into an RDF representation. The mappings are executed using the Morph-RDB R2RML engine.

To ensure that we abide by the FAIR data principles, we also generate machine processable metadata descriptions of the data that conform to the HCLS Community Profile.

Below is an altmetric donut so you can see what people are saying about the paper.

ISWC2017 Papers

I have had two papers accepted within the events that make up ISWC2017. My PhD student Qianru Zhou has been working on using RDF stream processing to detect anomalous events through telecommunication network messages. The particular scenario in our paper that will be presented at the Web Stream Processing workshop focuses on detecting a disaster such as […]

I have had two papers accepted within the events that make up ISWC2017.

My PhD student Qianru Zhou has been working on using RDF stream processing to detect anomalous events through telecommunication network messages. The particular scenario in our paper that will be presented at the Web Stream Processing workshop focuses on detecting a disaster such as the capsizing of the Eastern Star on the Yangtze River [1].

The second paper is a poster in the main conference that provides an overview of the Bioschemas project where we are identifying the Schema.org markup that is of primary importance for life science resources. Hopefully the paper title will pull the punters in for the session [2].

[1] Qianru Zhou, Stephen McLaughlin, Alasdair J. G. Gray, Shangbin Wu, and Chengxiang Wang. Lost Silence: An emergency response early detection service through continuous processing of telecommunication data streams. In Web Stream Processing 2017, 2017.
[Bibtex]
@InProceedings{ZhouEtal2017:LostSilence:WSP2017,
abstract = {Early detection of significant traumatic events, e.g. terrorist events, ship capsizes, is important to ensure that a prompt emergency response can occur. In the modern world telecommunication systems can and do play a key role in ensuring a successful emergency response by detecting such incidents through significant changes in calls and access to the networks. In this paper a methodology is illustrated to detect such incidents immediately (with the delay in the order of milliseconds), by processing semantically annotated streams of data in cellular telecommunication systems. In our methodology, live information of phones' positions and status are encoded as RDF streams. We propose an algorithm that processes streams of RDF annotated telecommunication data to detect abnormality. Our approach is exemplified in the context of capsize of a passenger cruise ship but is readily translatable to other incidents. Our evaluation results show that with properly chosen window size, such incidents can be detected effectively.},
author = {Qianru Zhou and Stephen McLaughlin and Alasdair J G Gray and Shangbin Wu and Chengxiang Wang},
title = {Lost Silence: An emergency response early detection service through continuous processing of telecommunication data streams},
OPTcrossref = {},
OPTkey = {},
booktitle = {Web Stream Processing 2017},
year = {2017},
OPTeditor = {},
OPTvolume = {},
OPTnumber = {},
OPTseries = {},
OPTpages = {},
OPTmonth = {},
OPTaddress = {},
OPTorganization = {},
OPTpublisher = {},
OPTnote = {},
OPTannote = {}
}
[2] Alasdair J. G. Gray, Carole Goble, Rafael C. Jimenez, and The Bioschemas Community. Bioschemas: From Potato Salad to Protein Annotation. In ISWC 2017 Poster Proceedings, Vienna, Austria, 2017. Poster
[Bibtex]
@InProceedings{grayetal2017:bioschemas:iswc2017,
abstract = {The life sciences have a wealth of data resources with a wide range of overlapping content. Key repositories, such as UniProt for protein data or Entrez Gene for gene data, are well known and their content easily discovered through search engines. However, there is a long-tail of bespoke datasets with important content that are not so prominent in search results. Building on the success of Schema.org for making a wide range of structured web content more discoverable and interpretable, e.g. food recipes, the Bioschemas community (http://bioschemas.org) aim to make life sciences datasets more findable by encouraging data providers to embed Schema.org markup in their resources.},
author = {Alasdair J G Gray and Carole Goble and Rafael C Jimenez and {The Bioschemas Community}},
title = {Bioschemas: From Potato Salad to Protein Annotation},
OPTcrossref = {},
OPTkey = {},
booktitle = {ISWC 2017 Poster Proceedings},
year = {2017},
OPTeditor = {},
OPTvolume = {},
OPTnumber = {},
OPTseries = {},
OPTpages = {},
OPTmonth = {},
address = {Vienna, Austria},
OPTorganization = {},
OPTpublisher = {},
note = {Poster},
OPTannote = {}
}

ISWC2017 Papers

I have had two papers accepted within the events that make up ISWC2017. My PhD student Qianru Zhou has been working on using RDF stream processing to detect anomalous events through telecommunication network messages. The particular scenario in our paper that will be presented at the Web Stream Processing workshop focuses on detecting a disaster such as […]

I have had two papers accepted within the events that make up ISWC2017.

My PhD student Qianru Zhou has been working on using RDF stream processing to detect anomalous events through telecommunication network messages. The particular scenario in our paper that will be presented at the Web Stream Processing workshop focuses on detecting a disaster such as the capsizing of the Eastern Star on the Yangtze River [1].

The second paper is a poster in the main conference that provides an overview of the Bioschemas project where we are identifying the Schema.org markup that is of primary importance for life science resources. Hopefully the paper title will pull the punters in for the session [2].

[1] Qianru Zhou, Stephen McLaughlin, Alasdair J. G. Gray, Shangbin Wu, and Chengxiang Wang. Lost Silence: An emergency response early detection service through continuous processing of telecommunication data streams. In Web Stream Processing 2017, 2017.
[Bibtex]
@InProceedings{ZhouEtal2017:LostSilence:WSP2017,
abstract = {Early detection of significant traumatic events, e.g. terrorist events, ship capsizes, is important to ensure that a prompt emergency response can occur. In the modern world telecommunication systems can and do play a key role in ensuring a successful emergency response by detecting such incidents through significant changes in calls and access to the networks. In this paper a methodology is illustrated to detect such incidents immediately (with the delay in the order of milliseconds), by processing semantically annotated streams of data in cellular telecommunication systems. In our methodology, live information of phones' positions and status are encoded as RDF streams. We propose an algorithm that processes streams of RDF annotated telecommunication data to detect abnormality. Our approach is exemplified in the context of capsize of a passenger cruise ship but is readily translatable to other incidents. Our evaluation results show that with properly chosen window size, such incidents can be detected effectively.},
author = {Qianru Zhou and Stephen McLaughlin and Alasdair J G Gray and Shangbin Wu and Chengxiang Wang},
title = {Lost Silence: An emergency response early detection service through continuous processing of telecommunication data streams},
OPTcrossref = {},
OPTkey = {},
booktitle = {Web Stream Processing 2017},
year = {2017},
OPTeditor = {},
OPTvolume = {},
OPTnumber = {},
OPTseries = {},
OPTpages = {},
OPTmonth = {},
OPTaddress = {},
OPTorganization = {},
OPTpublisher = {},
OPTnote = {},
OPTannote = {}
}
[2] Alasdair J. G. Gray, Carole Goble, Rafael C. Jimenez, and The Bioschemas Community. Bioschemas: From Potato Salad to Protein Annotation. In ISWC 2017 Poster Proceedings, Vienna, Austria, 2017. Poster
[Bibtex]
@InProceedings{grayetal2017:bioschemas:iswc2017,
abstract = {The life sciences have a wealth of data resources with a wide range of overlapping content. Key repositories, such as UniProt for protein data or Entrez Gene for gene data, are well known and their content easily discovered through search engines. However, there is a long-tail of bespoke datasets with important content that are not so prominent in search results. Building on the success of Schema.org for making a wide range of structured web content more discoverable and interpretable, e.g. food recipes, the Bioschemas community (http://bioschemas.org) aim to make life sciences datasets more findable by encouraging data providers to embed Schema.org markup in their resources.},
author = {Alasdair J G Gray and Carole Goble and Rafael C Jimenez and {The Bioschemas Community}},
title = {Bioschemas: From Potato Salad to Protein Annotation},
OPTcrossref = {},
OPTkey = {},
booktitle = {ISWC 2017 Poster Proceedings},
year = {2017},
OPTeditor = {},
OPTvolume = {},
OPTnumber = {},
OPTseries = {},
OPTpages = {},
OPTmonth = {},
address = {Vienna, Austria},
OPTorganization = {},
OPTpublisher = {},
note = {Poster},
OPTannote = {}
}

Supporting Dataset Descriptions in the Life Sciences

Seminar talk given at the EBI on 5 April 2017. Abstract: Machine processable descriptions of datasets can help make data more FAIR; that is Findable, Accessible, Interoperable, and Reusable. However, there are a variety of metadata profiles for describing datasets, some specific to the life sciences and others more generic in their focus. Each profile has […]

Seminar talk given at the EBI on 5 April 2017.

Abstract: Machine processable descriptions of datasets can help make data more FAIR; that is Findable, Accessible, Interoperable, and Reusable. However, there are a variety of metadata profiles for describing datasets, some specific to the life sciences and others more generic in their focus. Each profile has its own set of properties and requirements as to which must be provided and which are more optional. Developing a dataset description for a given dataset to conform to a specific metadata profile is a challenging process.

In this talk, I will give an overview of some of the dataset description specifications that are available. I will discuss the difficulties in writing a dataset description that conforms to a profile and the tooling that I’ve developed to support dataset publishers in creating metadata description and validating them against a chosen specification.

Research Blog: Facilitating the discovery of public datasets

Google are doing some interesting work on making datasets, in particular scientific datasets, more discoverable with schema.org markup. This is closely related to the bioschemas community project.
Source: Research Blog: Facilitating the discovery of pu…

Google are doing some interesting work on making datasets, in particular scientific datasets, more discoverable with schema.org markup. This is closely related to the bioschemas community project.

Source: Research Blog: Facilitating the discovery of public datasets