Journal Articles

Here is a list of my journal articles.

2018

  • Abiodun Akinyemi, Ming Sun, and Alasdair J. G. Gray. An ontology-based data integration framework for construction information management. Proceedings of the Institution of Civil Engineers – Management, Procurement and Law, 171(3):111–125, 2018. doi:10.1680/jmapl.17.00052
    [BibTeX] [Abstract] [Download PDF]

    Information management during the construction phase of a built asset involves multiple stakeholders using multiple software applications to generate and store data. This is problematic as data come in different forms and are labour intensive to piece together. Existing solutions to this problem are predominantly in proprietary applications, which are sometimes cost prohibitive for small engineering firms, or conceptual studies with use cases that cannot be easily adapted. In view of these limitations, this research presents an ontology-based data integration framework that makes use of open-source tools that support Semantic Web technologies. The proposed framework enables rapid answering of queries over construction data integrated from heterogeneous sources, data quality checks and reuse of project software resources. The attributes and functionalities of the proposed solution align with the requirements common to small firms with limited information technology skill and budget. Consequently, this solution can be of great benefit for their data projects.

    @article{Akinyemi_2018,
    abstract = {Information management during the construction phase of a built asset involves multiple stakeholders using multiple software applications to generate and store data. This is problematic as data come in different forms and are labour intensive to piece together. Existing solutions to this problem are predominantly in proprietary applications, which are sometimes cost prohibitive for small engineering firms, or conceptual studies with use cases that cannot be easily adapted. In view of these limitations, this research presents an ontology-based data integration framework that makes use of open-source tools that support Semantic Web technologies. The proposed framework enables rapid answering of queries over construction data integrated from heterogeneous sources, data quality checks and reuse of project software resources. The attributes and functionalities of the proposed solution align with the requirements common to small firms with limited information technology skill and budget. Consequently, this solution can be of great benefit for their data projects.},
    doi = {10.1680/jmapl.17.00052},
    url = {https://doi.org/10.1680%2Fjmapl.17.00052},
    year = 2018,
    month = jun,
    publisher = {Thomas Telford Ltd.},
    volume = {171},
    number = {3},
    pages = {111--125},
    author = {Abiodun Akinyemi and Ming Sun and Alasdair J G Gray},
    title = {An ontology-based data integration framework for construction information management},
    journal = {Proceedings of the Institution of Civil Engineers - Management, Procurement and Law}
    }

  • Simon D. Harding, Joanna L. Sharman, Elena Faccenda, Christopher Southan, Adam J. Pawson, Sam Ireland, Alasdair J. G. Gray, Liam Bruce, Stephen P. H. Alexander, Stephen Anderton, Clare Bryant, Anthony P. Davenport, Christian Doerig, Doriano Fabbro, Francesca Levi -, Michael Spedding, Jamie A. Davies, and Nc -. The IUPHAR/BPS Guide to PHARMACOLOGY in 2018: updates and expansion to encompass the new guide to IMMUNOPHARMACOLOGY. Nucleic Acids Research, 46(Database-Issue):D1091–D1106, 2018. doi:10.1093/nar/gkx1121
    [BibTeX] [Abstract] [Download PDF]

    The IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb, www.guidetopharmacology.org) and its precursor IUPHAR-DB, have captured expert-curated interactions between targets and ligands from selected papers in pharmacology and drug discovery since 2003. This resource continues to be developed in conjunction with the International Union of Basic and Clinical Pharmacology (IUPHAR) and the British Pharmacological Society (BPS). As previously described, our unique model of content selection and quality control is based on 96 target-class subcommittees comprising 512 scientists collaborating with in-house curators. This update describes content expansion, new features and interoperability improvements introduced in the 10 releases since August 2015. Our relationship matrix now describes ∼9000 ligands, ∼15 000 binding constants, ∼6000 papers and ∼1700 human proteins. As an important addition, we also introduce our newly funded project for the Guide to IMMUNOPHARMACOLOGY (GtoImmuPdb, www.guidetoimmunopharmacology.org). This has been ‘forked’ from the well-established GtoPdb data model and expanded into new types of data related to the immune system and inflammatory processes. This includes new ligands, targets, pathways, cell types and diseases for which we are recruiting new IUPHAR expert committees. Designed as an immunopharmacological gateway, it also has an emphasis on potential therapeutic interventions.

    @article{DBLP:journals/nar/HardingSFSPIGBA18,
    abstract = {The IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb, www.guidetopharmacology.org) and its precursor IUPHAR-DB, have captured expert-curated interactions between targets and ligands from selected papers in pharmacology and drug discovery since 2003. This resource continues to be developed in conjunction with the International Union of Basic and Clinical Pharmacology (IUPHAR) and the British Pharmacological Society (BPS). As previously described, our unique model of content selection and quality control is based on 96 target-class subcommittees comprising 512 scientists collaborating with in-house curators. This update describes content expansion, new features and interoperability improvements introduced in the 10 releases since August 2015. Our relationship matrix now describes ∼9000 ligands, ∼15 000 binding constants, ∼6000 papers and ∼1700 human proteins. As an important addition, we also introduce our newly funded project for the Guide to IMMUNOPHARMACOLOGY (GtoImmuPdb, www.guidetoimmunopharmacology.org). This has been ‘forked’ from the well-established GtoPdb data model and expanded into new types of data related to the immune system and inflammatory processes. This includes new ligands, targets, pathways, cell types and diseases for which we are recruiting new IUPHAR expert committees. Designed as an immunopharmacological gateway, it also has an emphasis on potential therapeutic interventions.},
    author = {Simon D. Harding and
    Joanna L. Sharman and
    Elena Faccenda and
    Christopher Southan and
    Adam J. Pawson and
    Sam Ireland and
    Alasdair J. G. Gray and
    Liam Bruce and
    Stephen P. H. Alexander and
    Stephen Anderton and
    Clare Bryant and
    Anthony P. Davenport and
    Christian Doerig and
    Doriano Fabbro and
    Francesca Levi{-}Schaffer and
    Michael Spedding and
    Jamie A. Davies and
    Nc{-}Iuphar},
    title = {The {IUPHAR/BPS} Guide to {PHARMACOLOGY} in 2018: updates and expansion
    to encompass the new guide to {IMMUNOPHARMACOLOGY}},
    journal = {Nucleic Acids Research},
    volume = {46},
    number = {Database-Issue},
    pages = {D1091--D1106},
    year = {2018},
    url = {https://doi.org/10.1093/nar/gkx1121},
    doi = {10.1093/nar/gkx1121}
    }

2017

  • Mark D. Wilkinson, Ruben Verborgh, Luiz Olavo Bonino Silva da Santos, Tim Clark, Morris A. Swertz, Fleur D. L. Kelpin, Alasdair J. G. Gray, Erik A. Schultes, Erik M. van Mulligen, Paolo Ciccarese, Arnold Kuzniar, Anand Gavai, Mark Thompson, Rajaram Kaliyaperumal, Jerven T. Bolleman, and Michel Dumontier. Interoperability and FAIRness through a novel combination of Web technologies. PeerJ Computer Science, 3:e110, 2017. doi:10.7717/peerj-cs.110
    [BibTeX] [Download PDF]
    @article{DBLP:journals/peerj-cs/WilkinsonVSCSKG17,
    author = {Mark D. Wilkinson and
    Ruben Verborgh and
    Luiz Olavo Bonino da Silva Santos and
    Tim Clark and
    Morris A. Swertz and
    Fleur D. L. Kelpin and
    Alasdair J. G. Gray and
    Erik A. Schultes and
    Erik M. van Mulligen and
    Paolo Ciccarese and
    Arnold Kuzniar and
    Anand Gavai and
    Mark Thompson and
    Rajaram Kaliyaperumal and
    Jerven T. Bolleman and
    Michel Dumontier},
    title = {Interoperability and FAIRness through a novel combination of Web technologies},
    journal = {PeerJ Computer Science},
    volume = {3},
    pages = {e110},
    year = {2017},
    url = {https://doi.org/10.7717/peerj-cs.110},
    doi = {10.7717/peerj-cs.110}
    }

2014

  • Alasdair J. G. Gray. Dataset Descriptions for Linked Data Systems. IEEE Internet Computing, 18(4):66–69, 2014. doi:10.1109/MIC.2014.66
    [BibTeX] [Download PDF]
    @article{DBLP:journals/internet/Gray14,
    author = {Alasdair J. G. Gray},
    title = {Dataset Descriptions for Linked Data Systems},
    journal = {{IEEE} Internet Computing},
    volume = {18},
    number = {4},
    pages = {66--69},
    year = {2014},
    url = {https://doi.org/10.1109/MIC.2014.66},
    doi = {10.1109/MIC.2014.66}
    }

  • Alasdair J. G. Gray, Paul T. Groth, Antonis Loizou, Sune Askjaer, Christian Y. A. Brenninkmeijer, Kees Burger, Christine Chichester, Chris T. A. Evelo, Carole A. Goble, Lee Harland, Steve Pettifer, Mark Thompson, Andra Waagmeester, and Antony J. Williams. Applying linked data approaches to pharmacology: Architectural decisions and implementation. Semantic Web Journal, 5(2):101–113, 2014. doi:10.3233/SW-2012-0088
    [BibTeX] [Abstract] [Download PDF]

    The discovery of new medicines requires pharmacologists to interact with a number of information sources ranging from tabular data to scientific papers, and other specialized formats. In this application report, we describe a linked data platform for integrating multiple pharmacology datasets that form the basis for several drug discovery applications. The functionality offered by the platform has been drawn from a collection of prioritised drug discovery business questions created as part of the Open PHACTS project, a collaboration of research institutions and major pharmaceutical companies. We describe the architecture of the platform focusing on seven design decisions that drove its development with the aim of informing others developing similar software in this or other domains. The utility of the platform is demonstrated by the variety of drug discovery applications being built to access the integrated data.

    @article{DBLP:journals/semweb/GrayGLABBCEGHPTWW14,
    abstract = {The discovery of new medicines requires pharmacologists to interact with a number of information sources ranging from tabular data to scientific papers, and other specialized formats. In this application report, we describe a linked data platform for integrating multiple pharmacology datasets that form the basis for several drug discovery applications. The functionality offered by the platform has been drawn from a collection of prioritised drug discovery business questions created as part of the Open PHACTS project, a collaboration of research institutions and major pharmaceutical companies. We describe the architecture of the platform focusing on seven design decisions that drove its development with the aim of informing others developing similar software in this or other domains. The utility of the platform is demonstrated by the variety of drug discovery applications being built to access the integrated data.},
    author = {Alasdair J. G. Gray and
    Paul T. Groth and
    Antonis Loizou and
    Sune Askjaer and
    Christian Y. A. Brenninkmeijer and
    Kees Burger and
    Christine Chichester and
    Chris T. A. Evelo and
    Carole A. Goble and
    Lee Harland and
    Steve Pettifer and
    Mark Thompson and
    Andra Waagmeester and
    Antony J. Williams},
    title = {Applying linked data approaches to pharmacology: Architectural decisions
    and implementation},
    journal = {Semantic Web Journal},
    volume = {5},
    number = {2},
    pages = {101--113},
    year = {2014},
    url = {https://doi.org/10.3233/SW-2012-0088},
    doi = {10.3233/SW-2012-0088}
    }

  • Paul T. Groth, Antonis Loizou, Alasdair J. G. Gray, Carole A. Goble, Lee Harland, and Steve Pettifer. API-centric Linked Data integration: The Open PHACTS Discovery Platform case study. Journal of Web Semantics, 29:12–18, 2014. doi:10.1016/j.websem.2014.03.003
    [BibTeX] [Download PDF]
    @article{DBLP:journals/ws/GrothLGGHP14,
    author = {Paul T. Groth and
    Antonis Loizou and
    Alasdair J. G. Gray and
    Carole A. Goble and
    Lee Harland and
    Steve Pettifer},
    title = {API-centric Linked Data integration: The Open {PHACTS} Discovery Platform
    case study},
    journal = {Journal of Web Semantics},
    volume = {29},
    pages = {12--18},
    year = {2014},
    url = {https://doi.org/10.1016/j.websem.2014.03.003},
    doi = {10.1016/j.websem.2014.03.003}
    }

2013

  • Paolo Ciccarese, Stian Soiland -, Khalid Belhajjame, Alasdair J. G. Gray, Carole A. Goble, and Tim Clark. PAV ontology: provenance, authoring and versioning. Journal of Biomedical Semantics, 4:37, 2013. doi:10.1186/2041-1480-4-37
    [BibTeX] [Abstract] [Download PDF]

    Background Provenance is a critical ingredient for establishing trust of published scientific content. This is true whether we are considering a data set, a computational workflow, a peer-reviewed publication or a simple scientific claim with supportive evidence. Existing vocabularies such as Dublin Core Terms (DC Terms) and the W3C Provenance Ontology (PROV-O) are domain-independent and general-purpose and they allow and encourage for extensions to cover more specific needs. In particular, to track authoring and versioning information of web resources, PROV-O provides a basic methodology but not any specific classes and properties for identifying or distinguishing between the various roles assumed by agents manipulating digital artifacts, such as author, contributor and curator. Results We present the Provenance, Authoring and Versioning ontology (PAV, namespace http://purl.org/pav/): a lightweight ontology for capturing “just enough” descriptions essential for tracking the provenance, authoring and versioning of web resources. We argue that such descriptions are essential for digital scientific content. PAV distinguishes between contributors, authors and curators of content and creators of representations in addition to the provenance of originating resources that have been accessed, transformed and consumed. We explore five projects (and communities) that have adopted PAV illustrating their usage through concrete examples. Moreover, we present mappings that show how PAV extends the W3C PROV-O ontology to support broader interoperability. Method The initial design of the PAV ontology was driven by requirements from the AlzSWAN project with further requirements incorporated later from other projects detailed in this paper. The authors strived to keep PAV lightweight and compact by including only those terms that have demonstrated to be pragmatically useful in existing applications, and by recommending terms from existing ontologies when plausible. Discussion We analyze and compare PAV with related approaches, namely Provenance Vocabulary (PRV), DC Terms and BIBFRAME. We identify similarities and analyze differences between those vocabularies and PAV, outlining strengths and weaknesses of our proposed model. We specify SKOS mappings that align PAV with DC Terms. We conclude the paper with general remarks on the applicability of PAV.

    @article{DBLP:journals/biomedsem/CiccareseSBGGC13,
    abstract = {Background
    Provenance is a critical ingredient for establishing trust of published scientific content. This is true whether we are considering a data set, a computational workflow, a peer-reviewed publication or a simple scientific claim with supportive evidence. Existing vocabularies such as Dublin Core Terms (DC Terms) and the W3C Provenance Ontology (PROV-O) are domain-independent and general-purpose and they allow and encourage for extensions to cover more specific needs. In particular, to track authoring and versioning information of web resources, PROV-O provides a basic methodology but not any specific classes and properties for identifying or distinguishing between the various roles assumed by agents manipulating digital artifacts, such as author, contributor and curator.
    Results
    We present the Provenance, Authoring and Versioning ontology (PAV, namespace http://purl.org/pav/): a lightweight ontology for capturing “just enough” descriptions essential for tracking the provenance, authoring and versioning of web resources. We argue that such descriptions are essential for digital scientific content. PAV distinguishes between contributors, authors and curators of content and creators of representations in addition to the provenance of originating resources that have been accessed, transformed and consumed. We explore five projects (and communities) that have adopted PAV illustrating their usage through concrete examples. Moreover, we present mappings that show how PAV extends the W3C PROV-O ontology to support broader interoperability.
    Method
    The initial design of the PAV ontology was driven by requirements from the AlzSWAN project with further requirements incorporated later from other projects detailed in this paper. The authors strived to keep PAV lightweight and compact by including only those terms that have demonstrated to be pragmatically useful in existing applications, and by recommending terms from existing ontologies when plausible.
    Discussion
    We analyze and compare PAV with related approaches, namely Provenance Vocabulary (PRV), DC Terms and BIBFRAME. We identify similarities and analyze differences between those vocabularies and PAV, outlining strengths and weaknesses of our proposed model. We specify SKOS mappings that align PAV with DC Terms. We conclude the paper with general remarks on the applicability of PAV.},
    author = {Paolo Ciccarese and
    Stian Soiland{-}Reyes and
    Khalid Belhajjame and
    Alasdair J. G. Gray and
    Carole A. Goble and
    Tim Clark},
    title = {{PAV} ontology: provenance, authoring and versioning},
    journal = {Journal of Biomedical Semantics},
    volume = {4},
    pages = {37},
    year = {2013},
    url = {https://doi.org/10.1186/2041-1480-4-37},
    doi = {10.1186/2041-1480-4-37}
    }

  • Patrick Jackman, Alasdair J. G. Gray, Andrew Brass, Robert Stevens, Ming Shi, Derek Scuffell, Simon Hammersley, and Bruce Grieve. Processing online crop disease warning information via sensor networks using ISA ontologies. Agricultural Engineering International: CIGR Journal, 15(3):243–251, 2013.
    [BibTeX] [Abstract]

    Growing demand for food is driving the need for higher crop yields globally. Correctly anticipating the onset of damaging crop diseases is essential to achieve this goal. Considerable efforts have been made recently to develop early warning systems. However, these methods lack a direct and online measurement of the spores that attack crops. A novel disease information network has been implemented and deployed. Spore sensors have been developed and deployed. The measurements from these sensors are combined with similar measurements of important local weather readings to generate estimates of crop disease risk. It is combined with other crop disease information allowing overall local disease risk assessments and forecasts to be made. The resulting data is published through a SPARQL endpoint to support reuse and connection into the linked data cloud.

    @article{Jackman:2013Processing-Online-Crop-Disease,
    title = "Processing online crop disease warning information via sensor networks using ISA ontologies",
    abstract = "Growing demand for food is driving the need for higher crop yields globally. Correctly anticipating the onset of damaging crop diseases is essential to achieve this goal. Considerable efforts have been made recently to develop early warning systems. However, these methods lack a direct and online measurement of the spores that attack crops. A novel disease information network has been implemented and deployed. Spore sensors have been developed and deployed. The measurements from these sensors are combined with similar measurements of important local weather readings to generate estimates of crop disease risk. It is combined with other crop disease information allowing overall local disease risk assessments and forecasts to be made. The resulting data is published through a SPARQL endpoint to support reuse and connection into the linked data cloud.",
    keywords = "Crop disease assessment, Data queries, Investigation study assay, Online sensors, Sensor network, Web semantics",
    author = "Patrick Jackman and Alasdair J G Gray and Andrew Brass and Robert Stevens and Ming Shi and Derek Scuffell and Simon Hammersley and Bruce Grieve",
    year = "2013",
    language = "English",
    volume = "15",
    pages = "243--251",
    journal = "Agricultural Engineering International: CIGR Journal",
    issn = "1682-1130",
    publisher = "International Commission of Agricultural and Biosystems Engineering",
    number = "3",
    }

2011

  • Ixent Galpin, Christian Y. A. Brenninkmeijer, Alasdair J. G. Gray, Farhana Jabeen, Alvaro A. A. Fernandes, and Norman W. Paton. SNEE: a query processor for wireless sensor networks. Distributed and Parallel Databases, 29(1-2):31–85, 2011. doi:10.1007/s10619-010-7074-3
    [BibTeX] [Abstract] [Download PDF]

    A wireless sensor network (WSN) can be construed as an intelligent, large-scale device for observing and measuring properties of the physical world. In recent years, the database research community has championed the view that if we construe a WSN as a database (i.e., if a significant aspect of its intelligent behavior is that it can execute declaratively-expressed queries), then one can achieve a significant reduction in the cost of engineering the software that implements a data collection program for the WSN while still achieving, through query optimization, very favorable cost:benefit ratios. This paper describes a query processing framework for WSNs that meets many desiderata associated with the view of WSN as databases. The framework is presented in the form of compiler/optimizer, called SNEE, for a continuous declarative query language over sensed data streams, called SNEEql. SNEEql can be shown to meet the expressiveness requirements of a large class of applications. SNEE can be shown to generate effective and efficient query evaluation plans. More specifically, the paper describes the following contributions: (1) a user-level syntax and physical algebra for SNEEql, an expressive continuous query language over WSNs; (2) example concrete algorithms for physical algebraic operators defined in such a way that the task of deriving memory, time and energy analytical cost-estimation models (CEMs) for them becomes straightforward by reduction to a structural traversal of the pseudocode; (3) CEMs for the concrete algorithms alluded to; (4) an architecture for the optimization of SNEEql queries, called SNEE, building on well-established distributed query processing components where possible, but making enhancements or refinements where necessary to accommodate the WSN context; (5) algorithms that instantiate the components in the SNEE architecture, thereby supporting integrated query planning that includes routing, placement and timing; and (6) an empirical performance evaluation of the resulting framework.

    @article{DBLP:journals/dpd/GalpinBGJFP11,
    abstract = {A wireless sensor network (WSN) can be construed as an intelligent, large-scale device for observing and measuring properties of the physical world. In recent years, the database research community has championed the view that if we construe a WSN as a database (i.e., if a significant aspect of its intelligent behavior is that it can execute declaratively-expressed queries), then one can achieve a significant reduction in the cost of engineering the software that implements a data collection program for the WSN while still achieving, through query optimization, very favorable cost:benefit ratios. This paper describes a query processing framework for WSNs that meets many desiderata associated with the view of WSN as databases. The framework is presented in the form of compiler/optimizer, called SNEE, for a continuous declarative query language over sensed data streams, called SNEEql. SNEEql can be shown to meet the expressiveness requirements of a large class of applications. SNEE can be shown to generate effective and efficient query evaluation plans. More specifically, the paper describes the following contributions: (1) a user-level syntax and physical algebra for SNEEql, an expressive continuous query language over WSNs; (2) example concrete algorithms for physical algebraic operators defined in such a way that the task of deriving memory, time and energy analytical cost-estimation models (CEMs) for them becomes straightforward by reduction to a structural traversal of the pseudocode; (3) CEMs for the concrete algorithms alluded to; (4) an architecture for the optimization of SNEEql queries, called SNEE, building on well-established distributed query processing components where possible, but making enhancements or refinements where necessary to accommodate the WSN context; (5) algorithms that instantiate the components in the SNEE architecture, thereby supporting integrated query planning that includes routing, placement and timing; and (6) an empirical performance evaluation of the resulting framework.},
    author = {Ixent Galpin and
    Christian Y. A. Brenninkmeijer and
    Alasdair J. G. Gray and
    Farhana Jabeen and
    Alvaro A. A. Fernandes and
    Norman W. Paton},
    title = {{SNEE:} a query processor for wireless sensor networks},
    journal = {Distributed and Parallel Databases},
    volume = {29},
    number = {1-2},
    pages = {31--85},
    year = {2011},
    url = {https://doi.org/10.1007/s10619-010-7074-3},
    doi = {10.1007/s10619-010-7074-3}
    }

  • Alasdair J. G. Gray, Jason Sadler, Oles Kit, Kostis Kyzirakos, Manos Karpathiotakis, Jean-Paul Calbimonte, Kevin R. Page, Raúl García -, Alex Frazer, Ixent Galpin, Alvaro A. A. Fernandes, Norman W. Paton, Óscar Corcho, Manolis Koubarakis, David De Roure, Kirk Martinez, and Asunción Gómez -. A Semantic Sensor Web for Environmental Decision Support Applications. Sensors, 11(9):8855–8887, 2011. doi:10.3390/s110908855
    [BibTeX] [Download PDF]
    @article{DBLP:journals/sensors/GraySKKKCPGFGFP11,
    author = {Alasdair J. G. Gray and
    Jason Sadler and
    Oles Kit and
    Kostis Kyzirakos and
    Manos Karpathiotakis and
    Jean-Paul Calbimonte and
    Kevin R. Page and
    Ra{\'{u}}l Garc{\'{\i}}a{-}Castro and
    Alex Frazer and
    Ixent Galpin and
    Alvaro A. A. Fernandes and
    Norman W. Paton and
    {\'{O}}scar Corcho and
    Manolis Koubarakis and
    David De Roure and
    Kirk Martinez and
    Asunci{\'{o}}n G{\'{o}}mez{-}P{\'{e}}rez},
    title = {A Semantic Sensor Web for Environmental Decision Support Applications},
    journal = {Sensors},
    volume = {11},
    number = {9},
    pages = {8855--8887},
    year = {2011},
    url = {https://doi.org/10.3390/s110908855},
    doi = {10.3390/s110908855}
    }

2010

  • Alasdair J. G. Gray, Norman Gray, Christopher W. Hall, and Iadh Ounis. Finding the right term: Retrieving and exploring semantic concepts in astronomical vocabularies. Information Processing and Management, 46(4):470–478, 2010. (Alphabetic authorship) doi:10.1016/j.ipm.2009.09.004
    [BibTeX] [Abstract] [Download PDF]

    Astronomy, like many domains, already has several sets of terminology in general use, referred to as controlled vocabularies. For example, the keywords for tagging journal articles, or the taxonomy of terms used to label image files. These existing vocabularies can be encoded into skos, a W3C proposed recommendation for representing vocabularies on the Semantic Web, so that computer systems can help users to search for and discover resources tagged with vocabulary concepts. However, this requires a search mechanism to go from a user-supplied string to a vocabulary concept. In this paper, we present our experiences in implementing the Vocabulary Explorer, a vocabulary search service based on the Terrier Information Retrieval Platform. We investigate the capabilities of existing document weighting models for identifying the correct vocabulary concept for a query. Due to the highly structured nature of a skos encoded vocabulary, we investigate the effects of term weighting (boosting the score of concepts that match on particular fields of a vocabulary concept), and query expansion. We found that the existing document weighting models provided very high quality results, but these could be improved further with the use of term weighting that makes use of the semantic evidence.

    @article{DBLP:journals/ipm/GrayGHO10,
    abstract = {Astronomy, like many domains, already has several sets of terminology in general use, referred to as controlled vocabularies. For example, the keywords for tagging journal articles, or the taxonomy of terms used to label image files. These existing vocabularies can be encoded into skos, a W3C proposed recommendation for representing vocabularies on the Semantic Web, so that computer systems can help users to search for and discover resources tagged with vocabulary concepts. However, this requires a search mechanism to go from a user-supplied string to a vocabulary concept.
    In this paper, we present our experiences in implementing the Vocabulary Explorer, a vocabulary search service based on the Terrier Information Retrieval Platform. We investigate the capabilities of existing document weighting models for identifying the correct vocabulary concept for a query. Due to the highly structured nature of a skos encoded vocabulary, we investigate the effects of term weighting (boosting the score of concepts that match on particular fields of a vocabulary concept), and query expansion. We found that the existing document weighting models provided very high quality results, but these could be improved further with the use of term weighting that makes use of the semantic evidence.},
    author = {Alasdair J. G. Gray and
    Norman Gray and
    Christopher W. Hall and
    Iadh Ounis},
    title = {Finding the right term: Retrieving and exploring semantic concepts
    in astronomical vocabularies},
    journal = {Information Processing and Management},
    volume = {46},
    number = {4},
    pages = {470--478},
    year = {2010},
    Note = {(Alphabetic authorship)},
    url = {https://doi.org/10.1016/j.ipm.2009.09.004},
    doi = {10.1016/j.ipm.2009.09.004}
    }

2007

  • Alasdair J. G. Gray, Werner Nutt, and Howard M. Williams. Answering queries over incomplete data stream histories. International Journal of Web Information Systems (IJWIS), 3(1/2):41–60, 2007. doi:10.1108/17440080710829216
    [BibTeX] [Abstract] [Download PDF]

    Purpose Distributed data streams are an important topic of current research. In such a setting, data values will be missed, e.g. due to network errors. This paper aims to allow this incompleteness to be detected and overcome with either the user not being affected or the effects of the incompleteness being reported to the user. Design/methodology/approach A model for representing the incomplete information has been developed that captures the information that is known about the missing data. Techniques for query answering involving certain and possible answer sets have been extended so that queries over incomplete data stream histories can be answered. Findings It is possible to detect when a distributed data stream is missing one or more values. When such data values are missing there will be some information that is known about the data and this is stored in an appropriate format. Even when the available data are incomplete, it is possible in some circumstances to answer a query completely. When this is not possible, additional meta‐data can be returned to inform the user of the effects of the incompleteness. Research limitations/implications The techniques and models proposed in this paper have only been partially implemented. Practical implications The proposed system is general and can be applied wherever there is a need to query the history of distributed data streams. The work in this paper enables the system to answer queries when there are missing values in the data. Originality/value This paper presents a general model of how to detect, represent, and answer historical queries over incomplete distributed data streams.

    @article{DBLP:journals/ijwis/GrayNW07,
    abstract = {Purpose
    Distributed data streams are an important topic of current research. In such a setting, data values will be missed, e.g. due to network errors. This paper aims to allow this incompleteness to be detected and overcome with either the user not being affected or the effects of the incompleteness being reported to the user.
    Design/methodology/approach
    A model for representing the incomplete information has been developed that captures the information that is known about the missing data. Techniques for query answering involving certain and possible answer sets have been extended so that queries over incomplete data stream histories can be answered.
    Findings
    It is possible to detect when a distributed data stream is missing one or more values. When such data values are missing there will be some information that is known about the data and this is stored in an appropriate format. Even when the available data are incomplete, it is possible in some circumstances to answer a query completely. When this is not possible, additional meta‐data can be returned to inform the user of the effects of the incompleteness.
    Research limitations/implications
    The techniques and models proposed in this paper have only been partially implemented.
    Practical implications
    The proposed system is general and can be applied wherever there is a need to query the history of distributed data streams. The work in this paper enables the system to answer queries when there are missing values in the data.
    Originality/value
    This paper presents a general model of how to detect, represent, and answer historical queries over incomplete distributed data streams.},
    author = {Alasdair J. G. Gray and
    Werner Nutt and
    M. Howard Williams},
    title = {Answering queries over incomplete data stream histories},
    journal = {International Journal of Web Information Systems ({IJWIS})},
    volume = {3},
    number = {1/2},
    pages = {41--60},
    year = {2007},
    url = {https://doi.org/10.1108/17440080710829216},
    doi = {10.1108/17440080710829216}
    }

2005

  • Andrew W. Cooke, Alasdair J. G. Gray, and Werner Nutt. Stream Integration Techniques for Grid Monitoring. Journal on Data Semantics, 2:136–175, 2005. (Alphabetical authorship, equal responsibility) doi:10.1007/978-3-540-30567-5_6
    [BibTeX] [Download PDF]
    @article{DBLP:journals/jods/CookeGN05,
    author = {Andrew W. Cooke and
    Alasdair J. G. Gray and
    Werner Nutt},
    title = {Stream Integration Techniques for Grid Monitoring},
    journal = {Journal on Data Semantics},
    volume = {2},
    pages = {136--175},
    year = {2005},
    Note = {(Alphabetical authorship, equal responsibility)},
    url = {https://doi.org/10.1007/978-3-540-30567-5\_6},
    doi = {10.1007/978-3-540-30567-5\_6}
    }

2004

  • Andrew W. Cooke, Alasdair J. G. Gray, Werner Nutt, James Magowan, Manfred Oevers, Paul Taylor, Roney Cordenonsi, Rob Byrom, Linda Cornwall, Abdeslem Djaoui, Laurence Field, Steve Fisher, Steve Hicks, Jason Leake, Robin Middleton, Antony J. Wilson, Xiaomei Zhu, Norbert Podhorszki, Brian A. Coghlan, Stuart Kenny, David O’Callaghan, and John Ryan. The Relational Grid Monitoring Architecture: Mediating Information about the Grid. Journal of Grid Computing, 2(4):323–339, 2004. (Alphabetical authorship by site, Heriot-Watt authored paper) doi:10.1007/s10723-005-0151-6
    [BibTeX] [Abstract] [Download PDF]

    We have developed and implemented the Relational Grid Monitoring Architecture (R-GMA) as part of the DataGrid project, to provide a flexible information and monitoring service for use by other middleware components and applications. R-GMA presents users with a virtual database and mediates queries posed at this database: users pose queries against a global schema and R-GMA takes responsibility for locating relevant sources and returning an answer. R-GMA’s architecture and mechanisms are general and can be used wherever there is a need for publishing and querying information in a distributed environment. We discuss the requirements, design and implementation of R-GMA as deployed on the DataGrid testbed. We also describe some of the ways in which R-GMA is being used.

    @article{DBLP:journals/grid/CookeGNMOTCBCDFFHLMWZPCKOR04,
    abstract = {We have developed and implemented the Relational Grid Monitoring Architecture (R-GMA) as part of the DataGrid project, to provide a flexible information and monitoring service for use by other middleware components and applications.
    R-GMA presents users with a virtual database and mediates queries posed at this database: users pose queries against a global schema and R-GMA takes responsibility for locating relevant sources and returning an answer. R-GMA’s architecture and mechanisms are general and can be used wherever there is a need for publishing and querying information in a distributed environment.
    We discuss the requirements, design and implementation of R-GMA as deployed on the DataGrid testbed. We also describe some of the ways in which R-GMA is being used.},
    author = {Andrew W. Cooke and
    Alasdair J. G. Gray and
    Werner Nutt and
    James Magowan and
    Manfred Oevers and
    Paul Taylor and
    Roney Cordenonsi and
    Rob Byrom and
    Linda Cornwall and
    Abdeslem Djaoui and
    Laurence Field and
    Steve Fisher and
    Steve Hicks and
    Jason Leake and
    Robin Middleton and
    Antony J. Wilson and
    Xiaomei Zhu and
    Norbert Podhorszki and
    Brian A. Coghlan and
    Stuart Kenny and
    David O'Callaghan and
    John Ryan},
    title = {The Relational Grid Monitoring Architecture: Mediating Information
    about the Grid},
    journal = {Journal of Grid Computing},
    volume = {2},
    number = {4},
    pages = {323--339},
    year = {2004},
    Note = {(Alphabetical authorship by site, Heriot-Watt authored paper)},
    url = {https://doi.org/10.1007/s10723-005-0151-6},
    doi = {10.1007/s10723-005-0151-6}
    }