ISWC2017 Papers

I have had two papers accepted within the events that make up ISWC2017. My PhD student Qianru Zhou has been working on using RDF stream processing to detect anomalous events through telecommunication network messages. The particular scenario in our paper that will be presented at the Web Stream Processing workshop focuses on detecting a disaster such as […]

I have had two papers accepted within the events that make up ISWC2017.

My PhD student Qianru Zhou has been working on using RDF stream processing to detect anomalous events through telecommunication network messages. The particular scenario in our paper that will be presented at the Web Stream Processing workshop focuses on detecting a disaster such as the capsizing of the Eastern Star on the Yangtze River [1].

The second paper is a poster in the main conference that provides an overview of the Bioschemas project where we are identifying the Schema.org markup that is of primary importance for life science resources. Hopefully the paper title will pull the punters in for the session [2].

[1] Qianru Zhou, Stephen McLaughlin, Alasdair J. G. Gray, Shangbin Wu, and Chengxiang Wang. Lost Silence: An emergency response early detection service through continuous processing of telecommunication data streams. In Web Stream Processing 2017, 2017.
[Bibtex]
@InProceedings{ZhouEtal2017:LostSilence:WSP2017,
abstract = {Early detection of significant traumatic events, e.g. terrorist events, ship capsizes, is important to ensure that a prompt emergency response can occur. In the modern world telecommunication systems can and do play a key role in ensuring a successful emergency response by detecting such incidents through significant changes in calls and access to the networks. In this paper a methodology is illustrated to detect such incidents immediately (with the delay in the order of milliseconds), by processing semantically annotated streams of data in cellular telecommunication systems. In our methodology, live information of phones' positions and status are encoded as RDF streams. We propose an algorithm that processes streams of RDF annotated telecommunication data to detect abnormality. Our approach is exemplified in the context of capsize of a passenger cruise ship but is readily translatable to other incidents. Our evaluation results show that with properly chosen window size, such incidents can be detected effectively.},
author = {Qianru Zhou and Stephen McLaughlin and Alasdair J G Gray and Shangbin Wu and Chengxiang Wang},
title = {Lost Silence: An emergency response early detection service through continuous processing of telecommunication data streams},
OPTcrossref = {},
OPTkey = {},
booktitle = {Web Stream Processing 2017},
year = {2017},
OPTeditor = {},
OPTvolume = {},
OPTnumber = {},
OPTseries = {},
OPTpages = {},
OPTmonth = {},
OPTaddress = {},
OPTorganization = {},
OPTpublisher = {},
OPTnote = {},
OPTannote = {}
}
[2] Alasdair J. G. Gray, Carole Goble, Rafael C. Jimenez, and The Bioschemas Community. Bioschemas: From Potato Salad to Protein Annotation. In ISWC 2017 Poster Proceedings, Vienna, Austria, 2017. Poster
[Bibtex]
@InProceedings{grayetal2017:bioschemas:iswc2017,
abstract = {The life sciences have a wealth of data resources with a wide range of overlapping content. Key repositories, such as UniProt for protein data or Entrez Gene for gene data, are well known and their content easily discovered through search engines. However, there is a long-tail of bespoke datasets with important content that are not so prominent in search results. Building on the success of Schema.org for making a wide range of structured web content more discoverable and interpretable, e.g. food recipes, the Bioschemas community (http://bioschemas.org) aim to make life sciences datasets more findable by encouraging data providers to embed Schema.org markup in their resources.},
author = {Alasdair J G Gray and Carole Goble and Rafael C Jimenez and {The Bioschemas Community}},
title = {Bioschemas: From Potato Salad to Protein Annotation},
OPTcrossref = {},
OPTkey = {},
booktitle = {ISWC 2017 Poster Proceedings},
year = {2017},
OPTeditor = {},
OPTvolume = {},
OPTnumber = {},
OPTseries = {},
OPTpages = {},
OPTmonth = {},
address = {Vienna, Austria},
OPTorganization = {},
OPTpublisher = {},
note = {Poster},
OPTannote = {}
}

ISWC2017 Papers

I have had two papers accepted within the events that make up ISWC2017. My PhD student Qianru Zhou has been working on using RDF stream processing to detect anomalous events through telecommunication network messages. The particular scenario in our paper that will be presented at the Web Stream Processing workshop focuses on detecting a disaster such as […]

I have had two papers accepted within the events that make up ISWC2017.

My PhD student Qianru Zhou has been working on using RDF stream processing to detect anomalous events through telecommunication network messages. The particular scenario in our paper that will be presented at the Web Stream Processing workshop focuses on detecting a disaster such as the capsizing of the Eastern Star on the Yangtze River [1].

The second paper is a poster in the main conference that provides an overview of the Bioschemas project where we are identifying the Schema.org markup that is of primary importance for life science resources. Hopefully the paper title will pull the punters in for the session [2].

[1] Qianru Zhou, Stephen McLaughlin, Alasdair J. G. Gray, Shangbin Wu, and Chengxiang Wang. Lost Silence: An emergency response early detection service through continuous processing of telecommunication data streams. In Web Stream Processing 2017, 2017.
[Bibtex]
@InProceedings{ZhouEtal2017:LostSilence:WSP2017,
abstract = {Early detection of significant traumatic events, e.g. terrorist events, ship capsizes, is important to ensure that a prompt emergency response can occur. In the modern world telecommunication systems can and do play a key role in ensuring a successful emergency response by detecting such incidents through significant changes in calls and access to the networks. In this paper a methodology is illustrated to detect such incidents immediately (with the delay in the order of milliseconds), by processing semantically annotated streams of data in cellular telecommunication systems. In our methodology, live information of phones' positions and status are encoded as RDF streams. We propose an algorithm that processes streams of RDF annotated telecommunication data to detect abnormality. Our approach is exemplified in the context of capsize of a passenger cruise ship but is readily translatable to other incidents. Our evaluation results show that with properly chosen window size, such incidents can be detected effectively.},
author = {Qianru Zhou and Stephen McLaughlin and Alasdair J G Gray and Shangbin Wu and Chengxiang Wang},
title = {Lost Silence: An emergency response early detection service through continuous processing of telecommunication data streams},
OPTcrossref = {},
OPTkey = {},
booktitle = {Web Stream Processing 2017},
year = {2017},
OPTeditor = {},
OPTvolume = {},
OPTnumber = {},
OPTseries = {},
OPTpages = {},
OPTmonth = {},
OPTaddress = {},
OPTorganization = {},
OPTpublisher = {},
OPTnote = {},
OPTannote = {}
}
[2] Alasdair J. G. Gray, Carole Goble, Rafael C. Jimenez, and The Bioschemas Community. Bioschemas: From Potato Salad to Protein Annotation. In ISWC 2017 Poster Proceedings, Vienna, Austria, 2017. Poster
[Bibtex]
@InProceedings{grayetal2017:bioschemas:iswc2017,
abstract = {The life sciences have a wealth of data resources with a wide range of overlapping content. Key repositories, such as UniProt for protein data or Entrez Gene for gene data, are well known and their content easily discovered through search engines. However, there is a long-tail of bespoke datasets with important content that are not so prominent in search results. Building on the success of Schema.org for making a wide range of structured web content more discoverable and interpretable, e.g. food recipes, the Bioschemas community (http://bioschemas.org) aim to make life sciences datasets more findable by encouraging data providers to embed Schema.org markup in their resources.},
author = {Alasdair J G Gray and Carole Goble and Rafael C Jimenez and {The Bioschemas Community}},
title = {Bioschemas: From Potato Salad to Protein Annotation},
OPTcrossref = {},
OPTkey = {},
booktitle = {ISWC 2017 Poster Proceedings},
year = {2017},
OPTeditor = {},
OPTvolume = {},
OPTnumber = {},
OPTseries = {},
OPTpages = {},
OPTmonth = {},
address = {Vienna, Austria},
OPTorganization = {},
OPTpublisher = {},
note = {Poster},
OPTannote = {}
}

SICSA Digital Humanities Event

On 24 August I attended the SICSA Digital Humanities event hosted at Strathclyde University. The event was organised by Martin Halvey and Frank Hopfgartner. The event brought together cultural heritage practitioners, and researchers from the humanities and computer science. The day started off with a keynote from Lorna Hughes, Professor of Digital Humanities at the […]

On 24 August I attended the SICSA Digital Humanities event hosted at Strathclyde University. The event was organised by Martin Halvey and Frank Hopfgartner. The event brought together cultural heritage practitioners, and researchers from the humanities and computer science.

The day started off with a keynote from Lorna Hughes, Professor of Digital Humanities at the University of Glasgow. She highlighted that there is not a single definition for digital humanities (weblink presents a random definition from a set collected at another event). However, at the core, digital humanities consists of:

  • Digital content
  • Digital methods
  • Tools

The purpose of digitial humanities is to enable better and/or faster outputs as well as conceptualising new research questions.

Lorna showcased several projects that she has been involved with highlighting the issues that were faced before identifying a set of lessons learned and challenges going forward (see her blog and slideshare). She highlighted that only about 10% of content has been transformed into a digital form, and of that only 3% is openly available. Additionally, some artefacts have been digitised in multiple ways at different time points, and the differences in these digital forms tells a story about the object.

Lorna highlighted the following challenges:

  • Enabling better understanding of digital content
  • Developing underlying digital infrastructure
  • Supporting the use of open content
  • Enabling the community
  • Working with born-digital content.

The second part of the day saw us brainstorming ideas in groups. Two potential apps were outlined to support the public get more out of the cultural heritage environment around us.

An interesting panel discussion was had, focused around what you would do with a mythical £350m. It also involved locking up 3D scanners, at least until appropriate methodology and metadata was made available.

The day finished off with an interesting keynote from Daniela Petrelli, Sheffield Hallam University. This was an interesting talk focussing on the outputs of the EU meSch project. A holistic design approach on the visitor experience was proposed that encompassed interaction design, product design, and content design. See the below embedded video for an idea.

Summary

There are lots of opportunities for collaboration between digital humanities and computing. From my perspective, there are lots of interesting challenges around capturing data metadata, linking between datasets, and capturing provenance of workflows.

Throughout the day, various participants were tweeting with the #dhfest hashtag.

DUCS not LOD

The follow is an excerpt from a blog by Keir Winesmith, Head of Digital at the San Francisco Museum of Modern Art (@SFMOMAlab) Linked Open Data may sound good and noble, but it’s the wrong way around. It is a truth universally acknowledged, that an organization in possession of good Data, must want it Open (and […]

The follow is an excerpt from a blog by Keir Winesmith, Head of Digital at the San Francisco Museum of Modern Art (@SFMOMAlab)

Linked Open Data may sound good and noble, but it’s the wrong way around. It is a truth universally acknowledged, that an organization in possession of good Data, must want it Open (and indeed, Linked).

Well, I call bullshit. Most cultural heritage organizations (like most organizations) are terrible at data. And most of those who are good at collecting it, very rarely use it effectively or strategically.

Instead of Linked Open Data (LOD), Keir argues for DUCS:

I propose an alternative anagram, and an alternative order of importance.

  • D. Data. Step one, collect the data that is most likely to help you and your organization make better decisions in the future. For example collection breadth, depth, accuracy, completeness, diversity, and relationships between objects and creators.
  • U. Utilise. Actually use the data to inform your decisions, and test your hypotheses, within the bounds of your mission.
  • C. Context. Provide context for your data, both internally and externally. What’s inside? How is represented? How complete is it? How accurate? How current? How was it gathered?
  • S. Share. Now you’re ready to share it! Share it with context. Share it with the communities that are included in it first, follow the cultural heritage strategy of “nothing about me, without me”. Reach out to the relevant students, scholars, teachers, artists, designers, anthropologists, technologists, and whomever could use it. Get behind it and keep it up to date.

I’m against LOD, if it doesn’t follow DUCS first.

If you’re going to do it, do it right.

Source: Against Linked Open Data – Keir Winesmith – Medium

BTL Surpass for online assessment in Computer Science

Over the last couple of years I have been leading the introduction of BTL’s Surpass online assessment platform for  exams in Computer Science. I posted the requirements for an online exam system we agreed on a few months ago. I have now written up an evaluation case study: Use of BTL Surpass for online exams in Computer … Continue reading BTL Surpass for online assessment in Computer Science

The post BTL Surpass for online assessment in Computer Science appeared first on Sharing and learning.

Over the last couple of years I have been leading the introduction of BTL’s Surpass online assessment platform for  exams in Computer Science. I posted the requirements for an online exam system we agreed on a few months ago. I have now written up an evaluation case study: Use of BTL Surpass for online exams in Computer Science, an LTDI report (local copy). TL;DR: nothing is perfect, but Surpass did what we hoped, and it is planned to continue & expand its use.

My colleagues Hans-Wofgang has also presented on our experiences of “Enhancing the Learning Experience on Programming-focused Courses via Electronic Assessment Tools” at the Trends in Functional Programming in Education Conference, Canterbury, 19-21. This paper includes work by Sanusi Usman on using Surpass for formative assessment.

A question for online exams in computer science showing few lines of JAVA code with gaps for the student to complete.
A fill the blanks style question for online exams in computer coding. (Not from a real exam!)

The post BTL Surpass for online assessment in Computer Science appeared first on Sharing and learning.

Quick notes: Naomi Korn on copyright and educational resources

I gate-crashed a lecture on copyright that Naomi Korn gave at Edinburgh University. I’ve had an interest in copyright for as long as I have been working with open access and open educational resources, about ten years. I think I understand the basic concepts pretty well, but even so Naomi managed to catch a couple … Continue reading Quick notes: Naomi Korn on copyright and educational resources

The post Quick notes: Naomi Korn on copyright and educational resources appeared first on Sharing and learning.

I gate-crashed a lecture on copyright that Naomi Korn gave at Edinburgh University. I’ve had an interest in copyright for as long as I have been working with open access and open educational resources, about ten years. I think I understand the basic concepts pretty well, but even so Naomi managed to catch a couple of misconceptions I held and also crystallised some ideas with well chosen examples.

hand drawn copyright symbol and word 'copyright' in cursive script.
from naomikorn.com

First, quick intro to Naomi. Naomi is a copyright consultant (but not a lawyer). I first met her through her work for UKOER, which I really liked because she gave us pragmatic advice that helped us release resources openly not just list of all the things we couldn’t do. Through that and other work Naomi & colleagues have created a set of really useful resources on copyright for OER (which are themselves openly licensed).

Naomi has also done some work with the Imperial War Museum from which she drew the story of Ethel Bilborough’s First World War diary. It’s there on her website so I won’t repeat here. The key lessons (to me) revolved around copyright existing from the moment of creation until 70 years after the author’s death; copyright is a property which can be inherited; ownership of the physical artifact does not necessarily mean ownership of the copyright; and composite works (the diary contained press cuttings and photos) creating more complex problems with several rights holders. All of these (and the last one especially) are relevant to modern teaching and learning resources.

In general copyright supports the copying and use of resources through permission from the  rights owner (a licence) and various copyright exceptions. However, sometimes it is necessary to fall back on a pragmatic approach of taking a reasonable risk, for example when the rights owner is not traceable.  Naomi described some interesting issues around the use of  copyright resources in teaching and learning. For example, there are exceptions to copyright for criticism, review or quotation and for teaching purposes. However these are limited in that such use must be fair dealing (I learnt this: that fair dealing/fair use is an additional limitation on an exception, not a type of exception). Fair dealing is undefined, and may not include putting materials online. Naomi described how easy it is for use of a resource under an exception to become an infringement in the context of modern teaching as the private space of teaching becomes more public. For example a resource used in lecture which is videoed, the video made public. All the more reason to be careful in the first place; all the more reason to use liberal licences such as creative commons, which are not limited to a specific scenario.

copyright pragmatics

All the way through her talk Naomi encouraged us to think about copyright in terms of being respectful of other people: giving the credit due to resource creators. She left left us with some key points of advice

  • make sure that you know the basics
  • make sure you know who can help you
  • ask when you’re not sure

fun fact

For copyright purposes, software is classed as a literary work.

 

 

The post Quick notes: Naomi Korn on copyright and educational resources appeared first on Sharing and learning.

An ending, and a beginning

On 30 June 2017 I will be leaving my current employment at Heriot-Watt University. I aim to continue to support the use of technology to enhance learning as an independent consultant. I first joined Heriot-Watt’s Institute for Computer Based Learning in 1996 on a six month secondment. I was impressed that ICBL was part of … Continue reading An ending, and a beginning

The post An ending, and a beginning appeared first on Sharing and learning.

On 30 June 2017 I will be leaving my current employment at Heriot-Watt University. I aim to continue to support the use of technology to enhance learning as an independent consultant.

I first joined Heriot-Watt’s Institute for Computer Based Learning in 1996 on a six month secondment. I was impressed that ICBL was part of a large, well-supported Learning Technology Centre–which was acknowledged at that time as one of the leading centres for the use of technology in teaching and learning. You can get a sense of the scope of the LTC by looking at the staff list from around that time. Working with, and learning from, colleagues with this common interest was hugely appealing to me; so when I had the opportunity I re-joined ICBL in 1997, and this time I stayed.

In my time at Heriot-Watt I have been fortunate beyond belief to collaborate with people in ICBL and through work such as the Engineering Subject Centre (and other subject centres of the LTSN and then HE Academy), EEVL, Cetis and many Jisc projects. But things change. The LTC was dismantled. A reduced ICBL moved to be a part of the Computer part of the School of Mathematics and Computer Sciences (MACS). Funding became difficult, and while I greatly appreciate the huge effort made by several individuals which kept me in continuous employment, like many in similar roles I frequently felt my position was precarious.  I really enjoyed teaching Computer Science and Information Systems students, and I worked with some great people in MACS, but the work became more internally focused, isolated from current developments…not what I had joined ICBL for.

What next?

When Heriot-Watt announced that it planned to offer staff voluntary severance terms, I applied and was happy to be accepted. My professional interests remain the same: supporting the selection and use appropriate learning resources; supporting the management and dissemination of learning resources; open education; sharing and learning. I do this through work on resource description, course description, OER platforms, I use specific technologies like schema.org, LRMI and wordpress. I intend to continue working in these areas, as an independent consultant and with colleagues in Cetis LLP. Contact me if you think I can help you.

Photograph of a person up a flight of stairs into the open. by flickr user Allen, licensed CC:BY.
Exit, by Allen ( https://www.flickr.com/photos/78139009@N03/) Licence CC:BY. Click image for original.

The post An ending, and a beginning appeared first on Sharing and learning.

An Identifier Scheme for the Digitising Scotland Project

The Digitising Scotland project is having the vital records of Scotland transcribed from images of the original handwritten civil registers . Linking the resulting dataset of 24 million vital records covering the lives of 18 million people is a major challenge requiring improved record linkage techniques. Discussions within the multidisciplinary, widely distributed Digitising Scotland project […]

The Digitising Scotland project is having the vital records of Scotland transcribed from images of the original handwritten civil registers . Linking the resulting dataset of 24 million vital records covering the lives of 18 million people is a major challenge requiring improved record linkage techniques. Discussions within the multidisciplinary, widely distributed Digitising Scotland project team have been hampered by the teams in each of the institutions using their own identification scheme. To enable fruitful discussions within the Digitising Scotland team, we required a mechanism for uniquely identifying each individual represented on the certificates. From the identifier it should be possible to determine the type of certificate and the role each person played. We have devised a protocol to generate for any individual on the certificate a unique identifier, without using a computer, by exploiting the National Records of Scotland’s registration districts. Importantly, the approach does not rely on the handwritten content of the certificates which reduces the risk of the content being misread resulting in an incorrect identifier. The resulting identifier scheme has improved the internal discussions within the project. This paper discusses the rationale behind the chosen identifier scheme, and presents the format of the different identifiers.

The work reported in the paper was supported by the British ESRC under grants ES/K00574X/1(Digitising Scotland) and ES/L007487/1 (Administrative Data Research Centre – Scotland).

My coauthors are:

  • Özgür Akgün, University of St Andrews
  • Ahamd Alsadeeqi, Heriot-Watt University
  • Peter Christen, Australian National University
  • Tom Dalton, University of St Andrews
  • Alan Dearle, University of St Andrews
  • Chris Dibben, University of Edinburgh
  • Eilidh Garret, University of Essex
  • Graham Kirby, University of St Andrews
  • Alice Reid, University of Cambridge
  • Lee Williamson, University of Edinburgh

The work reported in this talk is the result of the Digitising Scotland Raasay Retreat. Also at the retreat were:

  • Julia Jennings, University of Albany
  • Christine Jones
  • Diego Ramiro-Farinas, Centre for Human and Social Sciences (CCHS) of the Spanish National Research Council (CSIC)

CMALT – Another open portfolio

I’ve finally made a start on drafting my CMALT Portfolio (and so has Lorna,* we’re writing buddies), and in the interests of open practice I’m going to attempt to write the whole thing as an open Google doc before moving it here on my blog.  I have a shared folder on Google Drive, Phil’s CMALT, where … Continue reading CMALT – Another open portfolio

The post CMALT – Another open portfolio appeared first on Sharing and learning.

I’ve finally made a start on drafting my CMALT Portfolio (and so has Lorna,* we’re writing buddies), and in the interests of open practice I’m going to attempt to write the whole thing as an open Google doc before moving it here on my blog.  I have a shared folder on Google Drive, Phil’s CMALT, where I’ll be building up my portfolio over the coming weeks.  I’ve made a start drafting the first two Core Areas:  Operational Issues and Learning Teaching and Assessment, I’ll be adding more sections shortly, I hope. I’d love to have some feedback on  my portfolio so if you’ve got any thoughts, comments or guidance I’d be very grateful indeed.  I’d also be very interested to know if anyone else has created their portfolio as an exercise in open practice, and if so, how they found the experience.

Wish me luck!

An open door and the word Openness, in ALT branding.
CC BY @BryanMMathers for ALT

*Credit: This post is totally copied from Lorna’s post, with slight adaptation, under the terms of the Creative Commons Attribution 3.0 Unported License she used.

More importantly, she let me copy her idea as well. That’s open practice which can’t be formally licensed, though I know that she is cool with it.

The post CMALT – Another open portfolio appeared first on Sharing and learning.

Seminar: PhD Progression Talks

A double bill of PhD progression talks (abstracts below):

Venue: 3.07 Earl Mountbatten Building, Heriot-Watt University, Edinburgh

Time and Date: 11:15, 8 May 2017

Evaluating Record Linkage Techniques

Ahmad Alsadeeqi

Many computer algorithms have been developed to automatically link historical records based on a variety of string matching techniques. These generate an assessment of how likely two records are to be the same. However, it remains unclear how to assess the quality of the linkages computed due to the absence of absolute knowledge of the correct linkage of real historical records – the ground truth. The creation of synthetically generated datasets for which the ground truth linkage is known helps with the assessment of linkage algorithms but the data generated is too clean to be representative of historical records.

We are interested in assessing data linkage algorithms under different data quality scenarios, e.g. with errors typically introduced by a transcription process or where books can be nibbled by mice. We are developing a data corrupting model that injects corruptions into datasets based on given corruption methods and probabilities. We have classified different forms of corruptions found in historical records into four types based on the effect scope of the corruption. Those types are character level (e.g. an f is represented as an s – OCR Corruptions), attribute level (e.g. gender swap – male changed to female due to false entry), record level (e.g. missing records due to different reasons like loss of certificate), and group of records level (e.g. coffee spilt over a page, lost parish records in fire). This will give us the ability to evaluate record linkage algorithms over synthetically generated datasets with known ground truth and with data corruptions matching a given profile.

Computer-Aided Biomimetics: Knowledge Extraction

Ruben Kruiper

Biologically inspired design concerns copying ideas from nature to various other domains, e.g. natural computing. Biomimetics is a sub-field of biologically inspired design and focuses specifically on solving technical/engineering problems. Because engineers lack biological knowledge the process of biomimetics is non-trivial and remains adventitious. Therefore, computational tools have been developed that aim to support engineers during a biomimetics process by integrating large amounts of relevant biological knowledge. Existing tools work apply NLP techniques on biological research papers to build dedicated knowledge bases. However, these existing tools impose an engineering view on biological data. I will talk about the support that ‘Computer-Aided Biomimetics’ tools should provide, introducing a theoretical basis for further research on the appropriate computational techniques.