Several data were created for the project "A Text Analytic Approach to Rural and Urban Legal Histories". The data are the output of the project described in the paper "Text Analysis of Aberdeen Burgh Records 1530-1531".
The data represent annotations on the transcribed text of a selection of Aberdeen Burgh Records from 1530-1531. The annotations are various, covering non-legal (e.g. named entities) and legal (e.g. crimes and legal actions) terminology and relations as found in the text. The data was created using the General Architecture for Text Engineering (GATE) tool (along with some custom scripts), using manual and automated means. The data can be explored as well using the GATE tool or other compatible text analytic tools.
The purposes of making the data available is to: present the output of the project; demonstrate the effectiveness of the toolchain; serve as a proof of concept for text analytic approaches to legal, historical materials; make the data available for exploration and analysis; and facilitate augmenting the data with further information.
The data are:
- GATE Standoff XML (.xml) - a standoff XML version of the project output. The data contain the full range of annotations on the text. Importing this file into GATE will enable viewing of the texts and all their annotations.
- GATE RDF Triples (.rdf) - RDF triples that are the output of the GATE XML Standoff file, i.e. RDF triples for the annotated text. The RDF can be queried using SPARQL, allowing the user to do complex conceptual search over the corpus.
- OWL Ontology (.owl) - an ontology for a subdomain (i.e. shipping and church buildings) of the GATE analysis. An ontology allows us to enrich the annotations on the text beyond the explicit textual information.
- GATE Inline XML (.xml) - the inline GATE XML file for the project analysis. The same data is presented as in the GATE Standoff XML file, but in a different format. The inline XML can also be imported and viewed in GATE. Inline XML is useful for viewing and presenting fragments of the annotated text without the GATE tool.
- GATE Inline XML Scheme (.xml) - a schema for the Inline XML file that can be useful for formatting.