Jena Stable Turtle output plugin
This repository contains code to define new RDF Writers for Jena which is turtle always sorted in the same way. It has been developped to reduce the diff noise when the data is stored on a git repository, we are confident there are plenty of other use cases where it will be useful.
The repository contains two writers, for the Turtle and TriG formats.
Changes from the stock turtle output
Sorting some particular cases
There is always some arbitrary decisions to be taken for some cases. We took the following when sorting objects:
- first URIs (sorted) then literals (sorted) then blank nodes
- first
rdf:langString
s thenxsd:string
s then numbers then everything else, sorted by type uri then value rdf:langString
s are sorted by lang then value, in the root unicode collator (not in the locale corresponding to the language)- numbers are sorted first by value then by type uri (
"+1"^^xsd:integer
<"1"^^xsd:integer
<"+1"^^xsd:nonNegativeInteger
<"1.2"^^xsd:float
<"2"^^xsd:integer
)
Installation
Using maven:
<dependency>
<groupId>io.bdrc</groupId>
<artifactId>jena-stable-turtle</artifactId>
<version>0.7.2</version>
</dependency>
build and deploy:
mvn clean package
mvn deploy -DperformRelease=true
Then go to https://oss.sonatype.org/ and do the close and release
Use
// register the STTL writer
Lang sttl = STTLWriter.registerWriter();
// build a map of namespace priorities
SortedMap<String, Integer> nsPrio = ComparePredicates.getDefaultNSPriorities();
nsPrio.put(SKOS.getURI(), 1);
nsPrio.put("http://purl.bdrc.io/ontology/admin/", 5);
nsPrio.put("http://purl.bdrc.io/ontology/toberemoved/", 6);
// build a list of predicates URIs to be used (in order) for blank node comparison
List<String> predicatesPrio = CompareComplex.getDefaultPropUris();
predicatesPrio.add("http://purl.bdrc.io/ontology/admin/logWhen");
predicatesPrio.add("http://purl.bdrc.io/ontology/onOrAbout");
predicatesPrio.add("http://purl.bdrc.io/ontology/noteText");
// pass the values through a Context object
Context ctx = new Context();
ctx.set(Symbol.create(STTLWriter.SYMBOLS_NS + "nsPriorities"), nsPrio);
ctx.set(Symbol.create(STTLWriter.SYMBOLS_NS + "nsDefaultPriority"), 2);
ctx.set(Symbol.create(STTLWriter.SYMBOLS_NS + "complexPredicatesPriorities"), predicatesPrio);
// the base indentation, defaults to 4
ctx.set(Symbol.create(STTLWriter.SYMBOLS_NS + "nsBaseIndent"), 4);
// the minimal predicate width, defaults to 14
ctx.set(Symbol.create(STTLWriter.SYMBOLS_NS + "predicateBaseWidth"), 14);
Graph g = ... ; // fetch the graph you want to write
RDFWriter w = RDFWriter.create().source().context(ctx).lang(sttl).build();
w.output( ... ); // write somewhere
Note that for TriG order, you must use the same context namespace as for turtle: STTLWriter.SYMBOLS_NS
.
License
All the code on this repository is under the Apache 2.0 License.
The original parts are Copyright © 2017-2019 Buddhist Digital Resource Center
, and the files TurtleShell.java
(coming from the Jena repository) and TriGShell.java
(extracted from this file) are Copyright © 2011-2017 Apache Software Foundation (ASF)
, see NOTICE for more information.