sssom-js

0.4.3 • Public • Published

sssom-js

Test NPM Version

Simple Standard for Sharing Ontology Mappings (SSOM) JavaScript library

This Node package provides methods and a command line client to process mappings in SSSOM format.

It implements parsing variants of SSSOM (TSV, CSV and JSON) with validation and transformation to multiple formats, including JSKOS and RDF.

Table of Contents

Install

Requires Node.js >= 20.19.

npm install sssom-js

For RDF export in the command line client also install jsonld2rdf.

npm install jsonld2rdf

The web interface can be deployed on any web server by copying directory docs/ of the source code repository.

Usage

Command line

The package includes a command line client to parse and convert SSSOM. Usage and options:

sssom [options] [<mappings-file> [<metadata-file>]] 
short long argument description
-f --from format input format (csv, tsv, json)
-t --to format output format (json, ndjson, jskos, ndjskos, nq, nt, ttl)
-o --output file output filename or default - for stdout
-p --propagate add propagatable slots to mappings
-b --liberal parse less strict than the specification
-c --curie file additional CURIE map (JSON or YAML file)
-s --schemes file JSKOS concept schemes to detect
-m --mappings emit mappings only
-v --verbose emit error verbosely
-j --json-errors emit errors detailled in JSON
-h --help emit usage information
-V --version emit the version number

### Web interface

A web interface to validate and transform SSSOM/TSV is made available at <https://gbv.github.io/sssom-js/>. The application is not included in the package release at npm.

### API

~~~js
import { parseSSSOM, TSVReader, toJskosRegistry, toJskosMapping } from "sssom-js"

parseSSSOM (input, options)

This asynchronous function parses SSSOM in an input format from a stream or file and returns a mapping set on success. The result should directly be serializable as SSSOM/JSON (or as JSKOS with option to set to jskos).

import { parseSSSOM } from "sssom-js"
const { mappings, ...metadata } = await parseSSSOM(process.stdin)

An untruthy input value will skip processing of mappings so only the mapping set is returned:

const metadata = await parseSSSOM(false, { metadata: "metadata.sssom.yaml" })

See below for a description of common options. Additional options are:

  • metadataHandler (function) called for parsed metadata
  • mappingHandler (function) called for each parsed mapping

parseSSSOMString (input, options)

This is a utility function to parse SSSOM from a string. Equivalent implementation in NodeJS:

parseSSSOMString = (input, options={}) => parseSSSOM(Readable.from(input), options)

TSVReader

This event emitter parses SSSOM/TSV from a stream and emits metadata and mapping events:

import fs from "fs"
import { TSVReader } from "sssom-js"

const input = fs.createReadStream("test/valid/minimal.sssom.tsv")
new TSVReader(input)
  .on("metadata", console.log)
  .on("mapping", console.log)
  .on("error", console.error)
  .on("end", console.log)

new TSVReader(input, { delimiter: "," }) // parse SSSOM/CSV

The following parsing options can be given with a second optional argument object:

  • metadata (must be an object)
  • curie (must be an object)
  • propagate (boolean)
  • liberal (boolean)
  • delimiter (string)
  • storeMappings (boolean) whether to store parsed mappings and include them in the result

toJskosRegistry

Convert a parsed MappingSet to a JSKOS Registry object.

toJskosMapping

Convert a parsed Mapping to a JSKOS Concept Mapping object.

Options

The following options are supported by both the command line client, and the API:

propagate

Enables propagation of mapping set slots. False by default.

liberal

Enabling liberal parsing will

  • allow empty mappings block in SSSOM/TSV (but still read and validate the metadata block)
  • not require mapping set slots (neither mapping_set_id nor license) so the metadata block can be empty
  • not require mapping slot mapping_justification

curie

If you want to allow all CURIE prefixes from Bioregistry without explicitly defining them in curie_map you can download and convert the current list for instance with command line tools curl and jq this way (requires local copy of file bioregistry.jq) and then reference result file bioregistry.json with option --curie:

curl -sL https://w3id.org/biopragmatics/bioregistry.epm.json | \
jq -Sf bioregistry.jq > bioregistry.json

schemes

JSKOS Concept Schemes to detect when transforming to JSKOS

mappings

Emit mappings only. Metadata is parsed and validated nevertheless.

metadata

Mapping set metadata file in JSON or YAML format for external metadata mode. Is passed as second argument in the command line client or as named option in the API. The API also accepts a parsed object.

Validation errors

Validation errors are objects with three fields:

  • message an error message
  • value an optional value that caused the error
  • position an optional object mapping locator types to error locations. The following locator types are used:
    • line: a line number (given as string)
    • jsonpointer: a JSON Pointer to the malformed YAML or JSON element

Formats

Input format and output format can be specified via command line options from and to, and in the web interface.

of the mappings, given as string. The following formats are supported so far:

format description from to API
tsv SSSOM/TSV yes - yes
csv SSSOM/CSV yes - yes
json SSSOM/JSON/JSON-LD yes yes yes
ndjson metadata and mappings on individual lines (SSSOM/JSON) to -
jskos JSKOS - to yes
ndjskos metadata and mappings on individual lines (JSKOS) to -
nq NQuads of raw mappings - to -
nt NTriples - to (requires jsonld2rdf) -
ttl RDF/Turtle - to (requires jsonld2rdf) -

If not specified, formats are guessed from file name with fallback to tsv (from) and ndjson (to).

Formats json, jskos, nt, and ttl require to fully load the input into memory for processing, the other formats support streaming processing.

NQuads format (nq) is limited to the raw mapping statements without metadata and additional slots except subject_id, predicate_id, object_id, and optional mapping_set_id. Combine with option -m, --mappings to omit the latter, resulting in NTriples format of raw mappings.

JSKOS

The JSKOS data format is used in terminology applications for controlled vocabularies and their mappings.

The following correspondence between SSSOM and JSKOS has not fully been implemented yet. Some JSKOS fields will only be available since version 0.7.0 of JSKOS specification.

Common slots

SSSOM slot JSKOS field
comment note.und[]
creator_id contributor[].uri
creator_label contributor[].prefLabel.und
publication_date published
see_also ?
other -

Propagatable slots

SSSOM slot JSKOS field
mapping_date created
mapping_provider publisher[].url
mapping_tool tool[].prefLabel.und (0.7.0)
mapping_tool_version tool[].version (0.7.0)
object_source to.memberSet[].inScheme[].uri
object_source_version to.memberSet[].inScheme[].version (0.7.0)
object_type from.memberSet[].type (URI, limited list)
subject_source from.memberSet[].inScheme
subject_source_version from.memberSet[].inScheme[].version (0.7.0)
subject_type from.memberSet[].type (URI, limited list)
predicate_type -
object_match_field - (see #152)
object_preprocessing - (see #152)
subject_match_field - (see #152)
subject_preprocessing - (see #152)
similarity_measure - (see #152)

Mapping set slots

SSSOM slot JSKOS field
curie_map -
license license.uri
mappings mappings (of a registry or concordance)
mapping_set_id uri
mapping_set_version version (0.7.0)
mapping_set_source source
mapping_set_title prefLabel.und
mapping_set_description definition
issue_tracker issueTracker (0.7.0)
predicate_label -
extension_definitions -

Mapping slots

SSSOM slot JSKOS field
mapping_id uri
subject_id from.memberSet[].uri
subject_label from.memberSet[].prefLabel
subject_category -
predicate_id type
predicate_label - (implied by type)
object_id to.memberSet[].uri
object_label to.memberSet[].prefLabel
object_category -
mapping_justification justification (0.7.0)
author_id creator[].uri
author_label creator[].prefLabel
reviewer_id annotations[].creator.id
reviewer_label annotations[].creator.name
mapping_source source
confidence mappingRelevance
curation_rule guidelines (0.7.0)
curation_rule_text guidelines[].prefLabel (0.7.0)
issue_tracker_item issue (0.7.0)
license - (only for mapping sets)
predicate_modifier -
mapping_cardinality -
match_string - (see #152)
similarity_score - (see #152)

Limitations

This library follows the SSSOM specification as close as possible, but it does not aim to be a fully compliant implementation. The latter would require to also comply to LinkML, a specification much more complex then needed for SSSOM and not fully been implemented in JavaScript yet. In particular:

  • All slots of type Uri must be absolute URIs as defined in RFC 3986
  • Literal Mappings are not supported
  • Non-standard slots are not supported:
    • mapping set slot extension_definition is ignored
    • mapping set slot other is read and validated but not used
  • SSSOM/JSON, the JSON serialization of SSSOM has not been specified yet, so it may differ from the JSON(-LD) format used in this library
  • Transformation to RDF lacks creator_label and author_label
  • Propagation silently overwrites existing mapping slots instead of raising an error
  • There is an additional non-SSSOM mapping slot mapping_id. Uniqueness is not checked.

Survey

Directory survey contains a survey of published SSSOM data with validation results. See dev branch for most recent update.

Maintainers

Contribute

Contributions are welcome! Best use the issue tracker for questions, bug reports, and/or feature requests!

License

MIT license

Package Sidebar

Install

npm i sssom-js

Weekly Downloads

118

Version

0.4.3

License

MIT

Unpacked Size

55.4 kB

Total Files

18

Last publish

Collaborators

  • nichtich