Introducing Open Referral’s data transformation toolkit

See Open Knowledge’s blog post about this project here. Our thanks to the Frictionless Data Fund!

We’re excited to introduce a set of tools that make it easier to standardize resource directory data.

Community resource directory data (i.e., information about health, human, and social services available to people in need) is deceptively complex. In order to accurately represent the relationships between organizations, the services they provide, and the locations they are offered, Open Referral’s Human Service Data Specification (HSDS) calls for multiple tables, linked together — which can be challenging to work with. HSDS addressed this challenge by calling for a set of CSV files to be bundled together by a ‘data package’ to specify the tables’ contents and relationships in a single machine-readable file.

Image: the ‘datapackage’ concept, explained. (See more at https://frictionlessdata.io/) For HSDS, the boxes at the left are ‘organizations,’ ‘services,’ ‘locations,’ and associated entities.

Few of the members of our community, however, were familiar with datapackages and how to create them. So we’ve now made it easier to facilitate this complex approach to resource data sharing!

The Open Knowledge Foundation developed this standard format for JSON datapackages, and supports various initiatives in its communities to develop new tooling and functionality that use datapackages. With their support, we’ve now upgraded our specification and associated tools to make it easier to produce, share, and read standardized resource data in the HSDS format, datapackage and all.

This upgrade includes a suite of open source data management tools that facilitate standardized resource data production, transformation, validation, and publication. These modules that are freely accessible in our Github repository:

  • The HSDS Transformer enables data administrators to transform any resource data source into an HSDS-compliant data package (assuming that source includes fields consist of the core elements of organizations, services, and locations).
  • The HSDS Validator enables data administrators to test the integrity of an HSDS dataset, and ensure compliance with our spec. The HSDS Validator offers adopters an easy tool with which to prepare data for loading into any application that recognizes the HSDS format.
  • HSDS Zip is the compressed, single file form of a complete HSDS-compliant datapackage. This improves portability by enabling the transfer of human services data as a one file rather than a set of multiple files. The ZIP should include datapackage.json file that describes the data included and a directory named “data” that contains the HSDS-compliant CSVs.

Together, these tools can make it easier to transform, validate, and load data from one resource information system to another – key components of a data supply chain.

We are eager to see how members of the Open Referral community make use of these tools. If you find them valuable, may become critical components of your own workflow; if you find ways to improve them, we hope you’ll share those improvements back with the community.

We also have some ideas for where future development could go next — and we want to hear your feedback.

For example, would it be useful to develop a cloud-based API to manage (and automate) data transformation of nonstandardized human service directory data source into HSDS output?

Let us know what you think in the Github repos, or in our Slack team, or in the comments here, or reach out directly to discuss further.

Our thanks to the Open Knowledge Foundation for their support through the Frictionless Data Fund.

We also thank Chris Spiliotopoulos (@spilio) for his work on the first versions of the data validator, and Kin Lane (@apievangelist) for his work on the Human Service Data API protocols.

Leave a Reply

Your email address will not be published. Required fields are marked *