Open Data Catalog v2.0.0

DKAN's Metastore is what you use to create, retrieve, update, and delete records describing your data. These records are what we refer to as "metadata."

As a data catalog, DKAN's main goal is to help you share a collection of dataset records. A dataset's metadata can follow virtually any schema, or format you want. What is important is that it points to the data you are trying to share, and that it gives useful contextual information. This usually includes when the data was released, how often it is updated and who published it, but can include details as precise as the geographic boundaries it applies to.

Some more details of DKAN's metastore:

  • The data assets themselves (usually in the form of local files or URLs) are referred to internally in DKAN as resources.
  • The structure and format of dataset metadata in DKAN are determined by a JSON schema. By default, DKAN provides and utilizes the DCAT-US metadata schema to store datasets, but custom schemas can be added to the codebase to override this.
  • In DCAT-US, resources are placed in a sub-schema of the parent dataset called a distribution.
Read the documentation on How to add a Dataset to get started adding information to the metastore.


Changing your dataset schema

Replacing the dataset schema in DKAN allows you to add fields and conform to additional (or completely different) specifications. As long as you provide a valid JSON schema, any information going into the metastore will be validated against it.

To change the schema being used, copy the schema directory from the DKAN repo and place it in the root of your Drupal installation. Then make any modifications necessary to the dataset.json file inside the collections directory.

Warning: The schema is actively used by the catalog to verify the validity of the data. Making changes to the schema after data is present in the catalog should be done with, care as non-backward-compatible changes to the schema could cause issues. Look at Drupal::metastore::SchemaRetriever::findSchemaDirectory() for context.