Data Package Identifiers are a simple way to identify a Data Package (and its location) using a string or small JSON object.
It exists because of the consistent need across applications to identify a Data Package. For example, in command line tools or libraries one will frequently want to take a Data Package Identifier as an argument.
For example, consider the
dpm (the Data Package Manager) has commands like:
# gdp is a Data Package identifier dpm info gdp # https://github.com/datasets/gold-prices is a Data Package identifier dpm install https://github.com/datasets/gold-prices
The object structure looks like:
It can be parsed (and less importantly) serialized to a simple string. Spec strings will be frequently used on e.g. the command line to identify a data package.
An Identifier String is a single string (rather than JSON object) that points to a Data Package. An Identifier String can be, in decreasing order of explicitness:
A URL that points directly to the
datapackage.json (no resolution needed):
A URL that points directly to the Data Package (that is, the directory containing the
A GitHub URL:
name of a dataset in the Core Datasets registry: