Synchronization¶
-
class
ckan_api_client.syncing.
SynchronizationClient
(base_url, api_key=None, **kw)[source]¶ Synchronization client, providing functionality for importing collections of datasets into a Ckan instance.
Synchronization acts as follows:
- Snsure all the required organizations/groups are there; create a map between “source” ids and Ckan ids. Optionally update existing organizations/groups with new details.
- Find all the Ckan datasets matching the
source_name
- Determine which datasets...
- ...need to be created
- ...need to be updated
- ...need to be deleted
- First, delete datasets to be deleted in order to free up names
- Then, create datasets that need to be created
- Lastly, update datasets using the configured merge strategy (see constructor arguments).
-
__init__
(base_url, api_key=None, **kw)[source]¶ Parameters: - base_url – Base URL of the Ckan instance, passed to high-level client
- api_key – API key to be used, passed to high-level client
- organization_merge_strategy –
One of:
- ‘create’ (default) if the organization doesn’t exist, create it. Otherwise, leave it alone.
- ‘update’ if the organization doesn’t exist, create it. Otherwise, update with new values.
- group_merge_strategy –
One of:
- ‘create’ (default) if the group doesn’t exist, create it. Otherwise, leave it alone.
- ‘update’ if the group doesn’t exist, create it. Otherwise, update with new values.
- dataset_preserve_names – if
True
(the default) will preserve old names of existing datasets - dataset_preserve_organization – if
True
(the default) will preserve old organizations of existing datasets. - dataset_group_merge_strategy –
- ‘add’ add groups, keep old ones (default)
- ‘replace’ replace all existing groups
- ‘preserve’ leave groups alone
-
sync
(source_name, data)[source]¶ Synchronize data from a source into Ckan.
- datasets are matched by _harvest_source
- groups and organizations are matched by name
Parameters: - source_name – String identifying the source of the data. Used to build ids that will be used in further synchronizations.
- data – Data to be synchronized. Should be a dict (or dict-like)
with top level keys coresponding to the object type,
mapping to dictionaries of
{'id': <object>}
.