Synchronization¶
- class ckan_api_client.syncing.SynchronizationClient(base_url, api_key=None, **kw)[source]¶
Synchronization client, providing functionality for importing collections of datasets into a Ckan instance.
Synchronization acts as follows:
- Snsure all the required organizations/groups are there; create a map between “source” ids and Ckan ids. Optionally update existing organizations/groups with new details.
- Find all the Ckan datasets matching the source_name
- Determine which datasets...
- ...need to be created
- ...need to be updated
- ...need to be deleted
- First, delete datasets to be deleted in order to free up names
- Then, create datasets that need to be created
- Lastly, update datasets using the configured merge strategy (see constructor arguments).
- __init__(base_url, api_key=None, **kw)[source]¶
Parameters: - base_url – Base URL of the Ckan instance, passed to high-level client
- api_key – API key to be used, passed to high-level client
- organization_merge_strategy –
One of:
- ‘create’ (default) if the organization doesn’t exist, create it. Otherwise, leave it alone.
- ‘update’ if the organization doesn’t exist, create it. Otherwise, update with new values.
- group_merge_strategy –
One of:
- ‘create’ (default) if the group doesn’t exist, create it. Otherwise, leave it alone.
- ‘update’ if the group doesn’t exist, create it. Otherwise, update with new values.
- dataset_preserve_names – if True (the default) will preserve old names of existing datasets
- dataset_preserve_organization – if True (the default) will preserve old organizations of existing datasets.
- dataset_group_merge_strategy –
- ‘add’ add groups, keep old ones (default)
- ‘replace’ replace all existing groups
- ‘preserve’ leave groups alone
- sync(source_name, data)[source]¶
Synchronize data from a source into Ckan.
- datasets are matched by _harvest_source
- groups and organizations are matched by name
Parameters: - source_name – String identifying the source of the data. Used to build ids that will be used in further synchronizations.
- data – Data to be synchronized. Should be a dict (or dict-like) with top level keys coresponding to the object type, mapping to dictionaries of {'id': <object>}.