Update dataset with python CKAN lib

511 Views Asked by At

I need to automate a dataset hosted on my CKAN instance to be updated regularly. I want to achieve this using the ckanapi lib, however, I'm struggling to get it to work.

For now, the dataset will be hosted hosted on my desktop (in the example used below, I used an existing dataset). I want to produce a script to load it from my desktop to CKAN. I've tried the below, however, it isn't working:

import ckanapi
ckan = ckanapi.RemoteCKAN('https://data.nsw.gov.au/data/', apikey='xxx', user_agent='xxx')
resource_dict = {
    'id': 'a89c3110-ad71-4a8a-bf0a-04729604683d',
    'package_id': 'e3240d3d-bb8f-43c2-9c7f-54fb7a7fd05f',
    'name':'test data',
    'url':'https://data.nsw.gov.au/data/dataset/c647a815-5eb7-4df6-8c88-f9c537a4f21e/resource/2f1ba0f3-8c21-4a86-acaf-444be4401a6d/download/covid-19-cases-by-notification-date-and-likely-source-of-infection.csv',
    'description':'covid data',
    'format':'CSV'
}
ckan.action.resource_update(**resource_dict)

It returns a CKANAPIError. Appreciate any help getting to work.

3

There are 3 best solutions below

0
On

use simple request.post(url=url, headers=headers, files=multipart_form) where:

url is your ckan instance url (public url, maybe include port 5000) + '/api/3/action/resource_create' headers is just a dict like this

headers={"Authorization":"api_key_from_admin_user"}

then the important part a dictionary like this:

multipart_form = {
 'upload': ("filename", open('file_path', 'rb')),
'description': (None, 'data_description'),
'name', (None, 'data_name'),
/// desired data properties
'package_id': (None, 'package_uuid')
}

remember to import request using: import request

0
On

Try this for resource_dict:

resource_dict = {
    'id': 'a89c3110-ad71-4a8a-bf0a-04729604683d',
    'name': 'test data',
    'description': 'covid data',
    'upload': open('covid-19-cases-by-notification-date-and-likely-source-of-infection.csv', 'rb')
    }

I was having all sorts of problems myself, until I realised that the 'upload' value could not just be the filename itself, but within the open() function

0
On

In the Ckan API, there is a update_data in the ckan docs https://docs.ckan.org/en/ckan-2.7.3/api/