Error with Watson Discovery API's add_document method

199 Views Asked by At

I tried to use discovery API with this code but I have a strange error. It seems discovery doesn't recognize my simple HTML and rejects it. I tried with a lot of simple HTML files, but it doesn't work:

import sys
import os
import json
from watson_developer_cloud import DiscoveryV1

discovery = DiscoveryV1(version="2018-10-15", url='https://gateway.watsonplatform.net/discovery/api', username=username, password=password)



with open((os.path.join(os.getcwd(), 'html_simple_file.html')), "r") as fileinfo:
    add_doc = discovery.add_document(Environment_Id,
                                     Collection_Id,
                                     file_info=fileinfo,
                                    file_content_type = "text/html")
print(json.dumps(add_doc, indent=2))

--------------------------------------------------------------------------- WatsonApiException Traceback (most recent call last) in () 13 Collection_Id, 14 file_info=fileinfo, ---> 15 file_content_type = "text/html") 16 print(json.dumps(add_doc, indent=2))

~/anaconda3/lib/python3.7/site-packages/watson_developer_cloud/discovery_v1.py in add_document(self, environment_id, collection_id, file, metadata, file_content_type, filename, **kwargs) 1246 params=params, 1247 files=form_data, -> 1248 accept_json=True) 1249 return response 1250 ~/anaconda3/lib/python3.7/site-packages/watson_developer_cloud/watson_service.py in request(self, method, url, accept_json, headers, params, json, data, files, **kwargs) 488 error_info = self._get_error_info(response) 489 raise WatsonApiException(response.status_code, error_message, --> 490 info=error_info, httpResponse=response) WatsonApiException: Error: Invalid Content-Type. Expected 'multipart/form-data', got 'application/octet-stream', Code: 400 , X-dp-watson-tran-id: gateway01-30825332 , X-global-transaction-id: 7ecac92c5bf2f40901d65b74

I don't understand how it works. Thank you for any help!

1

There are 1 best solutions below

0
On

The issue is with the open call where you need to pass the file path of the filename(.html). Here's the working code

import os
import json
from watson_developer_cloud import DiscoveryV1

discovery = DiscoveryV1(
    version="2018-10-15",
    username='{USERNAME}',
    password='{PASSWORD}',
    #iam_apikey='{apikey}',
    url='https://gateway.watsonplatform.net/discovery/api'
)

with open(os.path.join(os.getcwd(), '/Users/VMac/Downloads/', 'Ana.json.html')) as fileinfo:
    add_doc = discovery.add_document('{DISCOVERY_ENVIRONMENT_ID}', '{DISCOVERY_COLLECTION_ID}', file=fileinfo).get_result()
print(json.dumps(add_doc, indent=2))

Once you pass all the required details in the code above, here's the output

{
  "document_id": "6439d5f2-f273-4173-8d25-54dc934df",
  "status": "processing"
}

Refer the API document for add_document here