Error with Watson Discovery API's add_document method

227 Views Asked by At

I tried to use discovery API with this code but I have a strange error. It seems discovery doesn't recognize my simple HTML and rejects it. I tried with a lot of simple HTML files, but it doesn't work:

import sys
import os
import json
from watson_developer_cloud import DiscoveryV1

discovery = DiscoveryV1(version="2018-10-15", url='https://gateway.watsonplatform.net/discovery/api', username=username, password=password)



with open((os.path.join(os.getcwd(), 'html_simple_file.html')), "r") as fileinfo:
    add_doc = discovery.add_document(Environment_Id,
                                     Collection_Id,
                                     file_info=fileinfo,
                                    file_content_type = "text/html")
print(json.dumps(add_doc, indent=2))

--------------------------------------------------------------------------- WatsonApiException Traceback (most recent call last) in () 13 Collection_Id, 14 file_info=fileinfo, ---> 15 file_content_type = "text/html") 16 print(json.dumps(add_doc, indent=2))

~/anaconda3/lib/python3.7/site-packages/watson_developer_cloud/discovery_v1.py in add_document(self, environment_id, collection_id, file, metadata, file_content_type, filename, **kwargs) 1246 params=params, 1247 files=form_data, -> 1248 accept_json=True) 1249 return response 1250 ~/anaconda3/lib/python3.7/site-packages/watson_developer_cloud/watson_service.py in request(self, method, url, accept_json, headers, params, json, data, files, **kwargs) 488 error_info = self._get_error_info(response) 489 raise WatsonApiException(response.status_code, error_message, --> 490 info=error_info, httpResponse=response) WatsonApiException: Error: Invalid Content-Type. Expected 'multipart/form-data', got 'application/octet-stream', Code: 400 , X-dp-watson-tran-id: gateway01-30825332 , X-global-transaction-id: 7ecac92c5bf2f40901d65b74

I don't understand how it works. Thank you for any help!

1

There are 1 best solutions below

0
Vidyasagar Machupalli On

The issue is with the open call where you need to pass the file path of the filename(.html). Here's the working code

import os
import json
from watson_developer_cloud import DiscoveryV1

discovery = DiscoveryV1(
    version="2018-10-15",
    username='{USERNAME}',
    password='{PASSWORD}',
    #iam_apikey='{apikey}',
    url='https://gateway.watsonplatform.net/discovery/api'
)

with open(os.path.join(os.getcwd(), '/Users/VMac/Downloads/', 'Ana.json.html')) as fileinfo:
    add_doc = discovery.add_document('{DISCOVERY_ENVIRONMENT_ID}', '{DISCOVERY_COLLECTION_ID}', file=fileinfo).get_result()
print(json.dumps(add_doc, indent=2))

Once you pass all the required details in the code above, here's the output

{
  "document_id": "6439d5f2-f273-4173-8d25-54dc934df",
  "status": "processing"
}

Refer the API document for add_document here