Watson-NLU giving downstream issue (500) in a loop, but not on the sentence individually

551 Views Asked by At

I am trying to return the entities for each sentence in corpus using Watson Natural Language Understanding.

(I can't produce fully reproducible code because the dataset I'm using is private.)

My script looks like this:

import json 
from watson_developer_cloud import NaturalLanguageUnderstandingV1
from watson_developer_cloud.natural_language_understanding_v1 import Features, EntitiesOptions

import pandas as pd 
from utils import * 

_DATA_PATH = "data/example_data.csv"
_IBM_NLU_USERNAME = "<username>"
_IBM_NLU_PASSWORD = "<password>"

X = [string1, string2, ... ]

nlu = NaturalLanguageUnderstandingV1(username=_IBM_NLU_USERNAME,
                                     password=_IBM_NLU_PASSWORD,
                                     version="2018-03-16")
def ibm_ner_recognition(sentence):
    """
    Input  -- sentence, string to conduct NER on
           -- feats, this will always be entities
    Return -- list of entities in the sentence
    """
    response = nlu.analyze(text=sentence,
                           features=Features(entities=EntitiesOptions()))
    output = json.loads(json.dumps(response))
    entities = []
    for result in output["entities"]:
        entities.append(result["type"])
    return entities

entities = []
for sent in X:
    sent_entities = ibm_ner_recognition(sent)
    entities.append(sent_entities)

This works fine up to roughly the 400th sentence in the corpus, and then throws up the following error:

WatsonApiException                        Traceback (most recent call last)
    <ipython-input-57-8632adb20778> in <module>()
          5 for sent in X:
          6     print(sent)
    ----> 7     sent_entities = ibm_ner_recognition(sent)
          8     entities.append(sent_entities)
          9 end = t.default_timer()

    <ipython-input-13-d132b33efc76> in ibm_ner_recognition(sentence)
          9     """
         10     response = nlu.analyze(text=sentence,
    ---> 11                            
features=Features(entities=EntitiesOptions()))
         12     output = json.loads(json.dumps(response))
         13     entities = []

    /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/watson_developer_cloud/natural_language_understanding_v1.py in analyze(self, features, text, html, url, clean, xpath, fallback_to_raw, return_analyzed_text, language, limit_text_characters, **kwargs)
        202             params=params,
        203             json=data,
    --> 204             accept_json=True)
        205         return response
        206 

    /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/watson_developer_cloud/watson_service.py in request(self, method, url, accept_json, headers, params, json, data, files, **kwargs)
        446             error_info = self._get_error_info(response)
        447             raise WatsonApiException(response.status_code, error_message,
    --> 448                                      info=error_info, httpResponse=response)

WatsonApiException: Error: Server Error cannot analyze: downstream issue, Code: 500 , X-dp-watson-tran-id: gateway01-474786453 , X-global-transaction-id: 7ecac92c5aff58601c4caa95

I located the string that was causing this as the following:

s = "Liz Saville Roberts."

I decided to run this string through ibm_ner_recognition on its own. There was no error and it successfully captured entities.

Problem summary

When looping through my corpus, I am getting to a sentence where Watson NLU is giving me a downstream error. However, Watson NLU is successful when receiving that sentence on its own, outside of the loop.

Things to note

  1. This is not the last sentence in my corpus.

  2. I haven't run out of uses of Watson according to my payment plan.

Edit

I re-ran the loop again. This time the sentence above worked (the loop reached about 3500th string), so there seems to be some temperamental behaviour.

0

There are 0 best solutions below