I'm using the Langchain library to make predictions with the AI21 Bedrock model. I have implemented the following code:
from langchain.chains import ConversationChain
from langchain.llms.bedrock import Bedrock
from langchain.memory import ConversationBufferMemory
ai21_llm = Bedrock(model_id="ai21.j2-ultra-v1", client=boto3_bedrock)
memory = ConversationBufferMemory()
conversation = ConversationChain(
llm=ai21_llm, verbose=False, memory=memory
)
try:
print(conversation.predict(input="write a paragraph about the wonders of wonder bread"))
except ValueError as error:
if "AccessDeniedException" in str(error):
print(f"\x1b[41m{error}\
\nTo troubeshoot this issue please refer to the following resources.\
\nhttps://docs.aws.amazon.com/IAM/latest/UserGuide/troubleshoot_access-denied.html\
\nhttps://docs.aws.amazon.com/bedrock/latest/userguide/security-iam.html\x1b[0m\n")
class StopExecution(ValueError):
def _render_traceback_(self):
pass
raise StopExecution
else:
raise error
However, I'm encountering an issue where the output of conversation.predict() is truncated. For instance, the output I get is:
"Wonder Bread is a type of bread that is sold in stores. It is made from flour, water,"
I expected a complete paragraph, but it cuts off. I've checked the Langchain memory documentation here, but I didn't find anything that would suggest it's affecting the output size.
How can I debug this issue to find out why the output is truncated? Are there any limitations with AI21 or Langchain that could be causing this? Any help would be appreciated.
Langchain memory documentation. But doesn't seem helpful.
The following code does work, so I'm thinking it has something to do with Langchain
body = json.dumps({"prompt": prompt_data, "maxTokens": 200})
modelId = "ai21.j2-mid-v1" # change this to use a different version from the model provider
accept = "application/json"
contentType = "application/json"
try:
response = bedrock_runtime.invoke_model(
body=body, modelId=modelId, accept=accept, contentType=contentType
)
response_body = json.loads(response.get("body").read())
print(response_body.get("completions")[0].get("data").get("text"))
except botocore.exceptions.ClientError as error:
if error.response['Error']['Code'] == 'AccessDeniedException':
print(f"\x1b[41m{error.response['Error']['Message']}\
\nTo troubeshoot this issue please refer to the following resources.\
\nhttps://docs.aws.amazon.com/IAM/latest/UserGuide/troubleshoot_access-denied.html\
\nhttps://docs.aws.amazon.com/bedrock/latest/userguide/security-iam.html\x1b[0m\n")
else:
raise error
This is just a stream of completions:
If you want the full response as a string, use the following: