I'm quite new with Langchain and I try to create the generation of Jira tickets. Before to use a tool to connect to my Jira (I plan to create my custom tools), I want to have the very good output of my GPT4all thanks Pydantic parsing. I use mistral-7b-openorca.Q4_0.gguf
Parsing Section :
class TechnicalSubtask(BaseModel):
subtask_name: str = Field(description="Name of the technical subtask")
subtask_description: str = Field(description="Description of the technical subtask")
class AcceptanceCriteria(BaseModel):
AcceptanceCriteria_name: str = Field(description="Name of the acceptance criteria")
AcceptanceCriteria_description: str = Field(description="Description of the acceptance criteria")
class US(BaseModel):
project_key: str = Field(description="Name of the project")
title: str = Field(description="Title of the US")
parent_id: Optional[int] = Field(description="Number of the parent_id of the US")
assignee: str = Field(description="Name of the responsable")
summary: str = Field(
description="Full description of the project according to the Product Owner point of view"
)
TechnicalSubtasks: list[TechnicalSubtask]
AcceptanceCriterias: list[AcceptanceCriteria]
class Scientist(BaseModel):
name: str = Field(description="Name of a Scientist")
discoveries: list = Field(description="Python list of discoveries")
us_parser = PydanticOutputParser(pydantic_object=US)
subtask_parser = PydanticOutputParser(pydantic_object=TechnicalSubtask)
Model section :
callbacks = [StreamingStdOutCallbackHandler()]
model = GPT4All(model=local_path,
callbacks=callbacks,
max_tokens = 1000,
temp = 1,
verbose=False)
And finally my Prompt code :
query_us = 'I want to integrate a MySQL database to my system'
context_us = 'You are an AI assistant specializing in creating Jira tickets. Write the US related to the query '
template_us = "Answer the query : {query}\n{format_instructions}\n"
us_prompt = PromptTemplate(
template=template_us,
input_variables=["query"],
partial_variables={"format_instructions": us_parser.get_format_instructions()}
)
prompt_us = us_prompt.format_prompt(query=context_us+query_us)
output = model(prompt_us.to_string())
Here the output which is incomplete :
Here is an example JSON instance that conforms to this schema:
{
"project_key": "JIRA-1234567890",
"title": "Integrate MySQL Database",
"parent_id": 1,
"assignee": "John Doe",
"summary": "As a Product Owner, I want to integrate a MySQL database to my system.",
"TechnicalSubtasks": [
{
"subtask_name": "Connect to the MySQL Database",
"subtask_description": "Establish a connection with the MySQL database."
},
{
"subtask_name": "Create Table Structure",
"subtask_description": "Design and create table structures in the MySQL database."
}
],
"AcceptanceCriterias": [
{
"AcceptanceCriteria_name": "Database Connected Successfully",
"AcceptanceCriteria_description": "The system can successfully connect to the MySQL database."
},
{
"Accept
This give me a pretty good result but I don't know why the generation stops... It is very limiting cause I must be able to increase the number of subtasks or acceptance criteria...
Langchain is quite new, I really hope that some of you could find an answer.
I start to think that I should create custom agent or custom prompt, but the level of difficulty is not the same !
Let me know,
Peace !
I tried to play with the number of token, to change the stop argument into my us_prompt
. But every time, the model stop after more or less 110 words...