Langchain and GPT4All - My JSON generation of a Jira ticket stop in the middle. How control the end of generation?

233 Views Asked by At

I'm quite new with Langchain and I try to create the generation of Jira tickets. Before to use a tool to connect to my Jira (I plan to create my custom tools), I want to have the very good output of my GPT4all thanks Pydantic parsing. I use mistral-7b-openorca.Q4_0.gguf

Parsing Section :

class TechnicalSubtask(BaseModel):
    subtask_name: str = Field(description="Name of the technical subtask")
    subtask_description: str = Field(description="Description of the technical subtask")

class AcceptanceCriteria(BaseModel):
    AcceptanceCriteria_name: str = Field(description="Name of the acceptance criteria")
    AcceptanceCriteria_description: str = Field(description="Description of the acceptance criteria")

class US(BaseModel):
    project_key: str = Field(description="Name of the project")
    title: str = Field(description="Title of the US")
    parent_id: Optional[int] = Field(description="Number of the parent_id of the US")
    assignee: str = Field(description="Name of the responsable")
    summary: str = Field(
        description="Full description of the project according to the Product Owner point of view"
    )
    TechnicalSubtasks: list[TechnicalSubtask]
    AcceptanceCriterias: list[AcceptanceCriteria]
        

class Scientist(BaseModel):
    name: str = Field(description="Name of a Scientist")
    discoveries: list = Field(description="Python list of discoveries")

us_parser = PydanticOutputParser(pydantic_object=US)
subtask_parser = PydanticOutputParser(pydantic_object=TechnicalSubtask)

Model section :

callbacks = [StreamingStdOutCallbackHandler()]

model = GPT4All(model=local_path, 
                callbacks=callbacks,
                max_tokens = 1000,
                temp = 1,
                verbose=False)

And finally my Prompt code :

query_us = 'I want to integrate a MySQL database to my system'
context_us = 'You are an AI assistant specializing in creating Jira tickets. Write the US related to the query '
template_us = "Answer the query : {query}\n{format_instructions}\n"

us_prompt = PromptTemplate(
    template=template_us,
    input_variables=["query"],
    partial_variables={"format_instructions": us_parser.get_format_instructions()}
)

prompt_us = us_prompt.format_prompt(query=context_us+query_us)

output = model(prompt_us.to_string())

Here the output which is incomplete :

Here is an example JSON instance that conforms to this schema:
    {
      "project_key": "JIRA-1234567890",
      "title": "Integrate MySQL Database",
      "parent_id": 1,
      "assignee": "John Doe",
      "summary": "As a Product Owner, I want to integrate a MySQL database to my system.",
      "TechnicalSubtasks": [
        {
          "subtask_name": "Connect to the MySQL Database",
          "subtask_description": "Establish a connection with the MySQL database."
        },
        {
          "subtask_name": "Create Table Structure",
          "subtask_description": "Design and create table structures in the MySQL database."
        }
      ],
      "AcceptanceCriterias": [
        {
          "AcceptanceCriteria_name": "Database Connected Successfully",
          "AcceptanceCriteria_description": "The system can successfully connect to the MySQL database."
        },
        {
          "Accept

This give me a pretty good result but I don't know why the generation stops... It is very limiting cause I must be able to increase the number of subtasks or acceptance criteria...

Langchain is quite new, I really hope that some of you could find an answer.

I start to think that I should create custom agent or custom prompt, but the level of difficulty is not the same !

Let me know,

Peace !

I tried to play with the number of token, to change the stop argument into my us_prompt. But every time, the model stop after more or less 110 words...

0

There are 0 best solutions below