GPT 3.5 turbo is cutting the response without hitting the token limit

Question

GPT 3.5 turbo is cutting the response without hitting the token limit

385 Views Asked by Dayerman At 29 July 2025 at 07:39

I trying to build an app using openAI API, and when using the openAI API i am passing two messages in the post request. The first one is as a system role which contains a list of exercises such as

const initialPrompt = '''
                    0001    3/4 Sit-up
                    0003    Air bike
                    0024    Barbell Bench Front Squat
                    0025    Barbell Bench Press
                    0026    Barbell Bench Squat
                    0027    Barbell Bent Over Row
                    0030    Barbell Close-Grip Bench Press
                    ''';

The second message is as a user:

const userMessageString = """
                Give 3 days workout routine based on these exercises.
                Keep the ID of the exercises on the answer, 
                give me the answer in a complete json format as [days{[exercises{exerciseId, exerciseName, sets, reps}]}]
                """;

And i want to get a json response using this method:

  Future<String> sendOpenAIRequest() async {
final apiUrl = 'https://api.openai.com/v1/chat/completions';

final response = await http.post(
  Uri.parse(apiUrl),
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer $openaiApiKey',
  },
  body: json.encode({
    "model": "gpt-3.5-turbo",
    "messages": [
      {"role": "system", "content": initialPrompt},
      {"role": "user", "content": userMessageString}
    ],
    "max_tokens": 3000,
  }),
);

if (response.statusCode == 200) {
  final result = json.decode(response.body);
  print('usage >>> ${result['usage']}');
  return result['choices'][0]['message']['content'];
} else {
  throw Exception('Request failed with status: ${response.statusCode}');
}

However the answer that i get is cut, even though the number of tokens is far from the max_tokens

usage >>> {prompt_tokens: 143, completion_tokens: 324, total_tokens: 467}
{
  "days": [
    {
      "exercises": [
        {
          "exerciseId": "0026",
          "exerciseName": "Barbell Bench Squat",
          "sets": 3,
          "reps": 12
        },
        {
          "exerciseId": "0030",
          "exerciseName": "Barbell Close-Grip Bench Press",
          "sets": 3,
          "reps": 12
        }
      ]
    },
    {
      "exercises": [
        {
          "exerciseId": "0027",
          "exerciseName": "Barbell Bent Over Row",
          "sets": 3,
          "reps": 10
        },
        {
          "exerciseId": "0024",
          "exerciseName": "Barbell Bench Front Squat",
          "sets": 3,
          "reps": 10
        }
      ]
    },
    {
      "exercises": [
        {
          "exerciseId": "0025",
          "exerciseName": "Barbell Bench Press",
          "sets": 4,
          "reps": 8
        },
        {
          "exerciseId": "0003",
          "exerciseName": "Air bike",
          "sets": 4,
          "reps": 12
        },
        {
          "exercise

Original Q&A

There are 1 best solutions below

**Aprendendo Next** · Accepted Answer

I've tried your prompt on playground and it seems fine.

Does it always get cut or only sometimes? If so it may be an instability on openai side.

The API fails sometimes and your app needs to check if the json is invalid and try again if needed.

Have you tried removing the key max_tokens? It is not really needed unless you want to force short answers because for gpt-3.5-turbo it is already limited.

GPT 3.5 turbo is cutting the response without hitting the token limit

There are 1 best solutions below

Related Questions in FLUTTER

Related Questions in OPENAI-API

Related Questions in GPT-3

Trending Questions

Popular # Hahtags

Popular Questions