IAsyncEnumerable not streaming response when connected with Azure OpenAI ChatCompletionsStreamingAsync

287 Views Asked by At

IAsyncEnumerable works without buffering the response as chunks on simple cases but when the same is used with Azure Open AI doesn't seem to respond back the moment the first chunk is received, still buffers all the chunks and responds in a single go.

 public async IAsyncEnumerable<ChatMessage> Completions(string Prompt)
 { 
    var response = await openAIClient.GetChatCompletionsStreamingAsync(DeploymentId, BuildChatCompletionOptions(Prompt));
    await foreach (var choice in response.Value.GetChoicesStreaming())
    {
      await foreach (var message in choice.GetMessageStreaming())
        {
          if (message.AzureExtensionsContext != null)
             {
               ChatMessage chatMessage = new ChatMessage
              {
               Content = message.AzureExtensionsContext.Messages[0].Content
               };
               yield return chatMessage;
             }

             if (message is { Content.Length: > 0 })
             {
              yield return message;
              }

     }
}

This method is called via an interface from a controller

[HttpPost]
public async IAsyncEnumerable<ChatMessage> PostAsync(string prompt)
{
  await foreach (var chatMessage in _OpenAIServiceGateway.Completions(prompt))
   {
     yield return chatMessage;
    }
}

Using .Net Core 7 & Azure.AI.OpenAI -1.0.0-beta.8

What am I missing here?

0

There are 0 best solutions below