Real max tokens count for google vertex ai text-unicorn model

159 Views Asked by At

according to the official documentation , the maximum input tokens for a PaLM 2 for Text (text-unicorn) model is 8192. However I am getting a 'token limit error' when submitting prompts larger than 7565 tokens (approximately, could be slightly more). I have verified this by counting the prompt tokens before submitting them using their official token counting API. Am I missing anything here? What is the reason for this?

1

There are 1 best solutions below

0
Mel On

It's indeed odd that there's a discrepancy between the official document and the actual token count, the only reason that I can think of is buffers. The maximum token input written in the document might be referring to the overall system capacity but the actual usable limit appears to be lower.

You can submit a ticket in our Issue tracking system and product feature requests so the rightful team can look it up and provide clarifications on the document.