GPT-J and GPT-Neo generate too long sentences

791 Views Asked by Astraport At 28 May 2022 at 19:38

I trained a GPT-J and GPT-Neo models (fine tuning) on my texts and am trying to generate new text. But very often the sentences are very long (sometimes 300 characters each), although in the dataset the sentences are of normal length (50-100 characters usually). I tried a lot of things, changed, adjusted the temperature, top_k, but still half of the results with long phrases and I neen more short.

What can you try?

Here are long examples of generated results:

The support system that they have built has allowed us as users who are not code programmers or IT administrators some ability to create our own custom solutions without needing much programming experience ourselves from scratch!

All it requires are documents about your inventory process but I've found them helpful as they make sure you do everything right for maximum efficiency because their knowledge base keeps reminding me there's new ways i can be doing some things wrong since upgrading my license so even though its good at finding errors with documentation like an auditor may bring up later downline someone else might benefit if those files dont exist anymore after one year when upgrades renews automatically!

Original Q&A

There are 1 best solutions below

Toakley On 06 December 2022 at 21:47

With all GPT models you can specify the "max_length" parameter during generation. This will force the model to generate an amount of tokens equal to max_length. You could also play with num_return_sequences and use a helper function to choose the shortest sequence.

Example:

output = model.generate(input_ids, do_sample=True, top_k=50, max_length=100, top_p=0.95, num_return_sequences=1)

These large language models are trained on massive amounts of data, and fine-tuning them can take patience as they learn to adapt to what you're feeding it. Try different things - adjust your training data format, try different samples, use a pre-prompt during generation to guide the model, etc.. A model like GPT-J does a mind-numbingly large amount of calculations just to spit out a single word, so it is hard to predict what exactly is causing it to say one thing over another.

GPT-J and GPT-Neo generate too long sentences

There are 1 best solutions below

Related Questions in TEXT

Related Questions in ARTIFICIAL-INTELLIGENCE

Related Questions in GPT-2

Related Questions in FINE-TUNING

Trending Questions

Popular # Hahtags

Popular Questions