Choosing your LLM model, context-window and algorithm

Chathive offers the possibility to change your LLM model, the context window and the algorithm of how this model is used. This guide will help you decide what configuration is best for your use case.

Default configuration

Whenever you create a project in Chathive your project will be set to these settings:

Setting	Default value
Model	GPT 4o-mini
Context size	8k (growth tier and higher) or 4k (free trial or basic tier)
Algorithm	Default

Although these are usually good defaults, you can greatly improve performance by choosing these settings carefully based on your goals.

How to choose an LLM model

All LLM models have its strengths and weaknesses, you will always make a trade-off based on your choice. However, your choice should be decided by these 3 factors:

Accuracy: Models with more parameters and a larger training dataset (like GPT-4o) tend to be a lot more accurate than less complex and smaller models (like GPT-4o-mini). If accuracy and reasoning is important, choose a more complex model.
Speed of generating responses: More complex models tend to require more compute and are as a result slower in generating text. GPT-4o-mini is very quick in comparison to GPT-4o.
Steerability: More complex models generally follow instructions better than less complex ones. If strict adherence to instructions is important, choose a more steerable model. Although more accurate models are typically more steerable, this is not always the case.
Price: More complex models also tend to be more expensive to run, and require more message credits to run as a result.

Model comparison

To help you decide, we have put all models into a table for you to compare and make your choice.

Model	Credits required per question	Accuracy	Steerability	Speed
GPT 3.5-Turbo DEPRECATED	1 credit	Good	Average	Fast
GPT 4o-mini	1 credit	Great	Good	Very fast
GPT 4o	5 credits	Highest accuracy	Great	Fast
GPT 4 Turbo NOT RECOMMENDED	10 credits	High accuracy	Average	Moderate
GPT 4 NOT RECOMMENDED	30 credits	Very high accuracy	Average	Very slow

We generally recommend to use GPT-4o for most use cases that require a high level of accuracy. And recommend GPT-4o-mini if you need quick replies and/or if you have a very large volume and need to cut costs.

Choosing your context window

This is a preview, this feature is currently under development and will be released soon.

Chathive allows you to choose how large of a context window to use for each message. A larger context window allows a larger amount of text from your sources for each message. It also allows longer questions to and longer replies from the AI assistant.

Larger context window for higher accuracy

We mostly recommend increasing context window for increasing accuracy of your AI assistant. With a larger context window we can provide larger snippets of text from your sources in each response. This means the model has more input to work with and tends to improve accuracy.

These are the ways a larger context window is more accurate:

More complete snippets from your sources: Smaller context windows only allow us to include small snippets of your training data as sources. Larger context windows allow us to include larger snippets, thus improving the chance that the full needed context is there for creating the answer.
Using more sources: Smaller context windows also limit the amount of sources we can include. And as a result have more chance of not including the right sources needed.
Long questions, hogging up space for sources: The longer the question of the user the less space we have for questions. If your questions are large, it will greatly reduce performance in smaller context sizes.
More conversation history: To fit as many sources as we can, we cut the conversation history short if we need that space for the sources. This results in the model knowing less of the past conversation when answering the question. Increasing context size will let the AI assistant retain more memory of the conversation.
Longer responses: Sometimes you need a long explanation for complex cases, more context allows the AI assitant. to create longer responses.

Overview of available context window’s

Model	4k	8k	16k	32k	64k	128k
GPT 3.5-turbo	✅	✅	✅
GPT-4	✅	✅	⚠️ *	⚠️ *
GPT-4-turbo	✅	✅	✅	✅	✅	✅
GPT-4o	✅	✅	✅	✅	✅	✅
GPT-4o-mini	✅	✅	✅	✅	✅	✅

⚠️ *: This symbol means it is available but at extreme costs, it's recommended to choose a more efficient model.

For pricing implications of these context windows we refer you to our message credits documentation here. However, a simple rule of thumb is x times amount of context window is a x multiplier of the cost. This means 8k will double the message credit cost over 4k and so on…

Choosing your algorithm

The last step is to decide on the algorithm to use. Algorithms are ways the model and context window is used to find sources and generate responses.

Currently, we offer 2 distinct algorithms:

Default: Use the question to find sources and feed those into the model.
Fusion algorithm: Rephrases the question a few times and uses all those rephrasings to search the database. This algorithm greatly improves the search results if the user uses different words or phrasings than are used in your AI database. It slightly increases the time it takes for the AI to start responding, as it first has to rephrase and only then can start responding.

You can view the pricing implications at our mesage credits documentation as well.