Karl Bernard

Beta Testing Semantic Memory

Added 2023-07-27 03:45:41 +0000 UTC

We've enabled the first iteration of Semantic Memory for all paying subscribers. If you notice any issues, please report them on Discord.

How does Semantic Memory works?

The AI models that we currently use have a limit of 2048 tokens (around 1500 words) that can be used to prompt and get a reply from your chatbot. That limitation means that we can only keep a recent portion of your conversation when generating a reply.

What Semantic Memory do is identify relevant portions of your conversation history and include them in the prompt (relevant to your last message). Your chatbot therefore has greater chance to "remember" things that would normally be forgotten because they are no longer in the prompt..

That technique is not perfect, however in our tests, it provides excellent results. One of our next milestone is to provide models with a longer context window (4k and 8k tokens), but Semantic Memory is meant to help in the meantime.

How can I enable it?

If you are a paying subscriber, it will be used automatically on all your longer conversations (when we start truncating your conversation history to keep the prompt length under 2048 tokens). You don't have anything special to do. It will also work with your past conversations.

How to disable it?

For the moment, there's no way to disable it, we don't see any downside to use Semantic Search so it is enabled by default.