Using the sandbox

To use the RAG Sandbox, first create a vectorization experiment. Refer to the Experiments Quickstart for more. When viewing the results of the experiment, there will be a button labeled “Open in RAG Sandbox”. Click that to start the Sandbox.

The Sandbox has quite a few inputs. Let’s look at each area while following the typical flow of a Generative AI (GenAI) application. Once you have the inputs set, click the “Submit” button to see how the LLM responds.

Vector search index

The options in this area are carried over from your experiment. Depending on which vectorization strategy you choose, there will be quite a few search indexes to choose from. Toggle between different indexes and use the same question/prompt input to see differences in the LLM response. LLMs can respond quite differently to different combinations of embedding models and chunking configurations. Learn more about vectorization strategies.

System behavior

This is a way to help the LLM understand its role during completion. Depending on the type of data you have embedded, the role could be “You are a chef answering questions to the class” or “You are an English poet”. Learn more about role prompting↗.


This is where you provide the question that you want the LLM to answer (which is called a completion ;) ). There will be a few sample questions below this input, which were created while running the experiment. More about the sample questions in the experiment results.

Your question should clearly describe the information you are seeking. Try to keep it short and to the point.


A Prompt is what you submit to an LLM for completion. Typically it includes a question, direction on how to answer, and additional up-to-date context that the LLM might not know about. A premade prompt template is loaded into your sandbox with placeholders and text to create the final prompt. You can manipulate it as needed.



This placeholder will be replaced with the value from the Question input when creating the final prompt.


This placeholder will be replaced with the search results from your vector data. Choose different vector indexes to change what search results are used.

Retrieved Context

This area will be filled with the results from the vector search. It is a read-only input that gives you insight into what context the LLM was given.

LLM Response

After clicking the “Submit” button a final prompt is created using all the inputs and submitted to the LLM. Its response is shown in the read-only input area.

You can choose from a support list of LLMs to further experiment with your GenAI solution. The following LLMs are currently available.

Supported Large Language Models

GPT 3.5 Turbo

Powered by OpenAI↗. Learn more about OpenAI models here↗.


Powered by Groq↗.


Powered by Groq↗.


Powered by Groq↗.


Powered by Groq↗.

Other LLM configurations


This sliding scale (from 0 to 2) decides how predictable or creative the LLM is allowed to be. Choose 0(zero) to keep the LLM more limited (and predictable) in its responses or slide toward 2(two) to let it be more creative (and random).

Top P

This sliding scale (from 0 to 1) decides how many words to consider. A higher value will allow the LLM to respond with a more diverse answer.

Read more↗ about Temperature, Top P, and how the two perform together.

Last updated