Skip to main content

Overview

What is prompt management?

Building LLM-powered applications is an iterative process. In each iteration, you aim to improve the application's performance by refining prompts, adjusting configurations, and evaluating outputs.

Illustration of the LLMOPs process

A prompt management system provides you the tools to do this process systematically by:

  • Versioning Prompts: Keeping track of different prompts you've tested.
  • Linking Prompt Variants to Experiments: Connecting each prompt variant to its evaluation metrics to understand the effect of changes and determine the best variant.
  • Publishing Prompts: Providing a way to publish the best prompt variants to production and maintain a history of changes in production systems.
  • Associating Prompts with Traces: Monitoring how changes in prompts affect production metrics.

Capabilities in agenta

Agenta enables you to create prompts both from the web UI and from code. It allows you to publish these prompts to endpoints with specific environment names and version them. Additionally, it allows you to run evaluations from the web UI and code on these prompts and connect their results and observability to the respective version.

Why do I need a prompt management system?

A prompt management system enables everyone on the team—from product owners to subject matter experts—to collaborate in creating prompts. Additionally it helps you answer the following questions:

  • Which prompts have we tried?
  • What were the outputs of these prompts?
  • How do the evaluation results of these prompts compare?
  • Which prompt was used for a specific generation in production?
  • What was the effect of publishing the new version of this prompt in production?
  • Who on the team made changes to a particular prompt in production?

Configuration management in agenta

Agenta goes beyond prompt management to encompass the entire configuration of your LLM applications. If your LLM workflow is more complex than a single prompt (e.g., Retrieval-Augmented Generation (RAG) or a chain of prompts), you can version the whole configuration together.

In contrast to a prompt, a configuration of an LLM application can include additional parameters beyond prompt templates and models (with their parameters). For instance:

  • An LLM application using a chain of two prompts would have a configuration that includes the two prompts and their respective model parameters.
  • An application that includes a RAG pipeline would have a configuration that includes parameters such as top_k and embedding.
Example RAG configuration
{
"top_k": 3,
"embedding": "text-embedding-3-large",
"prompt-query": "We have provided context information below. {context_str}. Given this information, please answer the question: {query_str}\n",
"model-query": "openai/gpt-o1",
"temperature-query": "1.0"
}

Agenta enables you to version the entire configuration of the LLM app as a unit. This is essential because there is a dependency between the parts of the configuration. For instance, in a chain of two prompts, changes to the first prompt depend on changes to the second prompt. Therefore, you need to version them together to ensure consistency and traceability.

Get started

Configuration lifecycle management