LlamaGym is an innovative, open-source Python framework designed to streamline the fine-tuning of large language model (LLM) agents through online reinforcement learning. Developed by KhoomeiK, LlamaGym simplifies the complexities associated with training LLM-based agents by managing essential tasks such as conversation context, episode batching, reward assignment, and proximal policy optimization (PPO) setup. This makes it a valuable resource for developers and researchers interested in advancing their work in AI. LlamaGym offers a range of features that facilitate the fine-tuning of LLM agents, making it easier for developers to focus on their models without getting bogged down by the underlying complexities. Below is a summary of the key features: LlamaGym can be applied in various scenarios, showcasing its versatility and capability to enhance AI applications: To get started with LlamaGym, you need to install it using pip. Here’s the command to install the framework: After installation, you can create your own Agent class by defining the following three abstract methods: LlamaGym is currently a weekend project and is still evolving, but it has already demonstrated significant potential in simplifying the fine-tuning process of LLM agents. Contributions to the project are welcome, and ongoing community involvement is expected to enhance its capabilities further.Features
Feature
Description
Simplified Fine-Tuning Process
Provides a standardized environment similar to OpenAI's Gym, allowing developers to focus on model training and abstracting away complexities.
Easy Integration
Highly customizable and easy to integrate into existing projects, enabling rapid experimentation with agent prompting and hyperparameters.
Customizable
Abstract Agent class allows for customization according to specific needs in AI applications, making it suitable for both research and development.
Use Cases
How to get started
pip install llamagym
get_system_prompt
, format_observation
, and extract_action
. Below is an example implementation for a Blackjack agent:from llamagym import Agent
class BlackjackAgent(Agent):
def get_system_prompt(self) -> str:
return "You are an expert blackjack player."
def format_observation(self, observation) -> str:
return f"Your current total is {observation[0]}"
def extract_action(self, response: str):
return 0 if "stay" in response else 1
model = AutoModelForCausalLMWithValueHead.from_pretrained("Llama-2-7b").to(device)
tokenizer = AutoTokenizer.from_pretrained("Llama-2-7b")
agent = BlackjackAgent(model, tokenizer, device)
The pricing for LlamaGen.ai is structured around various plans and additional credit packages: Note: Pricing is an indication and may vary over time.Pricing Information for LlamaGen.ai
Additional Credit Packages