Documentation

Browser Sessions Overview

Browser sessions are the mechanism through which your O-mega agents interact with the live web. When you ask your agent to visit a website, fill out a form, extract data, or perform any web-based task, it happens through a browser session—a real brow...

Browser sessions are the mechanism through which your O-mega agents interact with the live web. When you ask your agent to visit a website, fill out a form, extract data, or perform any web-based task, it happens through a browser session—a real browser running in the cloud that your agent controls.

This isn't web scraping in the traditional sense. Your agent operates a full browser with all the capabilities of a human user: JavaScript execution, cookie handling, form interactions, file downloads, and everything else a modern browser can do. Websites see a real browser session, not automated HTTP requests.

How Browser Sessions Work

When you ask your agent to perform a web task, O-mega orchestrates a seamless process in the background. A remote browser session is created, your agent takes control, and work begins.

The process unfolds in several stages:

  1. A remote browser session is created in the cloud
  2. O-mega provides a live view URL so you can watch in real-time if you wish
  3. Your agent executes each step while the system tracks progress
  4. Deliverables—data, files, screenshots—are extracted and saved along the way
  5. When the task is complete, results are returned to your conversation

Throughout this process, you have visibility into what's happening. You can watch your agent navigate, see what pages it visits, and observe actions as they occur.

Session Statuses

Every browser session moves through a defined lifecycle. Understanding these statuses helps you know what's happening at any moment.

Sessions can be in the following states:

  • Loading - The session is spinning up and preparing the browser environment
  • Active - Your agent is actively working in the browser, taking actions and navigating
  • Paused - The session is paused, either for manual takeover or while waiting for something
  • Finished - The task has been completed and the session is closed

You'll see the current status in your conversation interface whenever a browser session is active.

Live View

One of the most powerful features of browser sessions is the ability to watch them in real-time. Every session has a live view URL that connects you directly to what your agent is seeing.

Through the live view, you can observe exactly what the agent sees on screen. You can watch it fill out forms, navigate between pages, and interact with elements. This transparency is valuable for understanding how your agent approaches tasks and for troubleshooting when something doesn't work as expected.

The live view is also the gateway to manual takeover—if you need to step in and help your agent with something, you'll do it through this same interface.

What Agents Can Do in Browser Sessions

Browser sessions give your agents comprehensive web capabilities. They can handle essentially any task a human could perform in a web browser.

Agent capabilities in browser sessions include:

  • Navigate to any website using URLs you provide or that they discover
  • Click buttons, links, and interactive elements on any page
  • Fill out forms, enter text, and submit data
  • Extract text, tables, and structured data from pages
  • Download files that are available on websites
  • Take screenshots for verification or as deliverables
  • Handle login flows using saved accounts from your agent's configuration
  • Navigate through multi-step processes that span multiple pages

This combination of capabilities means your agents can handle complex workflows that require real browser interaction—things that simple APIs or scraping tools can't accomplish.

Session Profiles

Browser sessions can use persistent profiles that remember important state across sessions. This means cookies, login sessions, and preferences persist from one session to the next.

The practical benefit is significant: your agent doesn't need to log in to websites every time it starts a new session. Once logged in, that authentication state is saved. Future sessions pick up where previous ones left off, skipping login flows and saving both time and credits.

Profiles are managed automatically based on your agent's connected accounts. When you add an account to an agent, the profile system ensures that login state persists appropriately.

Related: Manual Browser Takeover | Browser Profiles