Documentation

Manual Browser Takeover

Even the most capable AI agents occasionally encounter situations they can't handle alone. A website might require CAPTCHA verification that the agent couldn't solve, or two-factor authentication might need a code from your phone. Manual browser take...

Even the most capable AI agents occasionally encounter situations they can't handle alone. A website might require CAPTCHA verification that the agent couldn't solve, or two-factor authentication might need a code from your phone. Manual browser takeover gives you the ability to step in at any moment, take control of your agent's browser session, and handle these situations yourself.

This feature transforms potential blockers into minor pauses. Instead of a task failing because of an unexpected security check, you can briefly take over, clear the obstacle, and hand control back to your agent to continue its work.

When to Use Manual Takeover

Manual takeover is designed for moments when human intervention makes the difference between success and failure. It's your escape hatch for edge cases that are difficult or impossible for agents to handle autonomously.

Common scenarios for manual takeover include:

  • A website presents a CAPTCHA that your agent wasn't able to solve
  • Two-factor authentication requires a code from your authenticator app or phone
  • An unexpected popup or dialog appears that the agent doesn't recognize
  • You need to complete a step that requires information only you have access to
  • You want to guide your agent through a particularly complex or unfamiliar flow

The key insight is that takeover is a collaboration tool. You're not replacing your agent—you're helping it get past a specific obstacle so it can continue working.

How to Take Over

The takeover process is simple and designed to be as seamless as possible. You pause the session, do what needs to be done, and resume.

To take over a browser session:

  1. While viewing an active browser session, click the Pause button
  2. The session status changes to Paused
  3. Use the live view to interact with the browser directly—you can click, type, and navigate
  4. Complete whatever action is needed (solve CAPTCHA, enter MFA code, etc.)
  5. Click Resume to hand control back to your agent

During takeover, you have full control of the browser. Your agent waits patiently for you to finish, maintaining its context and understanding of the task at hand.

What Happens During Takeover

When you pause a session, the agent immediately stops taking any actions. It won't click, navigate, or interact with the page while you're in control.

The current page state is preserved exactly as it was when you paused. You can interact with the page freely—fill in fields, click buttons, navigate to different pages if needed. All your actions are performed through the same browser session the agent was using.

Behind the scenes, your agent's context is saved. It remembers what task it was working on, what steps it had completed, and what it was trying to accomplish. This context persists through the takeover so the agent can seamlessly continue when you resume.

Resuming the Session

After you've handled whatever required your intervention, resuming is a single click. The agent picks up from the new page state and continues toward its goal.

When you resume:

  • Your agent receives updated context about the current page state (which might have changed during your takeover)
  • It continues working on the original task from wherever it left off
  • If the situation changed significantly, it may ask clarifying questions before proceeding
  • All progress made before the pause is preserved

The transition is designed to be smooth. Your agent adapts to whatever state the browser is in after your intervention and figures out how to proceed from there.

Sign-In Assistance

One of the most common uses for manual takeover is helping agents through authentication flows. Many websites have protections specifically designed to stop automated access, which can create challenges even for sophisticated browser automation.

A typical sign-in assistance workflow looks like this:

  1. Let your agent navigate to the login page and attempt to sign in
  2. If the site presents anti-bot protections, pause the session
  3. Enter your credentials manually if needed
  4. Complete any MFA or CAPTCHA challenges
  5. Resume so your agent can continue with the now-authenticated session

This approach combines the efficiency of automation with human judgment for the specific moments that need it. Your agent handles the bulk of the work; you step in only when genuinely necessary.

Related: Browser Sessions Overview | Connecting Accounts