Agentic Testing: Where Agents Fit in the E2E Testing Stack
Sergii GorbachovStaff Software Engineer
Sergii GorbachovStaff Software Engineer

Agentic vs. traditional testing paths
Agentic vs. traditional testing paths
Agentic vs. traditional testing paths
Agentic vs. traditional testing paths
Abstract
Abstract
Agent-driven end-to-end (E2E) tests add a new exploratory layer to testing, but should they replace traditional deterministic tests? We ran more than 200 agentic E2E workflows using the Playwright MCP, Playwright CLI, and agent-generated Playwright tests in test workspaces using non-production data to find out how agentic testing could fit into both our and your testing stacks.
Agent-driven end-to-end (E2E) tests add a new exploratory layer to testing, but should they replace traditional deterministic tests? We ran more than 200 agentic E2E workflows using the Playwright MCP, Playwright CLI, and agent-generated Playwright tests in test workspaces using non-production data to find out how agentic testing could fit into both our and your testing stacks.
1. From Journeys to Goals
1. From Journeys to Goals
Traditional end-to-end tests validate a specific journey through the UI.
Traditional end-to-end tests validate a specific journey through the UI.
click → click → type → assert
click → click → type → assert
Agent-driven tests instead validate whether a goal can be achieved, often expressed as an instruction (e.g. “send a thread message”):
Agent-driven tests instead validate whether a goal can be achieved, often expressed as an instruction (e.g. “send a thread message”):
goal → agent adapts → verify result
goal → agent adapts → verify result
This difference can be summarized simply:
This difference can be summarized simply:
Tests enforce journeys. Agents verify goals.
Tests enforce journeys. Agents verify goals.
Across our agentic test runs, the overall workflow remained consistent (e.g. login → search → result → clear), but the exact sequence of actions varied. In practice, agents took different paths to reach the same outcome:
Across our agentic test runs, the overall workflow ...