Introducing Micro Agent: An (Actually Reliable) AI Coding Agent

I don't know about you, but I always love finding new ways to make my job easier. AI-assisted coding tools like GitHub Copilot and ChatGPT have shown a lot of promise in generating code from natural language descriptions.

However, if you've used these AI code generation tools, you've probably encountered a persistent issue: the code they produce often doesn't work correctly right out of the box. It may look plausible at first glance, but when you run it in VS Code or your preferred IDE, you find bugs, edge cases, or even references to non-existent APIs.

This can lead to a frustrating loop of trying the generated code, finding issues, going back to the AI for fixes, and repeating. The time spent debugging can negate the time saved by using AI tools in the first place.

That's why we built Micro Agent, a new open-source tool, to solve. Micro Agent aims to deliver the benefits of AI-assisted coding while mitigating the problems of unreliable code generation.

Diagram showing typical AI tools, when they fail, is the users problem. With Micro Agent, failures are caught by a unit test and iteration leads to eventual success in completing a task

The key idea behind Micro Agent is to constrain the generative AI to a specific task and provide it with clear, deterministic feedback. Instead of generating code in an open-ended way, Micro Agent uses unit tests as guardrails.

Here's the typical Micro Agent workflow:

Describe your function: You provide a natural language description of the function you want to create.
AI generates tests: Based on your prompt, Micro Agent generates unit tests that specify the expected behavior of the function, including several input and output examples.
AI writes code: Micro Agent then attempts to write code in JavaScript, TypeScript, Python, or other languages that makes the tests pass, leveraging the power of large language models (LLMs).
Automatic iteration: If the tests fail, Micro Agent keeps iterating, editing the source code and re-running the tests until they all pass. This approach ensures the generated code meets the specified requirements.

The result is a function with much higher guarantees than typical AI coding tools to work as intended, backed by deterministic tests. By automating the iteration process, this AI coding assistant streamlines your development process and helps you create higher quality code with confidence.

Here’s a 30-second demo of Micro Agent generating tests and code for a TypeScript function that groups anagrams together from an array of strings:

While the concept of versatile AI coding agents that can automate any programming task is exciting, the reality often falls short. Tools like Auto-GPT and other general-purpose coding agents tend to go off the rails, compounding errors and leading to unexpected results.

Imagine your Roomba vacuum cleaner getting stuck under a table, spinning its wheels endlessly without making progress. Now magnify that a thousandfold, and you have an idea of what can go wrong with unconstrained AI coding agents.

Diagram showing typical AI agents derail, vs micro agent uses unit tests to stay on track

Micro Agent takes a different approach. By using unit tests as a guidance mechanism, it provides the AI with a clear definition of success. The agent iterates on the code until all test cases pass, ensuring that the generated code meets the specified requirements.

In our findings, LLMs are much more reliable at generating tests accurately in one shot than they are at trying to generate implementation code, especially for non-trivial tasks, which became the big unlock for Micro Agent.

At Builder.io, we've been using Micro Agent extensively to generate complex code without having to spend time figuring it all out and iterating ourselves. In fact, we even use Micro Agent to build Micro Agent.

Here are a few examples of how we've used Micro Agent in building Micro Agent:

Generating an ASCII file tree: We used Micro Agent to create a function that generates an ASCII representation of a file tree. Check out the source code and test to see how Micro Agent helped us tackle this task.
Parsing code blocks from Markdown: Extracting code blocks from Markdown files is another task we've delegated to Micro Agent. Take a look at the source code and test to see how it works. As a side note, Micro Agent is particularly great at generating and fixing regular expressions.

Heres also an example of getting Micro Agent to generate a simple HTML to AST parser (it achieved on two iterations):

While Micro Agent is effective at generating precise code logic in JavaScript, Typescript, Python, and other languages, it's not as well-suited for creating pixel-perfect user interfaces with HTML, CSS. React, Tailwind, etc. That's where Visual Copilot comes in.

Visual Copilot can take any design from Figma and convert it into production-ready code with AI-powered code generation that reuses your existing components, CSS variables, etc.

We're actively building an integration between Micro Agent and Visual Copilot so you can get the best of both worlds - auto-generated, test-driven code logic combined with pixel-perfect code straight from your designs. Some of what we've done can be use today, with a lot more coming soon.

Soon, Visual Copilot + Micro Agent will enable you to turn whole ideas end to end to working applications to life with confidence and precision - such as making figma prototypes real with a click, turning prompts to fully designed and coded applications, and more - all within your existing codebase using your existing design systems and components.

To get started, install Micro Agent globally with:

npm install -g @builder.io/micro-agent

Next, set your OpenAI API key when prompted or manually with:

micro-agent config set OPENAI_KEY=<your token>

To start a new coding task, just run:

Micro Agent will prompt you to describe the function you want, generate tests, and start writing code in your preferred language to make the tests pass. Once all the tests are green, you'll have a fully functional, test-backed function ready to use.

To understand where Micro Agent fits within the broader ecosystem of AI coding tools, lets take a quick look at each tool and its own strengths and specific use cases.

Untitled

Inline code completion (GitHub Copilot)

Tools like GitHub Copilot provide inline code completion suggestions as you type. This can be very handy for quickly writing boilerplate code or filling in common patterns. However, these suggestions are localized and do not guarantee overall code correctness.

Conversational coding assistants (ChatGPT, GitHub Copilot Chat)

Conversational AI assistants like ChatGPT allow you to describe what you want to code in natural language and then generate code snippets. This is great for exploring ideas or getting unstuck, but the generated code often needs manual tweaking to work correctly in your specific context.

Design to code (Visual Copilot)

When you need pixel-perfect code generation from Figma designs that reuse your existing components, design systems, and style variables, Visual Copilot is a good go-to. But, at times what the AI produces doesn’t perfectly match your needs (for instance specific linting requirements, sometimes there are TSC warnings, etc).

Micro agent can be plugged in here to nail that last mile of code, automatically and iteratively ensuring that all important checks pass.

Micro Agent doesn't aim to replace inline completion, conversational assistants, or design to code tools. Instead, it focuses on the specific use case of generating complete functions with high confidence by using tests as a guidance mechanism.

When you need a non-trivial piece of logic and want to be sure it works correctly without manual back-and-forth, that's where Micro Agent shines. It complements other AI coding tools by providing a more reliable solution for well-defined tasks.

Looking ahead, we believe AI agents will play an increasingly important role in software development. However, the current trend of building general-purpose agents are just too unreliable to be useful in day to day workflows.

The Micro Agent approach points to a different future—one where focused, specialized agents work in concert with developers to tackle specific programming tasks with high reliability. In the future, we think agents should be deeply integrated into our workflows, for instance right within our IDEs.

Imagine a future world where you could simply describe what you want to build at a high level, and multiple micro agents would spring into action—some writing tests, others implementing logic, and still others focusing on UI—all working together to bring your vision to life. Integrations like Visual Copilot would ensure pixel-perfect UIs true to the source material, while micro agents worked behind the scenes to make everything function seamlessly.

In this future, developers would spend less time on repetitive low-level tasks and more time on high-level problem solving and creativity. We would operate more like directors, constantly evaluating and guiding, rather than having to write every line of code ourselves (but still coding, too - AI can only do so much, now and in the future).

Of course, there’s still a lot of work to be done to make this vision a reality. But we think we already have an exciting first step in that direction. By focusing AI on narrowly defined tasks and providing tight feedback loops, we can harness its power in a more reliable and predictable way, today.

To us, Micro Agent is an exciting development in making AI-assisted coding more reliable and efficient. While we've had great success using it internally at Builder.io, it's still a young project and may not solve every coding challenge for everyone.

If you're interested in exploring Micro Agent or providing feedback, check out the GitHub repository. Feel free to tweet at me @Steve8708 or submit an issue on the repository with your thoughts or any issues you encounter as well.

Can't wait to see what you build with Micro Agent!