Tactical Coding Assistants
[
Over the past 15 months I have been actively working with LLM’s in my job as a software engineer. I have had many moments where I thought singularity was here, but then almost immediately after I would see the agent fall flat on its face. I have become interested in figuring out when and why these new tools fail, in the hope that it can help me use these tools to their maximum potential while still keeping control over the quality of what I commit. In this short article, I want to expand on a recent observation.
Generated using Gemini Imagen 4
In the book ‘A Philosophy of Software Design’ by John K. Ousterhout, a great analogy is made between two different modes of development: tactical and strategic programming.
Tactical programming: this is a shortsighted approach focused on completing the immediate task as quickly as possible. Here, the primary goal is to make a feature or bug fix work in the most direct way. You can imagine what will happen if you have a team of engineers that only works like this: technical debt will pile up fast.
Generated using Gemini Imagen 4
On the other side there is strategic programming: this involves a higher level of thinking. A strategic programmer is not finished when the feature is implemented. They take a step back and review the architecture, to see if any abstractions can be made. Does the current approach overcomplicate things? Can we refactor things? The focus here is on design, where the engineers main goal is to keep complexity at bay.
These two analogies hooked onto what I thought about the current state of coding assistants. In my experience, some of the current (mid-2025) state of the art coding assistants (Gemini 2.5 pro, Claude 3.7) are extremely adept tactical programmers. Assuming you give them a clear description of what you need implemented, they will go ahead and do it for you (with quite a high success rate in my experience). We humans are limited by many things, such as the speed at which we can physically press keys. For our tactical coding assistants however, lines of code are no longer a limiting factor: generating 1000 lines is nearly as fast as generating a single line of code. This becomes problematic, since in my own experience ‘laziness’ is a big driver on writing as little lines of code as possible, ensuring DRY and KISS designs. Our AI friends are designed to predict the next character, and when let loose chances are high that complexity sky-rockets, there is a ton of duplicated behavior, and the code is destined to become unreadable at some point.
So, how do we manage an eager, tactical assistant by our side?
As Andrej Karpathy suggested in his recent talk, we must keep the AI on a short leash. We should not, under any circumstances, let it run wild. The human developer must remain the project’s strategist. You decide the goals, you manage the complexity, and you navigate the uncertainty that comes with any evolving software project. If the leash is too long, the AI will lead you down a dark path, creating a codebase that is increasingly difficult to change and impossible to understand.
Of course, there are moments when you can loosen the leash. Think of it as finding a nice, open field where your AI dog can run a bit more freely. These are the boilerplate tasks: scaffolding a dozen REST endpoints from a clear API contract, writing repetitive test suites, or migrating data formats. Even then, it is critical that you don’t blindly accept what’s generated. You must be the ultimate gatekeeper, reviewing every line before it enters your codebase.
This post does not aim to be a prompting guide (there are plenty of those around) nor do I claim to be an expert at good software design. Still, I want to strengthen my case here by providing you with an example.
For this example, Gemini 2.5 Pro is used.
Starting point: simple report generator.
In this example, we have a simple function that takes in some list of objects and creates a string report. Our goal is to extend this piece of code to support an extra record type.
def generate_report(data):
report_lines = ["Sales and Expense Report", "========================"]
total_sales = 0
total_expenses = 0
for record in data:
record_type = record.get("type")
if record_type == "sale":
product = record.get("product", "N/A")
amount = record.get("amount", 0)
report_lines.append(f"[SALE] Product: {product}, Amount: ${amount:.2f}") total_sales += amount
elif record_type == "expense":
category = record.get("category", "Uncategorized")
amount = record.get("amount", 0)
report_lines.append(f"[EXPENSE] Category: {category}, Cost: ${amount:.2f}") total_expenses += amount
else:
report_lines.append(f"[UNKNOWN] Corrupted or unknown record type found.")
report_lines.append("========================")
report_lines.append(f"Total Sales: ${total_sales:.2f}")
report_lines.append(f"Total Expenses: ${total_expenses:.2f}") net_income = total_sales - total_expenses
report_lines.append(f"Net Income: ${net_income:.2f}")
return "\\n".join(report_lines)
Approach 1: The “No leash” prompt
Let’s start with a simple, direct request that focuses on the immediate requirement.
I need you to add a new record type for ‘refunds’. A record will look like {’type’: ‘refund’, ‘original_sale_id’ : ‘S12345’, ‘amount’: 99.99}
Resulting code:
def generate_report(data):
elif record_type == "refund":
original_sale_id = record.get("original_sale_id", "N/A")
amount = record.get("amount", 0)
report_lines.append(f"[REFUND] Original Sale ID: {original_sale_id}, Amount: ${amount:.2f}") total_refunds += amount
report_lines.append("========================")
report_lines.append(f"Total Sales: ${total_sales:.2f}")
report_lines.append(f"Total Expenses: ${total_expenses:.2f}")
report_lines.append(f"Total Refunds: ${total_refunds:.2f}") net_income = total_sales - total_expenses - total_refunds
report_lines.append(f"Net Income: ${net_income:.2f}")
return "\\n".join(report_lines)
On the surface, this seems like a perfectly viable solution. But is this really maintainable? What if we have 10 more of these types of requests? This is a good example of a tactical implementation of our request.
Approach 2: The strategic (leash) prompt.
Let us try again. Apart from listing our initial requirement, we now provide some constraints on how the AI should implement the feature. We propose an abstraction, with an eye towards future maintainability.
Analyze @tactical_example.py I want to refactor this using the Strategy or Command pattern to make it extensible. Define a Record interface with a method process(report_context). The report_context will be a simple class that holds the running totals (sales, expenses, refunds, etc.). Implement SaleRecord, ExpenseRecord, and a new RefundRecord that conform to this interface. The main generate_report function should then simply iterate over the records and call the process method on each, without needing to know the specific type.
Resulting code:
from abc import ABC, abstractmethod
class ReportContext:
def __init__(self):
self.total_sales = 0
self.total_expenses = 0
self.total_refunds = 0 self.report_lines = []
class Record(ABC):
def process(self, context: ReportContext):
"""Process the record and update the report context."""
pass....
class RefundRecord(Record):
"""Represents a refund."""
def __init__(self, reason: str, amount: float): self.reason = reason self.amount = amount
def process(self, context: ReportContext):
context.total_refunds += self.amount
context.report_lines.append(f"[REFUND] Reason: {self.reason}, Amount: ${self.amount:.2f}")
def generate_report(records: list[Record]):
report_title = "Sales and Expense Report" context = ReportContext()
for record in records:
record.process(context)
header = [report_title, "=" * len(report_title)]
footer = [
"=" * len(report_title),
f"Total Sales: ${context.total_sales:.2f}",
f"Total Expenses: ${context.total_expenses:.2f}",
f"Total Refunds: ${context.total_refunds:.2f}",
f"Net Income: ${context.total_sales - context.total_expenses - context.total_refunds:.2f}" ]
return "\\n".join(header + context.report_lines + footer)
if __name__ == "__main__":
data = [
{"type": "sale", "product": "Laptop", "amount": 1200.00},
{"type": "expense", "category": "Office Supplies", "amount": 75.50},
{"type": "sale", "product": "Mouse", "amount": 25.00},
{"type": "refund", "reason": "Defective Item", "amount": 1200.00},
{"type": "expense", "category": "Software", "amount": 250.00}, ] record_mapping = {
"sale": SaleRecord,
"expense": ExpenseRecord,
"refund": RefundRecord } financial_records = []
for item in data:
record_type = item.pop("type")
if record_class := record_mapping.get(record_type): financial_records.append(record_class(**item)) report = generate_report(financial_records)
print(report)
Key takeaway
Now whether you agree if this is a better implementation is not the point here. The point I am trying to make is that the assistant, without a leash, simply went ahead and appended to the existing structure, without taking a step back to think about complexity. This path of least resistance is not always the right path, which is why we should be wary and give tightly constrained instructions.
Perhaps through some clever prompt engineering we could have landed at a cleaner solution: but in my experience, there is no ideal prompt that will yield you consistent, strategic, results.
Instead of a vague, high-level command like “build the user profile page,” you shorten the leash with a precise, tactical prompt:
-
“Scaffold the component structure for a new endpoint that follows the contract defined in
api-types.ts
, but leave the core business logic implementation empty." - “Refactor this function to be pure. It should take its dependencies as arguments instead of accessing them from the outer scope.”
-
“Generate three edge-case unit tests for the
calculateDiscount
function based on these business rules."
Generated using Gemini Imagen 4
By using AI as a focused, tactical tool, we can potentially harness its power without leaving our most important role: the strategic programmer who ensures the codebase remains clean, coherent, and maintainable for the long haul. The assistants can lay the bricks, but we must remain the architect.