Using ASTs to reliably merge LLM generated code in to existing codebases
如果无法正常显示,请先停止浏览器的去广告插件。
1. A new approach Using ASTs
to reliably merge LLM
generated code in to existing
codebases.
Line diffing approaches are sooo 2023
2. A little about me
I was really tired of manually copying
and pasting snippets from chat GPT
or other tools in to my code editor.
I tested other tools and they seem to
be a little too eager to delete my code
or mangle my source files.
(mmiscool) on github.
I make seagulls a bit cross.
3. The project
https://github.com/mmiscool/aiCoder
A tool for making changes to existing code files
reliably using LLMs.
4. The normal workflow today
You know you do it.
Copy
Copy and paste
existing code file in to
ChatGPT conversation
Paste
Manually copy code
snippets one at a time
from LLM response
and paste them in to
the correct location in
source code file
Lorem ipsum
Write instructions
about the desired
changes or new code
to be generated and
submit along with the
existing code.
5. Ideal workflow.
Select target file
AST merge tool:
Test snippet is
syntactically
correct.
Describe desired
modifications or new
functionality to be
added. Surgically replace
duplicate nodes in
AST. Discard stubs
if existing function
or method has
code.
Extract snippets for
merging from LLM
response Regenerate
codefile with
modifications
made.
6. File being edited.
Code snippet produced by LLM
7. Merging the code together using an AST
File being edited.
Code snippet produced by LLM
Classes and methods that are
duplicated need to be merged.
The existing method is replaced with
the new method in the snippet.
8. Results. Existing file
modified to replaced
only the method that
changed.
9. Rules we are using when merging LLM snippets in to existing files.
● Empty stub methods or functions in snippets only get added
to existing classes if they do not currently exist in the target.
● If target contains non stub function or method it is not
replaced with an empty one.
● Existing location of class or function in file is not disturbed.
Example implementation for javascript files:
src/intelligentMerge.js
10. LLM instructions to coax snippets to be easy to merge.
● Only provide the classes and methods being modified.
● Exclude existing code.
● Provide examples to the LLM that show the format of the
desired snippets.
● Instructions to avoid creating global variables and testing or
example code.
Full markdown file with LLM instructions for snippet production:
src/prompts/snippet-production-prompt.md
11. aiCoder Chat interface
● Chat interface to edit a selected file
● Auto apply mode (Pops up a timeout to review
changes and reject them otherwise it
automatically apply snippets from
conversation.
● Buttons on each snippet to manually apply
code from that snippet to code file.
12. aiCoder Tools
● List methods in current target file.
● Mark stubs as red.
● Clicking on gree functions or methods
generates a new conversation about that
specific bit of code.
● Clicking a red method launches a chat to
automatically implement the method.
● Merge and format. You might have some
code that follows the rules for automatic
merging produced by an outside LLM.
Just paste this code at the bottom of the
code file and hit this button to make the
magic happen.
13. aiCoder Project Settings
● Change premade prompt templates for
the specific project you are working on
● This can tweak the style of javascript
produced.
● Provide refinement about how you want
snippets formatted or any other tweaks
needed to enhance the LLMs output
using prompt engineering.
14. aiCoder LLM Settings
● Select the LLM provider and specific
model you want to use.
● OpenAI and Anthrompic both provide
best quality responses.
● Ollama provides local LLM support.
Best results for javascript and snippet
production compliance come from IBM
granite 3.1 based models.
● Groq has a very small context window
and code generation quality is poor
with the older models they have
available.