Writing a DSL in Lua
DSLs, or domain specific languages, are programming languages that are designed to implement a set of features specific to a particular problem or field. An example could be Make, the build tool, which is a specially designed language for combining commands and files while managing dependencies.
A lot of modern programming languages have so much flexibility in their syntax that it’s possible to build libraries that expose their own mini-languages within the host language. The definition of DSL has broadened to include these kinds of libraries.
In this guide we'll build a DSL for generating HTML. It looks like this:
html { body { h1 "Welcome to my Lua site", a { href = "http://leafo.net", "Go home" } }
}
Before jumping in, here are some DSL building techniques:
Dropping the parenthesis
One of the cases for Lua as described in its initial public release (1996) is that it makes a good configuration language. That’s still true to this day, and Lua is friendly to building DSLs.
A unique part about Lua’s syntax is parenthesis are optional in some scenarios when calling functions. Terseness is important when building a DSL, and removing superfluous characters is a good way to do that.
When calling a function that has a single argument of either a table literal or a string literal, the parenthesis are optional.
print "hello"
my_function { 1,2,3 } print"hello"
my_function{ 1,2,3 }
This syntax has very high precedence, the same as if you were using parenthesis:
tonumber "1234" + 5
Chaining
Parenthesis-less invocation can be chained as long as each expression from the left evaluates to a function (or a callable table). Here’s some example syntax for a hypothetical web routing framework:
match "/post-comment" { GET = function () end, POST = function () end
}
If it’s not immediately obvious what’s going on, writing the parenthesis in will clear things up. The precedence of the parenthesis-less invocation goes from left to right, so the above is equivalent to:
match("/post-comment")({ ... })
The pattern we would use to implement this syntax would look something like this:
local function match(path) print("match:", path) return function(params) print("params:", params) end
end
Using a recursive function constructor it’s possible to make chaining work for any length.
Using function environments
When interacting with a Lua module you regularly have to bring any functions or values into scope using require
. When working with a DSL, it’s nice to have all the functionality available without having to manually load anything.
One option would be to make all the functions and values global variables, but it’s not recommended as it might interfere with other libraries.
A function environment can be used to change how a function resolves global variable references within its scope. This can be used to automatically expose a DSL’s functionality without polluting the regular global scope.
For the sake of this guide I'll assume that setfenv
exists in the version of Lua we're using. If you're using 5.2 or above you'll need to provide you own implementation: Implementing setfenv in Lua 5.2, 5.3, and above
Here’s a function run_with_env
that runs another function with a particular environment.
local function run_with_env(env, fn, ...) setfenv(fn, env) fn(...)
end
The environment passed will represent the DSL:
local dsl_env = { move = function(x,y) print("I moved to", x, y) end, speak = function(message) print("I said", message) end
} run_with_env(dsl_env, function() move(10, 10) speak("I am hungry!")
end)
In this trivial example the benefits might not be obvious, but typically your DSL would be implemented in another module, and each place you invoke it is not necessary to bring each function into scope manually, but rather activate the whole sscope with run_with_env
.
Function environments also let you dynamically generate methods on the fly. Using the __index
metamethod implemented as a function, any value can be programmatically created. This is how the HTML builder DSL will be created.
Implementing the HTML builder
Our goal is to make the following syntax work:
html { body { h1 "Welcome to my Lua site", a { href = "http://leafo.net", "Go home" } }
}
Each HTML tag is represented by a Lua function that will return the HTML string representing that tag with the correct attribute and content if necessary.
Although it would be possible to write code to generate all the HTML tag builder functions ahead of time, a function __index
metamethod will be used to generate them on the fly.
In order to run code in the context of our DSL, it must be packaged into a function. The render_html
function will take that function and convert it to a HTML string:
render_html(function() return div { img { src = "http://leafo.net/hi" } }
end)
The
img
tag is self-closing, it has no separate close tag. HTML calls these “void elements”. These will be treated differently in the implementation.
render_html
might be implemented like this:
local function render_html(fn) setfenv(fn, setmetatable({}, { __index = function(self, tag_name) return function(opts) return build_tag(tag_name, opts) end end })) return fn()
end
The build_tag
function is where all actual work is done. It takes the name of the tag, and the attributes and content as a single table.
This function could be optimized by caching the generated functions in the environment table.
The void elements, as mentioned above, are defined as a simple set:
local void_tags = { img = true,
}
The most efficient way to concatenate strings in regular Lua is to accumulate them into a table then call table.concat
. Many calls to table.insert
could be used to append to this buffer table, but I prefer the following function to allow multiple values to be appended at once:
local function append_all(buffer, ...) for i=1,select("#", ...) do table.insert(buffer, (select(i, ...))) end
end
append_all
uses Lua’s built in functionselect
to avoid any extra allocations by querying the varargs object instead of creating a new table.
Now the implementation of build_tag
:
local function build_tag(tag_name, opts) local buffer = {"<", tag_name} if type(opts) == "table" then for k,v in pairs(opts) do if type(k) ~= "number" then append_all(buffer, " ", k, '="', v, '"') end end end if void_tags[tag_name] then append_all(buffer, " />") else append_all(buffer, ">") if type(opts) == "table" then append_all(buffer, unpack(opts)) else append_all(buffer, opts) end append_all(buffer, "</", tag_name, ">") end return table.concat(buffer)
end
There are a couple interesting things here:
The opts
argument can either be a string literal or a table. When it’s a table it takes advantage of the fact that Lua tables are both hash tables and arrays at the same time. The hash table portion holds the attributes of the HTML element, and the array portion holds the contents of the element.
Checking if the key in a pairs
iteration is numeric is a quick way to approximate isolating array like elements. It’s not perfect, but will work for this case.
for k,v in pairs(opts) do if type(k) ~= "number" then end
end
When the content of the tag is inserted into the buffer for the table based opts
, the following line is used:
append_all(buffer, unpack(opts))
Lua’s built in function unpack
converts the array values in a table to varargs. This fits perfectly into the append_all
function defined above.
unpack
istable.unpack
in Lua 5.2 and above.
Closing
This simple implementation of an HTML builder that should give you a good introduction to building your own DSLs in Lua.
The HTML builder provided performs no HTML escaping. It’s not suitable for rendering untrusted input. If you're looking for a way to enhance the builder then try adding html escaping. For example:
local unsafe_text = [[<script type="text/javascript">alert('hacked!')</script>]] render_html(function() return div(unsafe_text)
end)
Here are some more guides tagged 'lua'