网络上的语法高亮

2022-05-31

2022-05-31

How does syntax highlighting work?

语法高亮是如何工作的?

In IDEs, syntax highlighting has traditionally been implemented in a mode-based pattern matching approach. Each language "grammar" defines a set of scopes, regular expressions that match different kinds of tokens in each scope, and inclusions of scopes inside other scopes. The capturing groups in the regular expressions are then associated with names in some taxonomy that themes interface with.

在IDE中,语法高亮传统上是以基于模式的模式匹配方式实现的。每种语言的 "语法 "都定义了一组作用域、匹配每个作用域中不同种类标记的正则表达式,以及其他作用域中的包含物。然后,正则表达式中的捕获组与主题所对接的一些分类法中的名称相关联。

I put "grammar" in quotes because they're very different from actual formal grammars (ABNF etc). Code editing as we know it is really a stack of several mostly-independent features, each of which has different priorities and ends up involving a different version of "parsing". IDEs mostly want syntax highlighting to be fast and forgiving. We expect our tokens to be colored "correctly" even in invalid/intermediate states, and we expect highlighting to happen basically instantly. This means that lots of systems converged on loose regex-based approaches that could identify keywords and operators and atoms without needing to parse the source into an actual AST.

我把 "语法 "放在引号里,因为它们与实际的形式语法(ABNF等)有很大的不同。我们所知道的代码编辑实际上是由几个基本独立的功能堆叠而成的,每个功能都有不同的优先级,最终涉及不同版本的 "解析"。集成开发环境大多希望语法高亮是快速和宽松。我们希望我们的标记能被 "正确 "地着色,即使是在无效/中间状态下,而且我们希望高亮基本上能立即发生。这意味着很多系统都趋向于基于松散的重合码的方法,这些方法可以识别关键词、运算符和原子,而不需要将源代码解析为实际的AST。

One downside to mode-based pattern matching is that it is fundamentally a very coarse version of parsing, and getting grammars to distinguish between things like function calls and variable names can be prohibitively complicated. Writing and maintaining these grammars is also a massive pain, because it requires thinking about the syntax of the language in unintuitive ways. This blog post is a good summary of just how much cognitive overhead there is in learning to write TexMate grammars, which are what VS Code uses. It doesn't help that the TexMate grammar fo...

开通本站会员,查看完整译文。

- 위키
Copyright © 2011-2024 iteam. Current version is 2.137.1. UTC+08:00, 2024-11-09 02:55
浙ICP备14020137号-1 $방문자$