TL; DR: We’re announcing a new open source type checker for Hack, called Hakana.


Slack launched in 2014, built with a lot of love and also a lot of PHP code.

We started migrating to a different language called Hack in 2016. Hack was created by Facebook after they had struggled to scale their operations with PHP. It offered more type-safety than PHP, and it came with an interpreter (called HHVM) that could run PHP code faster than PHP’s own interpreter.

Much has changed in PHP-land since we switched. PHP is faster than it used to be, and it has borrowed a number of Hack features (such as constructor property promotion). A great deal of the PHP community has also embraced type checking — there are now some great third-party type checkers to choose from.

Sticking with Hack has given us access to additional runtime speed boosts, performance-enhancing language constructs like `async`, and a typechecker that’s more strict by default than PHP typecheckers. But we’ve missed out on features provided by PHP typecheckers, including the ability to customize type inference rules to find issues specific to our codebase and automated security vulnerability detection.

Slack has hundreds of developers writing Hack. We want to give them the best possible experience, so last year we started building a type checker that could fill those gaps.

We’ve dubbed that static analysis tool Hakana, and it’s now available on GitHub!

Hakana is based on Psalm, an open-source PHP static analysis tool I created, and it’s written in Rust. Hakana re-uses a Hack parser that’s bundled with the Hack interpreter.

A bonus of writing it in Rust: with a bit of prodding, Hakana can run just about anywhere. For example, it runs in your web browser via WASM.

How we use Hakana

At Slack we run Hakana in CI to enforce good code behavior in a range of areas. Here’s an incomplete list:

  • It prevents unused functions and unused private methods.
  • It prevents unused assignments inside closures.
  • It detects both impossible and redundant type-checks.
  • It warns us about potential SQL-injection attacks and cross-site scripting vulnerabilities (more on this below).
  • It prevents misuse of internal Slack APIs (via plugin hooks).

We also use Hakana to automate type-aware API migrations (again via plugin hooks) and to delete unused functions in bulk. Thanks to Rust, those whole-codebase migrations are relatively quick.

Security

PHP makes it really easy to make a dynamically-rendered website. PHP also makes it really easy to create an utterly insecure dynamically-rendered website.

Hack improves on this slightly, by supporting a system for generating HTML output called XHP. XHP is secure-by-default against cross-site scripting attacks, but it doesn’t stop you from leaking customer data, and Hack doesn’t prevent you from shooting yourself in the foot with a wide range of other security vulnerabilities.

For a host of reasons (including compliance obligations) Slack needed a tool that could discover those vulnerabilities. Psalm, the type checker that Hakana is based on, already does security analysis, so it was relatively simple to add security analysis to Hakana as well.

Hakana isn’t the first security analysis tool for Hack — for years, Facebook has been using an internal, closed-source tool called Zoncolan — but Hakana is the first that everyone can use.

Hakana works in much the same way as Zoncolan. It examines how data can flow between different functions in a codebase, and checks if attacker-controlled data can show up in places it shouldn’t.

To date, Hakana has found a number of exploitable vulnerabilities in production code at Slack (that were immediately fixed, and we checked our logs to ensure that the vulnerabilities had not actually ever been exploited).

Security in the type system

Hakana’s security analysis mode is a form of interprocedural analysis — it looks at the way data flows between functions. Hakana also supports detecting one type of vulnerability (SQL injection) via intraprocedural analysis, just by examining types at function boundaries.

To do this, Hakana borrows the concept of literal string types from Psalm. In Hack code we can define a type alias:

<<Hakana\SpecialTypes\LiteralString()>>
type db_query_string = string;

Though the official Hack typechecker just treats this as a string, Hakana treats it as a special type `literal-string`, a subtype of string that can only be concatenated or interpolated with other literal-strings. Passing a string into a function that expects a literal-string causes Hakana to emit an error:

function get_id_query(string $id): db_query_string {
    return "select * from users where id = '$id'";
    // Error: The type `string` is more general
    // than the declared return type `literal-string`
}

Extending Hakana

PHP static analysis tools are generally highly-customisable. Customisable static analysis is really useful for PHP, because its interpreter allows lots of tricks (e.g. magic methods) that can confound one-size-fits-all static analysis tools.

Most of those tricks don’t work in Hack code, so there’s far less of a need for customisable static analysis. Even so, we’ve found a lot of value in extending Hakana with plugins.

For example, we use a custom plugin to tell Hakana that a method call on our internal Result object, $some_result->is_ok(), is equivalent to the more verbose $some_result is ResultSuccess<_> typecheck.

We also use custom plugins to perform type-aware migrations at speed across the entire codebase.

Building a plugin system for Rust is not simple — our custom version of Hakana wraps the open-source core as a library, using its plugin hooks where necessary — but it’s an important feature for us.

Speed

Hakana was adapted from Psalm, a type checker written in PHP. While PHP is fast for an interpreted language, it can’t compete with a compiled tool. To analyze a codebase of Slack’s size — many millions of lines of code — we needed a tool that’s as fast as possible.

Hakana, written in Rust, runs about 5x faster than the PHP-based tool on which it’s modeled.

We haven’t spent much time tuning performance, but when analyzing the full Slack codebase (about 5 million lines of code) performance is on par with the official Hack typechecker. That’s good enough for us, for now.

Why open-source Hakana?

Hack sort of looks like PHP but with more types. In that respect it’s been compared to TypeScript. But unlike TypeScript, which compiles down to JavaScript, Hack requires its users to change a lot of their server infrastructure too.

Due to that high switching cost, today Hack is only used at a few companies. It’s possible that nobody else aside from Slack will have a reason to use Hakana.

Even so, there are a few reasons why we think it’s worth open-sourcing:

  • 🔍 The broader programming language community may have useful input — especially when it comes to security analysis.
  • 🤝 Psalm, another open-source tool, was the basis for Hakana. By open-sourcing our own tool we’re returning the favor.
  • 🏢 Though it wouldn’t be easy, companies with extremely large PHP codebases might consider forking Hakana and altering it to analyze PHP code.

Conclusion

Slack is an indispensable tool for millions around the world. Hack is now an integral part of Slack, and so we’ve created a tool that we hope will become indispensable to our Hack developers. We’re open-sourcing it today with the hope that others will find it useful and interesting.