firefox parser/html/java/README.txt (via) TIL (or TIR - Today I was Reminded) that the HTML5 Parser used by Firefox is maintained as Java code (commit history here) and converted to C++ using a custom translation script.
You can see that in action by checking out the ~8GB Firefox repository and running:
cd parser/html/java
make sync
make translate
Here's a terminal session where I did that, including the output of git diff showing the updated C++ files.
I did some digging and found that the code that does the translation work lives, weirdly, in the Nu Html Checker repository on GitHub which powers the W3C's validator.w3.org/nu/ validation service!
Here's a snippet from htmlparser/cpptranslate/CppVisitor.java showing how a class declaration is converted into C++:
protected void startClassDeclaration() {
printer.print("#define ");
printer.print(className);
printer.printLn("\_cpp\_\_");
printer.printLn();
for (int i = 0; i < Main.H\_LIST.length; i++) {
String klazz = Main.H\_LIST\[i\];
if (!klazz.equals(javaClassName)) {
printer.print("#include \\"");
printer.print(cppTypes.classPrefix());
printer.print(klazz);
printer.printLn(".h\\"");
}
}
printer.printLn();
printer.print("#include \\"");
printer.print(className);
printer.printLn(".h\\"");
printer.printLn();
}
Here's a fascinating blog post from John Resig explaining how validator author Henri Sivonen introduced the new parser into Firefox in 2009.