|
1 | 1 | # MathJax in Puppeteer |
2 | 2 |
|
3 | | -This example shows how to run MathJax within a headless Chrome instance using the [Puppeteer](https://developers.google.com/web/tools/puppeteer) library. Although MathJax provides a lightweight DOM implementation (called `LiteDOM`) for use in node applications, it is limited in its scope, and there are reasons you may want to work within an actual browser DOM. For example, when a character is used that is not in MathJax's fonts, MathJax can query the browser to attempt to determine the character's size, but this only works in an actual browser, not MathJax's `LiteDOM`. Similarly, if you have specified `mtextInheritFont: true` or have set the `mtextFont`, MathJax asks the browser to compute the size of the resulting text strings. So if you are processing expressions that contain characters not in MathJax's fonts, or are using inherited or explicit fonts for text-mode material, MathJax's `LiteDOM` will produce poorer quality results than in an actual browser. Using an actual browser DOM, made available by Puppeteer, is one solution to this problem. |
| 3 | +This example shows how to run MathJax within a headless Chrome |
| 4 | +instance using the |
| 5 | +[Puppeteer](https://developers.google.com/web/tools/puppeteer) |
| 6 | +library. Although MathJax provides a lightweight DOM implementation |
| 7 | +(called `LiteDOM`) for use in node applications, it is limited in its |
| 8 | +scope, and there are reasons you may want to work within an actual |
| 9 | +browser DOM. For example, when a character is used that is not in |
| 10 | +MathJax's fonts, MathJax can query the browser to attempt to determine |
| 11 | +the character's size, but this only works in an actual browser, not |
| 12 | +MathJax's `LiteDOM`. Similarly, if you have specified |
| 13 | +`mtextInheritFont: true` or have set the `mtextFont`, MathJax asks the |
| 14 | +browser to compute the size of the resulting text strings. So if you |
| 15 | +are processing expressions that contain characters not in MathJax's |
| 16 | +fonts, or are using inherited or explicit fonts for text-mode |
| 17 | +material, MathJax's `LiteDOM` will produce poorer quality results than |
| 18 | +in an actual browser. Using an actual browser DOM, made available by |
| 19 | +Puppeteer, is one solution to this problem. |
4 | 20 |
|
5 | 21 | ## The Example Code |
6 | 22 |
|
7 | | -There are two parts to this example, the first is a basic HTML file that contains |
8 | | - |
9 | | -``` html |
10 | | -<!DOCTYPE html> |
11 | | -<html> |
12 | | -<head> |
13 | | -<title>MathJax in Puppeteer</title> |
14 | | -</head> |
15 | | -<body> |
16 | | -</body> |
17 | | -</html> |
18 | | -``` |
19 | | - |
20 | | -that is loaded into Puppeteer via a `file:` URL. This is so that additional `file:` URLs can be used to load MathJax itself and any components that it needs to load (if a `data:` URL were used, Chrome's security model would not allow `file:` URL access, and MathJax could not be loaded). |
21 | | - |
22 | | -The main code is in the `tex2svg` file. It loads the required node packages, and processes command-line arguments (not shown here). Then the HTML file shown above is loaded: |
| 23 | +The [`typeset`](typeset) file is a general-purpose typesetting tool |
| 24 | +that can be used to typeset one or more expressions or a file using |
| 25 | +any of the three input formats that MathJax supports (TeX/LaTeX, |
| 26 | +MathML, or AsciiMath), and any of its output formats (CHTML, SVG, or |
| 27 | +MathML). It calls on the utility files in [`mjs/util`](../mjs/util) |
| 28 | +to do most of the work, just as the other examples do. There is also |
| 29 | +a [`Puppeteer.js`](Puppeteer.js) utility that implements the |
| 30 | +puppeteer-specific code needed for the tool. There are two files that |
| 31 | +are used within the headless Chrome that is being run by the Puppeteer |
| 32 | +library: [`puppeteer.html`](puppeteer.html), which is a shell HTML |
| 33 | +file that is used to process individual expressions in the Chrome |
| 34 | +instance, and [`util.js`](util.js), which contains the portions of the |
| 35 | +utility files from `mjs/util` that are needed in Chrome. (Ideally, |
| 36 | +`typeset` would pass the needed commands from those utilities rather |
| 37 | +than duplicating them here, and that may be added in the future, but |
| 38 | +for now, this is sufficient to get the job done.) |
| 39 | + |
| 40 | +The key piece of code in `Puppeteer.js` that does the communication |
| 41 | +with the headless Chrome is the `typeset()` function: |
23 | 42 |
|
24 | | -``` javascript |
25 | | -const html = 'file://' + path.resolve(__dirname, 'puppeteer.html'); |
26 | 43 | ``` |
27 | | - |
28 | | -and the MathJax component file (`tex-svg-full`) and root directory are set up: |
29 | | - |
30 | | -``` javascript |
31 | | -const component = require.resolve('mathjax-full/es5/tex-svg-full.js'); |
32 | | -const root = path.dirname(component); |
| 44 | + async typeset(args, config, options, component, convert) { |
| 45 | + config ??= Puppeteer.configScript(args); |
| 46 | + options ??= Typeset.convertOptions(args); |
| 47 | + component ??= this.startup.pathname; |
| 48 | + convert ??= Puppeteer.convert; |
| 49 | +
|
| 50 | + const browser = await puppeteer.launch(); // launch the browser |
| 51 | + const page = await browser.newPage(); // and get a new page |
| 52 | + page.on('console', Puppeteer.report.bind(args)); // report messages from chrome |
| 53 | + await page.goto(args.file || this.html); // open the HTML page |
| 54 | + await page.addScriptTag({path: 'util.js'}); // load the util script |
| 55 | + await page.addScriptTag({content: config}); // configure MathJax |
| 56 | + await page.addScriptTag({path: component}); // load the MathJax conponent |
| 57 | + return page.evaluate(convert, options, args) // perform the conversion |
| 58 | + .then((output) => [output, null]) // and return its output |
| 59 | + .catch((err) => [null, err]) // pasing on any errors |
| 60 | + .then(async ([result, err]) => { // error or not: |
| 61 | + const output = result; // make local copy |
| 62 | + await browser.close(); // close the browser |
| 63 | + if (err) throw err; // throw any error again |
| 64 | + return output; // return the output |
| 65 | + }); |
| 66 | + } |
33 | 67 | ``` |
34 | 68 |
|
35 | | -The user-supplied TeX expression is obtained from the command line, and whether the math is in display mode is determined |
| 69 | +This function takes the one required argument and four optional ones: |
| 70 | +the list of command-line options (required), a MathJax configuration |
| 71 | +(as a string), options to pass to the conversion function, the URL for |
| 72 | +the MathJax component to load, and a conversion function to perform. |
| 73 | +The configuration script defaults to the one produced by |
| 74 | +`Puppeteer.configScript(args)`, the options default to |
| 75 | +`Typeset.convertOptions(args)`, the component defaults to MathJax's |
| 76 | +`startup` component, and the convert function to `Puppeteer.convert` |
| 77 | +(described below). |
| 78 | + |
| 79 | +The next steps launch the headless Chrome instance and set up a page |
| 80 | +within the browser. We attach an event handler to process any console |
| 81 | +messages from Chrome (e.g., error messages from MathJax). Next we |
| 82 | +open either the file specified by the `--file` command-line option, or |
| 83 | +the default `puppeteer.html` file in this directory, and then load the |
| 84 | +[`util.js`](util.js) script into the page. After that, we process the |
| 85 | +configuration script, and load the specified MathJax component. |
| 86 | + |
| 87 | +The `page.evaluate()` command does the real work by calling the |
| 88 | +`convert()` function, passing it the options and command-line |
| 89 | +arguments, and returning the output from the convert command. The |
| 90 | +first `then()` call puts the output into an array, while the `catch()` |
| 91 | +call traps any errors, returning them in the second part of the array. |
| 92 | +The final `then()` closes the browser and throws the error again, if |
| 93 | +there is one, otherwise it returns a copy of the output (because the |
| 94 | +`result` was tied to the browser, which is now closed, if we didn't |
| 95 | +copy it first, we would produce an error when trying to return the |
| 96 | +output). |
| 97 | + |
| 98 | +As an `async` function, `typeset()` returns a promise that resolves |
| 99 | +when the output is produced by Chrome. The `typeset` node application |
| 100 | +calls `Puppeteer.typeset()` and waits for the promise to resolve, then |
| 101 | +prints the result, catching any errors and printing those. |
| 102 | + |
| 103 | +The conversion function that runs in the Chrome instance is |
| 104 | +`Puppeteer.convert()`. It is passed the conversion options and the |
| 105 | +command-line arguments: |
36 | 106 |
|
37 | | -``` javascript |
38 | | -const math = argv._[0] || ''; |
39 | | -const display = {display: !argv.inline}; |
40 | 107 | ``` |
41 | | - |
42 | | -The MathJax configuration is created from the user-supplied values: |
43 | | - |
44 | | -``` javascript |
45 | | -const config = 'MathJax = ' + JSON.stringify({ |
46 | | - tex: { |
47 | | - packages: argv.packages.replace('\*', PACKAGES).split(/\s*,\s*/) |
48 | | - }, |
49 | | - svg: { |
50 | | - mtextFont: argv.textfont, |
51 | | - merrorFont: argv.textfont, |
52 | | - fontCache: (argv.fontCache ? 'local' : 'none') |
53 | | - }, |
54 | | - loader: { |
55 | | - paths: { |
56 | | - mathjax: `file://${root}` |
| 108 | + async convert(options, args) { |
| 109 | + window.args = args; // Make the arguments global (needed in some ready scripts) |
| 110 | + await MathJax.startup.promise; // Wait for MathJax to set up |
| 111 | + Util.startup(args); // Run the startup scripts |
| 112 | + if (args.file) { |
| 113 | + Util.removeScripts(); // Arrange to remove any scripts MathJax added |
57 | 114 | } |
| 115 | + // |
| 116 | + // Do the actual typesetting and conversion |
| 117 | + // |
| 118 | + return Util.typeset(args, Util[args.output], MathJax.startup.document, options); |
58 | 119 | }, |
59 | | - startup: { |
60 | | - typeset: false |
61 | | - } |
62 | | -}); |
63 | 120 | ``` |
64 | 121 |
|
65 | | -Note that this is a string, as it will be sent to Puppeteer to be executed. |
66 | | - |
67 | | -Finally, the main code to do the conversion: |
68 | | - |
69 | | -``` javascript |
70 | | -(async () => { |
71 | | - const browser = await puppeteer.launch(); // launch the browser |
72 | | - const page = await browser.newPage(); // and get a new page. |
73 | | - await page.goto(html); // open the shell HTML page |
74 | | - await page.addScriptTag({content: config}); // configure MathJax |
75 | | - await page.addScriptTag({path: component}); // load the MathJax conponent |
76 | | - return page.evaluate((math, display) => { // the following is performed in the browser... |
77 | | - return MathJax.startup.promise.then(() => { // wait for MathJax to be ready |
78 | | - return MathJax.tex2svgPromise(math, display).then((m) => { // convert TeX to svg |
79 | | - return m.firstChild.outerHTML.replace(/ /g, '\&#A0;') // then change to &#A0; |
80 | | - }); |
81 | | - }); |
82 | | - }, math, display).then((svg) => { // if successful: |
83 | | - console.log(svg); // output the resulting svg |
84 | | - return browser.close(); // close the browser |
85 | | - }).catch((e) => { // if there is an error: |
86 | | - browser.close(); // close the browser |
87 | | - throw e; // throw the error again (handled below) |
88 | | - }); |
89 | | -})().catch((e) => { // If the process produces an error |
90 | | - console.error(e.message); // reoport the error |
91 | | -}); |
92 | | -``` |
93 | | - |
94 | | -This first launches the browser and creates a page within it, then navigates that page to the HTML file using the `file:` URL set up above. Then we run the configuration script in the page in order to set up the `MathJax` variable, after which we load the MathJax component into the browser. |
95 | | - |
96 | | -The `page.evaluate()` command does the real work. It asks the browser to wait for the `MathJax.startup.promise` to be fulfilled (i.e., MathJax has loaded all its needed parts), and then uses `MathJax.tex2svgPromise()` to convert the TeX expression (passed to it) using the proper display mode (also passed to it) into an SVG DOM tree. That is then serialized (via `outerHTML`) and returned to the node program, where it is printed, and the browser is closed. If any error occurred during the process, the browser is closed, and the error message is printed. |
97 | | - |
98 | | -Note that if you have many expressions to process, you could leave the browser running and perform multiple calls to `Mathjax.tex2svgPromise()` to convert the expressions. That would avoid launching a separate Chrome instance for each expression, which would be rather inefficient. |
99 | | - |
| 122 | +It makes the `args` object a global (since some ready functions need |
| 123 | +that), and then waits for MathJax to start up. Then it calls the |
| 124 | +`startup()` function from the [`util.js`](util.js) file, which runs |
| 125 | +any function that would normally have been added to the |
| 126 | +`MathJax.startup.ready()` function. Then, if we are processing a |
| 127 | +file, we arrange for the MathJax scripts to be removed after the page |
| 128 | +is typeset (the `removeScripts()` function patches the MathDocument's |
| 129 | +`renderPromise()` method to record the scripts that were in place |
| 130 | +originally, then do the usual `renderPromise()` then remove any |
| 131 | +scripts that were not there originally). Finally, it call the |
| 132 | +`Util.typeset()` function to do the actual processing of the |
| 133 | +expressions or the page. |
100 | 134 |
|
101 | 135 | ## Installation |
102 | 136 |
|
103 | | -In order to try out this example you must install its dependencies. Since the code relies on Puppeteer, that needs to be installed, so this directory contains a separate `package.json` file, and you should do the following: |
| 137 | +In order to try out this example you must install its dependencies. |
| 138 | +Since the code relies on Puppeteer, that needs to be installed, so |
| 139 | +this directory contains a separate `package.json` file, and you should |
| 140 | +do the following: |
104 | 141 |
|
105 | 142 | ``` bash |
106 | 143 | cd MathJax-demos-node/puppeteer |
107 | 144 | npm install |
108 | 145 | ``` |
109 | 146 |
|
110 | | -The `tex2svg` file should be an executable that you can run. On non-unix systems, you may need to call |
| 147 | +To run the example, use |
111 | 148 |
|
112 | | - node tex2svg 'tex-code' > file.svg |
| 149 | +``` |
| 150 | +node typeset -i <format> -o <format> [options] [expressions...] |
| 151 | +``` |
| 152 | + |
| 153 | +where `<format>` is one of the input or output formats, and |
| 154 | +`expressions` are zero or more expressions. If no expressions are |
| 155 | +given, then they are taken from standard input. Use |
| 156 | + |
| 157 | +``` |
| 158 | +node typeset --help |
| 159 | +``` |
113 | 160 |
|
114 | | -where `tex-code` is the TeX expression to typeset, and `file.svg` is the name of the file where you will store the SVG output. |
| 161 | +for details about other options. |
0 commit comments