Skip to content

Commit 0c47b21

Browse files
committed
Update puppeteer README, and adjust jsdom and linkedom READMEs
1 parent 5f83042 commit 0c47b21

File tree

3 files changed

+139
-90
lines changed

3 files changed

+139
-90
lines changed

jsdom/README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,8 @@ they load the `adaptors/jsdom` adaptor rather than the
2626
`adaptors/liteDOM` adaptor. The jsdom adaptor requires that you load
2727
the JSDOM node module and pass that to the adaptor when it is created.
2828
These details are encapsulated in the [`Jsdom.js`](Jsdom.js) utility
29-
file.
29+
file. The rest of the work is done by the utility files in
30+
[`mjs/util`](../mjs/util).
3031

3132
## Installation
3233

linkedom/README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,8 @@ they load the `adaptors/linkedom` adaptor rather than the
2626
`adaptors/liteDOM` adaptor. The linkedom adaptor requires that you load
2727
the LINKEDOM node module and pass that to the adaptor when it is created.
2828
These details are encapsulated in the [`Linkedom.js`](Linkedom.js) utility
29-
file.
29+
file. The rest of the work is done by the utility files in
30+
[`mjs/util`](../mjs/util).
3031

3132
## Installation
3233

puppeteer/README.md

Lines changed: 135 additions & 88 deletions
Original file line numberDiff line numberDiff line change
@@ -1,114 +1,161 @@
11
# MathJax in Puppeteer
22

3-
This example shows how to run MathJax within a headless Chrome instance using the [Puppeteer](https://developers.google.com/web/tools/puppeteer) library. Although MathJax provides a lightweight DOM implementation (called `LiteDOM`) for use in node applications, it is limited in its scope, and there are reasons you may want to work within an actual browser DOM. For example, when a character is used that is not in MathJax's fonts, MathJax can query the browser to attempt to determine the character's size, but this only works in an actual browser, not MathJax's `LiteDOM`. Similarly, if you have specified `mtextInheritFont: true` or have set the `mtextFont`, MathJax asks the browser to compute the size of the resulting text strings. So if you are processing expressions that contain characters not in MathJax's fonts, or are using inherited or explicit fonts for text-mode material, MathJax's `LiteDOM` will produce poorer quality results than in an actual browser. Using an actual browser DOM, made available by Puppeteer, is one solution to this problem.
3+
This example shows how to run MathJax within a headless Chrome
4+
instance using the
5+
[Puppeteer](https://developers.google.com/web/tools/puppeteer)
6+
library. Although MathJax provides a lightweight DOM implementation
7+
(called `LiteDOM`) for use in node applications, it is limited in its
8+
scope, and there are reasons you may want to work within an actual
9+
browser DOM. For example, when a character is used that is not in
10+
MathJax's fonts, MathJax can query the browser to attempt to determine
11+
the character's size, but this only works in an actual browser, not
12+
MathJax's `LiteDOM`. Similarly, if you have specified
13+
`mtextInheritFont: true` or have set the `mtextFont`, MathJax asks the
14+
browser to compute the size of the resulting text strings. So if you
15+
are processing expressions that contain characters not in MathJax's
16+
fonts, or are using inherited or explicit fonts for text-mode
17+
material, MathJax's `LiteDOM` will produce poorer quality results than
18+
in an actual browser. Using an actual browser DOM, made available by
19+
Puppeteer, is one solution to this problem.
420

521
## The Example Code
622

7-
There are two parts to this example, the first is a basic HTML file that contains
8-
9-
``` html
10-
<!DOCTYPE html>
11-
<html>
12-
<head>
13-
<title>MathJax in Puppeteer</title>
14-
</head>
15-
<body>
16-
</body>
17-
</html>
18-
```
19-
20-
that is loaded into Puppeteer via a `file:` URL. This is so that additional `file:` URLs can be used to load MathJax itself and any components that it needs to load (if a `data:` URL were used, Chrome's security model would not allow `file:` URL access, and MathJax could not be loaded).
21-
22-
The main code is in the `tex2svg` file. It loads the required node packages, and processes command-line arguments (not shown here). Then the HTML file shown above is loaded:
23+
The [`typeset`](typeset) file is a general-purpose typesetting tool
24+
that can be used to typeset one or more expressions or a file using
25+
any of the three input formats that MathJax supports (TeX/LaTeX,
26+
MathML, or AsciiMath), and any of its output formats (CHTML, SVG, or
27+
MathML). It calls on the utility files in [`mjs/util`](../mjs/util)
28+
to do most of the work, just as the other examples do. There is also
29+
a [`Puppeteer.js`](Puppeteer.js) utility that implements the
30+
puppeteer-specific code needed for the tool. There are two files that
31+
are used within the headless Chrome that is being run by the Puppeteer
32+
library: [`puppeteer.html`](puppeteer.html), which is a shell HTML
33+
file that is used to process individual expressions in the Chrome
34+
instance, and [`util.js`](util.js), which contains the portions of the
35+
utility files from `mjs/util` that are needed in Chrome. (Ideally,
36+
`typeset` would pass the needed commands from those utilities rather
37+
than duplicating them here, and that may be added in the future, but
38+
for now, this is sufficient to get the job done.)
39+
40+
The key piece of code in `Puppeteer.js` that does the communication
41+
with the headless Chrome is the `typeset()` function:
2342

24-
``` javascript
25-
const html = 'file://' + path.resolve(__dirname, 'puppeteer.html');
2643
```
27-
28-
and the MathJax component file (`tex-svg-full`) and root directory are set up:
29-
30-
``` javascript
31-
const component = require.resolve('mathjax-full/es5/tex-svg-full.js');
32-
const root = path.dirname(component);
44+
async typeset(args, config, options, component, convert) {
45+
config ??= Puppeteer.configScript(args);
46+
options ??= Typeset.convertOptions(args);
47+
component ??= this.startup.pathname;
48+
convert ??= Puppeteer.convert;
49+
50+
const browser = await puppeteer.launch(); // launch the browser
51+
const page = await browser.newPage(); // and get a new page
52+
page.on('console', Puppeteer.report.bind(args)); // report messages from chrome
53+
await page.goto(args.file || this.html); // open the HTML page
54+
await page.addScriptTag({path: 'util.js'}); // load the util script
55+
await page.addScriptTag({content: config}); // configure MathJax
56+
await page.addScriptTag({path: component}); // load the MathJax conponent
57+
return page.evaluate(convert, options, args) // perform the conversion
58+
.then((output) => [output, null]) // and return its output
59+
.catch((err) => [null, err]) // pasing on any errors
60+
.then(async ([result, err]) => { // error or not:
61+
const output = result; // make local copy
62+
await browser.close(); // close the browser
63+
if (err) throw err; // throw any error again
64+
return output; // return the output
65+
});
66+
}
3367
```
3468

35-
The user-supplied TeX expression is obtained from the command line, and whether the math is in display mode is determined
69+
This function takes the one required argument and four optional ones:
70+
the list of command-line options (required), a MathJax configuration
71+
(as a string), options to pass to the conversion function, the URL for
72+
the MathJax component to load, and a conversion function to perform.
73+
The configuration script defaults to the one produced by
74+
`Puppeteer.configScript(args)`, the options default to
75+
`Typeset.convertOptions(args)`, the component defaults to MathJax's
76+
`startup` component, and the convert function to `Puppeteer.convert`
77+
(described below).
78+
79+
The next steps launch the headless Chrome instance and set up a page
80+
within the browser. We attach an event handler to process any console
81+
messages from Chrome (e.g., error messages from MathJax). Next we
82+
open either the file specified by the `--file` command-line option, or
83+
the default `puppeteer.html` file in this directory, and then load the
84+
[`util.js`](util.js) script into the page. After that, we process the
85+
configuration script, and load the specified MathJax component.
86+
87+
The `page.evaluate()` command does the real work by calling the
88+
`convert()` function, passing it the options and command-line
89+
arguments, and returning the output from the convert command. The
90+
first `then()` call puts the output into an array, while the `catch()`
91+
call traps any errors, returning them in the second part of the array.
92+
The final `then()` closes the browser and throws the error again, if
93+
there is one, otherwise it returns a copy of the output (because the
94+
`result` was tied to the browser, which is now closed, if we didn't
95+
copy it first, we would produce an error when trying to return the
96+
output).
97+
98+
As an `async` function, `typeset()` returns a promise that resolves
99+
when the output is produced by Chrome. The `typeset` node application
100+
calls `Puppeteer.typeset()` and waits for the promise to resolve, then
101+
prints the result, catching any errors and printing those.
102+
103+
The conversion function that runs in the Chrome instance is
104+
`Puppeteer.convert()`. It is passed the conversion options and the
105+
command-line arguments:
36106

37-
``` javascript
38-
const math = argv._[0] || '';
39-
const display = {display: !argv.inline};
40107
```
41-
42-
The MathJax configuration is created from the user-supplied values:
43-
44-
``` javascript
45-
const config = 'MathJax = ' + JSON.stringify({
46-
tex: {
47-
packages: argv.packages.replace('\*', PACKAGES).split(/\s*,\s*/)
48-
},
49-
svg: {
50-
mtextFont: argv.textfont,
51-
merrorFont: argv.textfont,
52-
fontCache: (argv.fontCache ? 'local' : 'none')
53-
},
54-
loader: {
55-
paths: {
56-
mathjax: `file://${root}`
108+
async convert(options, args) {
109+
window.args = args; // Make the arguments global (needed in some ready scripts)
110+
await MathJax.startup.promise; // Wait for MathJax to set up
111+
Util.startup(args); // Run the startup scripts
112+
if (args.file) {
113+
Util.removeScripts(); // Arrange to remove any scripts MathJax added
57114
}
115+
//
116+
// Do the actual typesetting and conversion
117+
//
118+
return Util.typeset(args, Util[args.output], MathJax.startup.document, options);
58119
},
59-
startup: {
60-
typeset: false
61-
}
62-
});
63120
```
64121

65-
Note that this is a string, as it will be sent to Puppeteer to be executed.
66-
67-
Finally, the main code to do the conversion:
68-
69-
``` javascript
70-
(async () => {
71-
const browser = await puppeteer.launch(); // launch the browser
72-
const page = await browser.newPage(); // and get a new page.
73-
await page.goto(html); // open the shell HTML page
74-
await page.addScriptTag({content: config}); // configure MathJax
75-
await page.addScriptTag({path: component}); // load the MathJax conponent
76-
return page.evaluate((math, display) => { // the following is performed in the browser...
77-
return MathJax.startup.promise.then(() => { // wait for MathJax to be ready
78-
return MathJax.tex2svgPromise(math, display).then((m) => { // convert TeX to svg
79-
return m.firstChild.outerHTML.replace(/&nbsp;/g, '\&#A0;') // then change &nbsp; to &#A0;
80-
});
81-
});
82-
}, math, display).then((svg) => { // if successful:
83-
console.log(svg); // output the resulting svg
84-
return browser.close(); // close the browser
85-
}).catch((e) => { // if there is an error:
86-
browser.close(); // close the browser
87-
throw e; // throw the error again (handled below)
88-
});
89-
})().catch((e) => { // If the process produces an error
90-
console.error(e.message); // reoport the error
91-
});
92-
```
93-
94-
This first launches the browser and creates a page within it, then navigates that page to the HTML file using the `file:` URL set up above. Then we run the configuration script in the page in order to set up the `MathJax` variable, after which we load the MathJax component into the browser.
95-
96-
The `page.evaluate()` command does the real work. It asks the browser to wait for the `MathJax.startup.promise` to be fulfilled (i.e., MathJax has loaded all its needed parts), and then uses `MathJax.tex2svgPromise()` to convert the TeX expression (passed to it) using the proper display mode (also passed to it) into an SVG DOM tree. That is then serialized (via `outerHTML`) and returned to the node program, where it is printed, and the browser is closed. If any error occurred during the process, the browser is closed, and the error message is printed.
97-
98-
Note that if you have many expressions to process, you could leave the browser running and perform multiple calls to `Mathjax.tex2svgPromise()` to convert the expressions. That would avoid launching a separate Chrome instance for each expression, which would be rather inefficient.
99-
122+
It makes the `args` object a global (since some ready functions need
123+
that), and then waits for MathJax to start up. Then it calls the
124+
`startup()` function from the [`util.js`](util.js) file, which runs
125+
any function that would normally have been added to the
126+
`MathJax.startup.ready()` function. Then, if we are processing a
127+
file, we arrange for the MathJax scripts to be removed after the page
128+
is typeset (the `removeScripts()` function patches the MathDocument's
129+
`renderPromise()` method to record the scripts that were in place
130+
originally, then do the usual `renderPromise()` then remove any
131+
scripts that were not there originally). Finally, it call the
132+
`Util.typeset()` function to do the actual processing of the
133+
expressions or the page.
100134

101135
## Installation
102136

103-
In order to try out this example you must install its dependencies. Since the code relies on Puppeteer, that needs to be installed, so this directory contains a separate `package.json` file, and you should do the following:
137+
In order to try out this example you must install its dependencies.
138+
Since the code relies on Puppeteer, that needs to be installed, so
139+
this directory contains a separate `package.json` file, and you should
140+
do the following:
104141

105142
``` bash
106143
cd MathJax-demos-node/puppeteer
107144
npm install
108145
```
109146

110-
The `tex2svg` file should be an executable that you can run. On non-unix systems, you may need to call
147+
To run the example, use
111148

112-
node tex2svg 'tex-code' > file.svg
149+
```
150+
node typeset -i <format> -o <format> [options] [expressions...]
151+
```
152+
153+
where `<format>` is one of the input or output formats, and
154+
`expressions` are zero or more expressions. If no expressions are
155+
given, then they are taken from standard input. Use
156+
157+
```
158+
node typeset --help
159+
```
113160

114-
where `tex-code` is the TeX expression to typeset, and `file.svg` is the name of the file where you will store the SVG output.
161+
for details about other options.

0 commit comments

Comments
 (0)