SVG Sandboxing#567
Conversation
Test Results 8 files ± 0 1 146 suites +32 12m 54s ⏱️ -43s For more details on these failures, see this check. Results for commit a069b29. ± Comparison against base commit 825af6e. This pull request removes 1 and adds 116 tests. Note that renamed tests count towards both.♻️ This comment has been updated with latest results. |
|
Please keep the cool css theming via svg 🙏 |
| updateSVGSkin (skinId, svgData, rotationCenter) { | ||
| if (this._allSkins[skinId] instanceof SVGSkin) { | ||
| this._allSkins[skinId].setSVG(svgData, rotationCenter); | ||
| return; | ||
| return this._allSkins[skinId].setSVG(svgData, rotationCenter); | ||
| } | ||
|
|
||
| const newSkin = new SVGSkin(skinId, this); | ||
| newSkin.setSVG(svgData, rotationCenter); | ||
| this._reskin(skinId, newSkin); | ||
| return newSkin.setSVG(svgData, rotationCenter); | ||
| } |
There was a problem hiding this comment.
Now that createSVGSkin and updateSvgSkin return a promise, if anyone tries to call getSkinSize without having awaited the create/update call, the dimensions would not be set yet, right? I think one such instance is the updateSvg method in virtual-machine.js. We should probably await the update there as well (and anywhere else where it might make sense). Also, looking at the implementation notes, the original plan was to pre-parse the SVG with DOMParser to extract viewBox synchronously (keeping _size available immediately for getSkinSize callers), is there a reason that changed?
| /** | ||
| * Tear down the iframe and reject any in-flight calls. | ||
| * After `destroy()`, the next `send()` lazily creates a fresh iframe. | ||
| */ | ||
| destroy () { |
There was a problem hiding this comment.
Since destroy() is currently never called and the iframe would stay alive/in memory indefinitely, I wonder if it would make sense for the sandbox to manage its own lifecycle and destroy itself once it becomes idle. For example, maybe the sandbox could start/reset an inactivity timeout whenever a message is sent, and destroy the iframe if no new messages arrive within some window (e.g. 10 seconds). That would still keep the sandbox alive during initial project load or bulk SVG imports, while avoiding a permanently alive iframe. I'm mostly thinking out loud here, but it seems like something along these lines could make sense and would avoid coupling the sandbox to the project loading lifecycle?
I'm not sure if there are any concerns performance-wise, but I think it should be relatively cheap, considering that we'd still reuse the iframe during the peak project loading?
| if (REMOVE_ELEMENTS.has(node.name)) { | ||
| parentNode.children = parentNode.children.filter( | ||
| child => child !== node | ||
| ); | ||
| return; | ||
| } |
There was a problem hiding this comment.
Would the current logic remove tags like <svg:script> or <svg:style> (or other namespaced tags), which I believe are also valid SVG elements? It seems like they would currently survive the canonicalization step, meaning their CSS rules (including external url() references) could persist.
This should be fine for now since sanitizeSvgText is still used, and as far as I can tell DOMPurify strips those tags (and leaves the content as a text node, which is still not ideal). But if we're planning to deprecate that at some point, or if we want canonicalize-svg to handle sanitization more comprehensively, it might be worth addressing here as well.
| if (val && !isInternalRef(val.replace(/\s/g, ''))) { | ||
| delete node.attributes[attr]; | ||
| } |
There was a problem hiding this comment.
One thing I noticed is that, as far as I can tell, the current canonicalization allows data: URIs on all elements, whereas the old sanitization logic only allowed them on image tags. I'm not sure whether this could actually be abused in practice, especially considering the current pipeline, but it still might be worth mentioning.
There was a problem hiding this comment.
Why did this snapshot change, are there any changes in the current sanitization logic that lead to it being changed? It doesn't seem so. Could this be a MacOS vs. Linux difference on how the snapshots are generated, and the cause of the failing tests?
There was a problem hiding this comment.
This test was failing for me even before the current changes. It could be an OS difference, now that you mentioned it - will take a look.
| // ── Element / attribute classification ───────────────────────────────────── | ||
|
|
||
| /** Elements removed entirely (children discarded). */ | ||
| const REMOVE_ELEMENTS = new Set([ |
There was a problem hiding this comment.
I approve of the work that has been done so far on the PR!
The main concern is to not be overly restrictive or not restrictive enough when it comes to removing elements and attributes here. You seem to have taken care of all cases that I could think of. The most important thing here has to be to double-check with a handful of examples right before deploying the code that we catch all of the SVGs that have been known to cause trouble so far and also not restrict the friendly ones (which, by looking at the current code, shouldn't be a thing).
Minor nitpick: consider the <discard> element, which might remove some other elements in the SVG (doesn't seem like the end of the world but might consider it while we're at it)
Resolves
https://scratchfoundation.atlassian.net/browse/UEPR-231
Proposed Changes
canonicalizeSvgTextcanonicalizeSvgText- at the point ofloadVector_Reason for Changes
Currently we attempt to sanitize SVGs, but the approach is piecemeal. The biggest security issue in the current state is that we load SVGs directly into the DOM, which is an inherently unsafe operation.
Test Coverage
Added tests for sandboxing, canonicalization and measuring SVGs in a sandboxed environment.