Transit reader produces Clojure keywords that cannot be round-tripped to EDN

Minimal reproduction:

``` clojure
(-> "[\"^ \",\"~:foo}\",\"bar\"]"
    (.getBytes "UTF-8")
    (java.io.ByteArrayInputStream.)
    (cognitect.transit/reader :json)
    cognitect.transit/read
    pr-str
    clojure.edn/read-string)
```

The above results in a `RuntimeException` from `read-string` ("Map literal must contain an even number of forms"), due to the `}` at the end of the `:foo}` in the transit string (which, to be clear, I've constructed by hand, rather than outputting from Transit).  If you remove the `read-string` and replace it with `keys first`, you can see that Transit has _not_ identified the keyword as invalid, and has produced a keyword `:foo}`.  This issue is me asking for Transit instead to report that the input was malformed or is not representable as a valid Clojure data structure.

I understand that for performance reasons Clojure doesn't validate keywords when interning, and I'm not proposing to bikeshed that decision.  However, for Transit, that approach has the unfortunate consequence that you have to re-validate data that Transit has read from the potentially-untrusted network before you can assume that its keywords are safe to use in everyday Clojure code.

Specifically, we have a Clojure web app with a ClojureScript front-end, and we use Transit to communicate between them.  Sometimes we'll need to take some of the data received from the front-end and write it out somewhere else (to be read back in future, or to communicate it to a separate process, etc).  The natural way to do that is via `pr-str` and `clojure.edn/read`.  This issue introduces a new point of failure in our application -- reading the EDN back in might fail, even though it was written out via `pr-str`, because Transit might have given us a keyword which is not printable/readable in Clojure.

For additional context, this issue was discovered by an external pen-test, attempting an XSS attack (by manually replaying a modified request), so the answer for us can't just be "well then don't have your front end send data across the wire that you're not going to be capable of reading correctly".  To give ourselves some level of confidence that our app isn't going to break when someone tries XSS, we're now having to write additional code to wrap around our calls to `transit/read` to ensure that the keywords it has read are valid keywords, etc.  That seems like it would be better handled when reading the input in the first place.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transit reader produces Clojure keywords that cannot be round-tripped to EDN #32

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Transit reader produces Clojure keywords that cannot be round-tripped to EDN #32

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions