Conversation
❌ 2 Tests Failed:
View the top 3 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
e1e4240 to
414e408
Compare
| setRows: Dispatch<SetStateAction<TableMapRow[]>>; | ||
| } | ||
|
|
||
| const MONGODB_DOC_COLUMN_PREFIX = 'doc.'; |
There was a problem hiding this comment.
On the other source types excluded columns are defined to be on the source side. Would it be difficult to do the same here and add doc. (or another flattened column) in Go code?
Also, seems like the ssn and such would still end up in the raw table, if we expose it in e2e clickpipes, wouldn't the expectation be that it never lands in CH in the open?
There was a problem hiding this comment.
makes sense, was thinking about how we'd want to extend this to other DBs -- in which case we'd need to pass info about which column the field exclusion is for from the client; but it feels a bit premature to worry about that given JSON is not currently supported for PG/MySQL, and even if we do support JSON for these sources, JSON field exclusion seems like a niche-enough use case that we could defer for some time.
re: sensitive data in raw table. two ways to work around it I originally had in mind was to document that doesn't require changes on our end (1) user explicitly add TTL to raw table (2) lock down permissions for raw table so it's only accessed by admin. That said, i think it might not be too tricky to just do exclusion properly, and in a way that is re-usable by flattened mode. Will explore this a bit more.
There was a problem hiding this comment.
Mongo column exclusion seems like a mirror of the existing "excluded columns" feature - so from the product point of view if we make it work just for Mongo it's plenty good as others already have it.
Add ClickHouse JSON SKIP support for MongoDB field exclusion. This is useful for filtering out sensitive fields in MongoDB documents.
Reuses
TableMapping.Excludeto implement support for ClickHouse JSON SKIP column hints. When users specify fields to exclude via the TableMapping.Exclude configuration, these fields will get translated to SKIP clauses in the JSON column type definition, as long as the column has the<col>.prefix and is of type JSON. Applied to Mongo'sdocfield only for now, but can be expanded to other JSON field later on, such as pg/mysql json support.Testing: e2e test verify that schema and data are properly excluded; and tested locally that resync preserve column exclusion info.