fix(gemini): strip data URI prefix from PDF base64 data#14134
fix(gemini): strip data URI prefix from PDF base64 data#14134404-Page-Found wants to merge 3 commits intoCherryHQ:mainfrom
Conversation
kangfenmao
left a comment
There was a problem hiding this comment.
Review
分析
这个修复的意图是好的,但从代码分析来看,这段防御性代码 永远不会被触发。
查看 FileStorage.ts:785-791:
public base64File = async (_: Electron.IpcMainInvokeEvent, id: string) => {
const filePath = path.join(this.storageDir, id)
const buffer = await fs.promises.readFile(filePath)
const base64 = buffer.toString('base64') // ← 纯 base64,没有 data URI 前缀
const mime = `application/${path.extname(filePath).slice(1)}`
return { data: base64, mime }
}buffer.toString('base64') 永远不会返回 data:...;base64, 前缀。PR 描述中也确认了这一点:
I've verified that
base64Filereturns raw base64 in FileStorage.ts
因此这段 strip 逻辑是死代码,不会解决 #14097 的根本问题。
需要补充测试
如果确认此修改确实解决了问题,请提供以下服务商的实际测试结果:
- Vertex AI — PDF 上传 + 发送消息
- Gemini(官方 API) — PDF 上传 + 发送消息
- CherryIN — PDF 上传 + 发送消息
建议
- 请进一步调查 #14097 的真正根因——如果
base64File返回的确实是纯 base64,那问题可能出在其他环节 - 如果仍要保留防御性检查,建议加一个
logger.warn记录异常情况,方便追踪问题来源
|
Summary of changes:\n\n- Added defensive stripping and a warning in main: src/main/services/FileStorage.ts\n- Added defensive warning in renderer: src/renderer/src/aiCore/prepareParams/fileProcessor.ts\n- Added unit test: src/renderer/src/aiCore/prepareParams/tests/fileProcessor.test.ts\n\nVerification done locally:\n- Ran linter & formatter (passed)\n- Ran renderer unit tests (passed)\n- Ran main unit tests: some unrelated failures in src/main/services/agents/services/cherryclaw/tests/prompt.test.ts (appears unrelated to these changes)\n\nNotes for reviewer:\n- The strip logic is now executed in the main process where base64 originates, so it will no longer be dead code. Any unexpected occurrences will be visible in main logs via the new warning message.\n- If you still observe the original failure (#14097), please capture the main process logs and share the logged warning (if present) so we can trace where the data URI was introduced.\n\nIf you want, I can also open a follow-up PR to add an integration test that covers the end-to-end flow (write a file containing a data URI and verify the IPC returns stripped base64). |
What this PR does
Before this PR:
Uploading a PDF file and sending it to Gemini models failed with a "Base64 decoding failed" error because the payload incorrectly included the data:application/pdf;base64, prefix.
After this PR:
The application correctly strips the Data URI prefix from the base64 data for PDF files, ensuring only the raw base64 string is sent to the API.
Fixes #14097
Why we need it and why it was done in this way
Certain API proxies or specific model versions are stricter about the inline_data.data format and do not handle the Data URI prefix. Stripping it ensures compatibility across different providers while remaining compliant with the expected standard of raw base64 for these fields.
The following tradeoffs were made:
The following alternatives were considered:
Breaking changes
None.
Special notes for your reviewer
I've verified that �ase64File returns raw base64 in FileStorage.ts, but added this defensive check in ileProcessor.ts to ensure that any data passed to the AI SDK is clean.
Checklist
/gh-pr-review, gh pr diff, or GitHub UI) before requesting review from othersRelease note