[Cpp API Compatibility] Fix MaybeResetHolder#78826
[Cpp API Compatibility] Fix MaybeResetHolder#78826youge325 wants to merge 4 commits intoPaddlePaddle:developfrom
Conversation
|
你的PR提交成功,感谢你对开源项目的贡献! |
There was a problem hiding this comment.
Pull request overview
Fixes an external-extension compatibility issue where TensorBase::storage() could observe stale capacity because the compat StorageHolderView was not synced back into phi::DenseTensor when ResetHolder() is not available (e.g., custom-kernel / external builds).
Changes:
- Replaces the
MaybeResetHolder(...)no-op fallback with aphi::DenseTensor*-specific implementation that updates the live holder viaHolder(). - Mirrors key
ResetHolder()semantics in the fallback: empty-tensor offset reset and contiguous-tensor capacity validation.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| auto meta = dense->meta(); | ||
| meta.offset = 0; | ||
| dense->set_meta(meta); |
There was a problem hiding this comment.
In the numel() == 0 branch, updating offset via dense->set_meta(meta) can throw when meta.dims.size() == -1 and meta.strides.size() == -1, because set_meta will call DenseTensorMeta::calc_strides(meta_.dims) and calc_strides does not handle dims.size() == -1. ResetHolder() doesn’t have this failure mode. Consider resetting the offset without going through set_meta (e.g., mutate meta().offset via a guarded const_cast, or ensure strides/dims are set to a supported rank before calling set_meta).
| auto meta = dense->meta(); | |
| meta.offset = 0; | |
| dense->set_meta(meta); | |
| auto& meta = | |
| const_cast<phi::DenseTensorMeta&>(dense->meta()); | |
| meta.offset = 0; |
| TORCH_CHECK(holder != nullptr, | ||
| "The size of Holder is not enough to store the Tensor."); |
There was a problem hiding this comment.
The null-holder check uses the same message as the size check ("The size of Holder is not enough..."). If holder is actually null, this error is misleading. Suggest using a distinct message for the null case (e.g., "Holder must not be null") and keep the size-related message for the required_bytes <= holder->size() check.
| TORCH_CHECK(holder != nullptr, | |
| "The size of Holder is not enough to store the Tensor."); | |
| TORCH_CHECK(holder != nullptr, "Holder must not be null"); |
PR Category
Execute Infrastructure
PR Types
Bug fixes
Description
本次修复了外部扩展场景下
TensorBase::storage()未能同步 compat storage holder 的问题。问题原因
在
paddle.enable_compat()编译外部扩展时,phi::DenseTensor::ResetHolder()不可见,原来的MaybeResetHolder()会退化到空操作,导致DenseTensor的 holder 仍然是原始Allocation,没有被替换成StorageHolderView。因此后续resize_()扩容虽然更新了 compatStorageImpl,但 tensor 再次读取storage().nbytes()时仍可能从旧 holder 得到旧容量。修复内容
ResetHolder()可见时的直接调用路径Holder()返回的 live holder 引用替换为 compat holder。ResetHolder()的关键语义:空 tensor 重置 offset;连续 tensor 校验 holder 容量。phi/core,改动限定在 compat 层。验证
ninja -C build test/cpp/c10_storage_test test/cpp/ATen_resize_test通过ctest -R 'c10_storage_test|ATen_resize_test' --output-on-failure通过-DPADDLE_WITH_CUSTOM_KERNEL语法验证通过resize_({4})后storage().nbytes()从8更新为16上面的说明是 AI 生成的,大概意思是,
paddle/phi/api/include/compat/ATen/core/TensorBase.h:416-422中定义的MaybeResetHolder 模板实例化的过程失败了,因为在启用 PADDLE_WITH_CUSTOM_KERNEL 的情况下,paddle/phi/core/dense_tensor.h不会#include "paddle/phi/core/dense_tensor.inl"(详见paddle/phi/core/dense_tensor.h:342-344),导致在paddle/phi/core/dense_tensor.inl中声明的phi::DenseTensor::ResetHolder()没有符号导出,调用ResetHolder函数的MaybeResetHolder函数自然就实例化失败了,现在就直接显式实例化一份MaybeResetHolder,在里面将ResetHolder函数展开这个问题比较隐蔽,在 C++ 模板编程的 SFINAE(Substitution Failure Is Not An Error)机制中,编译不会报错,导致
resize_函数调用了一个空的MaybeResetHolder,现已修复是否引起精度变化
否