Skip to content

[release/3.3] fix tinyformat#78843

Open
risemeup1111 wants to merge 1 commit intoPaddlePaddle:release/3.3from
risemeup1111:cherry-pick/78793/release/3.3
Open

[release/3.3] fix tinyformat#78843
risemeup1111 wants to merge 1 commit intoPaddlePaddle:release/3.3from
risemeup1111:cherry-pick/78793/release/3.3

Conversation

@risemeup1111
Copy link
Copy Markdown

PR Category

Execute Infrastructure

PR Types

Improvements

Description

删除tinyformat里的assert,防止PADDLE_ENFORCE %个数和变量不匹配时,崩溃在assert上而没有提供PADDLE_ENFORCE更有价值的信息。
业务上发生了这种情况。但不可解释的是,assert只在DEBUG编译下才生效,为何发版的包assert会生效暂时没有找到。

通过如下方式复现问题

#define TINYFORMAT_ERROR(reason) do { std::cerr << reason ; abort(); } while(0)

此后验证本PR,能够有效disable掉assert的崩溃。

但H卡Coverage,test_dygraph_sharding_stage3_for_eager等3个单测会崩溃在PrepareStridedOut中:

2026-04-27T12:46:24.0700284Z     70	--------------------------------------
2026-04-27T12:46:24.0700604Z     71	C++ Traceback (most recent call last):
2026-04-27T12:46:24.0700906Z     72	--------------------------------------
2026-04-27T12:46:24.0701655Z     73	0   egr::Backward(std::vector<paddle::Tensor, std::allocator<paddle::Tensor> > const&, std::vector<paddle::Tensor, std::allocator<paddle::Tensor> > const&, bool, std::string)
2026-04-27T12:46:24.0703059Z     74	1   egr::RunBackward(std::vector<paddle::Tensor, std::allocator<paddle::Tensor> > const&, std::vector<paddle::Tensor, std::allocator<paddle::Tensor> > const&, bool, bool, std::vector<paddle::Tensor, std::allocator<paddle::Tensor> > const&, bool, std::vector<paddle::Tensor, std::allocator<paddle::Tensor> > const&, std::string)
2026-04-27T12:46:24.0704342Z     75	2   MatmulGradNode::operator()(paddle::small_vector<std::vector<paddle::Tensor, std::allocator<paddle::Tensor> >, 15u>&, bool, bool)
2026-04-27T12:46:24.0705187Z     76	3   paddle::experimental::matmul_grad(paddle::Tensor const&, paddle::Tensor const&, paddle::Tensor const&, bool, bool, paddle::Tensor*, paddle::Tensor*)
2026-04-27T12:46:24.0706307Z     77	4   void phi::MatmulGradStrideKernel<float, phi::GPUContext>(phi::GPUContext const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor const&, bool, bool, phi::DenseTensor*, phi::DenseTensor*)
2026-04-27T12:46:24.0707137Z     78	5   phi::PrepareStridedOut(phi::DenseTensor*)
2026-04-27T12:46:24.0707545Z     79	6   phi::DenseTensorMeta::DenseTensorMeta(phi::DenseTensorMeta const&)
2026-04-27T12:46:24.0707920Z     80	
2026-04-27T12:46:24.0708110Z     81	----------------------
2026-04-27T12:46:24.0708368Z     82	Error Message Summary:
2026-04-27T12:46:24.0708612Z     83	----------------------
2026-04-27T12:46:24.0708985Z     84	FatalError: `Segmentation fault` is detected by the operating system.
2026-04-27T12:46:24.0709534Z     85	  [TimeInfo: *** Aborted at 1777293968 (unix time) try "date -d @1777293968" if you are using GNU date ***]
2026-04-27T12:46:24.0710126Z     86	  [SignalInfo: *** SIGSEGV (@0x10) received by PID 7988 (TID 0x7f9905499740) from PID 16 ***]

重命名后问题消失,可能是编译链接错函数了?无法解释。

是否引起精度变化


Cherry-pick of #78793 (authored by @wanghuancoder) to release/3.3.

devPR:#78793

* fix tinyformat

* for test
@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented Apr 29, 2026

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@risemeup1111 risemeup1111 mentioned this pull request Apr 29, 2026
@paddle-bot paddle-bot Bot added the contributor External developers label Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants