[CPU] MOE_via_BatchGatherMatmul_extended_for_gpt_oss_awq#33335
[CPU] MOE_via_BatchGatherMatmul_extended_for_gpt_oss_awq#33335chenhu-wang wants to merge 3 commits intoopenvinotoolkit:masterfrom
Conversation
1cd3ef2 to
7a01e2e
Compare
aa49aa4 to
75ebe2d
Compare
v-Golubev
left a comment
There was a problem hiding this comment.
Could you please also update src/plugins/intel_cpu/tests/unit/transformations/moe_matmuls_fusion_test.cpp?
There was a problem hiding this comment.
If it is applicable only for MoE2GeMM, let's throw an exception in case when with_gate_mul==true and moe type is MoE3GeMM
There was a problem hiding this comment.
The check is added, thanks!
There was a problem hiding this comment.
First 2 and last 2 params are identical except for with_gate_mul value, right? Maybe we can move with_gate_mul to MoeTestParams then? In this case, we will not need 2 different moe_params_* vectors
There was a problem hiding this comment.
Let's use pattern::shape_matches instead of a custom predicate:
| auto mul1_const = pattern::wrap_type<ov::op::v0::Constant>(mul1_const_predicate); | |
| auto mul1_const = pattern::wrap_type<ov::op::v0::Constant>(pattern::shape_matches("[?, 1, ?]"); |
ce59ca1 to
247ed46
Compare
extended, thanks! |
|
NNCF should avoid inserting "Mutliply" for MoE patterns, this is feedback for nncf team. |
Details:
Tickets: