-
Notifications
You must be signed in to change notification settings - Fork 12
Description
In Array.bosatsu:
def set_in_range_ok(_: Unit) -> Bool:
match set_Array(a5, 1, 8):
case Some(arr): to_List_Array(arr) matches [0, 8, 2, 3, 4]
case None: False
get compiled to:
static BValue ___bsts_c_Bosatsu_l_Collection_l_Array_l_set__in__range__ok() {
BValue __bsts_a_3 = ___bsts_g_Bosatsu_l_Collection_l_Array_l_index__in__range__Array(___bsts_g_Bosatsu_l_Collection_l_Array_l_a5(),
bsts_integer_from_int(1));
BValue __bsts_a_5 = get_variant_value(__bsts_a_3) == 1 ?
alloc_enum1(1,
___bsts_g_Bosatsu_l_Collection_l_Array_l_set__or__self__Array(___bsts_g_Bosatsu_l_Collection_l_Array_l_a5(),
bsts_integer_from_int(1),
bsts_integer_from_int(8))) :
alloc_enum0(0);
if (get_variant(__bsts_a_5) == 1) {
BValue __bsts_b_arr0 = get_enum_index(__bsts_a_5, 0);
BValue __bsts_a_4 = ___bsts_g_Bosatsu_l_Collection_l_Array_l_to__List__Array(__bsts_b_arr0);
return get_variant(__bsts_a_4) == 1 ?
(bsts_integer_equals(get_enum_index(__bsts_a_4, 0), bsts_integer_from_int(0)) ?
(get_variant(get_enum_index(__bsts_a_4, 1)) == 1 ?
(bsts_integer_equals(get_enum_index(get_enum_index(__bsts_a_4, 1), 0),
bsts_integer_from_int(8)) ?
(get_variant(get_enum_index(get_enum_index(__bsts_a_4, 1), 1)) == 1 ?
(bsts_integer_equals(get_enum_index(get_enum_index(get_enum_index(__bsts_a_4,
1),
1),
0),
bsts_integer_from_int(2)) ?
(get_variant(get_enum_index(get_enum_index(get_enum_index(__bsts_a_4,
1),
1),
1)) == 1 ?
(bsts_integer_equals(get_enum_index(get_enum_index(get_enum_index(get_enum_index(__bsts_a_4,
1),
1),
1),
0),
bsts_integer_from_int(3)) ?
(get_variant(get_enum_index(get_enum_index(get_enum_index(get_enum_index(__bsts_a_4,
1),
1),
1),
1)) == 1 ?
(bsts_integer_equals(get_enum_index(get_enum_index(get_enum_index(get_enum_index(get_enum_index(__bsts_a_4,
1),
1),
1),
1),
0),
bsts_integer_from_int(4)) ?
(get_variant(get_enum_index(get_enum_index(get_enum_index(get_enum_index(get_enum_index(__bsts_a_4,
1),
1),
1),
1),
1)) == 0 ?
alloc_boxed_pure_fn1(__bsts_t_lambda509) :
alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
alloc_boxed_pure_fn1(__bsts_t_lambda510);
}
else {
return alloc_boxed_pure_fn1(__bsts_t_lambda510);
}
}
So, the match on the specific list is fully inlined as an unrolled loop. But we do quadratic work to do the match: we don't save the part of the list we've already matched. It isn't saving the work of getting towards the tail. It isn't clear if this is an issue with Matchless compilation, or in the codegen from Matchless to C or some combination.
Secondly, look at the repeated alloc_boxed_pure_fn1(__bsts_t_lambda510)) at the end. This is certainly in the ClangGen code. We should only allocate lambda/closure wrappers once per scope. Since all values are immutable and garbarge collected we can go further: at each scope we should only allocate values one time (this is related to the above problem, but I suspect that one is really a problem with Matchless not doing this, and not that ClangGen should have).
Thirdly, (and this should be fixed after the previous two have been fix because they may hide this) note the original function was def set_in_range_ok(_: Unit) -> Bool for some reason. It's a little weird it wasn't written as set_in_range_ok: Bool but anyway, it was defined as a Unit -> Bool but somehow, for the compilation we are instead setting it to a lambda value and pushing the Unit -> x down both branches. This may be a TypedExprNormalization rule, but I can't see that it makes a lot of sense. The only case where that seems like it would be a win is if both branches we returning fn(x) on their tail where x is the argument. Then we could have just not done the lambda apply. But pushing "lambda-ness" down into branches without removing applies from all branches it seems like a de-optimization.