TypedNormalization inlining creating big functions

In Array.bosatsu:

```
def set_in_range_ok(_: Unit) -> Bool:
  match set_Array(a5, 1, 8):
    case Some(arr): to_List_Array(arr) matches [0, 8, 2, 3, 4]
    case None: False
```
get compiled to:
```
static BValue ___bsts_c_Bosatsu_l_Collection_l_Array_l_set__in__range__ok() {
    BValue __bsts_a_3 = ___bsts_g_Bosatsu_l_Collection_l_Array_l_index__in__range__Array(___bsts_g_Bosatsu_l_Collection_l_Array_l_a5(),
        bsts_integer_from_int(1));
    BValue __bsts_a_5 = get_variant_value(__bsts_a_3) == 1 ?
        alloc_enum1(1,
            ___bsts_g_Bosatsu_l_Collection_l_Array_l_set__or__self__Array(___bsts_g_Bosatsu_l_Collection_l_Array_l_a5(),
                bsts_integer_from_int(1),
                bsts_integer_from_int(8))) :
        alloc_enum0(0);
    if (get_variant(__bsts_a_5) == 1) {
        BValue __bsts_b_arr0 = get_enum_index(__bsts_a_5, 0);
        BValue __bsts_a_4 = ___bsts_g_Bosatsu_l_Collection_l_Array_l_to__List__Array(__bsts_b_arr0);
        return get_variant(__bsts_a_4) == 1 ?
            (bsts_integer_equals(get_enum_index(__bsts_a_4, 0), bsts_integer_from_int(0)) ?
                (get_variant(get_enum_index(__bsts_a_4, 1)) == 1 ?
                    (bsts_integer_equals(get_enum_index(get_enum_index(__bsts_a_4, 1), 0),
                        bsts_integer_from_int(8)) ?
                        (get_variant(get_enum_index(get_enum_index(__bsts_a_4, 1), 1)) == 1 ?
                            (bsts_integer_equals(get_enum_index(get_enum_index(get_enum_index(__bsts_a_4,
                                            1),
                                        1),
                                    0),
                                bsts_integer_from_int(2)) ?
                                (get_variant(get_enum_index(get_enum_index(get_enum_index(__bsts_a_4,
                                                1),
                                            1),
                                        1)) == 1 ?
                                    (bsts_integer_equals(get_enum_index(get_enum_index(get_enum_index(get_enum_index(__bsts_a_4,
                                                        1),
                                                    1),
                                                1),
                                            0),
                                        bsts_integer_from_int(3)) ?
                                        (get_variant(get_enum_index(get_enum_index(get_enum_index(get_enum_index(__bsts_a_4,
                                                            1),
                                                        1),
                                                    1),
                                                1)) == 1 ?
                                            (bsts_integer_equals(get_enum_index(get_enum_index(get_enum_index(get_enum_index(get_enum_index(__bsts_a_4,
                                                                    1),
                                                                1),
                                                            1),
                                                        1),
                                                    0),
                                                bsts_integer_from_int(4)) ?
                                                (get_variant(get_enum_index(get_enum_index(get_enum_index(get_enum_index(get_enum_index(__bsts_a_4,
                                                                        1),
                                                                    1),
                                                                1),
                                                            1),
                                                        1)) == 0 ?
                                                    alloc_boxed_pure_fn1(__bsts_t_lambda509) :
                                                    alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
                                                alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
                                            alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
                                        alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
                                    alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
                                alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
                            alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
                        alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
                    alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
                alloc_boxed_pure_fn1(__bsts_t_lambda510)) :
            alloc_boxed_pure_fn1(__bsts_t_lambda510);
    }
    else {
        return alloc_boxed_pure_fn1(__bsts_t_lambda510);
    }
}
```

So,  the match on the specific list is fully inlined as an unrolled loop. But we do quadratic work to do the match: we don't save the part of the list we've already matched. It isn't saving the work of getting towards the tail. It isn't clear if this is an issue with Matchless compilation, or in the codegen from Matchless to C or some combination.

Secondly, look at the repeated `alloc_boxed_pure_fn1(__bsts_t_lambda510))` at the end. This is certainly in the ClangGen code. We should only allocate lambda/closure wrappers once per scope. Since all values are immutable and garbarge collected we can go further: at each scope we should only allocate values one time (this is related to the above problem, but I suspect that one is really a problem with Matchless not doing this, and not that ClangGen should have).

Thirdly, (and this should be fixed after the previous two have been fix because they may hide this) note the original function was `def set_in_range_ok(_: Unit) -> Bool` for some reason. It's a little weird it wasn't written as `set_in_range_ok: Bool` but anyway, it was defined as a `Unit -> Bool` but somehow, for the compilation we are instead setting it to a lambda value and pushing the `Unit -> x` down both branches. This may be a TypedExprNormalization rule, but I can't see that it makes a lot of sense. The only case where that seems like it would be a win is if both branches we returning `fn(x)` on their tail where x is the argument. Then we could have just not done the lambda apply. But pushing "lambda-ness" down into branches without removing applies from all branches it seems like a de-optimization.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TypedNormalization inlining creating big functions #1603

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

TypedNormalization inlining creating big functions #1603

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions