Expanding the internal iteration API #99

tcbrindle · 2023-07-13T17:09:48Z

Currently internal iteration is achieved by a sequence customisation point for_each_while(predicate) -> cursor_t, which instructs the sequence to iterate over all of its elements until such time as the predicate returns false.

Unfortunately there are some cases where we can't use internal iteration even though we'd like to. Algorithms like fold_first (#98), min and max need to inspect the initial element before iterating over the rest of the sequence; more generally, any time we take a slice of a sequence (such as with the elements of chunk, chunk_by, slide etc) then we can't implement for_each_while() on the slice in terms of an equivalent call on the base sequence, and so have to fall back to external iteration.

A solution to this would be to replace the existing for_each_while() customisation point with a new pair of functions taking extra cursor arguments to indicate where iteration should begin and end:

auto iterate_while(auto& seq, auto&& pred, cursor_t from) -> cursor_t;
auto iterate_while(auto& seq, auto&& pred, cursor_t from, cursor_t to) -> cursor_t;

The slightly odd parameter order is to allow some sequence impls to provide just a single overload, with the to parameter defaulted to flux::last(). Implementations wouldn't need to provide both of these functions: in particular, most single-pass sequences aren't going to be able to provide the second one.

With this approach, algorithms like fold_first could examine the first element before proceeding with a call to iterate_while(); and iterate_while() on slices could be implemented in terms of calls to the equivalent function on their parent sequence. The existing flux::for_each_while(seq, pred) wrapper function could be implemented as a call to iterate_while(seq, pred, first(seq)).

This would add (yet) another customisation point that authors would have to worry about when writing a sequence implementation, which I'm somewhat reluctant to do, but I think the codegen benefits of internal iteration are great enough that it's justified in this case.

@brycelelbach @brevzin @jnytra any thoughts?

The text was updated successfully, but these errors were encountered:

brevzin · 2023-07-14T04:55:28Z

Yeah I ran into the same problem when I was trying to do this, and instead came up with a janky semi-stateful internal iteration solution? Not amazing.

So I guess the implementation of min would look something like... this?

template <sequence Seq>
auto min(Seq& seq) -> Optional<value_t<Seq>> {
    auto cursor = first(seq);
    if (is_last(seq, cursor)) {
        return {};
    }

    Optional<value_t<Seq>> best = read_at(seq, cursor);
    inc(seq, cursor);
    iterate_while(seq,
        [&](auto&& elem){ *best = std::min(*best, elem); return true; },
        cursor);
    return best;
}

tcbrindle · 2023-07-14T11:13:09Z

I think min() itself wouldn't need to be changed, as it just calls fold_first(). But we could implement that as something like:

template <sequence Seq, typename Func, typename V = value_t<Seq>>
auto fold_first(Seq&& seq, Func func) -> flux::optional<V>
{
    auto cur = first(seq);

    if (is_last(seq, cur)) {
        return std::nullopt;
    }

    V init(read_at(seq, cur));
    inc(seq, cur);

    iterate_while(seq, [&](auto&& elem) {
        init = std::invoke(func, std::move(init), FLUX_FWD(elem));
        return true;
    }, std::move(cur));

    return optional<V>(std::in_place, std::move(init));
}

(Which is pretty much exactly what you just wrote....)

brevzin · 2023-07-15T15:33:37Z

I think this seems like a pretty cool idea. Little awkward having the lambda not last I guess, so maybe do iterate_while(seq, cur, f) and iterate_while(seq, cur, last, f)? Definitely curious to see how this pans out!

jnytra · 2023-07-17T05:44:19Z

I think too many customization points can confuse users. I prefer a simpler API, even if it means a small performance loss. But if the compiler can optimize much better, it would be a pity not to use it. Maybe we can do some micro benchmarks to prove the speedup. I also agree with @brevzin that we should use lambda as the last argument.

tcbrindle · 2023-07-18T21:33:04Z

I think too many customization points can confuse users.

I agree in general. In this case though I think the argument in favour of adding one more -- in particular, that it would enable internal iteration of slices -- it persuasive enough that it's worth investigating.

I also agree with @brevzin that we should use lambda as the last argument.

As noted above, the odd parameter order is to allow some implementations to combine both overloads into a single function by defaulting the last parameter. For example, the implementation for C arrays could be

template <typename T, index_t N>
struct sequence_traits<T[N]> {
     // ...

    static constexpr auto iterate_while(auto& self, auto&& pred, index_t from, index_t to = N) -> index_t
    {
         for (; from < to; ++from) {
            if (!std::invoke(pred, self[from])) {
                break;
            }
         }

        return from;
    }
};

As per feedback on #99

tcbrindle added the enhancement New feature or request label Jul 13, 2023

tcbrindle mentioned this issue Jul 13, 2023

fold_first doesn't use internal iteration #98

Closed

tcbrindle mentioned this issue Jul 14, 2023

Update the internal iteration protocol #100

Closed

28 tasks

tcbrindle added a commit that referenced this issue Jul 27, 2023

iterate_while: put predicate argument last

d91a8c8

As per feedback on #99

tcbrindle added a commit that referenced this issue Jul 27, 2023

iterate_while: put predicate argument last

c1283bb

As per feedback on #99

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expanding the internal iteration API #99

Expanding the internal iteration API #99

tcbrindle commented Jul 13, 2023

brevzin commented Jul 14, 2023 •

edited

Loading

tcbrindle commented Jul 14, 2023 •

edited

Loading

brevzin commented Jul 15, 2023

jnytra commented Jul 17, 2023

tcbrindle commented Jul 18, 2023

Expanding the internal iteration API #99

Expanding the internal iteration API #99

Comments

tcbrindle commented Jul 13, 2023

brevzin commented Jul 14, 2023 • edited Loading

tcbrindle commented Jul 14, 2023 • edited Loading

brevzin commented Jul 15, 2023

jnytra commented Jul 17, 2023

tcbrindle commented Jul 18, 2023

brevzin commented Jul 14, 2023 •

edited

Loading

tcbrindle commented Jul 14, 2023 •

edited

Loading