> Our code heavily relies on vsetvl* instructions setting vl to VLMAX when avl is greater than or equal to VLMAX. Although this seems to be true on our target, this is actually not mandated by RVV 1.0. This makes this code quite brittle and should be addressed.
This is a single additional min instruction: vsetvli(min(avl, vlmax))
Or you can just always use VLMAX, which works better depends on the context.
> Using a VLEN specific approach is somehow cheating
I think the best approach is to have a general solution, with specialization where beneficial. Branching on VLEN should be extremely cheap, because it's always predicted.
Although looking at it closely, I think I can use clause 3. `vl = VLMAX if AVL ≥ (2 * VLMAX)` to make sure I get the value I expect.
>> Using a VLEN specific approach is somehow cheating
> I think the best approach is to have a general solution, with specialization where beneficial. Branching on VLEN should be extremely cheap, because it's always predicted.
I agree, although I would need to evaluate the impact in the code size (that I completely disregarded in this work). I might try it to try to optimize the code across several implementation (I only have a VLEN=128 implementation readily available at the moment).
> Our code heavily relies on vsetvl* instructions setting vl to VLMAX when avl is greater than or equal to VLMAX. Although this seems to be true on our target, this is actually not mandated by RVV 1.0. This makes this code quite brittle and should be addressed.
This is a single additional min instruction: vsetvli(min(avl, vlmax))
Or you can just always use VLMAX, which works better depends on the context.
> Using a VLEN specific approach is somehow cheating
I think the best approach is to have a general solution, with specialization where beneficial. Branching on VLEN should be extremely cheap, because it's always predicted.
Thanks for the suggestion @camel-cdr.
I actually want to set vl to VLMAX but RVV 1.0 allow some leniency when setting vl
https://github.com/riscvarchive/riscv-v-spec/blob/master/v-spec.adoc#63-constraints-on-setting-vl
Although looking at it closely, I think I can use clause 3. `vl = VLMAX if AVL ≥ (2 * VLMAX)` to make sure I get the value I expect.
>> Using a VLEN specific approach is somehow cheating
> I think the best approach is to have a general solution, with specialization where beneficial. Branching on VLEN should be extremely cheap, because it's always predicted.
I agree, although I would need to evaluate the impact in the code size (that I completely disregarded in this work). I might try it to try to optimize the code across several implementation (I only have a VLEN=128 implementation readily available at the moment).