2 Comments

Great article, here are some rdcycle measurements from a C908:

scalar baseline (due to -Os): https://pastebin.com/Jejx6CQW

autovec baseline (due to -Ofast): https://pastebin.com/gQB76kgy

Edit: I experimented with larger LMUL but that didn't seem to impact the performance by a lot.

Edit2: the output logs say ... instruction(s), but it does actually measured the cycles with rdcycle.

Expand full comment

Thank you for running those @camel-cdr

https://github.com/nibrunie/rvv-examples/pull/4 should improve the labelling of performance metrics "cycle" vs "instruction"

Expand full comment