Discussion about this post

User's avatar
camel-cdr's avatar

Great article, here are some rdcycle measurements from a C908:

scalar baseline (due to -Os): https://pastebin.com/Jejx6CQW

autovec baseline (due to -Ofast): https://pastebin.com/gQB76kgy

Edit: I experimented with larger LMUL but that didn't seem to impact the performance by a lot.

Edit2: the output logs say ... instruction(s), but it does actually measured the cycles with rdcycle.

Expand full comment
1 more comment...

No posts