Re: Strange: Rosetta faster than M1

Gerriet M. Denkmann

On 20 Sep 2022, at 19:42, Alex Zavatone via <zav@...> wrote:

It might seem like a primitive approach, but logging with time stamps should be able to highlight where the suckyness is. Run a log that displays the time delta from the last logging statement so that you are only looking at the deltas. Then run each version and see where the slowness is. That should tell you, right?
I did this:
typedef uint32_t limb;
typedef uint64_t bigLimb;

const uint len = 50000;
const int shiftLimb = sizeof(limb) * 8;

limb *someArray = malloc( len * sizeof(limb) );
bigLimb someBig = 0;

for (bigLimb factor = 1; factor < len; factor++ )
for (uint idx = 0 ; idx < len ; idx++)
someBig += factor * someArray[idx] ;
someArray[idx] = (limb)(someBig);
someBig >>= shiftLimb;

and run it in Release mode (-Os = Fastest, Smallest)
(In Debug mode (-O0) Rosetta time = M1 time).

with "someBig >>= shiftLimb”:
Rosetta M1 Rosetta time / M1 time
1.8 3.35 0.54
without the shift:
1.32 0.924 1.43

So it seems that Rosetta optimizes shifts way better than Apple Silicon.

Which kind of looks like a bug.


Join to automatically receive all group messages.