February 12th, 2012

ronin

A log2(n) integer modulo algorithm for CPUs lacking hardware divide.


Collapse )

I created this algorithm as a result of reading a post on Reddit that showed some shockingly poor performance for intrinsic software "%" operator in some C compilers for microcontrollers which don't have hardware divide.

How does the algorithm work? Well, the mathematical definition of modulo is:

y % x = y - (x * int(y/x))

The trick this algorithm employs is to do a kind of weirdo binary search to find (x * int(y/x)), without ever directly computing int(y/x), or using division.

Collapse )

Is it possible to do this faster? I strongly suspect so. As the Wikipedia link given below notes, it is possible to reduce (y % x) to (y & (x-1)) when x is a power of 2. Seems to me that trick should be exploitable to bring the time to compute modulo down even farther. But in any event, the above unsophisticated and unoptimized algorithm should still be a very large improvement over modulo implemented via a software divide routine... as some people's compilers are probably doing now.


See also:

http://en.wikipedia.org/wiki/Modulo_operation#Performance_issues

http://graphics.stanford.edu/~seander/bithacks.html#ModulusDivisionEasy

http://stackoverflow.com/questions/2566010/fastest-way-to-calculate-a-128-bit-integer-modulo-a-64-bit-integer
  • Current Music
    The Klein Four Group - Finite Simple Group (of Order Two)