Search for a command to run...
The Montgomery multiplication algorithm is used to perform multiplication coupled with modular reduction without the need to employ a division operation. Serial Montgomery implementations may operate at a bit, digit, or word level. An established classification scheme for serial implementations considers two dimensions: whether the multiplication and reduction computations are separated or integrated, and whether operand or product words are prioritized for scanning. Presented here is an augmented version of the taxonomy, which adds a third dimension. The new dimension characterizes the degree of parallelism in performing low-level (bit or digit) computations. Introducing a small degree of bit or digit level parallelism to a formerly serial approach can enhance performance, both through the typical benefit of parallel computation, and by opportunistically avoiding unnecessary computations. This can be achieved for a modest incremental cost in increased area in lieu of the larger cost of realizing a fully parallel solution. In this way, a new region on the latency-area curve becomes available for tradeoff considerations. The novel Rescheduled Montgomery Multiplier is presented as an experimental realization of the augmented taxonomy.