\documentclass[12pt,a4paper]{article}
\title{
Montgomery Algorithm for Modular Multiplication
}
\author{Professor Dr. D. J. Guan
\thanks{
Department of Computer Science,
National Sun Yat-Sen University,
Kaohsiung, Taiwan 80424
({\tt guan@cse.nsysu.edu.tw}).
}}
\markboth{Lecture Notes}{Montgomery Algorithm}
\pagestyle{myheadings}
\begin{document}
\maketitle
The Montgomery algorithm for modular multiplication is considered to
be the fastest algorithm to compute $x y \bmod n$ in computers when the
values of $x$, $y$, and $n$ are large.
In this lecture note, we shall describe the Montgomery algorithm for
modular multiplication~\cite{Mpl85,BDK98}.
This is one of the notes for the algorithms which my students have
difficulty to fully understand.
Suppose we want to compute $x y\bmod n$ in a computer.
We first choose a positive integer, $r$, greater than $n$ and relative
prime to $n$.
The value of $r$ is usually $2^m$ for some positive integer $m$.
This is because multiplication, division and modulo by $r$ can be done
by shifting or logical operations in computers.
Since $\gcd(r,n)=1$,
there are two numbers $r^{-1}$ and $n'$ with $0 $q:=(x\bmod r)n'\bmod r$;\\
\> $a:=(x+qn)/r$;\\
\>{\bf if} $a>n$ {\bf then} $a:=a-n$;\\
\>{\bf return} $a$;\\
{\bf end}
\end{tabbing}
\caption{Algorithm for the computation of $a=x r^{-1}\bmod n$.}
\end{figure}
The reason why the above algorithm works is explained as follows.
First,
$$x r^{-1} = x r r^{-1}/r = x (n n'+1)/r$$
Note that, for any integer $l$,
$$((x n' + l r)n + x)/r\bmod n=(x n' n + l r n + x)/r\bmod n = (x n' n + x)/r\bmod n$$
Therefore, instead of computing $q=x n'$, we can compute $q=x n'\bmod r$.
Now, suppose $0\le x < rn$.
The value of $a=(x+qn)/r$ will be less than $2n$.
Therefore, computing $a\bmod n$ can be done very easily by
subtracting $n$ from $a$ if $a>n$.
Suppose that a number $x$, $0\le x\le n$, is mapped to $xr$.
It is easy to verify that
\begin{enumerate}
\item $x+y$ is mapped to $(x+y)r=xr+yr$,
\item $x y$ is mapped to $(x y)r=(xr)(yr)r^{-1}$,
\end{enumerate}
Therefore, if the numbers $x$ and $y$ are represented by $xr$ and $yr$
in a computer, then the multiplication algorithm is ${\it reduce}(xy)$.
The addition algorithm is unchanged since
$xr^{-1}+yr^{-1}\equiv zr^{-1}\bmod n$
if and only if $x+y\equiv z\bmod n$.
The algorithms for subtraction, negation, equality test, inequality
test, multiplication by an integer, and the greatest common divisor
with $n$ are also unchanged.
When the numbers $X$, $Y$, and $N$ are large,
we can also apply the above algorithm to compute $X Y \bmod N$
in an efficient way~\cite{BDK98}.
Suppose that
$$X=\sum_{k=0}^{m-1}{x_k \beta^k},\quad
Y=\sum_{k=0}^{m-1}{y_k \beta^k},\quad
N=\sum_{k=0}^{m-1}{n_k \beta^k},$$
and we want to compute $A=X Y\beta^{-m}$.
Let $A=\sum_{k=0}^{m-1}{a_k \beta^k}$.
In the algorithm {\it montgomery}, we use the notation $(\alpha)_\beta^{-1}$
to denote the inverse of $\alpha$ in $Z_\beta$.
\begin{figure}[ht]
\begin{tabbing}
\quad\=\quad\=\quad\=\quad\=\quad\=\kill
{\bf function} {\it montgomery} $(X, Y)$\\
{\bf begin}\\
\> $A:=0$;\\
\>{\bf for} $k:=0$ {\bf to} $m-1$ {\bf do begin}\\
\>\> $q:=(a_0+x_k y_0)(\beta-n_0)_\beta^{-1}\bmod \beta$;\\
\>\> $A:=A + x_k Y + q N$;\\
\>\> $A:=A/\beta$;\\
\>{\bf end};\\
%\>\>{\bf if} $A>N$ {\bf then} $A:=A-N$;\\
\>{\bf return} $A$;\\
{\bf end}
\end{tabbing}
\caption{Algorithm for the computation of $A=X Y \beta^{-m}\bmod N$.}
\end{figure}
The reason why the above algorithm works is explained as follows.
The algorithm computes the answer $A$ incrementally.
At each step, $x_k$ is used to do the computation.
The computation is similar to the first algorithm.
That is, compute $A=A + x_k Y + q N$ and then $A=A/\beta$.
The rationale behind this computation is to find a proper value of $q$ so that,
at the $k$-th iteration,
the value of $A + x_k Y + q N$ is a multiple of $\beta$.
and that this value is bounded by $(R+\beta^k)N$.
As explained above, the value of $q$ is
$$q=(A + x_k Y)N'\bmod \beta,$$
where $N'$ is the inverse of $N$ in $Z_{\beta^m}$.
Therefore,
$$q=(A + x_k Y)N'\bmod \beta=(a_0+x_k y_0)(\beta-n_0)_\beta^{-1}\bmod \beta,$$
In the above equation, we use the fact that
$$R R^{-1}-N N'\equiv 1\pmod{\beta^m}.$$
Thus, $-N N'\bmod\beta=1$, which implies that
$$N'\bmod\beta=(-n)_\beta^{-1}=(\beta-n_0)_\beta^{-1}.$$
Suppose that $\beta=2^k$ for some positive integer $k$,
and that the value of $n_0=\beta-1$.
Then the value of $(\beta-n_0)_\beta^{-1}=1$.
In this case, $q=(a_0+x_k y_0)\bmod\beta$, which is the value of the
last digit of $A + x_k Y$.
Therefore, we can save the computation of the value of $q$ at each
iteration.
\begin{thebibliography}{1}
\bibitem{BDK98}
Jean-Claude Bajard, Laurent-Stephane Didier, and Peter Komerup.
\newblock An {RNS} {Montgomery} modular multiplication algorithm.
\newblock {\em IEEE Transaction on Computers}, 47(7):766--776, July 1998.
\bibitem{Mpl85}
Peter~L. Montgomery.
\newblock Modular multiplication without trial division.
\newblock {\em Mathematics of Computation}, 44(170):519--521, April 1985.
\end{thebibliography}
\end{document}