Categories
mathematics Number Theory

Number Theory Primer : Proving Properties of Natural Numbers Using Proof by Induction

authored by Premmi and Beguène

Previous Topic: An Axiomatic Study of Natural Numbers – Peano’s Axioms

Introduction

Building upon the foundational principles of Peano’s Axioms, we now turn our attention to a critical tool in number theory: proof by induction. How can we rigorously establish that a property holds true for all natural numbers? The answer lies in this powerful technique of proof by induction, a cornerstone for proving properties within the natural number system. We will explore both the set-based and predicate-based approaches to proof by induction, providing a clear understanding of its mechanics. We will then illustrate its use by proving a key lemma directly derived from Peano’s Axioms. Finally, we’ll demonstrate how this proof enables a rigorous definition of addition, laying the groundwork for further exploration of natural number arithmetic.

Proof by Induction

Suppose we want to prove that the natural numbers satisfy some property. Since a property is expressed as a predicate, we can prove that the natural numbers satisfy some property by showing that the predicate P(n) representing this property is true for all n \in \mathbb{N}_0.

Set-Based Induction

The formal way to proceed would be to first define the set T to be those natural numbers which satisfy P(n) and then use the Axiom of Induction to verify whether T = \mathbb{N}_0.

The set T defined using set-builder notation is of the form

T = \big\{n \in \mathbb{N}_0 \,|\, P(n)\big\}

where P(n) is a predicate known to be true or false for each n \in \mathbb{N}_0. We then

  1. show that 0 \in T,

  2. show that n \in T \implies S(n) \in T \text{ for all } n \in \mathbb{N}_0 and finally,

  3. invoke the Axiom of Induction to deduce that T = \mathbb{N}_0.

This method of proving that natural numbers satisfy some property using the Axiom of Induction is called Proof by Induction.

Predicate-Based Induction

The less cumbersome, but just as valid, way of proving the same would be to use the Principle of Mathematical Induction and show that the predicate P(n) holds for all n \in \mathbb{N}_0. We do this by

(i) showing that P(0) is true,


(ii) showing that if P(n) is true then P(S(n)) is true \text{for all } n \in \mathbb{N}_0,


(iii) invoking the Principle of Mathematical Induction to deduce that P(n) is true \text{for all } n \in \mathbb{N}_0.

In practice, the Principle of Mathematical Induction (\text{Axiom } 9) may not be invoked explicitly. The proof may be phrased entirely in terms of a predicate P(n), and, when steps (i) and (ii) are established, the conclusion “P(n) \text{ is true for all } n" is said to be established “by induction”. This should always be interpreted as an implicit use of Axiom 9, which is referred to as the Principle of Mathematical Induction for this very reason. During the course of such a proof, the assumption that P(n) is true is called the induction hypothesis and the proof that P(n) \implies P(S(n)) is called the induction step.

In our subsequent discussions, where we will demonstrate proofs of various properties of natural numbers, we will utilize the set-based approach to induction, thus making the definition of set T explicit. That is, we will use the Axiom of Induction, as explained earlier, to prove that the set T consisting of the natural numbers satisfying some property is the set of all natural numbers, \mathbb{N}_0; this proves that every natural number has the said property.

Example of Proof by Induction

Let us explain through an example how to use proof by induction to show that natural numbers have a certain property.

We have seen during our discussion of Peano’s Axioms that a natural number is either 0 or a successor of some other natural number. We will now use proof by induction to re-establish this fact, using the Axiom of Induction as our foundation. This lemma will be useful when we define the operations of addition and multiplication from the Peano Axioms.

Lemma

If n \in \mathbb{N}_0 and n \neq 0 then there exists a unique m \in \mathbb{N}_0 such that n = S(m).

Proof. First we will prove for any n \in \mathbb{N}_0 where n \neq 0, the existence of m \in \mathbb{N}_0 such that n = S(m) and then we will prove that such an m is unique.

  1. Existence:

    Let T be the subset of \mathbb{N}_0 defined as T = \{n \in \mathbb{N}_0 \,|\, n = 0 \text{ or } n = S(m) \text{ for some } m \in \mathbb{N}_0\}.

    We will first prove that T = \mathbb{N}_0 using the Axiom of Induction, which will imply the existence part of the lemma.

    • Base Case \textbf{(}\mathbf{0 \in T}\textbf{):} By the definition of T, 0 \in T.

    • Inductive Step:

      • Inductive Hypothesis: Let l \in T.

      • To prove: We need to show that S(l) \in T.

      • Argument:

        • Since T \subset \mathbb{N}_0 and l \in T, l \in \mathbb{N}_0. Therefore, by the definition of the successor function (\text{Peano's Axiom 6}), \, l \text{'s successor } S(l) is also a natural number.

        • By the definition of the set T, any natural number that is the successor of a natural number is in T.

        • Since S(l) is a successor of l, therefore, S(l) \in T.

    • Conclusion by Induction: By the Axiom of Induction, T = \mathbb{N}_0.

  2. Uniqueness

    • Assumption: Suppose there exist m_1, m_2 \in \mathbb{N}_0 such that n = S(m_1) \text{ and } n = S(m_2).

    • Deduction:
      • S(m_1) = S(m_2).

      • By Peano’s Axiom 8 (the successor function is injective), m_1 = m_2.

    • Conclusion: Therefore, m is unique.


Application of the Lemma: Defining Addition Step-by-Step

Suppose we want to define the addition of two natural numbers, say l, n \in \mathbb{N}_0. The lemma we just proved states that for any n \in \mathbb{N}_0, either n = 0 or n = S(m) for some m \in \mathbb{N}_0. This lemma is important because it allows us to define addition by considering these two cases for the second number, n.

Specifically, since every natural number is either 0 or a successor of another natural number, defining addition for the following two cases ensures that the operation is completely defined for all pairs of natural numbers l \text{ and } n:

  • Case \mathbf{1}: Adding Zero (Base Case)
    • l + 0

  • Case \mathbf{2}: Adding a Successor
    • l + S(m)

The lemma’s guarantee that m is unique when n = S(m) ensures that our definition of l + S(m) will be unambiguous and consistent. The uniqueness of m guarantees that for any given n, there is only one possible value for l + n or equivalently, l + S(m), thus ensuring that addition is unambiguous. Furthermore, this uniqueness ensures that the rules we establish for addition do not lead to contradictions or conflicting results in any application of addition, maintaining the consistency of our definition. By defining these two cases, we establish a step-by-step definition that encompasses all possible additions of natural numbers. The details of how these additions are actually defined will be our focus in the next section.

This approach demonstrates how a fundamental lemma about the structure of natural numbers allows us to rigorously and completely define more complex operations such as addition.

Building Arithmetic and Number Systems from Peano’s Axioms

With these Peano Axioms as a base, we will in the next section rigorously develop all the usual properties of arithmetic and in the subsequent sections build up the other number systems including integer, rational, real and complex numbers.

To be continued…

Categories
mathematics Number Theory

Number Theory Primer : An Axiomatic Study Of Natural Numbers – Peano’s Axioms

authored by Premmi and Beguène

Previous Topic: An Axiomatic Study of Numbers

Introduction

Thinking of numbers intuitively brings to mind the simplest and most fundamental set of numbers, namely the set of natural numbers. These are the numbers we first encounter when learning to count discrete objects like cars, books, pens, etc. If we associate natural numbers such as 0, 1, 2, 3, etc., with this basic act of enumeration, then with what corresponding concepts do we relate numbers like -4, \sqrt{3} \text{ and } \frac{22}{7}?

To reason rigorously about the diverse kinds of numbers encountered in mathematics, we require a precise mathematical framework for their definition. We will construct this framework by first defining the natural numbers axiomatically within a chosen logical system, namely second-order logic, due to its greater expressive power. Unlike first-order logic, which is limited to making statements about individual numbers, second-order logic allows us to make more general statements by also enabling quantification over sets of numbers and the properties that these numbers can have. This richer expressive capability of second-order logic is essential for formulating a comprehensive axiom of induction (which we will state precisely in a subsequent section). This axiom plays a key role in characterizing the set of natural numbers within the Peano axiomatic system. Consequently, the use of second-order logic provides a more direct way to capture the fundamental nature of the natural numbers through our axiomatic system.

This axiomatic system of natural numbers, constructed by assuming some fundamental statements about natural numbers to be true without proof, serves as a foundation from which all other properties and theorems about natural numbers can be logically deduced. By adopting this axiomatic approach, independent of the intuitive notion of counting, we aim to formulate the natural numbers based on a minimal set of core assumptions. This rigorously defined foundation will then equip us to construct the subsequent number systems: integers, rational numbers, real numbers, and complex numbers, all derived from these foundational natural numbers.

The hallmark of a powerful axiomatic system is its ability to assume a minimal set of fundamental axioms while enabling the derivation of a maximal body of knowledge. To create an efficient axiomatization for the natural numbers, we must distill these numbers to their most essential properties. Intuitively, we understand various aspects of the natural numbers, such as their existence and the basic properties of the binary operations addition, multiplication, and the ‘less than’ order relation. How few of these concepts can we take as axioms, from which we can deduce everything else that we need to know about the natural numbers? It turns out that remarkably little is required for the axiomatization of the natural numbers—neither addition, nor multiplication, nor the ‘less than’ relation need to be taken as axioms; these will all be constructed from our fundamental axioms.

The standard axiomatization of the natural numbers, known as the Peano Axioms, was originally formulated by the Italian mathematician, Giuseppe Peano. In 1889, Peano’s seminal work, “Arithmetices principia, nova methodo exposita” (The principles of arithmetic, presented by a new method) laid the groundwork for the rigorous development of the real number system based on a set of axioms for the natural numbers. His presentation featured nine axioms, four of which established the properties of the equality relation “=” with regard to natural numbers and the remaining five axioms provided a complete and rigorous definition of the natural numbers.

To truly appreciate the intellectual feat of Peano, it’s worth noting that utilizing only his fundamental axioms we gain the deductive power to define and prove the operations on natural numbers together with all their established properties. Furthermore, these axioms provide the essential scaffolding that facilitates the rigorous construction of the integer, rational, real, and complex number systems.

Before we discuss Peano’s Axioms in detail, it is a useful exercise to explore an alternative way to describe natural numbers, distinct from the usual intuitive notion of counting. Such an investigation would help us to independently arrive at Peano’s axioms.

Intuition behind the Axiomatic Definition of Natural Numbers

How do we model the concept of natural numbers denoted by 0, 1, 2, 3 and so on without relying on the notion of counting?

The way we count, starting from 0 and progressing sequentially to 1, then 2, and so on, reveals the core characteristic of natural numbers: a sequential structure where each subsequent number follows uniquely from the previous, beginning with 0. This is what we aim to formalize in our axiomatic definition.

Recognizing that the natural numbers constitute a collection with a sequential structure, which mathematical object is best suited to represent such a collection?

A set, with its capacity to contain elements and have relationships defined among them, provides the ideal structure for representing such a collection. Therefore, we will adopt a set-theoretic approach to define the natural numbers. This involves systematically enumerating the fundamental properties the set of natural numbers must have and meticulously describing the various relationships among its members, thereby, leading to the set \{0, 1, 2, 3, \ldots\}.

Having articulated our approach, we now proceed with the concrete steps of defining the set of natural numbers.

Let us denote the set of natural numbers as \mathcal{N}. First, we recognize that 0 should be a part of this set. Next, we want 0 to lead us to 1, 1 to 2 and we should be able to continue this way, naming each successive number as far as we wish. This is illustrated below.

0 \rightarrow1 \rightarrow 2 \rightarrow 3 \rightarrow \ldots

From the above diagram we can see that to model this relationship we need a “next” operation that given a natural number, produces the next natural number in the sequence.

The above diagram can also be viewed as shown below.

\begin{equation*} 
\begin{split}
0 &\rightarrow 1 \\
1 &\rightarrow 2 \\
2 &\rightarrow 3 \\
\vdots &\quad\,\,\,\, \vdots \\
\end{split}
\end{equation*} 

We can see from the diagram that an input of 0 maps to an output of 1, an input of 1 maps to 2 and so on. Therefore, we can model the “next” operation as a function S : \mathcal{N} \rightarrow \mathcal{N} that takes a natural number as input and produces a natural number as output. Here, the letter S stands for ‘successor’ and we have S(0) = 1, S(1) = 2 and so forth. We will refer to this function S as the successor function since it establishes a succession within the set of natural numbers, \mathcal{N}. This function S is illustrated below.

From the diagram above, we can observe that the successor function S has the following properties, which intuitively align with our understanding of counting:

  1. Not Surjective: There is no natural number in \mathcal{N} that, when given as input to the successor function S, results in 0 as output. This implies that not every element of the codomain of S is the image of at least one element from its domain. Therefore, we can conclude that S is not surjective because S(n) \neq 0 \text{ for any } n \in \mathcal{N}.

    Intuitively, this makes sense because we start counting from 0; therefore0 does not follow any other natural number in the standard counting sequence.

  2. Injective: Different inputs to S yield different outputs. This means that every element of the codomain of S is the image of at most one element from its domain; that is, S(m) = S(n) \text{ implies that } m = n \text{ for any } n, m \in \mathcal{N}. Hence, S is injective.

    This reflects the idea that when we count each number has a unique ‘next’ number; 1 follows 0 and no other number immediately follows 0 other than 1. Consequently, this ensures that the natural sequential order of the natural numbers is maintained where each number starting from 0 is followed by a unique number that is different from all the previous numbers, thus preventing the sequence from branching, merging or forming cycles in unexpected ways.

From the diagram, we can also see that a natural number is either 0 or can be obtained from 0 by applying the successor function to 0 a finite number of times. This highlights the fundamental way we generate all natural numbers; we start at 0 and keep going to the ‘next’ number. This process should give us all the natural numbers and nothing else. This implies that \mathcal{N} is the minimal non-empty set that contains 0 and admits a successor function satisfying conditions (1) \text{ and } (2) i.e., S is not surjective and is injective.

Therefore, for the Peano Axioms to accurately describe the set of natural numbers, they must define a set that contains 0 and supports a successor function as described above.

So far we have developed an intuitive understanding of how to define natural numbers axiomatically. Now we will turn our attention to how Giuseppe Peano originally formulated these axioms and the historical backdrop against which he developed his axioms, a period marked by a growing emphasis on logical rigor and the desire to provide firm foundations for mathematics.

Original Formulation of Peano’s Axioms: A Historical Context

Before we delve into the modern version of Peano’s Axioms, it is interesting to understand the intellectual landscape that gave rise to them. The late 19th century was a pivotal era in mathematics, marked by a burgeoning quest for logical rigor. This quest drove mathematicians to formalize arithmetic, transcending the limitations of purely intuitive apprehension to erect a more logically sound edifice.

Hermann Grassmann’s work in the 1860s demonstrated that many familiar truths of arithmetic could be built upon a more elementary foundation consisting of the successor function (the idea that each natural number has a unique “next” number) and mathematical induction (a method of proving statements about all natural numbers by first showing it’s true for the starting number (usually 0), and then showing that if it’s true for any number, it must also be true for the next one). Following this groundbreaking work, Charles Sanders Peirce provided an early axiomatization of natural-number arithmetic in 1881, and Richard Dedekind proposed another influential axiomatization in his 1888 publication, “Was sind und was sollen die Zahlen?” (What are numbers and what are they good for?), notably including the number 1 and the successor function as fundamental concepts.

Building upon these preceding efforts, Giuseppe Peano published his own streamlined and influential collection of axioms for the natural numbers in 1889 in his book Arithmetices principia, nova methodo exposita (The principles of arithmetic, presented by a new method). While Peano’s work drew inspiration from Dedekind’s, Peano’s formulation is now the accepted standard for the axiomatic definition of natural numbers.

Two years later, in 1891, Peano further refined his axiomatic system of natural numbers in his article ‘Sul concetto di numero’ (‘On the Concept of Number’), published in Rivista di Matematica, a journal he himself founded that same year. In this influential article, Peano reduced his list of axioms from nine to five. This simplification arose from the recognition that the fundamental properties of the equality relation ‘=’ such as reflexivity, symmetry, transitivity, and substitutivity are often considered integral to the underlying logical framework within which mathematical reasoning unfolds and are not specific to the definition of the natural numbers themselves, thereby exemplifying the inherent drive towards elegance and minimality in mathematical formalization. Therefore, for a more focused set of axioms uniquely characterizing the natural numbers, Peano chose to omit these axioms of equality.

Peano’s endeavor to formalize the natural numbers, while seemingly straightforward today, was a significant step in addressing the limitations of informal mathematical reasoning. Without a rigorous axiomatic system (a set of fundamental statements, called axioms, that are assumed to be true and from which all other statements are logically derived), proofs could be predicated on intuition that might not always hold, especially when constructing more intricate mathematical structures.

Peano was quite deliberate in using specific symbols for logical connectives and quantifiers that were distinct from the mathematical symbols representing numbers, operations, and sets. This practice of maintaining a clear separation between mathematical and logical symbols was innovative for his time and built upon earlier work by figures like Gottlob Frege (though Peano was initially unaware of Frege’s work). This constituted a move towards a more formal and unambiguous language for mathematics. Before this, mathematical writing often commingled these concepts more informally within prose.

While perusing Peano’s Axioms it is worth keeping in mind that during Peano’s time, the concept of set was still nascent; it was Peano who introduced the symbol \in in 1889 to denote “is an element of”. Being aware of Peano’s original formulation of these axioms helps us appreciate the subsequent trajectory of mathematical development and underscores that our pursuit of more refined mathematical notation and more powerful abstraction remains ongoing. 😃

The culmination of these foundational inquiries, as articulated in Peano’s seminal 1891 article “Sul concetto di numero” comprises the following five axioms which can be found within the pages 387-408 of the book “Historia Mathematica 1”, published in the year 1974:

We can see how the archaic notation obfuscates these axioms.

The interpretations of these axioms as stated in the same book “Historia Mathematica 1” are as follows:

For improved readability, these five axioms are presented below using LaTeX notation, accompanied by interpretations that elucidate their formal meaning within the axiomatic system:

(1) \quad\hspace{0.6em} 1 \hspace{0.7em} \varepsilon \hspace{0.7em} \mathit{N}

Interpretations:

  • From “Historia Mathematica 1”: One is a number.

  • Formal Interpretation: The element ‘one’ belongs to the collection of natural numbers, denoted by \mathit{N}.

(2) \quad\hspace{0.2em} + \hspace{0.35em} \varepsilon \hspace{0.6em} \mathit{N} \setminus \!\mathit{N} \quad

Interpretations:

  • From “Historia Mathematica 1”: The sign + placed after a number produces a number.

  • Formal Interpretation: The successor function, denoted by the postfixed symbol ‘+\text{'}, when applied to an element of the collection of natural numbers (\mathit{N}), yields an element that is also a member of the collection of natural numbers (\mathit{N}).

(3) \quad\hspace{0.5em} a, \hspace{0.7em} b \hspace{0.8em} \varepsilon \hspace{0.7em} \mathit{N} \hspace{0.7em} . \hspace{0.7em} a+ \hspace{0.35em} = \hspace{0.35em} b+ \hspace {0.7em}: \hspace{0.35em} \supset \hspace{0.35em}.\hspace{0.7em} a \hspace{0.35em}=\hspace{0.35em} b \quad

Interpretations:

  • From “Historia Mathematica 1”: If a and b are two numbers, and if their successors are equal, then they are also equal.

  • Formal Interpretation: For any two elements a and b that belong to the collection of natural numbers (\mathit{N}), if the successor of a\, (a+) is equal to the successor of b\, (b+), then a must be equal to b.

(4) \quad\hspace{0.5em} 1 \hspace{0.2em}- \hspace{0.2em} \varepsilon \hspace{0.6em} \mathit{N}+ \quad

Interpretations:

  • From “Historia Mathematica 1”: One is not the successor of any number.

  • Formal Interpretation: The element ‘one’ is not in the collection of successors of the elements in the collection of natural numbers (\mathit{N}).

(5) \quad\hspace{0.5em} s \hspace{0.5em} \varepsilon \hspace{0.5em} \mathit{K} \hspace{0.7em} . \hspace{0.7em} 1 \hspace{0.5em} \varepsilon \hspace{0.5em} s \hspace{0.7em} . \hspace{0.7em} s+ \hspace{0.35em} \supset \hspace{0.35em} s \hspace {0.7em}: \hspace{0.35em} \supset \hspace{0.35em}.\hspace{0.7em} \mathit{N} \hspace{0.35em} \supset \hspace{0.35em} s \quad

Interpretations:

  • From “Historia Mathematica 1”: If s is a class containing one, and if the class made up of the successors of s is contained in s, then every number is contained in the class s.

  • Formal Interpretation: For any class s belonging to the domain of classes (\mathit{K}), if ‘one’ is an element of s, and if the collection of successors of the elements in s\, (s+) is contained in s (meaning every element in s+ is also in s), then the collection of natural numbers (\mathit{N}) is contained in the class s (meaning every element in \mathit{N} is also in s).

Since axiom (5), as stated, can seem a bit convoluted, let’s break down its key components to facilitate understanding.

  1. s is a class belonging to the domain of classes (\mathit{K}) (s \,\varepsilon\, \mathit{K}): In Peano’s time, the distinction between “set” and “class” was not as rigorously defined as it became later with the development of axiomatic set theory (like ZFC). However, in this context, “class” generally refers to a collection of mathematical objects that share a certain property. We can think of it as being very similar to the modern idea of a set.

    In summary, “s \,\varepsilon\, \mathit{K}" tells us that s is a collection (a class, similar to a set) of entities within Peano’s mathematical framework, and \mathit{K} is the universe of all such collections he was considering. So, this part just establishes that “s" is a valid collection within Peano’s system. In the context of axiom (5),\,s is a class that we are examining to see if it contains all the natural numbers.

  2. s is a class containing one (1 \,\varepsilon\, s): We begin with the knowledge that the number 1 possesses the property defined by the class s.

  3. The class made up of the successors of s\, (s+ = \{x\!+ |\, x \in s\}) is contained in s\, (s+ \subset s): This part states that if we take all the elements currently in the class s and consider their successors, those successors are also guaranteed to be within the class s.

    Let’s trace this informally:

    \star\, We start with 1 \in s.

    \star\, Since 1 \in s, its successor 1+ = 2 must be an element of s+ (the class of successors of s)

    \star\, Because s+ \subset s, it follows that 2 is also an element of s.

    \star\, Now, since 2 \in s, its successor 2+ = 3 must be an element of s+.

    \star\, Again, because s+ \subset s, it follows that 3 is also an element of s.

    \star\, This chain of reasoning continues indefinitely. If any natural number x is in s, its successor x+ will be in s+, and therefore also in s due to the containment s+ \subset s.

  4. Then every number is contained in the class s\, (\mathit{N} \subset s): This is the conclusion. If the class s contains 1 \hspace{2em}and has the property that the successor of any element in s is also in s, then the class s must contain all the natural numbers.

This axiom might appear somewhat complex, but it plays a crucial role in defining the collection of natural numbers, \mathit{N}. While the first four axioms guarantee that \mathit{N} contains at least the standard natural numbers \{1, 2, 3, \ldots\} (generated by starting with 1 and repeatedly applying the successor function), these axioms alone do not preclude the existence of additional ‘extra’ elements beyond this standard sequence, a possibility we will explore further when discussing Peano’s fifth axiom.. It is the power of axiom (5) that excludes such a possibility. This axiom is key to ensuring that the collection of natural numbers \mathit{N} contains only the standard sequence generated by starting with 1 and repeatedly applying the successor function, i.e., \mathit{N} = \{1, 2, 3, \ldots\}. By requiring that any class s containing 1 and closed under the successor operation (i.e., if x \in s, then x+ \in s) must contain all natural numbers, axiom (5) effectively excludes any ‘extra’ elements that are not reachable through this fundamental iterative process. This is because the class s = \{1, 2, 3, \ldots\} inherently contains 1 and is closed under the successor operation, thus always satisfying the axiom’s premise. This leads to the axiom’s conclusion that \mathit{N} must be contained in s, which is only possible when \mathit{N} = \{1, 2, 3, \ldots\}.

In the Peano Axioms published in 1889 \text{ and } 1891, the sequence of natural numbers began with 1, and the class of natural numbers was denoted by \mathit{N}. However, in 1898 these axioms were modified so that the sequence began with 0 and the class was denoted by \mathit{N_{\!0}}.

Peano’s decision to include 0 as the starting natural number likely stemmed from a growing recognition of its utility in emerging areas of mathematics, such as the elegant representation of the empty set’s cardinality as 0 within set theory, and in providing a more consistent basis for arithmetic and logical systems he was developing. Although specific examples of instances that directly motivated this shift might not be explicitly documented in readily available historical texts, this seemingly small change had a significant impact. It ultimately simplified and unified concepts across various mathematical fields, including algebra, leading to its widespread adoption as the standard.

The set of five Peano Axioms was increased to six in 1901 with the addition of the axiom – \mathit{N_{\!0}} \in \text{Cls}, i.e., the natural numbers form a class. With the addition of this last axiom, the axioms have received their final form, which are listed below.

The final form of the six axioms of Peano presented in LaTeX along with their interpretations are as follows:

(0) \quad\hspace{0.6em} \mathit{N_0} \hspace{0.7em} \varepsilon \hspace{0.7em} \textit{Cls} \quad

Formal Interpretation: The natural numbers form a class.

(1) \quad\hspace{0.6em} \mathit{0} \hspace{0.7em} \varepsilon \hspace{0.7em} \mathit{N_0} \quad

Formal Interpretation: Zero belongs to the class of natural numbers.

(2) \quad\hspace{0.5em} a \hspace{0.8em} \varepsilon \hspace{0.7em} \mathit{N_0} \hspace{0.9em} . \hspace{0.35em} \supset \hspace{0.35em} . \hspace{0.9em} a+ \hspace {0.5em}\varepsilon \hspace{0.7em} \mathit{N_0} \quad

Formal Interpretation: If a belongs to the class of natural numbers, then its successor (a+) also belongs to the class of natural numbers.

(3) \quad\hspace{0.5em} s \hspace{0.5em} \varepsilon \hspace{0.5em} \textit{Cls} \hspace{0.7em} . \hspace{0.7em} \mathit{0} \hspace{0.7em} \varepsilon \hspace{0.7em} s \hspace{0.7em} : \hspace{0.7em} x \hspace{0.7em} \varepsilon \hspace{0.7em} s \hspace {0.7em}. \hspace{0.35em} \supset_{\normalsize \,x} \hspace{0.35em}.\hspace{0.7em} x+\!\! \hspace{0.7em} \varepsilon \hspace{0.7em} s \hspace {0.7em}: \hspace{0.35em} \supset \hspace{0.35em}.\hspace{0.7em}\mathit{N_0} \hspace{0.35em} \supset \hspace{0.35em} s \quad

Formal Interpretation: If s is a class containing zero, and if for every element x in s, its successor (x+) is also in s, then the class of natural numbers \mathit{N_0}​ is contained in the class s (meaning every element in \mathit{N_0} is also in s).

(4) \quad\hspace{0.5em} a, \hspace{0.7em} b \hspace{0.8em} \varepsilon \hspace{0.7em} \mathit{N_0} \hspace{0.7em} . \hspace{0.7em} a \hspace{0.35em}+ \hspace{0.35em} 1 \hspace{0.35em} = \hspace{0.35em} b \hspace{0.35em} + 1 \hspace {0.7em}. \hspace{0.35em} \supset \hspace{0.35em}.\hspace{0.7em} a \hspace{0.35em}=\hspace{0.35em} b \quad

Formal Interpretation: If a and b belong to the class of natural numbers and the successor of a\, (a + 1) is equal to the successor of b\, (b + 1), then a and b must be equal.

(5) \quad\hspace{0.5em} a \hspace{0.8em} \varepsilon \hspace{0.7em} \mathit{N_0} \hspace{0.9em} . \hspace{0.35em} \supset \hspace{0.35em} . \hspace{0.9em} a \hspace {0.5em} + \hspace{0.5em} 1 \hspace{0.5em} - \hspace{0.5em} = \hspace{0.5em} \mathit{0}\quad

Formal Interpretation: If a belongs to the class of natural numbers then its successor a + 1 is not equal to 0 (meaning zero is not the successor of any natural number).

We can see that the key changes between the original five axioms and the final six are as follows:

  • the explicit statement that the natural numbers form a class which reflects a move towards greater formalization within Peano’s system

  • the inclusion of 0 as the starting natural number

  • consequently, the axiom 3 (originally listed as axiom 5) now begins with the condition that 0 belongs to the class s

With this historical backdrop in mind, let us now examine the axioms themselves and their role in the formal definition of natural numbers.

The Axiomatization of Natural Numbers

We will first define the notion of equality as it pertains to natural numbers, and then we will formulate the Peano Axioms such that it provides an axiomatic definition of natural numbers.

As discussed in the historical context section, the concept of equality extends beyond natural numbers to mathematical objects in general and pertains more to the realm of logic, which specifies the conditions under which two mathematical objects can be considered equal. This likely explains Peano’s decision to omit the formal axioms of equality from his later formulation concerning only the natural numbers.

Although Giuseppe Peano omitted the four axioms related to the equality relation from his later formulation of the axioms regarding natural numbers, we will still discuss these axioms for the sake of completeness.

In our discussion of Peano’s Axioms, we will adopt some conventions from its final form. Therefore, we will start with 0 instead of 1 in our axiomatic system and denote the set of natural numbers by \mathbb{N_0}, where the subscript 0 reminds us that 0 is included.

The Notion of Equality

Before we define the set of natural numbers \mathbb{N_0} axiomatically, we will formalize the notion of equality through the four axioms of Peano, which establish the properties that the equality relation, denoted by =, must satisfy.

Suppose there exists a set \mathbb{N_0} that satisfies Axioms 1 t0 9 listed below.

Firstly, every natural number should be equal to itself; this is called the reflexivity axiom.

Axiom \mathbf{1}. For every x \in \mathbb{N_0}, x = x.

Secondly, if one natural number equals a second one, then the second one should equal the first one. This is known as the symmetry axiom.

Axiom \mathbf{2}. For every x, y \in \mathbb{N_0}, \text{ if } x = y, \text{ then } y = x.

Thirdly, if one natural number is equal to a second, and that second natural number is equal to a third, then the first and third are equal to each other. This is called the transitivity axiom.

Axiom \mathbf{3}. For every x, y, z \in \mathbb{N_0}, \text{ if } x = y \text{ and } y = z, \text{ then } x = z.

These three properties—reflexivity (pertaining to a single object), symmetry (pertaining to two objects), and transitivity (pertaining to three objects)—are fundamental characteristics of the equality relation as it applies to any type of mathematical objects. As we have already discussed during our study of Set Theory, equality is an example of an equivalence relation, which is a type of homogeneous binary relation that satisfies these three properties.

Since the equality relation is defined generically for all mathematical objects and not just for natural numbers, we must make explicit the assumption that if two mathematical objects are equal (i.e., they satisfy the equality relation) and one of them is a natural number, then the other must also be a natural number. The next axiom makes this assumption explicit, namely, a natural number can only be equal to another natural number.

Fourthly, if any mathematical object is equal to a natural number, then that mathematical object is itself a natural number. This is called the closure of equality axiom.

Axiom \mathbf{4}. For all x \text{ and } y, \text{ if } x \in \mathbb{N_0} \text{ and } x = y, \text{ then } y \in \mathbb{N_0}.

That is, the set of natural numbers is closed under equality.

The Peano Axioms

We will now discuss the five main Peano axioms that define the natural numbers. Peano aimed to formulate these axioms such that the fewest possible axioms could generate all the natural numbers. Therefore, we will construct the Peano axioms by checking, after stating each axiom, whether the axioms stated thus far can unambiguously result in the set of natural numbers that we know of. That is, we will continue constructing the Peano axioms until these axioms, when taken together, incontrovertibly result in \mathbb{N}_0 = \{0, 1, 2, \ldots\}.

This method of constructing the Peano axioms leads to the insight that the entire set of natural numbers can be generated by asserting the existence of at least one natural number and then defining a function called the successor function. This function takes a natural number as input and outputs another natural number, resulting in the construction of all remaining natural numbers.

Let us now proceed with the construction of the Peano Axioms.

Since we start counting from 0, it is unsurprising that 0 is the most obvious element to axiomatically include in the set of natural numbers.

Fifthly, 0 is a natural number.

Axiom \mathbf{5}. 0 \in \mathbb{N_0}.

Thus far we are only guaranteed the existence of a single natural number, 0.

From 0 we should be able to generate the other natural numbers; that is starting with 0 we should be able to reach 1, from 1 reach 2 and so on, akin to counting. We can model this progression from one natural number to the next using a function that takes a natural number as input and produces another natural number as output. This function is called the successor function since it establishes a succession within the set of natural numbers and is written as S : \mathbb{N_0} \rightarrow \mathbb{N_0}. The next axiom simply states that there is a function S whose domain and codomain are the set of natural numbers, \mathbb{N}_0.

Sixthly, every natural number has a successor which is also a natural number.

Axiom \mathbf{6}. If x \in \mathbb{N}_0, then S(x) \in \mathbb{N}_0.

That is, the set of natural numbers, \mathbb{N}_0, is closed under the successor operation, S.

As this axiom implies, we will refer to S(x) as the successor of \mathit{x}.

Till now we have only established that the set of natural numbers contains 0 and its successor, S(0), where the function S takes natural numbers as input and outputs natural numbers. However, we are still quite far from having the set of natural numbers that we know of, since we could have \mathbb{N}_0 = \{0\} and define S(0) = 0, which would still satisfy all of the above axioms. In this case, \mathbb{N}_0 = \{0\}, but we want \mathbb{N}_0 = \{0, 1, 2, 3, \ldots\}.

To achieve this, we need to ensure that the successor function S does not output 0 for any natural number input, because if it did, it could lead to cycles or a finite set, thus, preventing us from generating the infinite sequence of distinct natural numbers we expect. Our next axiom will guarantee this by forbidding 0 from being the successor of any natural number, including itself.

Seventhly, 0 is not the successor of any natural number.

Axiom \mathbf{7}. For every natural number x \in \mathbb{N}_0, S(x) \neq 0.

That means that there is no natural number whose successor is 0. Consequently, the preimage of 0 under S defined on the set of natural numbers is an empty set.

As a consequence of this axiom, we know that S(0) \neq 0. Therefore, S(0) must equal some other natural number, which we can denote by 1. Hence, we can define 1 as S(0) = 1.

Based on axioms 5, 6 \text{ and } 7 we are guaranteed the existence of at least two natural numbers, 0 \text{ and } 1, but not necessarily others.

For example, we could define \mathbb{N}_0 = \{0, 1\}, where S(0) = 1 \text{ and } S(1) = 1. In this case, both natural numbers 0 \text{ and } 1 have the same successor, which is 1.

If S(0) = 1 \text{ and } S(1) = 1, then the two natural numbers 0 \text{ and } 1 have the same successor, 1. This set of natural numbers together with the successor function S defined on it would still satisfy all the above axioms. However, if we stop here, the axioms constructed thus far do not guarantee the existence of the rest of the natural numbers that we know of, namely, 2, 3, 4 \ldots.

Therefore, our next axiom should ensure that different natural numbers have different successors. This means that every natural number is the successor of at most one natural number (since 0 is not the successor of any natural number) which implies that S must be an injective function.

Eighthly, no two natural numbers have the same successor unless they are equal.

Axiom \mathbf{8}. For all x, y \in \mathbb{N}_0, if S(x) = S(y), then x = y.

This axiom leads to some important consequences. It excludes the possibility of defining \mathbb{N}_0 to be just \{0, 1\}. We already have S(0) = 1 and since S is an injective function, we cannot have S(1) = 1. Axiom 7 excludes the possibility that S(1) = 0. Thus S(1) must be some other natural number, which we denote as 2. Therefore, we can define 2 = S(1).

By a similar argument, S(2) cannot be 0, 1 \text{ or } 2. Hence, it must be some other natural number, which we denote as 3. Continuing this way, we see that \mathbb{N}_0 must contain all the natural numbers that we know of.

So far we have established that \mathbb{N}_0 must include 0, its successor 1 = S(0), its successor’s successor 2 = S(1) and so on. Thus \mathbb{N}_0 must include 0, S(0), S(S(0)), S(S(S(0))), \ldots. In order to avoid so many nested applications of S we use the numerals 1, 2, 3 to denote S(0), S(S(0)) \text{ and } S(S(S(0))), respectively.

These first eight axioms have resulted in the definition of \mathbb{N}_0 to include all the natural numbers that we know of.

Therefore, so far we only know that

\{0, 1, 2, \ldots\} \subset \mathbb{N}_0

At this point, it is interesting to ask whether our axiomatic definition of \mathbb{N}_0 precludes the inclusion of additional elements.

In order to answer this question, let us consider a version of \mathbb{N}_0 that satisfies all the above axioms but is not the usual set of natural numbers that we know of. That is,

\mathbb{N}_0 = \{0, 1, 2, 3, \ldots\} \cup \{!, -\}

As can be seen, this version of \mathbb{N}_0 contains all the natural numbers and also includes two other symbols, ! \text{ and } -.

We will next define the successor function on this set. For the subset \{0, 1, 2, 3, \ldots\} of \mathbb{N}_0, we define S as we have described above i.e., S(0) = 1, S(1) = 2, S(2) = 3 and so on. For the subset \{!, -\} of \mathbb{N}_0, we define S(!) = - \text{ and } S(-) = !.

This version of \mathbb{N}_0 with this successor function satisfies all the axioms, but it has more elements than what we want our set of natural numbers to have. This is shown below.

Similarly, there could be other versions of \mathbb{N}_0 with different successor functions, each of which satisfies all of the above axioms from 5 \text{ through } 8 but could also have elements other than natural numbers. This is illustrated below.

Based on axioms 5 \text{ to } 8, the set \mathbb{N}_0 of natural numbers satisfies the following conditions:

  • 0 \in \mathbb{N}_0; and

  • if x \in \mathbb{N}_0,\text{then } S(x) \in \mathbb{N}_0, where S(x) denotes the successor of x.

This way of defining a set, where a base clause specifies the basic element of the set and an inductive clause details how to generate additional elements, is called an inductive definition of the set, and such a set is referred to as an inductive set.

Therefore, the axioms 5 \text{ to } 8 only ensure that \{0, 1, 2, \ldots\} \subset \mathbb{N}_0, where \mathbb{N}_0 is any set such that 0 \in \mathbb{N}_0 and if x \in \mathbb{N}_0,\text{then } S(x) \in \mathbb{N}_0.

However, as discussed earlier and shown in the diagram above, this definition of \mathbb{N}_0 does not exclude elements other than natural numbers from being contained in the set. But we want only the natural numbers to be included in the set \mathbb{N}_0. To ensure this, the set \mathbb{N}_0 should be the minimal set that satisfies axioms 5 \text{ through } 8. This means that \mathbb{N}_0 is the intersection of all sets that satisfy these axioms. In other words, \mathbb{N}_0 contains only those elements common to all such sets.

Since axioms 5 \text{ to } 8 only ensure that \{0, 1, 2, \ldots\} \subset \mathbb{N}_0, in order to arrive at the set of natural numbers that we know of, axiom 9 should declare that \{0, 1, 2, \ldots\} = \mathbb{N}_0. But such a declaration would fail to capture the properties of the set \{0, 1, 2, \ldots\} as characterized by the axioms 5 \text{ to } 8. Therefore, axiom 9 \text{'s} description of this set should encompass its properties. The set \{0, 1, 2, \ldots\} \subset \mathbb{N}_0 can be unambiguously described as the subset of the set of natural numbers \mathbb{N}_0 that contains 0 and is closed under the successor function (meaning that if a natural number is in the subset, its successor is also in the subset). Therefore, axiom 9 should declare that any subset of \mathbb{N}_0, say T, that contains 0 and is closed under the successor function must equal the entire set of natural numbers, i.e., T = \mathbb{N}_0. The set \{0, 1, 2, \ldots\} always satisfies the conditions imposed on the set T and therefore, satisfies the premise of the axiom; but only when \mathbb{N}_0 = \{0, 1, 2, \ldots\} is the conclusion of the axiom, namely, T = \mathbb{N}_0, satisfied. Thus, axiom 9 effectively enforces the minimality of \mathbb{N}_0 and excludes any extraneous elements.

Let us peruse some examples to further illuminate the subtly of how axiom 9 achieves minimality.

Suppose we have \mathbb{N}_0 = \{0, 1, 2, 3, \ldots\} \cup \{!, -\} with the successor function as defined earlier. Then both T = \{0, 1, 2, 3, \ldots\} \cup \{!, -\} \text{ and } T = \{0, 1, 2, \ldots\} are subsets of \mathbb{N}_0. Both the sets contain 0 and are closed under the successor function. Therefore, both the sets satisfy the premise of axiom 9 but T = \{0, 1, 2, \ldots\} does not satisfy the conclusion of the axiom since T \neq \mathbb{N}_0. Therefore, \mathbb{N}_0 = \{0, 1, 2, 3, \ldots\} \cup \{!, -\} is not possible.

Through this example we have shown that if a larger set with extra elements other than the standard natural numbers were to be considered as \mathbb{N}_0​, then the subset T of standard naturals within \mathbb{N}_0 would satisfy the induction premise i.e., the premise of axiom 9 and T would be forced to be equal to this supposedly larger \mathbb{N}_0​, which is a contradiction.

Now suppose \mathbb{N}_0 = \{0, 1, 2, 3, \ldots\}. Then T = \{0, 1, 2, \ldots\} is the only set that satisfies the premise of axiom 9 and also its conclusion.

In essence, the axiom 9 defines the extent of the natural numbers. It says that the natural numbers are precisely those elements that can be reached by starting at 0 and repeatedly applying the successor function. Any set satisfying all five axioms cannot contain elements outside of this standard sequence.

This is illustrated below.

Axiom 9 uses the inductive definition of a set in its construction and is therefore referred to as the Axiom of Induction.

We will now state our ninth and final axiom. 😇

Axiom \mathbf{9} (Axiom of Induction). For any set T \subset \mathbb{N}_0, if:

  • 0 \in T; and if

  • x \in T \implies S(x) \in T for all x \in \mathbb{N}_0,

then T = \mathbb{N}_0.

As we have already discussed, the axioms 5 \text{ through } 8 ensure that \{0, 1, 2, \dots\} \subset \mathbb{N}_0.

We can see that T = \{0, 1, 2, \dots\} satisfies the premise of Axiom 9 since 0 \in T \text{ and } x \in T \implies S(x) \in T \text{ for all } x \in \mathbb{N}_0. Therefore, by the conclusion of Axiom 9, T = \mathbb{N}_0, which is only possible when \mathbb{N}_0 = \{0, 1, 2, \dots\}.

Thus we finally have the set of natural numbers that we know of, namely, \mathbb{N}_0 = \{0, 1, 2, \dots\}.

A Crucial Note on the Order of Logic

The Axiom of Induction as formulated above is a second-order axiom because it makes a statement about any subset T \text{ of } \mathbb{N}_0​. This quantification over all possible subsets is a hallmark of second-order logic and is key to the power of this axiom in uniquely characterizing the set of natural numbers by excluding any extraneous elements, as our illustration above has shown.

In contrast, first-order Peano Arithmetic, operates within first-order logic, where we are restricted to quantifying only over the natural numbers themselves and not over sets of natural numbers. This restriction is a consequence of the fundamental design of first-order logic, which allows quantification solely over individual elements of the domain, explicitly excluding quantification over sets of elements and thereby giving rise to desirable meta-logical properties like completeness, meaning that if a statement is logically valid, it can be proven within the system. Second-order logic, while being more expressive (like the ability to fully characterize \mathbb{N}_0​), typically lacks this property of completeness. We will delve into these meta-logical properties in more detail later.

To capture the idea of induction within this first-order framework, we use an axiom schema. This schema provides a separate axiom for each formula \phi(x) in the language of first-order arithmetic. Essentially, for every property of a natural number x that can be expressed by a first-order formula \phi(x), we have an axiom of the form:

(\phi(0) \land \forall x \in \mathbb{N}_0(\phi(x) \implies \phi(S(x)))) \implies \forall n \in \mathbb{N}_0 \, \phi(n)

Therefore, this schema provides an infinite number of axioms, one for each such property. While aiming for the same inductive power, the inductive principle in first-order Peano Arithmetic is limited to properties formally expressible within its language, which is insufficient to fully express all the properties of standard natural numbers due to the constraints of first-order logic. Consequently, axioms of first-order Peano Arithmetic do not guarantee that \mathbb{N}_0​​ is limited to the standard natural numbers \{0, 1, 2, \ldots\}, allowing for the possibility of additional elements that also satisfy all the axioms. The second-order formulation, by ranging over all possible subsets, avoids this limitation.

We will delve into the specifics of the language of first-order arithmetic and the implications of this axiom schema later in our discussion. For now, it’s important to note this key difference in how induction is formalized in first-order logic compared to the second-order formulation we are currently examining.

Alternate Formulations of Axiom of Induction: Set-Based and Predicate-Based Perspectives

For the sake of completeness, we will discuss two alternate yet equivalent ways to formulate Peano’s Ninth Axiom, namely the Axiom of Induction. Their equivalence stems from the fundamental and often intertwined relationship between sets and their characteristic predicates, a connection that underpins much of mathematical logic.

As a cornerstone of number theory, the Axiom of Induction, which we have already seen in its set-based form as Axiom 9 (where the conclusion implies a set T, initially a subset of \mathbb{N}_0​, must indeed be equal to \mathbb{N}_0​​), can also be expressed through the lens of properties or predicates. The set-based formulation can also be presented with the conclusion that \mathbb{N}_0​ \subset T, which, given the premise that T \subset \mathbb{N}_0​, leads to the same result as T = \mathbb{N}_0​. This leads to two primary, though ultimately equivalent, formulations: the set-based formulation and the predicate-based formulation. Each offers a distinct yet complementary perspective on the same underlying logical structure.

Set-Based Axiom of Induction: Focusing on Subsets

Axiom \mathbf{9} (Axiom of Induction). For any set T \subset \mathbb{N}_0, if:

  • 0 \in T; and if

  • x \in T \implies S(x) \in T for all x \in \mathbb{N}_0,

then \mathbb{N}_0 \subset T.

In simpler terms, if any subset of natural numbers contains 0 and is closed under the successor function (meaning that if a natural number is in the subset, its successor is also in the subset), then that subset must contain all natural numbers.

As discussed above, Axioms 5 \text{ through } 8 ensure that \{0, 1, 2, \ldots\} \subset \mathbb{N}_0. Additionally, we observe that the set \{0, 1, 2, \ldots\} satisfies the following two conditions of Axiom 9:

  • It contains 0; and

  • Whenever it contains an element x, it also contains its successor, namely, S(x).

Therefore, by Axiom 9, it follows that \mathbb{N}_0 \subset \{0, 1, 2, \ldots\}.

Since Axioms 5 \text{ through } 8 ensure that \{0, 1, 2, \ldots\} \subset \mathbb{N}_0 and by Axiom 9 we have show that \mathbb{N}_0 \subset \{0, 1, 2, \ldots\}, it follows that \mathbb{N}_0 = \{0, 1, 2, \ldots\}.

This perspective emphasizes the structure of the subset itself. We are concerned with the elements that are members of the subset and how they relate to each other through the successor function.

Predicate-Based Axiom of Induction: Focusing on Properties

Peano’s axioms five through eight collectively define a superset of natural numbers, specifically \{0, 1, 2, \ldots\} \subset \mathbb{N}_0. To ensure that this set \mathbb{N}_0 includes only natural numbers, Peano’s ninth axiom can also be formulated as the principle of mathematical induction over natural numbers. This formulation is equivalent to the axiom of induction and serves the same purpose of removing unwanted elements from the superset \mathbb{N}_0, ensuring that it contains only natural numbers.

The predicate-based form of the Axiom of Induction shifts the focus from subsets to properties (expressed as predicates) that natural numbers may or may not possess.

The reformulation of Axiom 9 in terms of predicates results in the Principle of Mathematical Induction which is stated as follows.

Axiom \mathbf{9} (Principle of Mathematical Induction). For any predicate P(n), where n is a natural number, if:

  • P(0) is true, and

  • for every natural number n, P(n) being true implies that P(S(n)) is also true,

then P(n) is true for every natural number n.

Here, we are concerned with a property (represented by the predicate P(n)) that natural numbers might possess. If 0 has the property, and if natural number n having the property implies that its successor S(n) also has that property, then all natural numbers must have that property.

This perspective emphasizes the properties of individual natural numbers. We are concerned with whether a given natural number has a specific property.

Equivalence and Connection

The set-based and predicate-based forms are logically equivalent, meaning they express the same fundamental principle. This equivalence is rooted in the relationship between subsets and predicates, as established by the Axiom of Separation.

  • From Subset to Predicate:
    • Given T \subset \mathbb{N}_0, it implies from the Axiom of Separation that there exists a predicate P(n) such that P(n) is true if and only if n \in T.
  • From Predicate to Subset:
    • Given a predicate P(n) and a set \mathbb{N}_0, we can define a subset T = \{n \in \mathbb{N}_0 \,|\, P(n) \text{ is true}\}, by the Axiom of Separation.

Using this correspondence, we can see how the two types of formulations of axiom 9 map onto each other:

  • \text{``}T \subset \mathbb{N}_0 \text{"} (Set Formulation – Considering a subset of natural numbers) is equivalent to \text{``}P(n), where n \in \mathbb{N}_0 \text{"} (Predicate Formulation – Considering a property of natural numbers).

  • \text{``}0 \in T \text{"} is equivalent to \text{``}P(0) \text{ is true"}.

  • \text{``}x \in T \implies S(x) \in T for all x \in \mathbb{N}_0 \text{"} is equivalent to \text{``}P(n) being true implies that P(S(n)) is also true for every natural number n\text{"}.

  • \text{``}\mathbb{N}_0 \subset T \text{"} is equivalent to \text{``}P(n) is true for every natural number n \text{"}.

    It should be noted that if \mathbb{N}_0 \subset T, then the Axiom of Separation guarantees the existence of a predicate P(n), where n \in T, such that P(n) is true if and only if n \in \mathbb{N}_0. Since n \in T \text{ and } T \subset \mathbb{N}_0, n is a natural number, i.e., n \in \mathbb{N}_0 is always true. Therefore, the statement becomes \text{``}P(n) is true for every natural number n \text{"}.

Thus, the subset and predicate perspectives are simply two ways of expressing the same fundamental idea. The set-based form emphasizes the elements within a collection, while the predicate-based form emphasizes the properties of individual elements. The Axiom of Separation is the bridge that allows us to move seamlessly between these perspectives.

Why Both Forms Are Useful

Both forms of the Axiom of Induction are valuable tools in mathematical proofs. The set-based form is often used in set theory and related areas, where the focus is on the set of natural numbers constructed based on some property, while the predicate-based form is commonly used in number theory and other branches of mathematics, where properties of numbers are the main focus. Essentially, they are two sides of the same coin, and which one to use depends on the context of the problem and the preference of the mathematician.

The predicate-based form of the Axiom of Induction has historical precedence over the set-based form. Principles resembling mathematical induction can be traced back to ancient Greek mathematicians like Euclid, who used methods akin to induction in some of his proofs, although not in a fully formalized way. Later, mathematicians like Blaise Pascal in the 17th century and Pierre de Fermat employed inductive-like arguments more explicitly. However, the first clear and rigorous formulation of the Principle of Mathematical Induction as we know it today is often attributed to Augustus De Morgan in the mid-19th century. His work explicitly stated the base case and the inductive step in a manner very similar to the predicate-based form we use now. The set-based formulation arose later with the formalization of set theory in the late 19th and early 20th centuries, becoming particularly prominent in the context of how the natural numbers are understood based on the fundamental axioms defined by Giuseppe Peano. His axiomatic system provided a characterization of these numbers that could be realized as a specific set within set theory, thus leading to the set-based formulation of induction.

The Mathematical Structure Defined by the Peano Axioms

The Peano axioms define a specific mathematical structure (\mathbb{N}_0, S, 0), where \mathbb{N}_0 is the set of natural numbers, S is the successor function, and 0 is a distinguished element. The axioms themselves specify the fundamental relations and properties that give this structure its unique characteristics.

Method of Definition of Natural Numbers using Peano’s Axioms

It should be noted that Peano’s Axioms only describe how to construct the set of natural numbers and do not define what natural numbers are intrinsically. The axioms essentially provide an inductive definition of the set \mathbb{N}_0​: 0 is the initial natural number, and for every natural number n, its successor S(n) is also a natural number. Every natural number is identified by its unique generation through the repeated application of the successor function starting from 0. For example, the natural number 2 is represented by S(S(0)) and is obtained by starting from 0 and then applying the successor function once and then again to 0. The fact that nothing else is a natural number beyond what can be generated in this way is ensured by the Axiom of Induction, which guarantees the minimality of the set \mathbb{N}_0​ satisfying these properties.

Existence of the Set of Natural Numbers satisfying Peano’s Axioms

How do we establish the existence of a set, an element of that set, and a function from the set to itself, that satisfy Peano’s Axioms? These axioms themselves are insufficient to prove this existence. Consequently, there are two approaches to resolving this matter.

One common approach in mathematics is to take something as axiomatic and then use it as the basis upon which we prove all our other results. Hence, such an approach requires us to be satisfied with taking the existence of a set satisfying Peano’s Axioms axiomatically. This axiom is called the existence axiom for natural numbers and guarantees that there exists a set with the properties that Peano’s axioms ascribe to it.

The statement of this axiom is as follows:

Existence Axiom for Natural Numbers: There exists a set \mathbb{N}_0 satisfying Axioms 1 \text{ through } 9.

Alternatively, if we use the Zermelo-Fraenkel Axioms as our foundation for set theory, we can prove that something satisfying Peano’s Axioms exists, so we don’t need to assume it separately. We’ll show how this is done later. This approach is often preferred as it seeks to build mathematics from a more minimal set of fundamental axioms. Nevertheless, studying Peano’s Axioms as an independent axiomatic system, including assuming the existence of a set satisfying Peano’s Axioms, allows us to explore the properties of this axiomatic system and its consequences in isolation from a specific foundational embedding.

Proving Properties of Natural Numbers

Having established the existence of the natural numbers and their fundamental properties as defined by Peano’s Axioms, we will next explore how to rigorously prove that the natural numbers satisfy a given property using the Axiom of Induction.

Next Topic: Proving Properties of Natural Numbers Using Proof by Induction

Categories
Logic mathematics

Formulas and Free Variables

authored by Premmi and Beguène

We will now briefly discuss the terms formula and free variables that we encountered during our discussion of the substitution axiom which is one of the two Axioms of Equality.

We can think of language as being comprised of a set of symbols or letters called the alphabet of the language, from which we form words by constructing sequences of letters taken from the alphabet.

Similarly, in mathematical logic, a formal language consists of words whose letters are taken from the set of formal symbols referred to as an alphabet and are well-formed according to a specific set of rules called formal grammar.

A symbol or string (sequence) of symbols may comprise a well-formed formula if it is consistent with the formation rules of the language. Formation rules describe which strings formed from the alphabet of a formal language are syntactically valid within that language. These rules address only the location and manipulation of strings; they do not describe anything else about a language, such as its semantics (i.e., what the strings mean).

Therefore, a well-formed formula, abbreviated WFF or wff, often referred to simply as a formula, is a finite sequence of symbols from a given alphabet that belongs to the formal language for which it makes sense to ask “Is this formula true?”, once any free variables in the formula have been instantiated. 

A free variable is a symbol that specifies places in a formula where the symbol will later be replaced by some value. It should be noted that free variables appear only in the first-order formulas. A first-order formula is a formula in first-order logic—a system of logic more expressive and hence more powerful than proportional logic—used to express statements about objects and their relationships in a structured way, which we will discuss in detail later.

Example of a First-Order Formula

The following is an example of a first-order formula.

\forall x \ \, \,  \exists y \ R(x, y)

This is interpreted in mathematical parlance as “for all x, there exists a y, such that x \text{ and} y are related by the relation R.” This formula isn’t true or false unless we assign values to x, y \text{ and the relation} R.

A first-order formula isn’t true or false on its own. Before we can get a truth value we have to give an interpretation.Turning a first-order formula into a statement that is either true or false is called giving an interpretation for the formula. 

Interpretation

An interpretation of a first-order formula consists of a set X, called the domain of the interpretation, along with a relation on X for each relation symbol in the formula.

Once we’ve given an interpretation, i.e., substituted any variables in the formula with values from the domain and any relation with that defined on the domain, we can evaluate whether the formula is true or false in that interpretation.

Here are some interpretations of the first-order formula:

\forall x \ \, \,  \exists y \ R(x, y)

In the ensuing discussion, let’s denote \mathbb{N} as the set of all natural numbers \{0, 1, 2, \dots\}.

First Interpretation:

  • Domain: \mathbb{N}
  • Relation: R is <

The interpreted formula becomes:

\forall x \in \mathbb{N} \ \, \, \exists y \in \mathbb{N} \ x < y

This interpreted formula is true. For every natural number x there does exist a natural number y with x < y, e.g. y could be the natural number x + 1.

Second Interpretation:

  • Domain: \mathbb{N}
  • Relation: R is >

The interpreted formula becomes:

\forall x \in \mathbb{N} \ \, \, \exists y \in \mathbb{N} \ x > y

In this case, the interpreted formula is false. It is not true that for every natural number x there exists a natural number y such that x > y. For example, x could be 0 in which case no natural number y satisfies x > y.

Categories
Logic mathematics

Equality

authored by Premmi and Beguène

Introduction

In mathematics, the relation “equals” is a foundational concept that can be applied to any two mathematical objects to indicate that these objects represent the same mathematical entity. Since the concept of equality pertains to all mathematical objects, it is universal across mathematics and transcends its specific branches, situating itself firmly within the realm of logic. The equality relation is denoted by the symbol “=”.

Examples of Equality Relation
  • For numbers:
    \hspace*{4cm} 4 + 3 = 7
  • For sets:
    \hspace*{4cm}\{1, 2, 3\} = \{3, 1, 2\}
  • For functions:
    \hspace*{4cm}f(x) = x^3 = g(x), where g(x) is also defined as x^3
  • For matrices:
    \hspace*{4cm}\begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} = I_2 \, (\text{the } 2 \times 2 \text{ identity matrix})
  • For geometric objects:
    \hspace*{4cm}\text{The set of all points } (x, y) \in \mathbb{R} \times \mathbb{R} \text{ such that } x² + y² = 1 \, (\text{a circle with radius } 1)

In general, we write a = b to denote that a \text{ and } b represent the same mathematical object. This notation is universal across all areas of mathematics, from basic arithmetic to abstract algebra.

Axioms of Equality

The Axioms of Equality refer to a set of foundational principles that govern the relationship of equality in mathematics. These axioms include two key properties that describe how equality operates among mathematical objects.

In logic, equality is described through the following axioms:

  • Law of Identity: This axiom states that any mathematical object is equal to itself, expressed as:
    \hspace{6cm}\forall a \,(a = a)
  • Substitution Axiom: This axiom states that if two mathematical objects are equal, then any property of one must also be a property of the other. This implies that one can be substituted for the other in any mathematical formula without changing the truth of that formula once any free variables in the formula have been instantiated. This axiom is also sometimes referred to as Leibniz’s law.

    In mathematical terms this law can be expressed as follows:

    For every a \text{ and } b and any formula \phi(x), where x is a free variable, if a = b, then \phi(a) \text{ implies } \phi(b).

    Symbolically, this can be represented as:
    \hspace{6cm}\forall a, b \,(a = b) \implies \big[\phi(a) \Rightarrow \phi(b)\big]

    For example, for all real numbers a \text{ and } b, if a = b, then a \geq 0 \text{ implies } b \geq 0 \, (\text{here } \phi(x) \text{ is } x \geq 0).

It should be noted that these two axioms do not define equality, they only state what properties that objects that are related by equality must satisfy. However, these two axioms are usually sufficient for deducing most properties of equality that mathematicians care about.

We can deduce from these two axioms some more properties of the equality relation. We will discuss these next.

Derivations of Properties of Equality Relation

  • Reflexivity of Equality: For any element a in a set S with a relation R induced by equality (xRy \Leftrightarrow x = y), it holds that \forall a \in S\, (aRa).

    That is, for any element a in the set S, the relation R asserts that a is related to itself, which is equivalent to saying that a is equal to itself.

    Proof. Given some set S with a relation R induced by equality, let us assume that a \in S. Then, by the Law of Identity, a = a and consequently, aRa.

  • Symmetry of Equality: For any elements a \text{ and } b in a set S with a relation R induced by equality (xRy \Leftrightarrow x = y), it holds that \forall a, b \in S\, (aRb \implies bRa).

    That is, for any elements a \text{ and } b in the set S, the relation R asserts that if a is equal to b (i.e., aRb), then it follows that b is equal to a (i.e., bRa).

    Proof. Given some set S with a relation R induced by equality, let us assume that there are elements a, b \in S such that aRb. Let us consider the formula \phi(x) : xRa.

    By the Substitution Axiom, we have:
    \hspace{6cm}(a = b) \implies \big[\phi(a) \Rightarrow \phi(b)\big]
    Thus, we obtain:
    \hspace{6cm}(a = b) \implies (aRa \Rightarrow bRa)

    Since by assumption, a = b and by Reflexivity, aRa, it follows that bRa.

    Therefore, we have shown that if aRb, then bRa, demonstrating the symmetry of equality.

  • Transitivity of Equality: For any elements a, b \text{ and } c in a set S with a relation R induced by equality (xRy \Leftrightarrow x = y), it holds that \forall a, b, c \in S\, \big[(aRb \land bRc) \implies aRc\big].

    That is, for any elements a, b \text{ and } c in the set S, the relation R asserts that if a is equal to b (i.e., aRb) and b is equal to c (i.e., bRc), then it follows that a is equal to c (i.e., aRc).

    Proof. Given some set S with a relation R induced by equality, let us assume that there are elements a, b, c \in S such that aRb \text{ and } bRc. Then let us consider the formula \phi(x) : xRc.

    By the Substitution Axiom,
    \hspace{5cm}(b = a) \implies (bRc \Rightarrow aRc).

    Since by assumption, a = b, it follows from Symmetry that b = a. Additionally, since by assumption bRc, it follows that aRc.

    Therefore, we have shown that if aRb \text{ and } bRc, then aRc, demonstrating the transitivity of equality.

  • Function Application: For any elements a \text{ and } b from the domain of a function f it holds that if a = b then it implies that f(a) = f(b).

    Proof. Given a function f, let us assume that there are elements a \text{ and } b from its domain such that a = b. We will consider the formula \phi(x) : f(a) = f(x).

    By the Substitution Axiom, we have:
    \hspace{6cm}(a = b) \implies \big[(f(a) = f(a)) \Rightarrow (f(a) = f(b))\big].

    Since, by assumption a = b and by the Law of Identity f(a) = f(a), it follows that f(a) = f(b).

    Therefore, we have shown that for any elements a \text{ and } b from the domain of a function f, if a = b then it must be the case that f(a) = f(b).

These properties are sometimes included in the axioms of equality, but it is not necessary to include them since they can be derived from the two axioms of equality as shown above.

Categories
mathematics Number Theory

Number Theory Primer : An Axiomatic Study Of Numbers

authored by Premmi and Beguène

Introduction

Number theory involves an extensive study of the properties of integers. Before embarking on this study, it is only natural to ask the question: “What is a number?” since integers are just one type of number.

Intuitively, we often associate numbers with counting objects like books and pencils or measuring quantities such as height, weight and distance.

Counting generally begins with 0, which represents the absence of an object, and then proceeds to 1, 2, 3, \ldots, which are the numbers used to count objects. For example, we denote that we don’t have any books by saying “0 books”, we have one book by saying “1 book”, we have two books by saying “2 books”, and so forth.

Thus far, we have denoted the numbers used in counting as 0, 1, 2, 3 and so on. These numbers, called natural numbers, form a highly non-trivial set because we cannot list every single one of them since they go on forever. Therefore, we would need a different approach to describe them.

There is also another problem with such a concrete representation of numbers. While we can think of numbers like 1, 2 \text{ and } 3 as representing the count of certain objects, how do we reason about numbers like -3, \frac{4}{3} \text{ and } \sqrt{2} ?

To reason about the variety of numbers encountered in mathematics, we need a unified framework that describes numbers abstractly, as mathematical objects satisfying specific properties and having certain operations defined on them, regardless of their concrete representations. Such a framework is known as the axiomatic representation of numbers.

For example, under this framework natural numbers could have various concrete representations like \{0, 1, 2, 3, \ldots \}, \{O, I, II, III, IV, \ldots \} or \{\text{One, Two, Three,} \ldots \} and yet, all these representations would be considered equivalent if they all have the same properties and operations defined on them.

Our axiomatic study of numbers will start with a small list of axioms (an initial set of true statements that cannot be proven) pertaining to natural numbers. These axioms are based on Peano’s Axioms, which were formulated by the Italian mathematician Giuseppe Peano in 1889. A natural number is defined as any mathematical object that satisfies these axioms.

Next, with these axioms as foundation, we will define arithmetic operations such as addition and multiplication on natural numbers and prove all the properties resulting from these operations, for instance, commutativity and associativity, among others.

We will conclude our discussion on natural numbers by introducing the notion of ordering on them.

With the axiomatic definition of natural numbers as our foundation, we will proceed to define integers, rational numbers and finally, real numbers. Similar to our discussion on natural numbers, we will also define operations for each of these number systems.

By the end of our discussion, we will develop a deep appreciation for how the entire edifice of numbers, starting with natural numbers as a base and leading up to other number systems including real and complex numbers, can be constructed from just five axioms pertaining to natural numbers.

Let us now proceed with the axiomatic definition of natural numbers.

Next Topic: An Axiomatic Study of Natural Numbers – Peano’s Axioms

Categories
mathematics Number Theory

Number Theory Primer : The Diophantine Equation ax + by = c

authored by Premmi and Beguène

Previous TopicThe Euclidean Algorithm

Introduction

A Diophantine equation is any equation in one or more unknowns that is to be solved in the integers.

The simplest type of Diophantine equation that we shall consider is the linear Diophantine equation in two unknowns:

ax + by = c

where a, b, c are given integers and a, b are not both zero.

A solution of this equation is a pair of integers x_0, y_0 that, when substituted into the equation, satisfy it; that is, we ask that ax_0 + by_0 = c.

Diophantine equations frequently arise when solving certain types of problems, for example, while solving linear congruences, which we will encounter later during our study of modular arithmetic.

Motivation

Let us motivate a discussion on Diophantine equations through a problem posed by Euler in 1770.

“Divide 100 into two summands such that one is divisible by 7 and the other by 11.”

Framing this problem in terms of an equation leads us to the following Diophantine equation,

7x + 11y = 100

where 7x is divisible by 7 and 11y by 11 and consequently, x \text{ and } y are positive integers.

In order to find a solution to this problem we need to know whether this equation is solvable. Also, if the equation has a solution, we would like to how many solutions the equation has and what they are.

We will next discuss how to arrive at answers to these questions.

Intuition

The equation 7x + 11y = 100 is a specific instance of the Diophantine equation of the form ax + by = c, wherea = 7, b = 11 \text{ and } c = 100.

As usual, we will find a solution to this problem by relating it to something we already know. We have already encountered an equation of this form during our study of the greatest common divisor of two integers. We have seen that given two integers a \text{ and } b not both of which are zero, there exist integers x^\prime \text{ and } y^\prime such that ax^\prime + by^\prime = d, where d = \text{gcd}(a, b).

Suppose c is some multiple of d i.e., c = dr, where r is an integer.

If we multiply the equation ax^\prime + by^\prime = d with r, we get

arx^\prime + bry^\prime = dr

Since c = dr,

a(rx^\prime) + b(ry^\prime) = c

We can see that x = rx^\prime \text{ and } y = ry^\prime is a solution of the equation ax + by = c.

Therefore, we can see that a solution to the equation exists whenever d \,|\, c.

We will formalize this observation as a theorem and also derive all possible solutions of this equation, where such solutions exist.

Theorem

The linear Diophantine equation ax + by = c has a solution if and only if d \,|\, c, where d = \text{gcd}(a, b).

First we will prove the following conditional statement.

“If the linear Diophantine equation ax + by = c has a solution, then d \,|\, c, where d = \text{gcd}(a, b).”

Since d = \text{gcd}(a, b), by definition, d \,|\, a \text{ and } d \,|\, b. Therefore, from the definition of the divisibility of integers we know that there exist integers r \text{ and } s for which a = dr\text{ and } b = ds.

If a solution of ax + by = c exists, so that ax_0 + by_0 = c for suitable x_0 \text{ and } y_0, then

c = ax_0 + by_0 = drx_0 + dsy_0 = d(rx_0 + sy_0)

which implies that d \,|\, c.

Now lets prove the converse stated below.

“If d \,|\, c, where d = \text{gcd}(a, b), then there exists a solution to the linear Diophantine equation ax + by = c.”

Since d \,|\, c, from the definition of the divisibility of integers c = dt, for some integer t.

Also, since d = \text{gcd}(a, b), from this theorem we know that there exist integers x_0 \text{ and } y_0 such that d = ax_0 + by_0.

When we multiply this relation by t, we get

dt = (ax_0 + by_0)t = a(tx_0) + b(ty_0)

Since c = dt,

a(tx_0) + b(ty_0) = c

Hence, the Diophantine equation ax + by = c has x = tx_0 \text{ and } y = ty_0 as a particular solution.

How do we find all the other solutions to the Diophantine equation ax + by = c ?

Let us suppose that a solution x_0, y_0 of the given equation is known.

If x^\prime, y^\prime is any other solution, then

ax_0 + by_0 = c = ax^\prime + by^\prime

which is equivalent to,

a(x_0 - x^\prime) = b(y^\prime - y_0)

Substituting a = dr \text{ and } b = ds in the above equation, we get

dr(x_0 - x^\prime) = ds(y^\prime - y_0)

Cancelling the common factor d, we get

r(x_0 - x^\prime) = s(y^\prime - y_0)

From the definition of the divisibility of integers, the above equation implies that r \,|\, s(y^\prime - y_0).

Also, the equations a = dr \text{ and } b = ds can be rewritten as r = \dfrac{a}{d} \text{ and } s = \dfrac{b}{d}.

From corollary 1 of a theorem we have already proved, we know that \text{gcd}\bigg(\dfrac{a}{d}, \dfrac{b}{d}\bigg) = 1. Therefore, \text{gcd}(r, s) = 1.

From Euclid’s Lemma we know that r \,|\, (y^\prime - y_0) since r \,|\, s(y^\prime - y_0) and \text{gcd}(r, s) = 1.

From the definition of the divisibility of integers, this implies that y^\prime - y_0 = rt for some integer t.

Substituting the value of y^\prime - y_0 in the above equation, we get

r(x_0 - x^\prime) = srt

which is equivalent to

x_0 - x^\prime = st

This leads us to the formulas

\begin{equation*} 
\begin{split}
x^\prime = x_0 - st &= x_0 - \bigg(\frac{b}{d}\bigg)t \\
y^\prime = y_0 + rt &= y_0 + \bigg(\frac{a}{d}\bigg)t \\
\end{split}
\end{equation*}

It is easy to see that these values satisfy the Diophantine equation, regardless of the choice of the integer t; for

\begin{equation*} 
\begin{split}
ax^\prime + by^\prime &= a \Bigg[x_0 - \bigg(\frac{b} {d}\bigg)t\Bigg] + b\Bigg[y_0 + \bigg(\frac{a}{d}\bigg)t\Bigg]\\
&= (ax_0 + by_0) + \bigg(\frac{ab}{d} - \frac{ab}{d}\bigg)t \\
&= c + 0 \cdot t \\
&= c \\
\end{split}
\end{equation*}

Thus, there are an infinite number of solutions of the given equation, one for each value of t.

Summary

The linear Diophantine equation ax + by = c has a solution if and only if d \,|\, c, where d = \text{gcd}(a, b). If x_0, y_0 is any particular solution of this equation, then all other solutions are given by

x = x_0 - \bigg(\frac{b}{d}\bigg)t \quad \quad\quad y = y_0 + \bigg(\frac{a}{d}\bigg)t

where t is an arbitrary integer.

Example of an application of the theorem

We are now poised to solve the question posed by Euler in 1770, namely, dividing 100 into two summands such that one is divisible by 7 and the other by 11.

The solution to this problem can be framed in terms of a Diophantine equation as given below.

7x + 11y = 100

where x \text{ and } y are some positive integers.

First let us check whether this equation is solvable. From the theorem we have just proved, we know that this equation has a solution if \text{gcd}(7, 11) divides 100.

In this case, since 7 \text{ and } 11 are prime numbers, finding their greatest common divisor is quite straightforward; we know that their only common divisor is 1 and hence, \text{gcd}(7, 11) = 1.

However, we will still use Euclidean Algorithm to find \text{gcd}(7, 11) since we will need the steps in this algorithm to express \text{gcd}(7, 11) as a linear combination of 7 \text{ and } 11, which in turn will help us find the values of x \text{ and } y.

Since \text{gcd}(7, 11) = \text{gcd}(11, 7) and the Euclidean Algorithm requires the first operand to be greater than or equal to the second we will evaluate, \text{gcd}(11, 7).

Applying the Euclidean Algorithm to the evaluation of \text{gcd}(11, 7), we find that

\begin{equation*} 
\begin{split}
11 &= 2 \times 7 - 3\\
7 &= 2 \times 3 + 1\\
3 &= 3 \times 1 + 0\\
\end{split}
\end{equation*}

and therefore, \text{gcd}(11, 7) = 1. Since 1 \,|\, 100, a solution to this equation exists.

To obtain 1 as a linear combination of 11 \text{ and } 7, we work backward through the previous calculations, as follows:

\begin{equation*} 
\begin{split}
1 &= 7 - (2 \times 3)\\
&= 7 - [2 \times (2 \times 7 - 11)]\\
&= 7 - (2 \times 2 \times 7 ) + 2 \times 11 \\
&= 7 (1 - 4) + 2 \times 11 \\
&= 7 (-3) + 11 \times 2\\
\end{split}
\end{equation*}

Therefore,

7 (-3) + 11 \times 2 = 1

Multiplying this equation by 100, we get

\begin{equation*} 
\begin{split}
100[7 (-3) + 11 \times 2] &= 100\\
\end{split}
\end{equation*}

which is equivalent to,

7(-300) + 11 \times 200 = 100

Hence, x = -300 \text{ and } y = 200 is one solution to the Diophantine equation in question.

All other solutions can be expressed by

\begin{equation*} 
\begin{split}
x &= -300 - \bigg(\frac{11}{1}\bigg)t = -300 - 11t \\
y &= 200 + \bigg(\frac{7}{1}\bigg)t = 200 + 7t\\
\end{split}
\end{equation*}

for some integer t.

Since we want to express 100 as a sum of two positive integers, we want the solution in positive integers.

From the above relations we can see that for x \text{ and } y to be positive, t must be chosen to satisfy simultaneously the following inequalities:

-300 - 11t > 0 \quad \quad \quad 200 + 7t > 0

Multiplying the inequality -300 - 11t > 0 with -1, we get

300 + 11t < 0

Adding -300 to both sides of the inequality results in

11t < -300

Dividing both sides of the inequality by 11, we get

t < \frac{-300}{11}

which is equivalent to,

t < -27\frac{3}{11}

Similarly, adding -200 to both sides of the inequality 200 + 7t > 0, we get

7t > -200

Dividing both sides of the inequality by 7, we get

t > \frac{-200}{7}

which is equivalent to,

t > -28\frac{4}{7}

Therefore,

-27\frac{3}{11} > t > -28\frac{4}{7}

Since t must be an integer, we are forced to conclude that t = -28.

Substituting this value of t in the following equations, we get

\begin{equation*} 
\begin{split}
x &= -300 - 11t =-300 - (11 \times -28) = -300 + 308 = 8  \\
y &= 200 + \bigg(\frac{7}{1}\bigg)t = 200 + 7t = 200 + (7 \times -28) = 200 - 196 = 4\\
\end{split}
\end{equation*}

Thus, our Diophantine equation has a unique positive solution x = 8, y = 4 corresponding to the value t = -28.

Therefore, 7x = 7 \times 8 = 56 \text{ and } 11y = 11 \times 4 = 44 are the two numbers that sum to 100 such that 56 is divisible by 7 and 44 by 11.

Diophantine equation ax + by = c, where a and b are relatively prime

As we have seen in the previous example, it might be helpful to record the form that the theorem we have just discussed takes when the coefficients are relatively prime.

Corollary

If \text{gcd}(a, b) = 1 and if x_0, y_0 is a particular solution of the linear Diophantine equation ax + by = c, then all solutions are given by

x = x_0 - bt \quad \quad \quad y = y_0 + at

for integral values of t.

Categories
mathematics Number Theory

Number Theory Primer : The Euclidean Algorithm

authored by Premmi and Beguène

Previous Topic The Greatest Common Divisor

Introduction

We have already seen a situation that necessitated the calculation of the greatest common divisor of two integers. We calculated the greatest common divisor of the two integers by listing all their positive divisors and choosing the largest one common to each. This method of calculation becomes cumbersome for large numbers. The Euclidean Algorithm, as we shall see shortly, through repeated application of the Division Algorithm provides a more efficient process to calculate the greatest common divisor of two integers.

Intuition

Suppose we want to find the greatest common divisor of 36 and 24 i.e., gcd(36, 24). One way to do this is to list the positive divisors of 36 and 24 and choose the largest divisor common to both as shown below.

The positive divisors of each of 36 \text{ and } 24 are listed below.

\begin{equation*} 
\begin{split}
\text{Positive divisors of } 36 &: 1, 2, 3, 4, 6, 9, 12, 18, 36 \\
\text{Positive divisors of } 24 &: 1, 2, 3, 4, 6, 8, 12, 24 \\
\end{split}
\end{equation*}

We can see that 12 is the greatest common divisor of both 36 \text{ and } 24. Hence, gcd(36, 24) = 12.

Now suppose we have to find the greatest common divisor of 9 and 6 i.e., gcd(9, 6).

The positive divisors of each of 9 \text{ and } 6 are listed below.

\begin{equation*} 
\begin{split}
\text{Positive divisors of } 9 &: 1, 3, 9 \\
\text{Positive divisors of } 6 &: 1, 2, 3, 6 \\
\end{split}
\end{equation*}

We can see that this list of positive divisors for 9 \text{ and } 6 is smaller than the corresponding list for 36 \text{ and } 24.

Similarly, if we are asked to find the greatest common divisor of 12480 \text{ and } 4032, the list of positive divisors of both these numbers will be long and cumbersome to make.

To summarize, since larger numbers have more number of potential divisors than smaller numbers this method of finding all the positive divisors of both the numbers and choosing the largest one common to both might be infeasible when large numbers are involved.

So how do we circumvent this problem? What if we could reduce the problem of finding the greatest common divisor of large numbers to an equivalent albeit easier problem of finding the greatest common divisor of small numbers which is easier to calculate?

We will now explore this method in detail.

Suppose we have to find gcd(36, 24). Let us assume that d = \text{gcd}(36, 24). Since by definition, d \,|\,36 \text{ and } d \,|\,24, part (g) of the Theorem we have already proved allows us to conclude that d divides any linear combination of 36 \text{ and } 24.

By applying the Division Algorithm to 36 \text{ and } 24 we get,

36 = 1 \times 24 + 12

which can be rewritten as

36 - 1 \times 24 = 12

Since d divides any linear combination of 36 \text{ and } 24, d divides 36 - 1 \times 24 i.e., d \,|\, 12. Therefore, d is also the positive common divisor of 24 \text{ and } 12.

Now if c is some arbitrary positive common divisor of 24 \text{ and } 12, then part (g) of the Theorem we have already proved allows us to conclude that c divides any linear combination of 24 \text{ and } 12 i.e., c \,|\, (1 \times 24 + 1 \times 12). Therefore, c divides 36. That is, c is also the positive common divisor of both 36 \text{ and } 24. Since d is the greatest common divisor of 36 \text{ and } 24, it follows that c \leq d. Therefore, d is the greatest common divisor of 24 \text{ and } 12, i.e., \text{gcd}(24, 12) = d.

By similar reasoning, applying the Division Algorithm to 24 \text{ and } 12 we get,

24 = 2 \times 12 + 0

Therefore, d is the greatest common divisor of 12 \text{ and } 0, i.e., \text{gcd}(12, 0) = d.

Hence, gcd(36, 24) = \text{gcd}(24, 12) = \text{gcd}(12, 0) = d = 12.

We have reduced the problem of finding the greatest common divisor of 36 \text{ and } 24 to an equivalent and much easier problem of find the greatest common divisor of 12 \text{ and } 0 i.e., \text{gcd}(36, 24) = \text{gcd}(12, 0).

It should be noted that a zero remainder must eventually appear at some point because the repeated application of the Division Algorithm results in a decreasing sequence of remainders that cannot contain more than 24 integers since we started the division with 24.

To recap, as shown below, to find the greatest common divisor of two integers, namely, 36 \text{ and }24, we repeatedly apply the Division Algorithm till a zero remainder appears and then choose the last nonzero remainder that appears in these equations, namely, the integer 12, as the greatest common divisor of 36 \text{ and }24. This method is much easier than finding all the divisors of 36 \text{ and }24 and choosing the largest divisor common to both.

\begin{equation*} 
\begin{split}
36 &= 1 \times 24 + 12 \\
24 &= 2 \times 12 + 0 \\
\end{split}
\end{equation*} 

This method of finding the greatest common divisor of two integers by repeated application of the Division Algorithm till a zero remainder appears is called the Euclidean Algorithm.

In order to fully appreciate the Euclidean Algorithm let us use it to calculate the greatest common divisor of two large integers, namely, 12480 \text{ and } 4032.

\begin{equation*} 
\begin{split}
12480 &= 3 \times 4032 + 384 \\
4032 &= 10 \times 384 + 192 \\
384 &= 2 \times 192 + 0\\
\end{split}
\end{equation*}

Working down the displayed system of equations, we obtain

\text{gcd}(12480, 4032) = \text{gcd}(4032, 384) = \text{gcd}(384, 192) = \text{gcd}(192, 0) = 192

since

\begin{equation*} 
\begin{split}
\text{gcd}(12480, 4032) &= \text{gcd}(3 \times 4032 + 384, 4032) = \text{gcd}(4032, 384), \\
\text{gcd}(4032, 384) &= \text{gcd}(10 \times 384 + 192, 384) = \text{gcd}(384, 192) \,\& \\
\text{gcd}(384, 192) &= \text{gcd}(2 \times 192 + 0, 192) = \text{gcd}(192, 0) \\

\end{split}
\end{equation*}

We can see that with just three applications of the Division Algorithm we were able to calculate the greatest common divisor of 12480 \text{ and } 4032. This calculation would have been very cumbersome if we had to list each of their divisors and then pick the largest divisor common to both.

We have already seen that the greatest common divisor of two integers can be expressed as a linear combination of the two integers i.e., given integers a \text{ and } b not both zero, there exist integers x \text{ and } y such that gcd(a, b) = ax + by. We did not provide a practical method to find x \text{ and } y which we will remedy now.

In order to represent 192 as a linear combination of the integers 12480 \text{ and } 4032, we will use the equations displayed above as follows.

From the last but one equation, we get

192 = 4032 - 10 \times 384

From the first equation, we get

384 = 12480 - 3 \times 4032

Combining the above two equations we get,

\begin{equation*} 
\begin{split}
192 &= 4032 - 10 \times (12480 - 3 \times 4032) \\
&= (-10) \times 12480 + 31 \times 4032 \\
\end{split}
\end{equation*}

Therefore,

192 = \text{gcd}(12480, 4032) = 12480 x + 4032y

where x = -10 \text{ and } y = 31. It should be noted that this is not the only way to express 192 as a linear combination of the integers 12480 \text{ and } 4032; among other possibilities, we could add and subtract 12480 \cdot 4032 to get

\begin{equation*} 
\begin{split}
192 &= (-10 + 4032) \times 12480 + (31 - 12480) \times 4032 \\
&= 4022 \times 12480 + (-12449) \times 4032 \\
\end{split}
\end{equation*}

From our discussion thus far we have seen that in order to calculate the greatest common divisor of two integers using the Euclidean Algorithm we repeatedly apply the Division Algorithm till a zero remainder appears. This implies that the most efficient way to calculate the greatest common divisor of two integers using the Euclidean Algorithm is to make the zero remainder emerge with as few applications of the Division Algorithm as possible.

We will illustrate this point using an example. Suppose we have to find the greatest common divisor of 24 \text{ and } 14. We have shown two ways of doing this. In the first way as we can see the remainders are all positive integers, while in the second the remainders are positive as well as negative. We can see that the second way leads to fewer steps and hence is more efficient.

\begin{equation*} 
\begin{split}
24 &= 1 \times 14 + 10 \\
14 &= 1 \times 10 + 4 \\
10 &= 2 \times 4 + 2 \\
4 &= 2 \times 2 + 0\\
\end{split}
\end{equation*}
\begin{equation*} 
\begin{split}
24 &= 2 \times 14 - 4 \\
14 &= 3 \times 4 + 2 \\
4 &= 2 \times 2 + 0\\
\end{split}
\end{equation*}

Lets illustrate this point with an example. Suppose we divide 19 by 10. We can express 19 as

19 = 1 \times 10 + 9 \\

or as

19 = 2 \times 10 - 1

When we divide 19 by 10 we can get any one of the integers between 0 \text{ to } 9 as remainders. In the first case we got a remainder that is greater than \frac{10}{2} = 5 and in the second case we got a remainder that is less than 5. Therefore, we can reduce the number of steps in the Euclidean Algorithm whenever we are able to choose remainders, r, such that |r| < \frac{b}{2}, where b is the divisor. This might not always be possible, especially when |r| = \frac{b}{2}.

As we can see in the second case, this happened when 14 was divided by 4 and resulted in the remainder of 2 which equals \frac{4}{2}. We cannot optimize this particular step for a smaller absolute remainder and consequently, lesser number of subsequent steps in the Euclidean Algorithm since

14 = 3 \times 4 + 2

and

14 = 4 \times 4 - 2

result in the same absolute remainder of 2.

We will now formalize all these observations we have made thus far.

Lemma

If a = bq + r, as in the Division Algorithm, then \text{gcd}(a, b) = \text{gcd}(b, r).

Suppose d = \text{gcd}(a, b). Then it follows from the definition of \text{gcd}(a, b), that d \,|\, a \text{ and } d \,|\, b. Part (g) of the Theorem we have already proved lets us conclude that d divides any linear combination of a \text{ and } b i.e., d \,|\, (a - bq) or equivalently, d \,|\, r. Therefore, d is a positive common divisor of both b \text{ and }r.

On the other hand, if c is an arbitrary common divisor of b \text{ and }r, then part (g) of the Theorem we have already proved lets us conclude that c divides any linear combination of b \text{ and } r i.e., c \,|\, (bq + r) or equivalently, c \,|\, a. Hence, c is a positive common divisor of both a \text{ and }b. Since, \text{gcd}(a, b) = d, it implies that c \leq d.

Therefore, it follows from the definition of \text{gcd}(b, r) that d = \text{gcd}(b, r).

The Euclidean Algorithm

The Euclidean Algorithm may be described as follows:

Let a \text{ and } b be two integers whose greatest common divisor is desired.

We know from the definition of the divisibility of integers that, if a positive integer d divides a, then a = dr, for some integer r. Also we can see that -a = d(-r). That is, if d divides a, then d also divides -a.

Therefore, \text{gcd}(a, b) = \text{gcd}(|a|, |b|) and there is no harm is assuming that a \geq b > 0.

The first step of the Euclidean Algorithm is to apply the Division Algorithm to integers a \text{ and } b to get

a = q_1b + r_1 \quad \quad \quad 0 \leq r_1 < b

If it happens that r_1 = 0, then b \,|\, a and \text{gcd}(a, b) = b.

When r_1 \neq 0, we divide b by r_1 to produce integers q_2 \text{ and } r_2 satisfying

b = q_2r_1 + r_2 \quad \quad \quad 0 \leq r_2 < r_1

If r_2 = 0, then we stop; otherwise, we proceed as before to obtain

r_1 = q_3r_2 + r_3 \quad \quad \quad 0 \leq r_3 < r_2

This division process continues until some zero remainder appears, say, at the (n + 1)th stage where r_{n-1} is divided by r_n; a zero remainder occurs sooner or later because the decreasing sequence b > r_1 > r_2 > \cdots \geq 0 cannot contain more than b integers.

The result is the following system of equations:

\begin{equation*} 
\begin{split}
a &= q_1b + r_1 \quad \quad \quad 0 < r_1 < b\\
b &= q_2r_1 + r_2 \quad \quad \quad 0 < r_2 < r_1\\
r_1 &= q_3r_2 + r_3 \quad \quad \quad 0 < r_3 < r_2\\
&\,\,\,\vdots\\
r_{n-2} &= q_nr_{n-1} + r_n \quad \quad \quad 0 < r_n < r_{n-1}\\
r_{n-1} &= q_{n+1}r_{n} + 0 \\
\end{split}
\end{equation*}

Using the result of the lemma we have proved previously, we simply work down the displayed system of equations, obtaining

\text{gcd}(a, b) = \text{gcd}(b, r_1) = \text{gcd}(r_1, r_2) = \cdots = \text{gcd}(r_{n-1}, r_n) =  \text{gcd}(r_n, 0) = r_n
Expressing gcd(a,b) as a linear combination of a and b

Theorem asserts that \text{gcd}(a, b) can be expressed as a linear combination of a \text{ and } b i.e., in the form ax + by, but the proof of the theorem does not provide a method to determine the integers x \text{ and } y. We will now demonstrate how the Euclidean Algorithm can be used to determine x \text{ and } y.

Starting with the last but one equation arising from the algorithm, we rewrite

r_{n-2} = q_nr_{n-1} + r_n

as

r_n = r_{n-2} - q_nr_{n-1}

Now we solve the preceding (last but second) equation in the algorithm for r_{n-1} and substitute in the above equation to get,

\begin{equation*} 
\begin{split}
r_n &= r_{n-2} - q_n(r_{n-3} - q_{n-1}r_{n-2}) \\
&= (1 + q_nq_{n-1})r_{n-2} + (-q_n)r_{n-3} \\
\end{split} 
\end{equation*}

This represents r_n as a linear combination of r_{n-2} and r_{n-3}. Continuing backward through the system of equations, we successfully eliminate the remainders r_{n-1}, r_{n-2}, \ldots, r_2, r_1 until a stage is reached when r_n = \text{gcd}(a, b) is expressed as a linear combination of a \text{ and } b.

Some interesting observations

The number of steps required in the Euclidean Algorithm is at most five times the number of digits in the smaller integer. This was proved by the French mathematician Gabriel Lamé (1795-1870).

Another observation of interest is that for each n > 0, it is possible to find integers a_n \text{ and } b_n such that exactly n divisions are required to compute \text{gcd}(a_n, b_n) by the Euclidean Algorithm. We shall prove this fact later.

As we have already discussed, the number of steps in the Euclidean Algorithm can be reduced if it is possible to select remainders r_{k+1} such that |r_{k+1}| < \dfrac{r_k}{2}.

An important consequence of Euclidean Algorithm

The following theorem is an important consequence of the Euclidean Algorithm.

Theorem

If k > 0, where k is an integer, \text{gcd}(ka, kb) = k\,\text{gcd}(a, b).

If each of the equations appearing in the Euclidean Algorithm for a \text{ and } b is multiplied by k, we obtain,

\begin{equation*} 
\begin{split}
ak &= q_1(bk) + r_1k \quad \quad \quad 0 < r_1k < bk\\
bk &= q_2(r_1k) + r_2k \quad \quad \quad 0 < r_2k < r_1k\\
r_1k &= q_3(r_2k) + r_3k \quad \quad \quad 0 < r_3k < r_2k\\
&\,\,\,\vdots\\
r_{n-2}k &= q_n(r_{n-1}k) + r_nk \quad \quad \quad 0 < r_nk < r_{n-1}k\\
r_{n-1}k &= q_{n+1}(r_{n}k) + 0 \\
\end{split}
\end{equation*}

But this is clearly the Euclidean Algorithm applied to the integers ak \text{ and } bk, so that their greatest common divisor is the last nonzero remainder r_nk; that is

\text{gcd}(ka, kb) = r_nk = k \, \text{gcd}(a, b)

as stated in the theorem.

An Alternate Proof

The following is an alternate proof of the theorem we have just proved.

From this Theorem we can conclude that \text{gcd}(ak, bk) is the smallest positive integer of the form (ak)x + (bk)y, which, in turn, is equal to k times the smaller positive integer of the form ax + by; this latter value is equal to k \, \text{gcd}(a, b).

Application of the Theorem

By way of illustrating this theorem, we see that

\text{gcd}(36, 24) = 3 \, \text{gcd}(12, 8) = 3 \, \cdot 2\, \text{gcd}(6, 4) = 3 \, \cdot 2\, \cdot 2\, \text{gcd}(3, 2) = 12 \, \cdot 1 = 12

Corollary

For any integer k \neq 0, \text{gcd}(ka, kb) = |\,k\,|\,\text{gcd}(a, b).

It suffices to consider the case in which k < 0. Then -k = |\,k\,| > 0.

Also, we know from the definition of the divisibility of integers that, if a positive integer d divides ak, then ak = dr, for some integer r. Also we can see that -ak = d(-r). That is, if d divides ak, then d also divides -ak.

Therefore,

\begin{equation*} 
\begin{split}
\text{gcd}(ak, bk) &= \text{gcd}(-ak, -bk)\\
&= \text{gcd}(a\,|\,k\,| , b\,|\,k\,| )\\
\end{split}
\end{equation*}

By the theorem we have just proved,

\text{gcd}(a\,|\,k\,| , b\,|\,k\,| ) = |\,k\,|\,\text{gcd}(a, b)

Hence,

\text{gcd}(ak, bk) = |\,k\,|\,\text{gcd}(a, b)

Least Common Multiple

There is a concept parallel to that of the greatest common divisor of two integers, known as their least common multiple. For the sake of completion, we shall explore this concept in detail in this section, though we shall not have much occasion to make use of it.

Intuition

Suppose you are throwing a party for kids. You want to give each kid who attends the party a box of chocolates. You are not certain about the exact number of kids who will turn up for the party. Depending on the number of kids who turn up for the party you would like to give each of them a box containing either 18 \text{ or } 42 pieces of chocolate; if less number of kids turn up each kid will get a box with 42 pieces of chocolates otherwise each kid will get a box with 18 pieces of chocolates.What is the minimum number of pieces of chocolates that needs to be bought such that they can be divided up into boxes of either 18 \text{ or } 42 pieces of chocolates with no chocolates leftover in either case?

Suppose the minimum number of pieces of chocolates that needs to be bought is x. Then x should divide both 18 \text{ and } 42 or equivalently, x is a common multiple of both 18 \text{ and } 42. Since x is the smallest positive integer that is divisible by both 18 \text{ and } 42 or equivalently, the smallest possible positive multiple of both 18 \text{ and } 42 it is called the least common multiple of integers 18 \text{ and } 42.

How do we find x? One way is to list the positive multiples of each of 18 \text{ and } 42 till we get a multiple that is common to both as illustrated below.

\begin{equation*} 
\begin{split}
\text{Positive multiples of } 18 &: 18, 36, 54, 72, 90, 108, 126 \\
\text{Positive multiples of } 42 &: 42, 84, 126 \\
\end{split}
\end{equation*}

We can see that since 126 is common to both the lists, x = 126 is the smallest positive integer that is divisible by both 18 \text{ and } 42. Hence, the minimum number of pieces of chocolates that needs to be bought is 126.

We could also use an alternate and more concise method to find the least common multiple of two integers. Since 42 > 18, the least common multiple of 18 \text{ and } 42 is the smallest positive multiple of 42 that is also a multiple of 18. Therefore, we test each multiple of 42 to see whether it is also a multiple of 18 as shown below.

We test 1 \cdot 42 = 42. Since 18 \nmid 42, 42 is not a common multiple of 18 \text{ and } 42.

Next, we test 2 \cdot 42 = 84. Since 18 \nmid 84, 84 is not a common multiple of 18 \text{ and } 42.

We test 3 \cdot 42 = 126. Since 18 \,|\, 126 and the only two smaller positive multiples of 42 are not multiples of 18, the least common multiple of 18 \text{ and } 42 is 126.

Proof of existence of Least Common Multiple

An integer c is said to be a common multiple of two nonzero integers a \text{ and } b whenever a \,|\, c \text{ and } b \,|\, c.

Evidently, zero is a common multiple of a \text{ and } b. There also exist common multiples that are not trivial because the products ab \text{ and} -\!(ab) are both common multiples of a \text{ and } b, one of which is always positive. Therefore, the set of positive common multiples of a \text{ and } b is nonempty. Consequently, by the Well-Ordering Principle, the set of positive common multiples of a \text{ and } b must contain a smallest integer; we call it the least common multiple of a \text{ and } b.

Definition

The least common multiple of two nonzero integers a \text{ and } b, denoted by \text{lcm}(a, b), is the positive integer m satisfying the following conditions:

(a) a \,|\, m \text{ and } b \,|\, m

(b) For any integer c, if a \,|\, c \text{ and } b \,|\, c, with c > 0, then m \leq c

Example

As an example, the positive common multiple of the integers -9 \text{ and } 30 are 90, 180, 270, \ldots; hence, \text{lcm}(-9, 30) = 90.

Remark

The following remark is clear from our discussion thus far: given nonzero integers a \text{ and } b, \text{lcm}(a, b) always exists and \text{lcm}(a, b) \leq |ab|.

LCM of large integers

We had discussed a method to calculate the least common multiple of two nonzero integers. But does this method scale to large integers?

Now suppose instead of finding the least common multiple of 18 \text{ and } 42 we have to find the least common multiple of 3054 \text{ and } 12378. As we have already discussed, we can test each multiple of 12378 to see whether it is divisible by 3054. The first such multiple we encounter is the least common multiple of 3054 \text{ and } 12378. Let us proceed to find \text{lcm}(3054, 12378) in this way.

We test 1 \cdot 12378 = 12378. Since 3054 \nmid 12378, 12378 is not a common multiple of 3054 \text{ and } 12378.

We test 2 \cdot 12378 = 24756. Since 3054 \nmid 24756, 24756 is not a common multiple of 3054 \text{ and } 12378.

We proceed by testing every subsequent multiple i.e., the 3\text{rd}, 4\text{th}, 5\text{th}, etc., for divisibility by 3054.

Proceeding this way, we arrive at the 509\text{th} multiple. We test 509 \cdot 12378 = 6300402. Since 3054 \,|\, 6300402 and all the other smaller positive multiples of 12378 are not multiples of 3054, the least common multiple of 3054 \text{ and } 12378 is 6300402.

We can see that this method of finding the least common multiple of two nonzero integers becomes cumbersome when the integers are large. We had to iterate through 509 multiples of 12378 till we found a multiple that is common to both 3054 \text{ and } 12378.

Is there a more efficient way to find the least common multiple of two nonzero integers? Number Theory is one of those rare mathematical disciplines where one oftentimes solves problems by relying to a large extent upon trial and error, in combination with some curiosity, intuition and ingenuity. Often patient, plodding experimentation precedes rigorous proof. We will tackle this problem of coming up with an efficient way to find the least common multiple of two nonzero integers using this philosophy. We will try and connect some of ideas we have encountered so far to solve this problem.

We have seen before that given nonzero integers a \text{ and } b, \text{lcm}(a, b) \leq |ab|. Therefore, if a \text{ and } b are positive integers then,

\text{lcm}(a, b) \leq ab

If \text{lcm}(a, b) = ab, then it is easy to find the least common multiple of a \text{ and } b since we have to just multiply a \text{ with } b. In which situation does \text{lcm}(a, b) = ab? Lets try some examples.

\begin{equation*} 
\begin{split}
\text{lcm}(2, 3) &= 6\\
\text{lcm}(5, 6) &= 30\\
\text{lcm}(15, 22) &= 330\\
\end{split}
\end{equation*}

We can see that whenever integers a \text{ and } b are relatively prime i.e., \text{gcd}(a, b) = 1, then \text{lcm}(a, b) = ab. Therefore, we can see that whenever integers a \text{ and } b are relatively prime,

\text{gcd}(a, b)\,\text{lcm}(a, b) = ab

The next question to ask is whether this relationship always holds irrespective of whether a \text{ and } b are relatively prime. Lets try to check this using some examples.

Let us verify whether \text{gcd}(18, 42)\,\text{lcm}(18, 42) = 756 is true.

We will find \text{gcd}(18, 42) using Euclidean Algorithm.

\begin{equation*} 
\begin{split}
42 &= 2 \times 18 + 6 \\
18 &= 3 \times 6 + 0 \\
\end{split}
\end{equation*}

We can see that \text{gcd}(18, 42) = 6. We have already calculated \text{lcm}(18, 42) = 126. Since 6 \times 126 = 756, the relationship namely, \text{gcd}(18, 42)\,\text{lcm}(18, 42) = 18 \times 42 = 756 holds good in this case.

In order to verify whether this relationship is true for any two positive integers we will postulate this relationship as a theorem and try to prove it.

Relationship between gcd and lcm of positive integers

The following theorem establishes a relationship between the greatest common divisor and the least common multiple of two positive integers.

Theorem

For positive integers a \text{ and } b, \text{gcd}(a, b)\, \text{lcm}(a, b) = ab.

Proof. Suppose d = \text{gcd}(a, b). By definition, d \,|\, a \text{ and } d \,|\, b. We know from the definition of the divisibility of integers that, if a positive integer d divides a, then a = dr, for some integer r. Similarly, b = ds, for some integer s.

Consequently, r = \dfrac{a}{d} \text{ and } s = \dfrac{b}{d}. Suppose there is a positive integer m such that m = \dfrac{ab}{d}. Then m = as = rb i.e., a \,|\, m \text{ and } b \,|\, m. Therefore, by definition, m is the common multiple of a \text{ and } b.

Now let c be any positive integer that is a common multiple of a \text{ and } b i.e., c = au = bv for integers u \text{ and } v.

From this theorem, we know that since d = \text{gcd}(a, b), there exist integers x \text{ and } y such that d = ax + by.

Consequently,

\frac{c}{m} = \frac{cd}{ab} = \frac{c(ax + by)}{ab} = \Big(\frac{c}{b}\Big)x + \Big(\frac{c}{a}\Big)y = vx + uy

Since the expression vx + uy is the sum of the product of two integers, it is also an integer. Therefore, m \,|\, c. Consequently, from part (f) of the theorem we have already proved, m \leq c.

Hence, in accordance with this definition of the least common multiple of two nonzero integers, m = \text{lcm}(a, b). That is,

\text{lcm}(a, b) = \frac{ab}{d} = \frac{ab}{\text{gcd}(a, b)}

Rewriting the above equation, we get

\text{gcd}(a, b)\, \text{lcm}(a, b) = ab

which is what we started out to prove.

We have already discussed the corollary of this theorem which we will now formally state.

Corollary. For any choice of positive integers a \text{ and } b, \text{lcm}(a, b) = ab if and only if \text{gcd}(a, b) = 1.

It should be noted that for nonzero integers a \text{ and } b, \text{gcd}(a, b)\, \text{lcm}(a, b) = |ab|. The proof is the same as what we have already proved except that m = \dfrac{|ab|}{d}.

Calculation of lcm of integers using the relationship between gcd and lcm of integers

The main purpose of the theorem we have just proved is to make the calculation of the least common multiple of two integers dependent on the value of their greatest common divisor, which in turn can be calculated from the Euclidean Algorithm.

Earlier we had calculated the least common multiple of 3054 \text{ and } 12378 by iterating through each multiple of 12378 and checking whether it was divisible by 3054. We found the calculation tiresome since we had to iterate till the 509th multiple of 12378 to get the least common multiple of both 3054 \text{ and } 12378. We will now use this theorem we have just proved to recalculate \text{lcm}(3054, 12378) and thereby demonstrate the efficiency of this method over the previous one.

Since \text{lcm}(3054, 12378) = \dfrac{3054 \times 12378}{\text{gcd}(3054, 12378)}, we will first calculate \text{gcd}(3054, 12378) using the Euclidean Algorithm.

The appropriate applications of the Division Algorithm produce the following equations:

\begin{equation*} 
\begin{split}
12378 &= 4 \times 3054 + 162 \\
3054 &= 19 \times 162 - 24  \\
162 &= 7 \times 24 - 6  \\
24 &= (-4)(-6) + 0
\end{split}
\end{equation*}

From our previous discussion we know that the absolute value of the last nonzero remainder appearing in these equations, namely, integer 6, is the great common divisor of 3054 \text{ and } 12378.

Therefore, \text{gcd}(3054, 12378) = 6 and consequently,

\text{lcm}(3054, 12378) = \frac{3054 \times 12378}{6} = 6300402

Greatest Common Divisor of more than Two Integers

The notion of greatest common divisor can be extended to more than two integers in an obvious way.

For example, in the case of three integers a, b \text{ and } c, not all zero, \text{gcd}(a, b, c) is defined to be the positive integer d having the following properties:

(a) d is a divisor of each of a, b \text{ and } c

(b) If e divides the integers a, b \text{ and } c, then e \leq d

Calculating the Greatest Common Divisor of Three Integers

How do we calculate the greatest common divisor of three integers? We know how to calculate the greatest common divisor of two integers using Euclidean Algorithm. We could use the Euclidean Algorithm to calculate the greatest common divisor of three integers if we could transform the problem of calculating the greatest common divisor of three integers into an equivalent one of calculating the greatest common divisor of two integers.

The following theorem will help us achieve this.

Theorem

For integers a, b \text{ and } c, no two of which are zero and d = \text{gcd}(a, b, c)

d = \text{gcd}(\text{gcd}(a, b), c) = \text{gcd}(a, \text{gcd}(b, c)) = \text{gcd}(\text{gcd}(a, c), b) 

Since d = \text{gcd}(a, b, c), by definition, d \,|\, a, d \,|\, b \text{ and } d \,|\, c.

Since d \,|\, a \text{ and } d \,|\, b, by part (g) of the theorem we have already proved, d \,|\, ax + by, for arbitrary integers x \text{ and } y.

From this theorem we know that there exist integers x \text{ and } y such that \text{gcd}(a, b) = ax + by.

Therefore, d \,|\, \text{gcd}(a, b) and consequently, d is a positive common divisor of \text{gcd}(a, b) \text{ and } c .

Suppose e = \text{gcd}(\text{gcd}(a, b), c), where e is some positive integer.

Since e is the greatest common divisor of \text{gcd}(a, b) \text{ and } c, by definition, d \leq e.

Let f = \text{gcd}(a, b), where f is some positive integer.

By definition, e \,|\, f \text{ and } e \,|\, c.

By the definition of divisibility of integers, if e \,|\, f, then there exists an integer r such that f = er.

Since f = \text{gcd}(a, b), by definition, f \,|\, a \text{ and } f \,|\, b. Therefore, by the definition of divisibility of integers, a = fs \text{ and } b = ft, for some integers s \text{ and } t.

Therefore, a = fs = e(rs) \text{ and } b = ft = e(rt) and consequently, e \,|\, a \text{ and } e \,|\, b.

Since e \,|\, a, e \,|\, b \text{ and } e \,|\, c and d = \text{gcd}(a, b, c), by definition, e \leq d.

Since d \leq e and e \leq d, it follows that d = e; that is,

d = \text{gcd}(\text{gcd}(a, b), c)

Similarly, we can prove that d = \text{gcd}(a, \text{gcd}(b, c)) = \text{gcd}(\text{gcd}(a, c), b).

Example

Let us calculate \text{gcd}(39, 42, 54).

From the theorem we have just proved it follows that \text{gcd}(39, 42, 54) = \text{gcd}(\text{gcd}(39, 42), 54).

We will first calculate \text{gcd}(39, 42).

Applying the Euclidean Algorithm to the evaluation of \text{gcd}(39, 42), we find that

\begin{equation*} 
\begin{split}
42 &= 1 \times 39 + 3\\
39 &= 13 \times 3 + 0\\
\end{split}
\end{equation*}

and therefore, \text{gcd}(39, 42) = 3.

Incorporating this result in our calculation, we get

\text{gcd}(39, 42, 54) = \text{gcd}(\text{gcd}(39, 42), 54) = \text{gcd}(3, 54)

Once again, we will apply the Euclidean Algorithm to evaluate \text{gcd}(3, 54).

\begin{equation*} 
\begin{split}
54 &= 18 \times 3 + 0\\
\end{split}
\end{equation*}

Therefore, \text{gcd}(3, 54) = 3.

Hence,

\text{gcd}(39, 42, 54) = \text{gcd}(\text{gcd}(39, 42), 54) = \text{gcd}(3, 54) = 3

It should be noted the we can extend the definition of the greatest common divisor of two integers to any number of integers by adopting the method we used to define the greatest common divisor of three integers.

Next Topic The Diophantine Equation

Categories
mathematics Number Theory

Number Theory Primer : The Greatest Common Divisor

authored by Premmi and Beguène

Previous Topic : The Division Algorithm

It is worthwhile to study in detail the case wherein the remainder in the Division Algorithm turns out to be zero. Such a study leads to some very interesting insights that form the cornerstone of number theory.

Divisibility of Integers

Definition

An integer b is said to be divisible by an integer a \neq 0, symbolically written as a \,|\, b, if there exists some integer c such that b = ac.

We write a \nmid b to indicate that b is not divisible by a.

Examples

For example, -24 is divisible by 3, because -24 = 3(-8). However, 10 is not divisible by 7 because there is no integer c that makes the statement 10 = 7c true.

There are a multitude of ways for expressing the divisibility relation a \,|\, b. We could say that a is a divisor of b, that a is a factor of b or that b is a multiple of a.

It should be noted that whenever the notation a \,|\, b is used, it is understood that a is a nonzero integer.

If a is a divisor of b, then there exists an integer c such that b = ac. This implies that b = (-a)(-c), which means that b is also divisible by -a. Hence, the divisors of an integer always occur in pairs.

To find all the divisors of a given integer, it will suffice to deduce the positive integers and adjoin to them the corresponding negative integers. For this reason, we will usually limit ourselves to a consideration of only the positive divisors.

We list below some immediate consequences of the above definition.

Theorem

For integers a, b \text{ and } c the following hold:

(a) a \,|\, 0, 1 \,|\, a, a \,|\, a

(b) a \,|\, 1 if and only if a = \pm 1

(c) If a \,|\, b \text{ and } c \,|\, d, then ac \,|\, bd

(d) If a \,|\, b \text{ and } b \,|\, c, then a \,|\, c

(e) a \,|\, b \text{ and } b \,|\, a if and only if a =\pm b

(f) If a \,|\, b \text{ and } b \neq 0, then |a| \leq |b|

(g) If a \,|\, b \text{ and } a \,|\, c, then a \,|\, (bx + cy) for arbitary integers x \text{ and } y

Proof

We will now prove assertions (a) to (g).

(a) a \,|\, 0, 1 \,|\, a, a \,|\, a

\quad Integer b = 0 is divisible by integer a because there exists an integer c = 0 such that b = ac.

\quad We can verify this by substituting the respective values for b \text{ and } c which results in the equality, 0 = a(0).

\quad Similarly, a is divisible by 1 because a = 1 \cdot a.

\quad a is divisible by a because a = a \cdot 1.

(b) a \,|\, 1 if and only if a = \pm 1

\quad Let us prove the first part of the assertion, namely, “If a \,|\, 1, then a =\pm 1.”

\quad If a \,|\, 1, then there exists an integer b such that 1 = ab.

\quad ab = 1 implies that either a = 1 \text{ and } b = 1 or a = -1 \text{ and } b = -1.

\quad This proves the assertion that if a \,|\, 1, then a =\pm 1.

\quad Let us prove the second part of the assertion, namely, “If a =\pm 1, then a \,|\, 1 .”

\quad If a = 1 it follows that 1 = ab where b = 1. Hence, 1 is divisible by a.

\quad Similarly, if a = -1 it follows that 1 = ab where b = -1. Hence, 1 is divisible by a.

\quad Therefore, a \,|\, 1 if and only if a = \pm 1.

(c) If a \,|\, b \text{ and } c \,|\, d, then ac \,|\, bd

\quad If a \,|\, b, then there exists some integer e such that

b = ae

\quad Similarly, since c \,|\, d,

d = cf

\quad for some integer f.

\quad Therefore,

bd = aecf = acef = acg

\quad where g = ef.

\quad Since the product of two integers is also an integer, bd \text{ and } ac are also integers.

\quad Hence, integer bd is divisible by integer ac because there exists an integer g such that bd = acg.

\quad Thus we have proved the assertion that if a \,|\, b \text{ and } c \,|\, d, then ac \,|\, bd.

(d) If a \,|\, b \text{ and } b \,|\, c, then a \,|\, c

\quad If a \,|\, b, then there exists some integer d such that

b = ad

\quad Similarly, if b \,|\, c, then there exists some integer e such that

c = be

\quad Multiplying both the equations together, we get

bc = adbe

\quad Dividing both sides of the equation by b, results in

c = ade = af

\quad where f = de.

\quad The integer c is divisible by integer a because there exists an integer f such that c = af.

\quad This proves the assertion that if a \,|\, b \text{ and } b \,|\, c, then a \,|\, c.

(e) a \,|\, b \text{ and } b \,|\, a if and only if a =\pm b

\quad Let us prove the first part of the assertion, namely, “If a \,|\, b \text{ and } b \,|\, a, then a =\pm b.”

\quad If a \,|\, b \text{ and } b \,|\, a, then there are integers c \text{ and } d such that

b = ac

\quad and

a = bd

\quad Multiplying both the equations, we get

ba = acbd

\quad Rearranging the terms of the above equation, results in

ab = abcd

\quad Dividing both sides of the equation by ab,

1 = cd

\quad Since c \text{ and } d are both integers, cd = 1 only when c = d = 1 or c = d = -1.

\quad Since b = ac, when c = 1, \, b = a \implies a = b and when c = -1, \, b = -a \implies a = -b.

\quad Therefore, we have proved that if a \,|\, b \text{ and } b \,|\, a then a =\pm b.

\quad Let us now prove the second part of the assertion, namely, “If a =\pm b, then a \,|\, b \text{ and } b \,|\, a.”

\quad Let us consider the case when a = b.

a = b \implies b = a \implies b = a \cdot 1

\quad b is divisible by a because there exists an integer c = 1 such that b = ac.

\quad Also,

a = b \implies a = b \cdot 1

\quad a is divisible by b because there exists an integer c = 1 such that a = bc.

\quad Therefore, when a = b we have proved that a \,|\, b \text{ and } b \,|\, a.

\quad Next, let us consider the case when a = -b.

a = -b \implies b = -a \implies b = a \cdot (-1)

\quad b is divisible by a because there exists an integer c = -1 such that b = ac.

\quad Similarly,

a = -b \implies a = b \cdot (-1)

\quad a is divisible by b because there exists an integer c = -1 such that a = bc.

\quad Therefore, when a = -b we have proved that a \,|\, b \text{ and } b \,|\, a.

\quad Hence, we have proved that a \,|\, b \text{ and } b \,|\, a if and only if a =\pm b.

(f) If a \,|\, b \text{ and } b \neq 0, then |a| \leq |b|

\quad If a \,|\, b, then there exists an integer c such that b = ac.

\quad Also, b \neq 0 implies that c \neq 0.

\quad Upon taking absolute values, we get

|b| = |ac| = |a|\cdot|c|

\quad Since c \neq 0, it implies that |c| \geq 1.

\quad Hence,

|b| = |a|\cdot|c| \geq |a|

\quad Therefore, we have proved the assertion that if a \,|\, b \text{ and } b \neq 0, then |a| \leq |b|.

(g) If a \,|\, b \text{ and } a \,|\, c, then a \,|\, (bx + cy) for arbitary integers x \text{ and } y

\quad The relation a \,|\, b \text{ and } a \,|\, c implies that b = ad \text{ and } c = ae, for some integers d \text{ and } e.

\quad For arbitrary integers x \text{ and } y,

bx + cy = adx + aey = a(dx + ey) = af

\quad where f = dx + ey.

\quad Since f is an integer, a \,|\, (bx + cy); thus, the assertion is proved.

It is worth noting that this assertion extends by induction to sums of more than two terms.

That is, if a \,|\, b_k \text{ for } k = 1, 2, \ldots, n, then

a \,|\, (b_1x_1 + b_2x_2 + \cdots + b_nx_n) 

for all integers x_1, x_2, \ldots, x_n.

Let us prove this assertion using proof by induction.

We will first establish the basis for induction i.e., prove the assertion for k = 1.

If a \,|\, b_1, then a \,|\, b_1x_1 for arbitary integer x_1.

If a \,|\, b_1, then there exists an integer c such that b_1 = ac. Therefore,

b_1x_1 = acx_1 = a(cx_1)

where cx_1 is an integer. Hence, a \,|\, b_1x_1.

Next let us prove the induction step by using the inductive hypothesis which is stated below.

We will assume that if a \,|\, b_k \text{ for } k = 1, 2, \ldots, m, then

a \,|\, (b_1x_1 + b_2x_2 + \cdots + b_mx_m) 

for all integers x_1, x_2, \ldots, x_m, where m is some fixed but unspecified positive integer .

Under the assumption that this assertion is true, we will prove that the assertion “If a \,|\, b_k \text{ for } k = 1, 2, \ldots, m+1, then

a \,|\, (b_1x_1 + b_2x_2 + \cdots + b_{m+1}x_{m+1}) 

for all integers x_1, x_2, \ldots, x_{m+1}” must be true.

Since a \,|\, (b_1x_1 + b_2x_2 + \cdots + b_mx_m) it implies that there exists an integer d such that

b_1x_1 + b_2x_2 + \cdots + b_mx_m = ad

Adding b_{m+1}x_{m+1} to both sides of the equation we get,

b_1x_1 + b_2x_2 + \cdots + b_mx_m + b_{m+1}x_{m+1} = ad + b_{m+1}x_{m+1}

Since b_{m+1} is divisible by a, there exists an integer e such that

b_{m+1} = ae

Substituting for b_{m+1} on the right hand side of the equation, we get

b_1x_1 + b_2x_2 + \cdots + b_mx_m + b_{m+1}x_{m+1} = ad + aex_{m+1} = a(d + ex_{m+1})

Since d + ex_{m+1} is an integer it implies that b_1x_1 + b_2x_2 + \cdots + b_mx_m + b_{m+1}x_{m+1} is divisible by a. This proves the induction step.

Our reasoning shows that whenever the assertion is true for any integer m it is also true for integer m+1. We have already proved that the assertion is true for 1. Consequently, it is also true for 1+1 = 2 and so on ad infinitum. Hence the assertion, namely,

“If a \,|\, b_k \text{ for } k = 1, 2, \ldots, n, then

a \,|\, (b_1x_1 + b_2x_2 + \cdots + b_nx_n) 

for all integers x_1, x_2, \ldots, x_n” is true for every positive integer n.

Common Divisors

Intuition

Annie is organizing her wardrobe for storage. She has 24 pants and 36 shirts which she needs to store in boxes such that each box has the same type (either all pants or all shirts) and number of clothing items. Also, each box must contain at least 2 pieces of clothing. What are the possible number of clothing items in each box?

Since each box has either pants or shirts and the number of clothing items in each box is the same and greater than 1, we need to find the positive common divisors of 24 \text{ and } 36 that are greater than 1.

Let us now list the positive divisors of each of 24 \text{ and } 36.

\begin{equation*} 
\begin{split}
\text{Positive divisors of } 24 &: 1, 2, 3, 4, 6, 8, 12, 24 \\
\text{Positive divisors of } 36 &: 1, 2, 3, 4, 6, 9, 12, 18, 36 \\
\end{split}
\end{equation*}

Since there are at least 2 items of clothing per box, the possible numbers of clothing items in each box are 2, 3, 4, 6 \text{ and } 12.

Definition

If a \text{ and } b are arbitrary integers, then an integer d is said to be a common divisor of a \text{ and } b if both d \,|\, a \text{ and } d \,|\, b.

Greatest Common Divisor

Intuition

Now suppose Annie has to use the least number of boxes to store her clothes, while still ensuring that each box contains the same type of clothing and the same number of clothing items. How would be she go about this task?

In order to use the minimum number of boxes, Annie has to store the maximum number of clothing items in each box while still ensuring that each box contains the same type of clothing and the same number of clothing items. That is, Annie should find the largest integer that divides both 24 \text{ and } 36, as this would equal the number of clothing items to place in each box. We have already calculated in the example above that 12 is the largest integer that divides both 24 \text{ and } 36. So Annie will place 12 items of clothing in each box. She will use 24 / 12 = 2 boxes to store her 24 pants and 36 / 12 = 3 boxes to store her 36 shirts. Hence, she will need a minimum of 5 boxes to store her clothes. The integer 12 is called the greatest common divisor of the integers 24 \text{ and } 36.

Let a \text{ and } b be arbitrary integers.

Since 1 is a divisor of every integer, 1 is a common divisor of a \text{ and } b. Hence, the set of positive common divisors of a \text{ and } b is nonempty.

Also, since every integer divides 0, if a = b = 0, then every integer serves as a common divisor of both a \text{ and } b. In this case, the set of positive common divisors of a \text{ and } b is infinite.

However, when at least one of a \text{ or } b is different from zero, there are only a finite number of positive common divisors. Among these, there is a largest one, called the greatest common divisor of a \text{ and } b.

Definition

Let a \text{ and } b be given integers, with at least one of them different from zero. The greatest common divisor of a \text{ and } b, denoted by gcd(a, b), is the positive integer d satisfying the following conditions:

(a) d \,|\, a \text{ and } d \,|\, b

(b) If c \,|\, a \text{ and } c \,|\, b, then c \leq d

Examples

Example 1

Let us find the greatest common divisor of -12 \text{ and } 30.

We begin by listing the positive divisors of each of -12 \text{ and } 30.

\begin{equation*} 
\begin{split}
\text{Positive divisors of } -12 &: 1, 2, 3, 4, 6, 12 \\
\text{Positive divisors of } \quad \,30 &: 1, 2, 3, 5, 6, 10, 15, 30 \\
\end{split}
\end{equation*}

Therefore, the positive common divisors of -12 \text{ and } 30 are 1, 2, 3 \text{ and } 6. Since 6 is the largest of these integers, it follows that gcd(-12, 30) = 6.

Example 2

Marie and Annie ate some mango gummies for fun. 😃 Marie ate 40 gummies and Annie ate 24. If each packet of gummies from which Marie and Annie ate their respective gummies from contain the same number of gummies, what is the largest possible number of gummies in a packet?

\text{Number of gummies eaten by a person} = \text{Number of gummies in each packet} \times \text{Number of packets}

Hence, the number of gummies in a packet should be a divisor of both 40 \text{ and } 24, i.e., the number of gummies eaten by Marie and Annie respectively. Therefore, the largest possible number of gummies in a packet is gcd(40, 24) = 8.

Note: The gcd(40, 24) can be calculated by following the procedure shown in example 1.

We will next prove that gcd(a, b) can be represented as a linear combination of a \text{ and } b. (The linear combination of integers a \text{ and } b is an expression of the form ax + by, where x \text{ and } y are integers.)

For example,

\text{gcd}(-12, 30) = 6 = (-12) \cdot 2 + 30 \cdot 1

and

\text{gcd}(40, 24) = 8 = 40 \cdot (-1) + 24 \cdot 2

Theorem

Given integers a \text{ and } b, with at least one of them different from zero, there exist integers x \text{ and } y such that

\text{gcd}(a, b) = ax + by

We prove this theorem by showing that the following assertions hold good.

(i) There exists a positive integer, d, such that d = ax + by

(ii) gcd(a, b) = d

We prove assertion (ii) by showing that

(a) d is a positive common divisor of integers a \text{ and } b

(b) there exists no other positive common divisor of integers a \text{ and } b greater than d

Let us prove assertion (i), namely, the existence of a positive integer ax + by.

In order to prove that a positive integer d = ax + by exists, we construct a set of all positive linear combinations of a \text{ and } b i.e., a set of all positive integers of the form au + bv, where u \text{ and } v are integers and prove that this set is not empty. Then from the Well-Ordering Principle we know that such a nonempty set of positive integers will have a least element or equivalently, a smallest integer. We will next prove that this smallest integer d = \text{gcd}(a, b).

We begin by proving that the set, S, of all positive linear combinations of a \text{ and } b, written in set notation as,

S = \{au + bv \,|\,  au + bv > 0; u, v \text{ are integers} \}

is nonempty.

In order to prove that S is nonempty it suffices to find just one value for each of u \text{ and } v such that au + bv is positive.

If a \neq 0, then the integer |a| = au + b \cdot 0 lies in S, where we choose u = 1 \text{ or } u = -1 depending on whether a is positive or negative. Hence, S is nonempty.

Therefore, by virtue of the Well-Ordering Principle the set S must contain a smallest element d. Thus, by the very definition of S, there exist integers x \text{ and } y for which d = ax + by. We claim that d = \text{gcd}(a, b).

In order to prove this claim, we will first show using the Division Algorithm that d divides a \text{ and } b i.e., d is the positive common divisor of a \text{ and } b. Then we will show that any arbitrary positive common divisor of a \text{ and } b cannot be greater than d.

From the Division Algorithm, we know that we can obtain integers q \text{ and } r such that a = qd + r, where 0 \leq r < d. Then r can be written in the form

r = a - qd

Substituting d = ax + by in the above equation, we get

\begin{equation*} 
\begin{split}
r &= a - q(ax + by) \\
&= a(1 - qx) + b(-qy) \\
\end{split}
\end{equation*}

If r were positive, then this representation would imply that r is a member of S, contradicting the claim that d is the least integer in S since r < d. Therefore, r = 0 and so a = qd, or equivalently, d \,|\, a. By similar reasoning, d \,|\, b which implies that d is a common divisor of a \text{ and } b.

Now if c is an arbitrary positive common divisor of a \text{ and } b, then part (g) of the Theorem stated above, allows us to conclude that c \,|\, (ax + by) i.e., c \,|\, d. From part (f) of the same theorem, we know that if c \,|\, d, then c = |c| \leq |d| = d i.e., d is greater than every other positive common divisor of a \text{ and } b.

We have proved that d is a common divisor of a \text{ and } b and is greater than every other positive common divisor of a \text{ and } b. Therefore, d = \text{gcd}(a, b).

Since d = ax + by and d = \text{gcd}(a, b) we can conclude that \text{gcd}(a, b) = ax + by.

Alternate Definition of the Greatest Common Divisor of Two Integers

The above theorem, which expresses the greatest common divisor of two integers as a linear combination of them, leads to an alternate definition of the greatest common divisor of two integers, which is given below.

Theorem

Let a\text{ and } b be integers, not both zero. For a positive integer d, d = \text{gcd}(a, b) if and only if

(a) d \,|\, a \text{ and } d \,|\, b

(b) Whenever c \,|\, a \text{ and } c \,|\, b, then c \,|\, d

First we will suppose that d = \text{gcd}(a, b) and prove that conditions (a) and (b) hold.

Since d = \text{gcd}(a, b), by definition, d \,|\, a \text{ and } d \,|\, b. Therefore, condition (a) holds.

From part (g) of the theorem we have already proved we know that if c \,|\, a \text{ and } c \,|\, b, then c \,|\, (ax + by) for arbitrary integers x \text{ and } y.

Also, in light of the theorem we just proved above, we can express d as d = ax + by for some integers x \text{ and } y.

Thus c \,|\, d and condition (b) holds.

Next, let us prove the converse i.e., if there exists a positive integer d such that conditions (a) and (b) hold, then d = \text{gcd}(a, b).

From part (b), we know that if c is a positive common divisor of a \text{ and } b, then c \,|\, d. If c \,|\, d, part (f) of the theorem we have already proved lets us conclude that c \leq d. Also, from (a) we know that d is the positive common divisor of a \text{ and } b. Hence, by definition, d = \text{gcd}(a, b).

Relatively Prime Integers

It might happen that 1 \text{ and } -1 are the only common divisors of a given pair to integers a \text{ and } b and consequently, \text{gcd}(a, b) = 1.

For example, \text{gcd}(3, 7) = \text{gcd}(-15, 22) = \text{gcd}(-41, -42) = 1.

This situation occurs offer enough to merit a definition.

Definition

Two integers a \text{ and } b, not both of which are zero, are said to be relatively prime whenever \text{gcd}(a, b) = 1.

The following theorem expresses relatively prime integers as linear combinations.

Theorem

Let a \text{ and } b be integers, not both zero. Then a \text{ and } b are relatively prime if and only if there exist integers x \text{ and } y such that 1 = ax + by.

We prove this theorem by showing that the following assertions are true :

  1. If integers a \text{ and } b, not both of which are zero, are relatively prime then there exist integers x \text{ and } y such that 1 = ax + by
  2. If a \text{ and } b are integers, not both of which are zero, and there exist integers x \text{ and } y such that 1 = ax + by then a \text{ and } b are relatively prime.

If a \text{ and } b are relatively prime, then \text{gcd}(a, b) = 1. The Theorem stated above guarantees the existence of integers x \text{ and } y such that \text{gcd}(a, b) = ax + by. Since \text{gcd}(a, b) = 1, it implies that 1 = ax + by.

As for the converse, i.e., assertion (2), suppose that 1 = ax + by for some choice of x \text{ and } y and that d = \text{gcd}(a, b). Since d \,|\, a \text{ and } d \,|\, b, part (g) of the Theorem stated above yields d \,|\, (ax + by). Since ax + by = 1, d \,|\, 1. Since d is a positive integer, part (b) of the Theorem stated above lets us conclude that d = 1. Since d = \text{gcd}(a, b) and d = 1, it follows that \text{gcd}(a, b) = 1. Therefore, a \text{ and } b are relatively prime.

As an aside, let us consider an alternate way of proving the converse.

Suppose there exists integers x \text{ and } y such that 1 = ax + by. The Theorem stated above guarantees the existence of integers x \text{ and } y such that \text{gcd}(a, b) = ax + by. Therefore, \text{gcd}(a, b) = 1, which implies that a \text{ and } b are relatively prime.

Is this proof correct? Unfortunately, not. The fallacy in this argument is the assumption that the integers x \text{ and } y in 1 = ax + by is the same as the integers x \text{ and } y in \text{gcd}(a, b) = ax + by. However, this is incorrect since \text{gcd}(a, b) could equal ax^\prime + by^\prime such that x \neq x^\prime \text{ and } y \neq y^\prime.

The point of this discussion is that while writing proofs one should always guard against making such fallacious arguments.

Application

We will now apply this theorem to prove an interesting observation made by Pythagoras.

Pythagoras. The number \sqrt{2} is irrational.

Before proving this assertion, we will elucidate a bit about its origin and significance.

An irrational number is a number that cannot be expressed as a ratio of two integers. That is, any number that cannot be written in the form \dfrac{p}{q}, where p is an integer and q is a non-zero integer is an irrational number.

Interestingly, the first irrational number to be discovered was \sqrt{2}. The Pythagoreans were an ancient Greek philosophical university and religious brotherhood which originated in the sixth century B.C. They stumbled upon \sqrt{2} as the length of a diagonal of a square with side lengths 1. They held the doctrine that all is number and viewed numbers, particularly whole numbers i.e., integers as fundamental to understanding the universe. Since they believed that everything could be explained by relationship between integers, when it was proven that the number \sqrt{2} did not fit this doctrine, they killed the man who proved it. 😱 The Greeks then rationalized this discrepancy by supposing that \sqrt{2} was an anomaly of the square. However, when they discovered \sqrt{3}, they ultimately had to confront the existence of irrational numbers, which complicated their understanding of the universe.

Now that we know about the macabre history of \sqrt{2} lets prove that it is irrational.

Proof. Suppose \sqrt{2} is not irrational. Then it can be expressed as a ratio of two integers i.e.,

\sqrt{2} = \frac{a}{b}

with \text{gcd}(a, b) = 1.

Consequently, it follows from the theorem we have just proved that there must exist integers r \text{ and } s satisfying ar + bs = 1.

As a result,

\sqrt{2} = \sqrt{2} \cdot 1 = \sqrt{2} (ar + bs)

Since \sqrt{2} = \dfrac{a}{b}, it follows that

a = \sqrt{2} b \quad \quad \text{ and } \quad \quad b = \frac{a}{\sqrt{2}}

Therefore,

\sqrt{2} = \sqrt{2} (ar + bs) = \sqrt{2} \bigg(\sqrt{2} br + \frac{a}{\sqrt{2}}s\bigg) = 2br + as

Since 2br + as is the sum of products of integers, it is an integer. Hence, this representation of \sqrt{2} leads us to conclude that \sqrt{2} is an integer, which is an obvious impossibility.

Hence, our supposition that \sqrt{2} is not irrational leads to a contradiction. Since \sqrt{2} can only be either irrational or not irrational and \sqrt{2} being not irrational leads to a contradiction, it must be the case that \sqrt{2} is irrational.

This theorem we have just proved leads to an observation stated below.

Corollary 1

If a \text{ and } b are integers, not both of which are zero, and \text{gcd}(a, b) = d, then \text{gcd}(a/d, b/d) = 1.

The Theorem stated above guarantees the existence of integers x \text{ and } y such that \text{gcd}(a, b) = ax + by. Since \text{gcd}(a, b) = d, it follows that

d = ax + by

Dividing each side of this equation by d, we get

1 = \Big(\frac{a}{d}\Big)x + \Big(\frac{b}{d}\Big)y

Since d is the greatest common divisor of a \text{ and } b, d \,|\, a \text{ and } d \,|\, b. Therefore, \dfrac{a}{d} \text{ and } \dfrac{b}{d} are integers. Hence by the theorem stated above, \dfrac{a}{d} \text{ and } \dfrac{b}{d} are relatively prime i.e., \text{gcd}(a/d, b/d) = 1.

For example, \text{gcd}(21, -49) = 7 and \text{gcd}(21/7, -49/7) = \text{gcd}(3, -7) = 1 which concurs with the statement of the corollary.

If a \,|\, c \text{ and } b \,|\, c, does it imply that ab \,|\, c ?

In order to prove or disprove this assertion, we will first try to find a counterexample that disproves this claim. If we are unable to do so, then we will try and prove the assertion.

For example, when a = 4, b = 6 \text{ and } c = 12, 4 \,|\, 12 \text{ and } 6 \,|\, 12 but 4 \cdot 6 \nmid 12.

However, when a = 2, b = 3 \text{ and } c = 12, 2 \,|\, 12 \text{ and } 3 \,|\, 12 and 2 \cdot 3 \,|\, 12.

We can see that when a \text{ and } b are relatively prime, then the assertion is true. This leads us to corollary 2.

Corollary 2

If a \,|\, c \text{ and } b \,|\, c, with \text{gcd}(a, b) = 1, then ab \,|\, c.

Since a \,|\, c \text{ and } b \,|\, c, there exist integers r \text{ and } s such that c = ar = bs.

The relation \text{gcd}(a, b) = 1 implies that

1 = ax + by

for some choice of integers x \text{ and } y.

Multiplying the above equation by c, we get

c = 1 \cdot c= (ax + by)c = acx + bcy

Making the appropriate substitutions on the right hand side of the equation yields,

c = a(bs)x + b(ar)y = ab(sx + ry)

which as a divisibility statement can be written as ab \,|\, c.

Our next theorem which also relates to integers a \text{ and } b being relatively prime is of fundamental importance.

Theorem

Euclid’s Lemma

If a \,|\, bc, with \text{gcd}(a, b) = 1, then a \,|\, c.

Theorem states that if \text{gcd}(a, b) = 1, then there exist integers x \text{ and } y such that

1 = ax + by

Multiplying the above equation by c, we get

c =  1 \cdot c = (ax + by)c = acx + bcy

Since a \,|\, ac and a \,|\, bc, from part (g) of the Theorem stated above we can infer that a \,|\, (acx + bcy).

Since c = acx + bcy, a \,|\, c.

Next Topic : The Euclidean Algorithm

Categories
mathematics Number Theory

Number Theory Primer : The Division Algorithm

authored by Premmi and Beguène

Previous Topic : Well-Ordering Principle

Division Algorithm

The Division Algorithm establishes a relationship between two integers by asserting that an integer a can be divided by a positive integer b in such a way that the remainder is lesser than b. The usefulness of the Division Algorithm lies in its ability to allow us to prove assertions about all the integers by considering only a finite number of cases.

Intuition

Consider integers of the form 3q. These integers are shown in red on the number line below. We can see that these multiples of 3 are evenly spaced on the number line i.e., there are exactly two integers between each consecutive pair of multiples of 3.

As shown below, by adding 0, 1 \text{ or } 2 to each of these multiples of 3 we can cover every integer on the number line.

We can see from the number line shown above that from each integer on the number line there is only one multiple of 3 we can get by subtracting 0, 1 \text{ or } 2. Expressed in symbols, from each integer a we can subtract exactly one possible integer r, \text{ where } 0 \leq r \leq 2, to get only one possible multiple of 3. That is,

a - r = 3q

Rearranging this equation, we get

a = 3q + r

That is, there is only one possible ordered pair of integers (q, r) such that 0 \leq r \leq 2 \text{ and }a = 3q + r.

For every integer a, there is exactly one pair of integers (q, r) such that 0 \leq r \leq 2 \text{ and }a = 3q + r.

This is a very important idea in number theory which will help us prove many facts about integers.

The exact statement of this idea is the theorem below.

Theorem

The Division Algorithm states that for any integer a and a positive integer b there exists exactly one pair of integers q \text{ and } r such that

a = qb + r

where 0 \leq r < b.

The integers q \text{ and } r are called, respectively, the quotient and remainder in the division of a by b. We call a, the dividend and b, the divisor.

The Division Algorithm is also sometimes called the Division Theorem.

We can also think of r as the smallest integer we can subtract from a to get a multiple of b.

Proof

In order to prove this theorem we will show that

  1. there exists a nonnegative integer r such that r = a - qb \text{ and } 0 \leq r < b
  2. q \text{ and } r are unique i.e., there exists exactly one pair of integers (q, r) such that 0 \leq r < b \text{ and }a = qb + r

Let us first prove the existence of a nonnegative integer r such that r = a - qb \text{ and } 0 \leq r < b.

We will first prove that the nonnegative integer r = a - qb exists by constructing a set of non-negative integers of the form a - xb, where x is an integer and proving that this set is nonempty. Then from the Well-Ordering Principle we know that such a nonempty set of nonnegative integers will have a least element or equivalently, a smallest integer. We will next prove that this smallest integer is r and it is less than b i.e., 0 \leq r < b..

We begin by proving that the set

S = \{a-xb \,|\,  x \text{ is an integer and } a-xb \geq 0\}

is nonempty.

In order to prove that S is nonempty it suffices to find just one value of x such that a - xb is nonnegative.

Since b \geq 1, it follows that |a| b \geq |a|.

Therefore,

a - (-|a|b) = a + |a|b \geq a + |a| \geq 0

Hence, we have shown that for x = -|a|, a - xb \in S. Since S is a nonempty set of nonnegative integers, by the well-ordering principle the set S contains a least element i.e., a smallest integer. Let us call this integer r.

Since r \in S, by definition of S there exists an integer q such that

r = a - qb 

where 0 \leq r.

We will now prove that r < b.

If this were not the case then, r \geq b and

a -qb \geq b

Adding -b to both sides of the equation, we get

a - qb -b \geq 0

Simplifying the above equation, we get

a - (q+1)b \geq 0

The integer a - (q+1)b has the right form to belong to S. But,

a - (q+1)b = a - qb -b = r - b < r

which contradicts our choice of r as the smallest member of S.

Since our assumption that r \geq b leads to a contradiction, it must be the case that r < b.

So far we have proved the existence of a nonnegative integer r such that r = a - qb \text{ and } 0 \leq r < b.

We will next prove that there exists exactly one pair of integers (q, r) such that 0 \leq r < b \text{ and }a = qb + r.

Suppose that integer a has two representations in the desired form, namely,

a = qb + r = q'b + r'

where 0 \leq r < b and 0 \leq r' < b.

Rearranging the terms of the equation,

qb + r = q'b + r'

we get,

r' - r = b(q-q')

Since the absolute value of a product is equal to the product of the absolute values,

|r' - r| = b|q-q'|

Multiplying the equation 0 \leq r < b with -1, we get

0 \geq -r > -b \implies -b < -r \leq 0

Adding the two equations,

\begin{equation*} 
\begin{split}
0  & \leq r'  < b \,\text{ and }\\
-b & < -r  \leq 0 \\
\end{split}
\end{equation*} 

we get,

-b < r'-r < b

Therefore,

|r'-r| < b

Since,

\begin{equation*} 
\begin{split}
|r' - r| &= b|q-q'| \, \text{ and }\\
|r'-r| &< b \\
\end{split}
\end{equation*} 

we get,

b|q-q'| < b \implies |q-q'| < 1

Therefore,

0 \leq |q-q'| < 1

Since |q-q'| is a nonnegative integer, the only possibility is that |q-q'| = 0 and hence, q = q' which in turn results in r = r'. This concludes our proof.

Loosening the restriction on b

Since a = qb + r, when b < 0 i.e., when b is negative we can get the same a by negating q. Therefore, we can obtain a more general version of the Division Algorithm by loosening the restriction that b must be positive and only requiring that it be nonzero i.e. b \neq 0.

Corollary

If a \text{ and } b are integers, with b \neq 0, then there exists exactly one pair of integers q \text{ and } r such that

a = qb + r

where 0 \leq r < |b|.

Proof

By the Division Algorithm proved above, if a \text{ and } b are integers, with b > 0, then there exists exactly one pair of integers q' \text{ and } r such that

a = q'b + r

where 0 \leq r < b.

Since b = |b|, it follows that a = q'b + r where 0 \leq r < |b|.

When b < 0, |b| = -b > 0.

Substituting b \text{ with } |b| in the above equation, we get

a = q'|b| + r

where 0 \leq r < |b|.

Let q'' = -q'. This implies that q' = -q''.

Substituting q' = -q'' \text{ and } |b| = -b in the above equation, we get

a = (-q'')(-b) + r

where 0 \leq r < |b|.

Therefore, when b < 0, a = q''b + r where 0 \leq r < |b|.

We will summarize what we have proved so far.

If a \text{ and } b are integers, with b > 0, then there exists exactly one pair of integers q' \text{ and } r such that

a = q'b + r

where 0 \leq r < |b|.

If a \text{ and } b are integers, with b < 0, then there exists exactly one pair of integers q'' \text{ and } r such that

a = q''b + r

where 0 \leq r < |b|.

Combining the above two statements we get the following statement.

If a \text{ and } b are integers, with b \neq 0, then there exists exactly one pair of integers q \text{ and } r such that

a = qb + r

where 0 \leq r < |b|.

This concludes our proof.

Applications of the Division Algorithm

The applications of the Division Algorithm is much more interesting than the algorithm itself as it helps us prove many assertions about integers.

We will see some numerical (American style 😏) as well as abstract (French style 😰) examples that illustrate the usefulness of the Division Algorithm.

Example 1

Suppose we are asked to find the greatest three-digit number that leaves a remainder of 6 when divided by 11. How would we go about finding such a number?

From the Division Algorithm we know that for an integer a and another integer b > 0 there exists unique integers q \text{ and } r such that a = qb + r \text{ and } 0 \leq r < b.

Here a \leq 999, b = 11 \text{ and } r = 6.

Therefore, a = 11q + 6 \leq 999.

Hence, q \leq \dfrac{993}{11} \implies q \leq 90\dfrac{3}{11}.

In order to get the largest possible a we need the largest possible q. Therefore, q = 90 and consequently, a = 11 \times 90 + 6 = 996.

Therefore the greatest three-digit number that leaves a remainder of 6 when divided by 11 is 996.

Next let us consider a more sophisticated example of an application of the Division Algorithm.

Example 2

Suppose we are asked to prove that 3n^2-1 (where n is an integer) can never be a perfect square. How would we go about proving this assertion?

Since the expression 3n^2-1 is a multiple of 3 from which 1 has been subtracted, from the Division Algorithm we know that we can express any integer as multiple of 3 plus some remainder ranging from 0 \text{ to } 2. We will square integers of this form and prove that they can never be of the form 3n^2-1.

From the Division Algorithm we know that when b = 3, the possible remainders are r = 0, r = 1 \text{ and } r = 2. When r = 0, the integer a has the form 3q, when r = 1, the integer a has the form 3q + 1 and when r = 2, the integer a has the form 3q + 2.

Hence, by the Division Algorithm, any integer is representable as one of the three forms, namely, 3q, 3q + 1 \text{ and } 3q+2. Therefore, by squaring each of these possible forms of integer a, we get all possible forms the square of an integer may assume.

  1. When a = 3q

    a^2 = (3q)^2 = 9q^2 = 3 \cdot 3q^2 = 3k, where k = 3q^2 is an integer.

  2. When a = 3q + 1

    a^2 = (3q +1)^2 = 9q^2 + 6q + 1 = 3(3q^2 + 2q) + 1 = 3k + 1, where k = 3q^2 + 2q is an integer.

  3. When a = 3q + 2

    a^2 = (3q +2)^2 = 9q^2 + 12q + 4 = 3(3q^2 + 4q) + 3 + 1 = 3(3q^2 + 4q + 1) + 1= 3k + 1, where k = 3q^2 + 4q + 1 is an integer.

Hence, the square of any integer is either of the form 3k \text{ or } 3k + 1.

We have to prove that the square of an integer cannot be of the form 3n^2-1. Suppose the square of an integer is of this form. Then, if a^2 = 3k \implies 3n^2-1 = 3k\implies 3(n^2 - k) = 1 \implies n^2 - k = \dfrac{1}{3}. This is impossible since an integer subtracted from another integer cannot result in a fraction.

Similarly, if a^2 = 3k + 1 \implies 3n^2-1 = 3k + 1 \implies 3(n^2-k) = 2 \implies n^2 - k = \dfrac{2}{3}, which is impossible.

Therefore, our assumption that 3n^2-1 is a perfect square is incorrect and hence we have proved that 3n^2-1 can never be a perfect square.

Next Topic : The Greatest Common Divisor

Categories
mathematics Number Theory Set Theory

Number Theory Primer : Well-Ordering Principle

authored by Premmi and Beguène

Well-Ordering Principle

Every nonempty set S of nonnegative integers contains a least element, i.e., there is some integer a in S such that a \leq b for all b\text{'s} belonging to S.

Since we cannot prove the well-ordering principle by using properties that the nonnegative integers satisfy under the operations of addition and multiplication, we will consider the well-ordering principle as an axiom. Surprisingly enough, the well-ordering principle is logically equivalent to the principle of mathematical induction. This means that we if consider one as an axiom, you can use that to prove the other. We will prove this logical equivalence at a later time.

We will use the well-ordering principle in number theory to prove many important assertions with regard to integers. For example, the Division Algorithm uses the well-ordering principle to prove the existence of a nonnegative integer remainder greater than or equal to zero when an integer is divided by another positive integer.

Next Topic : Division Algorithm