Buchberger Algorithm Formalization

3 Gröbner Bases

3.1 Polynomial Reductions

Theorem 15 Division Algorithm for Multivariate Polynomials
#

Let \(P\) be a subset of \(K[X]\) and \(f\in K[X]\). Then there exists a normal form \(g\in K[X]\) of \(f\) modulo \(P\) and a family \(\mathcal{F}=\{ q_{p}\} _{p\in P}\) of elements of \(K[X]\) with

\[ f=\sum _{p\in P}q_{p}\, p + g \quad \text{and}\quad \max \bigl\{ \operatorname {LT}(q_{p}p)\mid p\in P,\; q_{p}p\neq 0\bigr\} \le \operatorname {LT}(f). \]

If \(P\) is finite, the ground field is computable, and the term order on \(T\) is decidable, then \(g\) and \(\{ q_{p}\} _{p\in P}\) can be computed from \(f\) and \(P\).

Proof

3.2 Gröbner Bases-Existence and Uniqueness

Definition 16 Initial Ideal
#

Let \(I\subseteq k[x_1,\dots ,x_n]\) be an ideal other than \(\{ 0\} \), and fix a monomial ordering on \(k[x_1,\dots ,x_n]\). Then:

  1. We denote by

    \[ \mathrm{LT}(I) = \{ \, c\, x^\alpha \mid \exists \, f\in I\setminus \{ 0\} \text{ with }\mathrm{LT}(f)=c\, x^\alpha \} . \]
  2. We denote by \(\langle \mathrm{LT}(I)\rangle \) the ideal generated by the elements of \(\mathrm{LT}(I)\).

Let \(I \subseteq k[x_1, \ldots , x_n]\) be an ideal different from \(\{ 0 \} \).

  1. \(\langle \operatorname {LT}(I)\rangle \) is a monomial ideal.

  2. There are \(g_1, \ldots , g_t \in I\) such that \(\langle \operatorname {LT}(I)\rangle = \langle \operatorname {LT}(g_1), \ldots , \operatorname {LT}(g_t)\rangle \).

Proof
  1. The leading monomials \(\text{LM}(g)\) of elements \(g \in I \setminus \{ 0\} \) generate the monomial ideal \(\langle \text{LM}(g) \mid g \in I \setminus \{ 0\} \rangle \). Since \(\text{LM}(g)\) and \(\text{LT}(g)\) differ by a nonzero constant, this ideal equals \(\langle \text{LT}(g) \mid g \in I \setminus \{ 0\} \rangle = \langle \text{LT}(I) \rangle \). Thus, \(\langle \text{LT}(I) \rangle \) is a monomial ideal.

  2. Since \(\langle \text{LT}(I) \rangle \) is generated by the monomials \(\text{LM}(g)\) for \(g \in I \setminus \{ 0\} \), Dickson’s Lemma tells us that \(\langle \text{LT}(I) \rangle = \langle \text{LM}(g_1), \dots , \text{LM}(g_t) \rangle \) for finitely many \(g_1, \dots , g_t \in I\). Since \(\text{LM}(g_i)\) differs from \(\text{LT}(g_i)\) by a nonzero constant, it follows that \(\langle \text{LT}(I) \rangle = \langle \text{LT}(g_1), \dots , \text{LT}(g_t) \rangle \). This completes the proof.

Definition 18
#

Fix a monomial order \({\gt}\) on the polynomial ring \(k[x_1, \dots , x_n]\). A finite subset \(G = \{ g_1, \dots , g_t\} \) of an ideal \(I \subseteq k[x_1, \dots , x_n]\) different from \(\{ 0\} \) is said to be a Gröbner basis (or standard basis) for \(I\) if the ideal generated by the leading terms of the elements in \(G\) is equal to the ideal generated by the leading terms of all elements in \(I\). That is,

\[ \langle \operatorname {LT}(g_1), \dots , \operatorname {LT}(g_t) \rangle = \langle \operatorname {LT}(I) \rangle , \]

where \(\operatorname {LT}(I) = \{ \operatorname {LT}(f) \mid f \in I \setminus \{ 0\} \} \). Using the convention that \(\langle \emptyset \rangle = \{ 0\} \), we define the empty set \(\emptyset \) to be the Gröbner basis of the zero ideal \(\{ 0\} \).

Proposition 19

Let \(I \subseteq k[x_1,\dots ,x_n]\) be an ideal and let \(G = \{ g_1,\dots ,g_t\} \) be a Gröbner basis for \(I\). Then given \(f \in k[x_1,\dots ,x_n]\) there is a unique \(r \in k[x_1,\dots ,x_n]\) with the following two properties:

  1. No term of \(r\) is divisible by any of \(\mathrm{LT}(g_1),\dots ,\mathrm{LT}(g_t)\).

  2. There is \(g\in I\) such that \(f = g + r\).

In particular, \(r\) is the remainder on division of \(f\) by \(G\) no matter how the elements of \(G\) are listed when using the division algorithm.

Proof

The division algorithm gives \(f = q_1 g_1 + \cdots + q_t g_t + r\), where \(r\) satisfies (i). We can also satisfy (ii) by setting \(g = q_1 g_1 + \cdots + q_t g_t \in I\). This proves the existence of \(r\). To prove uniqueness, suppose \(f = g+r = g'+r'\) satisfy (i) and (ii). Then \(r-r' = g' - g \in I\), so that if \(r \ne r'\), then \(\text{LT}(r-r') \in \langle \text{LT}(I) \rangle = \langle \text{LT}(g_1), \dots , \text{LT}(g_t) \rangle \). By Lemma 12, it follows that \(\text{LT}(r-r')\) is divisible by some \(\text{LT}(g_i)\). This is impossible since no term of \(r, r'\) is divisible by one of \(\text{LT}(g_1), \dots , \text{LT}(g_t)\). Thus \(r-r'\) must be zero, and uniqueness is proved. The final part of the proposition follows from the uniqueness of \(r\).

Corollary 20 Ideal Membership Problem

Let \(G = \{ g_1, \dots , g_t\} \) be a Gröbner basis for an ideal \(I \subseteq k[x_1, \dots , x_n]\) (with respect to a given monomial order \({\gt}\)) and let \(f \in k[x_1, \dots , x_n]\). Then \(f \in I\) if and only if the remainder on division of \(f\) by \(G\) is zero.

\[ f \in I \iff \operatorname {rem}(f, G) = 0. \]
Proof

If the remainder is zero, then we have already observed that \(f \in I\). Conversely, given \(f \in I\), then \(f = f + 0\) satisfies the two conditions of Proposition 19. It follows that \(0\) is the remainder of \(f\) on division by \(G\).

Definition 21
#

We will write \(\overline{f}^F\) for the remainder(normalform) on division of \(f\) by the ordered \(s\)-tuple

\[ F \; =\; (f_1,\dots ,f_s). \]

If \(F\) is a Gröbner basis for the ideal \(\langle f_1,\dots ,f_s\rangle \), then by Proposition 1 we may regard \(F\) as a set (without any particular order).

Definition 22
#

Let \(f,g\in k[x_1,\dots ,x_n]\) be nonzero polynomials.

  1. If \(\operatorname {multideg}(f)=\alpha \) and \(\operatorname {multideg}(g)=\beta \), then let

    \[ \gamma = (\gamma _1,\dots ,\gamma _n), \quad \gamma _i = \max (\alpha _i,\beta _i) \quad \text{for each }i. \]

    We call \(x^\gamma \) the least common multiple of \(\operatorname {LM}(f)\) and \(\operatorname {LM}(g)\), written

    \[ x^\gamma \; =\; \operatorname {lcm}\bigl(\operatorname {LM}(f),\, \operatorname {LM}(g)\bigr). \]
  2. The \(S\)-polynomial of \(f\) and \(g\) is the combination

    \[ S(f,g) \; =\; \frac{x^\gamma }{\mathrm{LT}(f)}\, f \; -\; \frac{x^\gamma }{\mathrm{LT}(g)}\, g. \]

Suppose we have a sum \(\sum _{i=1}^s p_i,\) where \(\operatorname {multideg}(p_i)=\delta \in \mathbb Z_{\ge 0}^n\) for all \(i\). If \(\operatorname {multideg}\Bigl(\sum _{i=1}^s p_i\Bigr){\lt}\delta ,\) then \(\sum _{i=1}^s p_i\) is a linear combination, with coefficients in \(k\), of the \(S\)-polynomials \(S(p_j,p_l)\quad \text{for }1\le j,\, l\le s\). Furthermore, each \(S(p_j,p_l)\) has multidegree \({\lt}\delta \).

Proof

Let \(d_i = \text{LC}(p_i)\), so that \(d_i x^\delta \) is the leading term of \(p_i\). Since the sum \(\sum _{i=1}^s p_i\) has strictly smaller multidegree, it follows easily that \(\sum _{i=1}^s d_i = 0\).

Next observe that since \(p_i\) and \(p_j\) have the same leading monomial, their S-polynomial reduces to

\begin{equation} \tag {1} S(p_i, p_j) = \frac{1}{d_i}p_i - \frac{1}{d_j}p_j. \end{equation}
1

It follows that

\begin{equation} \tag {2} \begin{aligned} \sum _{i=1}^{s-1} d_i S(p_i, p_s) & = d_1\left(\frac{1}{d_1}p_1 - \frac{1}{d_s}p_s\right) + d_2\left(\frac{1}{d_2}p_2 - \frac{1}{d_s}p_s\right) + \cdots \\ & = p_1 + p_2 + \cdots + p_{s-1} - \frac{1}{d_s}(d_1 + \cdots + d_{s-1})p_s. \end{aligned} \end{equation}
2

However, \(\sum _{i=1}^s d_i = 0\) implies \(d_1 + \cdots + d_{s-1} = -d_s\), so that (2) reduces to

\[ \sum _{i=1}^{s-1} d_i S(p_i, p_s) = p_1 + \cdots + p_{s-1} + p_s. \]

Thus, \(\sum _{i=1}^s p_i\) is a sum of S-polynomials of the desired form, and equation (1) makes it easy to see that \(S(p_i, p_j)\) has multidegree \({\lt}\delta \). The lemma is proved.

Suppose that \(p_i = c_i x^{\alpha ^{(i)}} g_i\) and \(p_j = c_j x^{\alpha ^{(j)}} g_j\) have the same multidegree \(\delta \). Then

\[ S(p_i, p_j) = x^{\delta - \gamma _{ij}} S(g_i, g_j), \]

where \(x^{\gamma _{ij}} = \operatorname {lcm}(\operatorname {LM}(g_i), \operatorname {LM}(g_j))\).

Proof

By hypothesis, \(\operatorname {multideg}(p_i) = \operatorname {multideg}(p_j) = \delta \), which implies \(\delta = \alpha ^{(i)} + \operatorname {multideg}(g_i)\) and \(\delta = \alpha ^{(j)} + \operatorname {multideg}(g_j)\). The leading terms are \(\operatorname {LT}(p_i) = c_i \operatorname {LC}(g_i) x^\delta \) and \(\operatorname {LT}(p_j) = c_j \operatorname {LC}(g_j) x^\delta \). Let \(\gamma _{ij}\) be the exponent of \(\operatorname {lcm}(\operatorname {LM}(g_i), \operatorname {LM}(g_j))\).

On the one hand, the left-hand side simplifies to:

\begin{align*} S(p_i, p_j) & = \frac{x^\delta }{\operatorname {LT}(p_i)} p_i - \frac{x^\delta }{\operatorname {LT}(p_j)} p_j \\ & = \frac{x^\delta }{c_i \operatorname {LC}(g_i) x^\delta } (c_i x^{\alpha ^{(i)}} g_i) - \frac{x^\delta }{c_j \operatorname {LC}(g_j) x^\delta } (c_j x^{\alpha ^{(j)}} g_j) \\ & = \frac{1}{\operatorname {LC}(g_i)} x^{\alpha ^{(i)}} g_i - \frac{1}{\operatorname {LC}(g_j)} x^{\alpha ^{(j)}} g_j. \end{align*}

On the other hand, the right-hand side expands to:

\begin{align*} x^{\delta - \gamma _{ij}} S(g_i, g_j) & = x^{\delta - \gamma _{ij}} \left( \frac{x^{\gamma _{ij}}}{\operatorname {LT}(g_i)} g_i - \frac{x^{\gamma _{ij}}}{\operatorname {LT}(g_j)} g_j \right) \\ & = \frac{x^{\delta }}{\operatorname {LC}(g_i)x^{\operatorname {multideg}(g_i)}} g_i - \frac{x^{\delta }}{\operatorname {LC}(g_j)x^{\operatorname {multideg}(g_j)}} g_j \\ & = \frac{1}{\operatorname {LC}(g_i)} x^{\delta - \operatorname {multideg}(g_i)} g_i - \frac{1}{\operatorname {LC}(g_j)} x^{\delta - \operatorname {multideg}(g_j)} g_j \\ & = \frac{1}{\operatorname {LC}(g_i)} x^{\alpha ^{(i)}} g_i - \frac{1}{\operatorname {LC}(g_j)} x^{\alpha ^{(j)}} g_j. \end{align*}

The two sides are equal, which completes the proof.

Theorem 25 Buchberger’s Criterion

Let \(I\) be a polynomial ideal in \(k[x_1, \dots , x_n]\). Then a basis \(G = \{ g_1, \dots , g_t\} \) of \(I\) is a Gröbner basis for \(I\) (with respect to a given monomial order \({\gt}\)) if and only if for all pairs \(i \neq j\), the remainder on division of the \(S\)-polynomial \(S(g_i, g_j)\) by \(G\) (listed in some order) is zero.

\[ \forall i \neq j, \quad \operatorname {rem}(S(g_i, g_j), G) = 0. \]
Proof

\(\Rightarrow \): If \(G\) is a Gröbner basis, then since \(S(g_i, g_j) \in I\), the remainder on division by \(G\) is zero by Corollary 20.

\(\Leftarrow \): Let \(f \in I\) be nonzero. We will show that \(\text{LT}(f) \in \langle \text{LT}(g_1), \dots , \text{LT}(g_t)\rangle \) as follows. Write

\[ f = \sum _{i=1}^t h_i g_i, \quad h_i \in k[x_1, \dots , x_n]. \]

From Lemma 10, it follows that

\begin{equation} \tag {3} \text{multideg}(f) \le \max (\text{multideg}(h_i g_i) \mid h_i g_i \ne 0). \end{equation}
3

The strategy of the proof is to pick the most efficient representation of \(f\), meaning that among all expressions \(f = \sum _{i=1}^t h_i g_i\), we pick one for which

\[ \delta = \max (\text{multideg}(h_i g_i) \mid h_i g_i \ne 0) \]

is minimal. The minimal \(\delta \) exists by the well-ordering property of our monomial ordering. By (3), it follows that \(\text{multideg}(f) \le \delta \).

If equality occurs, then \(\text{multideg}(f) = \text{multideg}(h_i g_i)\) for some \(i\). This easily implies that \(\text{LT}(f)\) is divisible by \(\text{LT}(g_i)\). Then \(\text{LT}(f) \in \langle \text{LT}(g_1), \dots , \text{LT}(g_t)\rangle \), which is what we want to prove.

It remains to consider the case when the minimal \(\delta \) satisfies \(\text{multideg}(f) {\lt} \delta \). We will use \(S(g_i, g_j) \overline{\hspace{1.5em}}^G 0\) for \(i \ne j\) to find a new expression for \(f\) that decreases \(\delta \). This will contradict the minimality of \(\delta \) and complete the proof.

Given an expression \(f = \sum _{i=1}^t h_i g_i\) with minimal \(\delta \), we begin by isolating the part of the sum where multidegree \(\delta \) occurs:

\begin{align} \tag {4} f & = \sum _{\text{multideg}(h_i g_i) = \delta } h_i g_i + \sum _{\text{multideg}(h_i g_i) {\lt} \delta } h_i g_i \\ & = \sum _{\text{multideg}(h_i g_i) = \delta } \text{LT}(h_i) g_i + \sum _{\text{multideg}(h_i g_i) = \delta } (h_i - \text{LT}(h_i)) g_i + \sum _{\text{multideg}(h_i g_i) {\lt} \delta } h_i g_i. \nonumber \end{align}

The monomials appearing in the second and third sums on the second line all have multidegree \({\lt}\delta \). Then \(\text{multideg}(f) {\lt} \delta \) means that the first sum on the second line also has multidegree \({\lt}\delta \).

The key to decreasing \(\delta \) is to rewrite the first sum in two stages: use Lemma 23 to rewrite the first sum in terms of S-polynomials, and then use \(S(g_i, g_j) \overline{\hspace{1.5em}}^G 0\) to rewrite the S-polynomials without cancellation.

To express the first sum on the second line of (4) using S-polynomials, note that

\begin{equation} \tag {5} \sum _{\text{multideg}(h_i g_i) = \delta } \text{LT}(h_i) g_i \end{equation}
5

satisfies the hypothesis of Lemma 23 since each \(p_i = \text{LT}(h_i) g_i\) has multidegree \(\delta \) and the sum has multidegree \({\lt}\delta \). Hence, the first sum is a linear combination with coefficients in \(k\) of the S-polynomials \(S(p_i, p_j)\). In Exercise 24, you will verify that

\[ S(p_i, p_j) = x^{\delta - \gamma _{ij}} S(g_i, g_j), \]

where \(x^{\gamma _{ij}} = \text{lcm}(\text{LM}(g_i), \text{LM}(g_j))\). It follows that the first sum (5) is a linear combination of \(S(g_i, g_j)\) for certain pairs \((i, j)\).

Consider one of these S-polynomials \(S(g_i, g_j)\). Since \(S(g_i, g_j) \overline{\hspace{1.5em}}^G 0\), the division algorithm (Theorem 3 of §3) gives an expression

\begin{equation} \tag {6} S(g_i, g_j) = \sum _{l=1}^t A_l g_l, \end{equation}
6

where \(A_l \in k[x_1, \dots , x_n]\) and

\begin{equation} \tag {7} \text{multideg}(A_l g_l) \le \text{multideg}(S(g_i, g_j)) \end{equation}
7

when \(A_l g_l \ne 0\). Now multiply each side of (6) by \(x^{\delta - \gamma _{ij}}\) to obtain

\begin{equation} \tag {8} x^{\delta - \gamma _{ij}} S(g_i, g_j) = \sum _{l=1}^t B_l g_l, \end{equation}
8

where \(B_l = x^{\delta - \gamma _{ij}} A_l\). Then (7) implies that when \(B_l g_l \ne 0\), we have

\begin{equation} \tag {9} \text{multideg}(B_l g_l) \le \text{multideg}(x^{\delta - \gamma _{ij}} S(g_i, g_j)) {\lt} \delta \end{equation}
9

since \(\text{LT}(S(g_i, g_j)) {\lt} \text{lcm}(\text{LM}(g_i), \text{LM}(g_j)) = x^{\gamma _{ij}}\) by Exercise 7.

It follows that the first sum (5) is a linear combination of certain \(x^{\delta - \gamma _{ij}} S(g_i, g_j)\), each of which satisfies (8) and (9). Hence we can write the first sum as

\begin{equation} \tag {10} \sum _{\text{multideg}(h_i g_i) = \delta } \text{LT}(h_i) g_i = \sum _{l=1}^t \tilde{B}_l g_l \end{equation}
10

with the property that when \(\tilde{B}_l g_l \ne 0\), we have

\begin{equation} \tag {11} \text{multideg}(\tilde{B}_l g_l) {\lt} \delta . \end{equation}
11

Substituting (10) into the second line of (4) gives an expression for \(f\) as a polynomial combination of the \(g_i\)’s where all terms have multidegree \({\lt}\delta \). This contradicts the minimality of \(\delta \) and completes the proof.

Definition 26
#

Fix a monomial order and let \(G = \left\{ g_1, \ldots , g_t\right\} \subseteq k[x_1, \ldots , x_n]\). Given \(f \in k[x_1, \ldots , x_n]\), we say that \(f\) reduces to zero modulo \(G\), written \(f \xrightarrow {G} 0\), if \(f\) has a standard representation

\[ f = A_1g_1 + \cdots A_tg_t,\ A_i \in k[x_1, \ldots , x_n], \]

which means that whenever \(A_ig_i \neq 0\), we have

\[ \operatorname {deg}(f) \geq \operatorname {deg}(A_ig_i). \]
Theorem 27 Buchberger’s Algorithm
#

Let \(I = \langle f_1, \ldots , f_s \rangle \ne \{ 0\} \) be a polynomial ideal. Then a Gröbner basis for \(I\) can be constructed in a finite number of steps by the following algorithm:

  • Input : \(F = (f_1, \ldots , f_s)\)

  • Output : A Gröbner basis \(G = (g_1, \ldots , g_t)\) for \(I\), with \(F \subseteq G\)

  • \(G := F\)

  • REPEAT

    • \(G' := G\)

    • FOR each pair \(\{ p, q\} \), \(p \ne q\) in \(G'\) DO

      • \(r := \overline{S(p, q)}^{G'}\)

      • IF \(r \ne 0\) THEN \(G := G \cup \{ r\} \)

  • UNTIL \(G = G'\)

  • RETURN \(G\)

Proof

We begin with some frequently used notation. If \(G = \{ g_1, \ldots , g_t \} \), then \(\langle G \rangle \) and \(\langle \operatorname {LT}(G) \rangle \) will denote the following ideals:

\[ \langle G \rangle = \langle g_1, \ldots , g_t \rangle , \qquad \langle \operatorname {LT}(G) \rangle = \langle \operatorname {LT}(g_1), \ldots , \operatorname {LT}(g_t) \rangle . \]

Turning to the proof of the theorem, we first show that \(G \subseteq I\) holds at every stage of the algorithm. This is true initially, and whenever we enlarge \(G\), we do so by adding the remainder \(r = \overline{S(p,q)}^{G'}\) for \(p,q \in G' \subseteq G\). Thus, if \(G \subseteq I\), then \(p,q \in I\) and, hence, \(S(p,q) \in I\), and since we are dividing by \(G' \subseteq I\), we get \(G \cup \{ r\} \subseteq I\). We also note that \(G\) contains the given basis \(F\) of \(I\), so that \(G\) is actually a basis of \(I\).

The algorithm terminates when \(G = G'\), which means that \(r = \overline{S(p,q)}^{G'} = 0\) for all \(p,q \in G\). Hence \(G\) is a Gröbner basis of \(\langle G \rangle = I\) by Theorem 6 of §6.

It remains to prove that the algorithm terminates. We need to consider what happens after each pass through the main loop. The set \(G\) consists of \(G'\) (the old \(G\)) together with the nonzero remainders of \(S\)-polynomials of elements of \(G'\). Then

\begin{equation} \langle \operatorname {LT}(G') \rangle \subseteq \langle \operatorname {LT}(G) \rangle . \end{equation}
12

Since \(G' \subseteq G\). Furthermore, if \(G' \ne G\), we claim that \(\langle \operatorname {LT}(G') \rangle \) is strictly smaller than \(\langle \operatorname {LT}(G) \rangle \). To see this, suppose that a nonzero remainder \(r\) of an \(S\)-polynomial has been adjoined to \(G\). Since \(r\) is a remainder on division by \(G'\), \(\operatorname {LT}(r)\) is not divisible by the leading terms of elements of \(G'\), and thus \(\operatorname {LT}(r) \notin \langle \operatorname {LT}(G') \rangle \) by Lemma 2 of §4. Yet \(\operatorname {LT}(r) \in \langle \operatorname {LT}(G) \rangle \), which proves our claim.

By (1), the ideals \(\langle \operatorname {LT}(G') \rangle \) from successive iterations of the loop form an ascending chain of ideals in \(k[x_1, \ldots , x_n]\). Thus, the ACC (Theorem 7 of §5) implies that after a finite number of iterations the chain will stabilize, so that \(\langle \operatorname {LT}(G') \rangle = \langle \operatorname {LT}(G) \rangle \) must happen eventually. By the previous paragraph, this implies that \(G' = G\), so that the algorithm must terminate after a finite number of steps.

Lemma 28

Let \(G\) be a Gröbner basis of \(I \subseteq k[x_1, \ldots , x_n]\). Let \(p \in G\) be a polynomial such that \(\operatorname {LT}(p) \in \langle \operatorname {LT}(G \setminus \{ p\} ) \rangle .\) Then \(G \setminus \{ p\} \) is also a Gröbner basis for \(I\).

Proof

We know that \(\langle \operatorname {LT}(G)\rangle = \langle \operatorname {LT}(I)\rangle \). If \(\operatorname {LT}(p) \in \langle \operatorname {LT}(G \setminus \{ p\} )\rangle \), then we have \(\langle \operatorname {LT}(G \setminus \{ p\} )\rangle = \langle \operatorname {LT}(G)\rangle \). By definition, it follows that \(G \setminus \{ p\} \) is also a Gröbner basis for \(I\).