Previous: Short Biographies Next: Conclusions
Home: Table of Contents
5. Computer Algebra Systems
5.1 Introduction - What is a Computer Algebra System?
A Computer Algebra system is a type of software package that is used in manipulation of mathematical formulae. The primary goal of a Computer Algebra system is to automate tedious and sometimes difficult algebraic manipulation tasks. The principal difference between a Computer Algebra system and a traditional calculator is the ability to deal with equations symbolically rather than numerically. The specific uses and capabilities of these systems vary greatly from one system to another, yet the purpose remains the same: manipulation of symbolic equations. Computer Algebra systems often include facilities for graphing equations and provide a programming language for the user to define his/her own procedures.
Computer Algebra systems have not only changed how mathematics is taught at many universities, but have provided a flexible tool for mathematicians worldwide. Examples of popular systems include Maple, Mathematica, and MathCAD. Computer Algebra systems can be used to simplify rational functions, factor polynomials, find the solutions to a system of equation, and various other manipulations. In Calculus, they can be used to find the limit of, symbolically integrate, and differentiate arbitrary equations.
Attempting to expand the equation
using the binomial theorem by hand would be a daunting task, nearly impossible to do without error. However, with the aid of Maple, this equation can be expanded in less than two seconds. Differentiating the result term-by-term can then be performed in milliseconds. The usefulness of such a system is obvious: not only does it act as a time saving device, but problems which simply were not reasonable to perform by hand can be performed in seconds.
Leibniz and Newton developed calculus in terms of algorithmic processes. Computer Algebra systems can now take these methods and remove the human from the process. However, in studying Calculus and even simple algebraic operations, it would seem that computers would be extraordinarily inept at performing such tasks. After all, most of us consider there to be a great deal of problem solving involved in the mathematics taught in grade school and beyond. How is it that a computer, a mindless composition of binary digits, is able to perform such complex tasks? It would seem that the computer would be unsuitable for such tasks, but the success of popular Algebra software packages show that this is not the case. On the contrary, Computer Algebra systems often know how to perform more operations on equations than the user!
Rather than discuss the many ways that Computer Algebra systems have altered the education and use of Calculus, we were most intrigued by how these systems actually worked. Our approach was to begin by researching the theories and issues involved in creating a Computer Algebra system. Coinciding with our research, we began writing our own Computer Algebra system in C++. The rest of this section is dedicated to a summary of our research and the specifics of the implementation we chose.
5.2 Data Structures
In order for a computer program to even begin manipulating a symbolic equation, it first must store that equation somewhere in memory. At the heart of any Computer Algebra system is a data structure (or combination of data structures) responsible for describing a mathematical equation. Equations can exist in several variables, contain references to other functions, and can themselves be rational functions. There is no perfect solution to a data structure representation of an equation. One representation might be efficient for certain mathematical operations, but poor for others. Another representation might be inefficient in time and space complexity, but easy to program. Such tradeoffs need to be considered when choosing a representation; there is no absolute answer to the problem.
5.2.2 Polynomials in one variable - Coefficients
In order to begin a discussion of the issues involved in storing a symbolic equation, polynomials of single-variables will be considered. That is, equations that are only of the form where is an integer or fractional coefficient. Even in storing such a simple equation, there are numerous issues involved in choosing a data structure.
The coefficient itself can not be stored as a simple data type. The amount of storage for an integer in most languages is typically 16 or 32-bits. In a 32-bit representation of an integer, only 232 or approximately 4.3 billion different numbers can be represented. Though this might seem large, for many mathematical operations (such as the simplification described in the Introduction), numbers of much larger size must be possible to represent. Therefore, a numeric data type that allows for expansive growth in representation (e.g. a data type that grows dynamically with the size of the number) needs to be used. For fractional coefficients, simply storing the numerator and denominator separately in two such data types is adequate.
5.2.3 Polynomials in one variable - Terms
Having dealt with the issue of storing coefficients, the more significant problem of how to actually store each term must be dealt with. The first issue to contend with is that of finding a canonical form. That is, consider that a user types in the following single-variable equations:
In both instances, a human can easily expand and re-order the equations to determine equivalence. However, for a computer, this is far from a trivial task. The Computer Algebra system must represent these equations in a canonical form, one in which only one representation exists for each equation. That is, in the above examples, the Computer Algebra system would simplify both pairs of equation into the same representation in internal memory.
For a single-variable polynomial, once fully simplified (more on this later), finding a canonical form is not difficult. In the internal representation, simply sort the terms by degree of their exponent, and each version of the equation will correspond to the same representation. The next step is to determine a way to store each term in the computer's memory. One approach would be to create an array of the size equal to the largest exponent in the equation. An array is simply a collection of items of the same type, the size of which does not change after instantiation. For example, consider a user enters the equation . The program would create an array of 6 elements. At each element in the array, it would store the coefficient of the corresponding term (and place a 0 in all the unused terms):
This representation is known as dense because it stores each of the terms, independent of whether the coefficient of the term is 0. An alternative representation would be to create a list or similar data structure that at each node stores the coefficient and the exponent of the term. This way, only the terms that actually are used require storage in memory:
At each node in the list, the first number in the pair represents the coefficient and the second the exponent of each term. This representation is sparse: only the terms which have non-zero coefficients are stored in memory.
Typically, a sparse representation is preferable to a dense one, but there are advantages and disadvantages to both. In a dense representation, there often can be large amounts of computer memory wasted. For example, the equation would require an array of 2001 elements just to store two coefficients. Comparatively, the sparse version would require only two nodes in a list. Additionally, from the programmer perspective, it is difficult to make changes to a dense representation. For example, if one were to add a new term to a polynomial that was not included in the range of the original array, the array would have to be completely recreated (in most languages, the size of an array can not be modified without completely recreating it). The sparse representation provides much more flexibility, as adding a new term is simply a matter of adding a new node to the list in the correct place (to maintain a canonical form). The choice between a sparse and dense representation is completely dependent on the task for which it will be used. In some cases, a system will shift between representations in order to optimize for specific algorithms (Davenport 59-70).
5.2.4 Polynomials in one variable - Recursive definition
A Computer Algebra system supporting only equations of the form would be quite inflexible. One enhancement, while still remaining in the realm of polynomials of one variable, is to allow equations with recursive definitions. That is, polynomials where the coefficient to a term or numerous terms can be a polynomial itself. Some examples of such equations would be , , and . The sparse representation can be easily modified to support equations of such a form. Instead of simply storing pairs of coefficients and their exponents, the coefficients themselves can also point to lists of polynomials. In order to find the canonical form, the polynomial is simplified and all recursive polynomial coefficients are removed from the representation. In order to support rational functions, all that is needed is to store two polynomials: the numerator and denominator. Of course, simplification of a rational function into canonical form introduces a host of new issues, but these issues will be discussed later.
5.2.5 Multivariate Polynomials
The sparse representation can be extended into multivariate polynomials without too much effort in terms of representation. Rather than storing simply a coefficient and degree at each node in the list, the degree will be replaced with a list of variables and their respective degrees. The difficult issue that arises when switching to a multivariate representation is finding a canonical form. One approach that is commonly used is to first sort the terms lexicographically and then by degree. As an example, consider the equation
would be represented in the following manner after sorting
The selection of sorting is irrelevant (whether first by lexicographic order and then degree or vice versa) as long as the choice is consistent (Davenport 71-74).
5.2.6 The Syntax Tree - Our Data Structure Implementation
In the spirit of sparse representation, we chose to use a syntax tree as the internal data structure for our symbolic calculator. A syntax tree is a kind of tree, which in turn is a kind of linked data structure. Briefly, a linked data structure is an object which contains references, or links, to other like objects. A simple example is a linked list, where each element contains the data for it list entry and a link to the next list element. A tree is a linked structure that starts with a single "root" node. One or more "child" nodes are referenced from the root node, and each of these child nodes may in turn have children of their own. This linking pattern produces a branching data structure, as seen in the following diagram; hence the name "tree".
Trees are acyclic, which means that nodes cannot be linked in a loop. Each node has exactly one "parent" node, that is, one node of which it is a child. The exception is the root node, which has no parent. Nodes with no children are called terminal nodes, or "leaf" nodes. A syntax tree is a type of tree where each non-terminal node represents an operator or function, and its children represent its operands or arguments. In our program, mathematical operators such as addition and multiplication, or mathematical functions such as sin or log are represented by these non-terminal nodes. Leaf nodes represent the terminal symbols of an expression, such as numbers, constants or variables. The structure of a syntax tree represents syntactic information about the data it contains.
In our case, the syntax tree represents the syntactic steps, or order of operations involved in evaluating a symbolic expression. For example, the simple expression a + b* c could be represented by the following syntax tree:
The root of this tree is the addition operation, and the children are its operands. The hierarchy of operators and arguments establishes a clear precedence of operations. The syntax tree for the expression is shown below:
These two syntax trees are different, as are the expressions they represent. Syntax trees offer a clear and unambiguous way to store a wide variety of expressions.
5.2.7 The Syntax Tree - Advantages
The major advantage of the syntax tree is that it is flexible. It can represent a wide variety of different expressions that cannot be easily captured in the previously discussed data structures. Take, for example, an expression as simple and common as . The polynomial representations are limited to just that - simple polynomials. Perhaps the sparse multivariate polynomial could be extended to think of e as a kind of "special" variable - namely a constant. Furthermore, and more difficult, the representation would have to be extended to allow for non-integer exponents. Even then, what happens when the exponent is more complicated than a single non-terminal symbol? What if the exponent is itself a polynomial expression, replete with its own exponents, and so on and so forth? The problem quickly grows out of any attempts of reasonable management. This doesn't even touch on the matter of functions within expressions, as in . Granted, the representation was designed with polynomials in mind, but we wanted something more general. All of these expressions are easily represented in abstract syntax trees:
5.2.8 The Syntax Tree - Disadvantages
The generality that makes the syntax tree so appealing is also its biggest problem. While it's possible to represent numerous different expressions with a syntax tree, it's also possible to represent a single expression a number of different ways. For example, the expression can be represented in a number of different ways:
All are mathematically equivalent, but to a computer, they look nothing alike. As mentioned before, it's important to define a canonical form. Changing an expression to canonical form can be a difficult task in itself, but due to the wide variety of expressions syntax trees can represent, it's hard to define exactly what canonical form should be. Certain types of expressions tend to fit some forms better than others. A polynomial fits nicely in a form that orders terms by their degree. What happens when the exponents are more complicated? Both and could be said to have the same "degree" - that is, a polynomial in y of degree 2. Defining a general algorithm for ordering becomes complicated. Consider, also, the example of x as compared to 1 * x or x1. The trees of these expressions are as follows:
While it may be unlikely for a user to enter x1 instead of x, it is not hard to imagine that such an expression may be obtained in the process of manipulating the expression symbolically. For example, might be simplified to x1. The procedure to convert expressions to canonical form must take many factors into account.
5.3.1 General Issues in Simplification
A very common task that any computer algebra system must perform is to simplify an expression. Simplifying expressions makes other tasks much easier, especially comparing expressions entered in different forms to see if they are equivalent. The system has to know how to add terms that should be added and how to add exponents together when the multiplicands have the same base. Computer algebra systems must always orders all the terms to arrive at a canonical form. There must be a consistent order that everything is sorted by. If exponents can be polynomials, such as , then those exponents also need to be sorted. This could go on indefinitely, with each additional power being another complex polynomial. Sorting will need to take place on several levels to make it consistent. The uppermost exponents must be simplified first so that the lower level ones will sort properly. One can not simply sort from the lowest level first.
Systems must also decide whether or not to perform some simplifications. Identities for operations must be taken into account, so adding zero and multiplying by one will be dealt with properly. Either strip out all such identities or add them in to every operation, which would probably require more work and produce a more cluttered display.
There is also the decision whether or not to expand polynomials. Expanding may be trivial, but expanding would require an enormous amount of memory and time, not to mention far too much display space on the screen to be impractical. The factored form is much more compact. If integers are raised to a power, the limitations of the computerís number system may hinder expansion. This again shows why it is important to choose a number representation system that allows for arbitrarily large values. It may be necessary to expand simple polynomials of a very high order into large polynomials of a very high order in order to simplify further, but a lone value should probably be left as it is for simplicity in display. Expanding might be appropriate when there are several polynomials, and terms will cancel out when expanded, but no further simplification can be done otherwise.
The systems might also want to apply trigonometric or other identities to expressions for simplification of functions. For example, and are equivalent if the trig identity is applied. There are many other ways to manipulate functions, especially trigonometric ones, that could hinder further simplification. Sometimes the only way to simplify further is to apply these identities, which makes knowing when to use identities difficult. The natural logarithm function also has identities, which, besides being used to simplify expressions with the natural logarithm function, can also be used to simplify some other expressions. The system must know when to apply these identities and when to leave the functions as they are.
Simplifying is not only used when an expression is first entered in by the user, but, in particular, differentiating an equation will produce an expression that will need to be simplified for it to look like what the user expects to see. It is important that the computer algebra system be able to represent everything that may happen when expressions are simplified and expanded, but it must also decide whether or not to simplify certain operations depending on the circumstances.
5.3.2 The Steps of Simplification - Our Approach
In order to break up the complex task of simplification and reduction to a canonical form, we created a number of small algorithms that performed very simple, specific operations on syntax trees. By calling these simple procedures in order, and repeatedly, we are able to simplify many equivalent representations into a single deterministic form. In the following sections, we will describe the steps taken. Some of the steps seem to move away from simplification instead of towards it - these are intermediate steps that make later simplification easier.
5.3.3 Transforming Negatives
In this step, all negative operators (unary negative and subtraction) are transformed to terminal constants with negative values. For example, x becomes 1 * x and a - b becomes a + (-1 * b). The trees of these expressions are as follows:
This step, while extremely simple, has a number of advantages in terms of defining a canonical form and simplifying later operations. For the canonical benefit, the above pairs of expressions, and others like them can be determined to be equivalent, obviously. The real benefit of this operation is that it significantly reduces the syntactic complexity of the expression trees. Two elements, negation and subtraction, are removed from the set of operators that have to be dealt with in later stages.
Subtraction is replaced by addition, a commutative operator, which allows greater flexibility in ordering. A subtraction node must have exactly two children, and their order cannot be reversed. An addition node, on the other hand, can have any number of children, and they can appear in any order.
5.3.4 Leveling Operators
When the expressions a * b * c and a + b + c are parsed by our calculator, the following syntax trees result:
This is due to the fact that our parser assumes that all operators, with the exception of negation, are binary - that is, they have two operands. Since the parser is designed to read expressions written in infix notation, this is a valid assumption. But there's no reason that commutative operators such as addition and multiplication have to be binary in another notation. For example, a + b + c could be written as (+ a b c) in prefix notation. The same is true of our syntax tree. For example, the expressions a + b + c + d and (a + b) + (c + d) would be parsed as follows:
However, after the simplification step of leveling operators, both expressions are represented as:
This operation trims unnecessary complexity from the syntax trees and resolves problems of associativity in canonical form.
5.3.5 Simplifying Rational Expressions
Recall the various syntax trees for the expression illustrated in section 5.2.8. This simplification step will transform a syntax tree so that a division node cannot be the immediate child of either a division node or a multiplication node. The end result is that any expression formed of multiplicative operators (multiply and divide) will be transformed so that there is a single division node at the top of the tree, with only multiplication operators below it. This simplification takes three specific cases into account in order to form a general procedure for other cases. The first case is the event when a division node (D1) has another division node (D2) as its numerator. In order to simplify this situation, the numerator of D1 must become the numerator of D2, and the denominator of D1 must become the product of the denominators of D1 and D2. This transformation can be seen in the following diagram:
The second case is very similar to the first. In this case, the second division node (D2) occurs in the denominator of the first (D1). The numerator of D1 becomes the product of the numerator of D1 and the denominator of D2. The denominator of D1 becomes the numerator of D2, as seen in the following illustration:
The final case is when a child of a multiplication node (M) is a division node (D). It doesn't matter how many children the multiplication node has, or how many of those children are division nodes. Only the first division node is considered in this simplification. This situation is a little more complicated than the previous two since the operation of the top node must be changed and its children moved, rather than just reshuffling some node links. To simplify this case, M is replaced by a division node whose numerator is the product of the numerator of D and the children of M (with the exception of D itself), and whose denominator is the denominator of D. This transformation is shown below:
From repeated application of these three cases, more complicated expressions can be reduced:
5.3.6 Collecting Like Terms
The first step involved in collecting like terms is to explicitly represent any coefficients or exponents a term may possess. For example, x + 2x becomes 1x + 2x and becomes . The main reason for doing this transformation is to make all the children of an addition or multiplication node share a common form - that is, all the children are either multiplication nodes or power nodes, respectively. In order to collect like terms below a multiplication node, one compares the base of each child power node (Pi) with the bases of the remaining children (Pi+n). In the event that two bases are equal, the exponent of Pi becomes the sum of the exponents of Pi and Pi+n, and Pi+n is removed:
A similar operation would take place for collecting like terms under an addition node, but we have not actually implemented it in our calculator.
5.3.7 Folding Constants
Once terms have been collected together, unnecessary constants can be collected or removed. A constant, in this sense, is a real number in the expression, such as the three in . The k is a mathematical constant, but for purposes of symbolic manipulation, it is treated as a variable. When multiple constants occur below an addition or multiplication node, they can be combined (added or multiplied as the case may be) into a single constant. When both children of a power node are constants, it can optionally be replaced with a single number, although it is not always wise to do so. The number is usually expressed as such because a 901 digit number is unwieldy. In our calculator, we fold a constant power term if the result is less than 1000, an arbitrary choice. Furthermore, a power term with a base of zero can be folded to zero, unless the exponent is also zero. In that case, our calculator simply leaves alone since it has no provisions for indeterminate forms. Power nodes with a base of one can be reduced to one, and power nodes with exponents of one or zero can be reduced to the base alone or one, respectively, with the previously mentioned exception of . If a multiplication node contains a one, that child can be eliminated; if it contains a zero, the whole multiplication node can be replaced with zero. Also, if a multiplication or addition node is left with a single child in the course of these reductions, the node can be eliminated and replaced with its sole child.
5.3.8 Canonical Order
All of these simplifications are fine and wonderful, but what's the use if they can't even determine that a + b and b + a are equivalent? That's why it's important to define a canonical ordering of terms, as discussed in section 5.2.2 and 5.3.1. In order to arrange our syntax trees in canonical order, all the children of a commutative node are sorted with a simple ordering function. The children are sorted first by their node type. In our calculator, there is a different node type for each operator, and one for each of the following: variables, functions, and constants. After node type, the children are sorted lexicographically. This ordering scheme doesn't always order expressions the way one would expect to see it written, but it works well with syntax trees and is consistent - which is the important part.
5.3.9 Full Simplification
Sometimes a single iteration of the simplification steps is not enough to reduce an equation as much as it should be. To compensate for this, we keep iterating through these simplifications until the syntax tree ceases to change.
5.4 Advanced Operations
After canonically representing an equation in memory, the Computer Algebra system can demonstrate its true power. The advanced operations that a system is capable of performing are what separate one system from another. Advanced operations include factorization, differentiation, integration, and finding the limit of a function.
Mathematical operations that are defined in terms of algorithmic processes are rather painlessly integrated into Computer Algebra systems. Assuming that an appropriate representation is chosen for describing an equation, any algorithmic manipulation can be fairly easily translated into a Computer Algebra system. Differentiation is one such operation that is defined algorithmically in a very general way and is therefore particularly well suited to a computational definition.
Differentiation essentially consists of four basic rules (Davenport 165):
The algorithm must only know two additional pieces of information. First, the algorithm must be informed that , which enables the computation of the derivative of any function that does not contain references to other functions. Second, to be a completely flexible at differentiation, the algorithm must be aware of the derivatives of functions (e.g. sin, cos, ln, etc.). The differentiation of functions can easily be accomplished by storing a table of derivatives.
The ability of a computer to perform differentiation is thus demystified. Because we, as human problem solvers, compute derivatives in a very algorithmic way, it is easy for a computer to emulate such behavior. Artificial Intelligence is the attempt at algorithmically modeling a human's ability to think. However, when there is no obvious algorithm that exists, modeling such behavior becomes extremely difficult (and, at this point, all attempts are nothing more than an approximation). The same is true in Computer Algebra systems: some mathematical computations are not clearly performed algorithmically.
Integration is an example of an operation which, at first, appears to have no algorithmic definition. The only general rule that appears to be useable is that . However, even this rule turns out to be unusable in certain cases. Consider attempting to find ; breaking it up at its addition will not yield a solution. The two respective parts have no integral, yet the integral of the combination yields (Davenport 167).
Integration appears to be a compendium of different techniques such as integration by parts, integration by substitution, and simply consulting a table of known integrals. Which integration problems require which technique can not be generally defined. The first attempts at computer integration, then, took an obvious brute force approach. That is, try all possible known techniques until an answer is found.
Ultimately though, a full theory of integration in terms of an algorithmic process that computers can perform was developed. This theory is beyond the scope of this paper, but a summary of the theory is developed in Davenport, Siret, and Tournier pp. 167-186.
5.4.3 Differentiation - Our Implementation
The only advanced operation we chose to implement is differentiation in one variable. In order to differentiate a syntax tree, one must take a top-down approach. The differentiation procedure starts with the root node and tries to differentiate it based on what type of node it is. For example, if the node is an addition node, the derivative of the node is an addition node whose children are the derivatives of each of the original node's children. Differentiation is an inherently recursive procedure, and syntax trees are well suited to recursive evaluation. Below is a list of how each type of node is differentiated:
Previous: Short Biographies Next: Conclusions
Home: Table of Contents