THE PERRON-FROBENIUS THEOREM

THE PERRON-FROBENIUS THEOREM

INTRODUCTION

The projects in this collection are concerned with models from many different areas; that is part of their purpose, to show that linear algebra is a broadly applicable branch of mathematics. If one reviews them as a whole, they do have a couple of common mathematical characteristics: eigenvalues are very useful and the matrices studied are almost all nonnegative (all entries in them 0 or greater). To be a bit more precise about the former, it is often only the largest, or dominant, eigenvalue that we need to know.

This makes life much easier. Computing all the eigenvalues of a matrix may be a difficult task. For a 10x10 matrix from a population model, Maple often has trouble computing all of them. Yet we only need one of the 10 and the methods shown in Project #5 work reliably.

The mathematical question that all of this raises is: how do we know if the dominant eigenvalue of a matrix is positive? Another question is: could there ever be positive and negative eigenvalues of the same size? If so, the behavior of the associated system could be quite different. The answer to this question is yes, as is shown in problem 3 of Project #5

Both of these questions are answered by the Perron-Frobenius Theorem for nonnegative matrices. The results of the theorem depend upon what kind of nonnegative matrix one has. The first kind we look at are called irreducible.

DEFINITION An nxn nonnegative matrix A is said to be irreducible if there is no permutation of coordinates such that

where P is an nxn permutation matrix (each row and each column have exactly one 1 entry and all others 0), A₁₁ is rxr, and A₂₂ is (n-r)x(n-r). This is not a particularly inspiring definition in view of the fact it only tells us what is NOT irreducible. About the only thing we know for sure is that a matrix with all positive entries is irreducible.

To clarify the notion of irreducibility, we examine it in three different contexts:

1. Markov Chains. Suppose A is the transition matrix of a Markov chain. Then it is nonnegative and suppose further it is set up so that

a_ij = the probability of going from state j to state i

in one transition

If one looks at, say, the fourth row of A, then one sees the probabilities of going from state 4 to the various other states (as well as remaining in state 4). Any entry that is zero indicates that one cannot go to that state from state 4, at least in one step.

Now if A is reducible, for instance,

then we can see that it is possible to go from states 1,2 or 3 to any states but only from states 4 or 5 to themselves. This is, of course, the traditional definition of "absorbing" states in a Markov Chain. The above mentioned permutation matrix P simply amounts to labeling the states so the absorbing ones are last.

2. Graphs. If we considered directed graphs then each has associated with it a nonnegative matrix with all entries 0 or 1 with

a_ij = 1 if there is an arc from vertex i to vertex j.

If the associated matrix is irreducible then one can get from any vertex to any other vertex (perhaps in several steps) whereas if it is reducible then (like the Markov Chain case), there are vertices from which one cannot travel to all other vertices.

The former case, in the realm of graph theory, is called a "strongly connected" graph.

3. Dynamical Systems. Suppose, as in the population case we have a system of the form

x(k+1) = A x(k) k=0,1,...

and that A is reducible as in the definition above. Then in the manner of partitioned matrices, one can rewrite this system as

Y(k+1) = A₁₁ Y(K) + A₁₂ Z(k)

Z(k+1) = A₂₂ Z(k)

where Y has the first r components of X and Z has the last n-r so together they comprise the original vector X. (The reader who has either not worked with partitioned matrices or is rusty on the subject is urged to sketch the details out at the component level and verify it; the result should follow in a straightforward manner.) While this may not seem helpful at first glance, it means the solution for Z may be obtained first, with no reference to the system governing Y, and then the solution for Y obtained from the (nonhomogeneous) system that treats Z as known. In the early language of such a problem, the original system has been "reduced" to two simpler systems. To physicists, it has been partially decoupled.

Saying, then, that the matrix A is irreducible means that the system cannot be reduced; it must be treated as a whole in studying its behavior.

While the above discussion may clarify the notion of an irreducible matrix, it is not much help in verifying that a given matrix actually is irreducible. For example, it is not blatantly obvious that of the two following matrices

the latter is irreducible while the former is not. If we went strictly by the definition, we would have to keep trying permutations and looking for the critical zero submatrix to appear. But for an nxn matrix there are, of course, n! possible such matrices P making for a great deal of work ( if A is 10 x 10 then there are over 3 million possibilities!).

The following theorem is thus quite direct and helpful.

THEOREM. A is a nonnegative irreducible nxn matrix if and only if

(I_n + A)^n-1 > 0. (see [12], p.6 for details and proof)

We note that the power in the above expression contains the same n as in the size of the matrix. Since computer software is readily available to compute matrix powers, the above may be readily checked. Note the result is also an nxn matrix, and if any entry is zero, then the contrapositive of the theorem says A is reducible. Referring to the two matrices above, we find that by direct computation

At this point, it seems appropriate to finally state the

PERRON-FROBENIUS THEOREM FOR IRREDUCIBLE MATRICES

If A is nxn, nonnegative, irreducible, then

1. one of its eigenvalues is positive and greater than or equal to (in absolute value) all other eigenvalues

2. there is a positive eigenvector corresponding to that eigenvalue

and 3. that eigenvalue is a simple root of the characteristic equation of A.

Such an eigenvalue is called the "dominant eigenvalue" of A and we assume hereafter we have numbered the eigenvalues so it is the first one. We should point out that other eigenvalues may be positive, negative or complex (and if they are complex then by "absolute value" we mean modulus, or distance in the complex plane from the origin). Complex eigenvalues are a real possibility as only symmetric matrices are guaranteed to not have them, and very few of the matrices we have been discussing, in application, will be symmetric with the notable exception of undirected graphs. Part 3 of the theorem also merits brief comment. One ramification of it is that the dominant eigenvalue cannot be a multiple root. One will not be left with the classic situation of having more roots than corresponding linearly independent eigenvectors and hence having to worry about or calculate generalized eigenvectors and/or work with Jordan blocks. The same may not be said for the other eigenvalues of A but in the models here, they do not concern us.

Primitive Matrices and the Perron-Frobenius Theorem

Irreducible matrices are not the only nonnegative matrices of possible interest in the models we have looked at. Suppose we have a dynamical system of the form

x(k+1)=Ax(k)

where

(this matrix, while containing many suspicious looking zeroes, is indeed irreducible. The easiest way to see this is to construct the associated graph for it and check that you can get from any vertex to any other vertex.)We calculate that its dominant eigenvalue is 1.653 and that an associated eigenvector is (.29,.48,.29,.35,.5,.48)^t so based upon the above discussion, we believe the long term behavior of the system to be of the form:

x(k) = c₁(1.653)^k(.29,.48,.29,.35,.5,.48)^t (c₁ determined from initial conditions)

Simulation of the system is, of course, quite easy as one just needs to keep multiplying the current solution by A to get the solution one time period later. However, in this case, doing so does not seem to validate the predicted behavior and in fact does not even seem to show any sort of limit at all! (the reader is encouraged to fire up some software, pick an initial condition and see this behavior).

So what went wrong? If one calculates all of the eigenvalues for the matrix, they turn out to be: 1.653,0,0,+\- .856i and -1.653. The latter is where the limit problem arises; since we are taking integral powers of the eigenvalues, we get an algebraic "flip-flop" effect:

x(k) = c₁ (1.653)^k e₁ + ... + c₆ (-1.653)^k e₆

(It is, by the way, a known result that an irreducible matrix cannot have two independent eigenvectors that are both nonnegative; see [16], chapter 2. Thus e₆ in the above expansion has components of mixed sign.)

Thus 1.653 did not turn out to be as neatly "dominant" as we would have liked. If we look back at the statement of the Perron-Frobenius Theorem, we see it guaranteed a positive eigenvalue (with positive eigenvector) with absolute value greater than or equal to that of any other eigenvalue. In the example just considered, the equality was met.

So the question comes up: what stronger condition than irreducibility should one impose so that a nonnegative matrix has a truly dominant eigenvalue; strictly greater in absolute value than any other eigenvalue? The answer is that the matrix needs to be "primitive". While there are several possible definitions of "primitive", most of which have a graphical context in terms of cycles, we will state a more general, algebraic definition as the models we may wish to look at are from a diverse group.

DEFINITION An nxn nonnegative matrix A is primitive iff A^k > 0 for some power k.

We note the strict inequality; all n² entries of A^khave to be positive for some power. Such a condition, again considering the availability of computer software and ease of use, is easy to check. If one experiments with the 6x6 from the last example, one never finds a power where all 36 entries are positive. The question might come up: how many powers

of A does one have to look at before concluding it is not primitive? If A is primitive then the power which should have all positive entries is less than or equal to n² -2n +2 (this is due to Wielandt in 1950, see [17]). Also, it can be easily shown that if A is primitive than A is irreducible. Thus the class of primitive matrices has as a subset the class of irreducible matrices. Finally, primitive matrices indeed have the desired property in terms of a dominant eigenvalue:

PERRON-FROBENIUS THEOREM FOR PRIMITIVE MATRICES

If A is an nxn nonnegative primitive matrix then

1. one of its eigenvalues is positive and greater than (in absolute value) all other eigenvalues

2. there is a positive eigenvector corresponding to that eigenvalue

3. that eigenvalue is a simple root of the characteristic equation of A.

OTHER APPLICATIONS

In addition to the various projects, some other applications which involve the Perron Frobenious Theorem desire mention:

Application #1: Ranking of Football Teams.

James P. Keener has developed several models of schemes for ranking college football teams which may be found in [6].

Application #2: Graphs.

In general, it should be remarked that graph theory and nonnegative matrices have a very strong relationship and that the Perron-Frobenius Theorem is often a powerful tool in graph theory. The interested reader is referred to, for example, the excellent books by Minc and Varga for an in depth discussion.

As stated above, a graph (directed or not) has associated with it a nonnegative, "adjacency" matrix whose entries are 0s and 1s. A fundamental result about lengths of cycles in the graph may be obtained by determining whether the matrix is primitive or not. The very elegant result which occurs with the help of the Perron-Frobenius Theorem is this:

* if the matrix is primitive (hence a dominant eigenvalue with absolute value strictly greater than that of all other eigenvalues) then the greatest common divisor (gcd) of the lengths of all cycles is 1

* if the matrix is irreducible but not primitive then the greatest common divisor of the lengths of all cycles is the same as the number of eigenvalues with magnitude the same as the dominant eigenvalue (and including it).

It is common to refer to graphs with matrices which are irreducible but not primitive, naturally, as imprimitive and aforementioned gcd. as the index of the graph. It should also be mentioned that the collection of such eigenvalues lie equally spaced in the complex plane on a circle of radius equaling the dominant eigenvalue.

The interested reader is encouraged to examine the following pair of graphs in light of this result:

In the case of the first graph, the eigenvalues are 1,i,-i, and -1 while in the second graph they are 1.221, -.248 +/-1.034i, and -.724, consistent with the gcd of paths for graph 1 being 4 and the gcd of paths for graph 2 being 1.

Conclusion

The Perron-Frobenius Theorem has proven to be a consistently powerful result for examining certain nonnegative matrices arising in discrete models. It has been shown that careful consideration need be given to what hypothesis is used; depending on whether one has an irreducible or primitive matrix. In applications, knowledge of the dominant eigenvalue and eigenvector is very helpful and also attainable while knowledge of the rest of the "spectrum" is both unnecessary and computationally extensive.

Acknowledgment

The author wishes to thank Dr. Kenneth Lane of KDLLabs, Satellite Beach, Florida, for many inspiring insights and conversations concerning the power and richness of the Perron-Frobenius Theorem.

REFERENCES

1. Berman, A. and Plemmons R. 1979. Nonnegative Matrices in the Mathematical Sciences. New York: Academic Press.

2. Chartrand, G. 1977. Graphs as Mathematical Models. Boston: Prindle, Weber and Schmidt.

3. Gould, Peter. 1967. The Geographic Interpretation of Eigenvalues.Transactions of the Institute of British Geographers 42: 53-85.

4. Goulet, J. 1982. Mathematical Systems Analysis - A Course. The UMAP Journal 3(4):395-406.

6. Keener, James P., 1993. The Perron-Frobenius Theorem and the Ranking of Football Teams. SIAM Review 35(1): 80-93.

7. Kemeny, John and Snell, Laurie. 1960. Finite Markov Chains. New York: Van Nostrand Reinhold.

8. Lane,Kenneth D.. 1983. Computing the Index of a Graph.Congressus Numerantium, 40,143-154

9. Luenberger, D.G. 1979. Dynamic Systems. New York: John Wiley.

11. Maki, D.P. and Thompson, M. 1973. Mathematical Models and Applications. Englewood Cliffs, New Jersey: Prentice-Hall.

12. Minc, Henryk.1988. Nonnegative Matrices. New York: John Wiley and Sons.

14. Straffin, Philip D. 1980. Linear Algebra in Geography: Eigenvectors of Networks. Mathematics Magazine 53(5): 269-276.

16. Varga, Richard S. 1962. Matrix Iterative Analysis. Englewood Cliffs N.J.: Prentice-Hall.

17. Wielandt,H. 1950. "Unzerlegbare nicht negativen Matrizen" Math. Z. 52, 642-648.