When an expression raised to the square or any higher power vanishes, it may be called nilpotent; but when, raised to a square or higher power, it gives itself as the result, it may be called idempotent.1
To be mathematically vague, an object is nilpotent if some power of it 'is' zero. Clearly, such vagueness won't do, but can only be cleared up by a mathematical framework within which objects, powers and zero all have meaning. For the non-mathematician, a friendly environment might be the integers. Then zero is a very familiar concept, and power has its 'usual' meaning of repeated multiplication. So, a nilpotent integer is one which when multiplied together a magical number of times (its order of nilpotency) becomes zero.
With a little thought it becomes clear that the only such integer is zero itself- all the others grow in magnitude during the multiplication process, so cannot end up at zero. How exotic, then, does the mathematics have to become before the nilpotent definition becomes interesting? There are many mathematical playgrounds out there within which nilpotency is meaningful, and I'm certainly no master of them all. So we'll settle for a tour of a couple of the more accessible (for, at least, an undergraduate) areas in which interesting behaviour can be found.
Nilpotency in Modular mathematics
We observed that the problem with the integers was that multiplication always made things bigger (in absolute value- for a negative integer it would alternate between positive and negative values, but those continue to get further from zero at each step). So a solution might be to throw in numbers which, when multiplied by, give smaller results: values less than 1, such as fractions. (That is, we can start working in ℚ, the set of rational numbers). This turns out not to help- whilst (1/2)n+1 is half the size of (1/2)n, there is still no value k such that (1/2)k is zero. That is, it approaches but never attains its limit. So adding in more numbers to play with wasn't useful in getting an integer-like playground with non-zero nilpotent values.
Instead, therefore, we'll throw most of the integers away. A structure that allows this is modular mathematics- where all calculations are done modulo some value n- meaning that n is identified with zero, and at any intermediate step of a calculation, multiples of n can be safely discarded. Those with more than a passing interest in cryptography will probably have come across it; otherwise, you may have encountered it as "clock arithmetic"- where you add and multiply on a clock face, resetting to zero every time you reach 12 (so 17:00 hours is 17-12 = 5 o'clock in the afternoon).
Why can this give nilpotent elements? Well, some experimentation will show that working in the integers modulo n (an algebraic structure known as a ring; although choosing a prime number for n gives a "better" structure, a field) multiplication will wrap around and so can give smaller values. For instance, working modulo 7, we can start taking multiples of two: 2*2 = 4, 2*2*2 = 8 = 8-7 =1 which is less than two. Two then locks into this pattern (2-->4-->1-->2) so isn't nilpotent. But any multiple of 7 is trivially nilpotent, as 7^1 = 7 = 7-7 = 0. Can we get rings where values other than n and its multiples are 0? That's possible too, if we pick an n which is some power of a number. As an example, two is nilpotent in the integers modulo 8, since 2^3 = 8 = 8-8 = 0 in that setting.
What's happened here is that we've made the definition of nilpotent go further by giving ourselves more zeros to play with. Before moving on to break the power bit instead of the zero bit, I'll take a quick detour into some related algebra for those more versed in the notions and notation. Feel free to skip this bit.
A ring is said to be reduced if it has no nilpotent operators: that is, there exists no non-zero f and natural m such that fm=0. If the ring R is of the form R = k[X1,...,Xn]/I for some ideal I, then R is reduced if and only if I is a radical ideal (that is, I is its own radical). In particular, k[V], the co-ordinate ring, is reduced for any affine variety V.
Nilpotent Operators
A mathematical operator/function/map is a rule which takes values in a space and assigns to them values in a new space. If the operation actually always assigns a value from the original space, then it makes sense to think of applying the function multiple times. So if T:V-->V is a map such that v-->f(v) for all v in V, then we can think of the map T2: V--> V as being v-->f(f(v)). In this way, we can build up Tn for any natural number n. If we take maps as our objects, and this layering process as our 'power', then we are in a position to think about nilpotent operators. (The natural candidate for a 'zero' here is the zero map, which cheerfully sends anything you throw at it to zero.)
In analysis, as ariels observes, the nilpotent operators on finite dimensional spaces V reduce dimension. Since infinite dimensional functional analysis is not something I easily grasp, we'll ignore that (even though other interesting things can apparently happen) and restrict our attention to finite dimension (which can be arbitrarily large if you're feeling cheated!).
A structure theorem for nilpotent operators
If β: W-->W is a nilpotent operator, then there is a basis w1...wn of W for which β(wi) is either wi-1 or 0; and in particular β(w1) is zero.
Proof of the above was considered beyond the scope of my undergrad studies, but the complication arises in showing that all nilpotent operators can be considered in this form. Clearly, any operator with this structure will be nilpotent. If this isn't obvious because the terminology is meaningless to you, here's what that part of the theorem is saying. Any object in our space can be built up using some but not necessarily all of a collection of building blocks (the basis w1...wn). So the input to β and its output can both be thought of as such a sum. Let's simplify further, and suppose that we're working in the familiar three dimensions and have a basis w1, w2, w3. We start with a point t, which we write as aw1 + bw2 + cw3 for some constants. When we apply β we get an answer out which by our rule simply erases some coefficients and shifts others. For instance, for the rule w3-->w2-->w1-->0 the result is bw1 + cw2 . Note that this is of only 2 dimensions compared to the three of the point t. Applying β to that object gives us cw1. A final application erases our last coefficient and takes us to zero: β was nilpotent of order 3.
There is a beautiful and powerful connection between linear operators and matrices, which can help to illustrate this dimension reducing activity. Essentially, the structure theorem above argues that any nilpotent operator, when considered in matrix form for an appropriate basis, will consist of a series of blocks, where any given block is upper triangular with the lead diagonal containing only zeros. The matrix itself is block diagonal. Some useful properties are obtained-
- The smaller the blocks, the quicker the map erases dimension. So the order of nilpotency for β is the size of the largest block.
- The form of β (that is, the chains of wi involved) is determined by the sizes of the blocks, but permutation of the basis permutes the blocks in the matrix, so these values serve only as an unordered partition of n (the dimension of W).
- The number of blocks is the geometric multiplicity of 0 as an eigenvalue, equivalently, dim(ker β).
- The collection of block sizes is an invariant of β, so matrices with this structure but different block sizes are not similar (they are not representations of the same map in different bases).
Matrix examples in 3 dimensions
The simplest nilpotent matrix in 3 dimensions is the zero matrix:
0 0 0
0 0 0
0 0 0
Which with regard to our structure theorem is the matrix of the zero map, which sends w
i to 0 for all i. Here the block sizes are all 1, and a 1X1 block with zero lead diagonal is just a zero block.
Thinking of 3 as 1+2, we can construct another 3X3 nilpotent matrix by either of
0 1 0 0 0 0
0 0 0 0 1 0
0 0 0 0 0 0
The one on the left corresponds to w
2-->w
1-->0, w
3-->0 (chains of length 2 and 1).
Finally, our original structure theorem example with the chain w3-->w2-->w1-->0 takes on the matrix form
0 1 0
0 0 1
0 0 0
More on nilpotent matrices
Setting aside all of this theory of linear maps (although in fact it'll turn out to be equivalent), we can apply our nilpotent definition to matrices as objects in a spaces of matrices, using matrix multiplication for powers and the zero matrix as our zero (this is another ring, but unlike the integers modulo n this is not a commutative one). Then a (non-zero- that is, there is at least one element which isn't zero) matrix A is said to be nilpotent of order k (where k ∈ N) if Ak=0 and An ≠ 0 for all n < k.
Then we can observe, by properties of the determinant, that a nilpotent matrix is singular- that is, it has no inverse. However, I-A has the curious inverse I + A + A2 + ... + Ak-1: multiply the two and all the interior terms cancel, leaving I - Ak = I as Ak is zero.
Knowing that a matrix is nilpotent is handy for calculating the matrix exponential function, a process which generally involves an infinite sum (or lengthly calculation of fundamental matrices). As the summands are powers of the matrix, once the kth term is reached all the subsequent terms are zero and hence have no effect on the sum. Thus you can find the matrix exponential by the finite sum of Mn/n! where n ranges from 0 to k-1 .
A nilpotent matrix may alternatively be defined as one for which all eigenvalues are zero.
References
1- from "Linear Associative Algebra", Benjamin Peirce, 1870. Source: Earliest Known Uses of Some of the Words of Mathematics - http://members.aol.com/jeff570/mathword.html.
MA20012 (Algebra II) and MA30188 (Algebraic Curves) lecture notes from the University of Bath Mathematics degree.