A number sequence puzzle is given as a finite sequence of numbers; the challenge is to find the next number in the most obvious extension of the given sequence.

For example, given the puzzle 2, 5, 10, 17, we can see it as the start of the following sequence of numbers: 2, 5, 10, 17, 17, 17, 17, 17, 17, .... But that is "not right": there is no good reason to follow up with more numbers 17. So it's not a good answer. But we can find a simple function that generates every number in the sequence: f(n) = n*n + 1. According to this function, the next number is 26. And if this is the simplest function to describe the given sequence, 26 is the best answer.

If only polynomial functions are used, a completely general method exists to solve these puzzles: Lagrange interpolation. Given any finite sequence of numbers, it always produces the least complex polynomial that fits the sequence; this polynomial is unique and it determines a unique next number. It is not always an integer, but this can be fixed by using a different interpolation formula.

In general, a solution does not have to be a polynomial; it may be some other computable function. For example, you probably want to follow up 0, 1, 0, 1, 0, 1, 0, 1 with 0, but that is not what Lagrange interpolation will produce. So the aim is more generally to find the 'simplest' function that generates another number. To this end, we can define 'simplest' as 'having the lowest Kolmogorov complexity'. The Kolmogorov complexity of a function is the amount of code required to program it; with a Turing machine, to be exact. This doesn't guarantee a unique result, but that's not the point - the point is to give a reasonable measure against which proposed solutions can be compared.

Why use a Turing machine to define complexity, you may ask. The answer is

  • we have to take some machine model, and the Turing machine is pretty standard
  • the theory shows that the choice of machine model doesn't really make much of a difference on the whole, although it matters when comparing individual functions
Still, it is meaningful to ask if the machine model can be 'fine-tuned' to be a closer approximation to the human intuition regarding these puzzles. A machine with random memory access seems more reasonable, for instance.

PS Suvrat makes the point that many sequence puzzles draw on real world knowledge. For example, 3, 3, 5, 4, 4, 3 can be followed by 5 because the sequence represents the string lengths of the English numerals. The approach above implicitly assumes that no such knowledge is required.

So how would you go about doing this "Lagrange interpolation" by hand on a quiz?

Well, suppose we're given a sequence 2,5,10,17. We'll write down the numbers in one row, then below them their differences, and then proceed writing successive differences:

 2  5  10 17 ? ...
   3  5  7  ?
     2  2  ?
Eventually we get a constant row (note that each row is one number shorter, so obviously we might get a length=1 row, which is constant; this probably doesn't count as a "solution").

Now extend the table: the constant row remains constant, the other entries are then forced:

 2  5  10 17 26 37 ...
   3  5  7  9 11 ...
     2  2 2  2 ...
Note that, amazingly enough, the method of successive differences has found the numbers of the sequence n2+1.

This method works for any polynomial sequence.

If, instead of differences, we also allow ourselves to pick quotients, we can "discover" rules for a wider class of sequences (of course, since every row is still shorter, we can now discover more rules for any given sequence fragment...). For instance, the sequence 1,2,4,8,16,32,... is probably best "explained" by taking a quotient at the first step, and getting a constant row of 2's.

The extended method can additionally find any sequence that is polynomial in n and an, for any constant a. But nn, for instance, is still beyond the scope of this method.

Regarding Ariel's WU above, thats what you would need to do to interpolate anyway-calculate successive differences!.

But my primary point is about rp's WU above. There are two problems. The first is that it is only very very rarely that this 'generating function' happens to be a polynomial. The second point is about the definition of complexity above. I dont think that this it's valid to use a Turing Machine criterion to decide simplicity because thats not the way the human brain works. Here's an example. Consider the sequence:
3,5,7,11,13,17...
Next number? Well 19 obviously. However, that would probably be the very last number tried by a computer, using the above definition of complexity. Generating prime numbers is very very difficult indeed, so it would be much simpler for a computer to fit a 6th order polynomial(which would fit perfectly).

Other number sequence puzzles might depend on extra 'wordly' knowledge. For example, each number might represent a character of the alphabet. As long as the sequence involves simple things like adding or multiplying the preceeding numbers the above definition of simplicity might work. However, the human brain is far too complex, so there are lots of things which would be obvious to a human, but very difficult to guess for a machine.

My primary point is that (at least at our current level of computational theory) it is impossible to develop any relationship between what humans perceive as simple or difficult and what machines perceive as simple or difficult(Isnt that the problem everyone's trying to crack anyway?).

Okay here's a footnote after some discussion with rp. The basic point above is that 'obvousness' or lack of it is a social construct. For example consider 3,1,4-5,9. What do you fit into the dash. Well 1 obviously but the reason its 1 and not something else is that the digits of pi are 'socially common'. There's nothing computationally simple involved here, but the answer is obvious because the society we live in uses pi such a lot . The same thing holds for prime numbers. We are taught about them and classify them as a seperate block in our minds, which is why the answer is obvious.
It is this very important intangible factor that I'm talking about. This is something which cannot be duplicated mechanically...

Not all number sequence puzzles can be solved with a simple function. Sometimes you come across a sequence that you are expected to be familiar with already, in a disguised form. A very simple example:

31, 28, 31, 30, 31, ?

The answer, clearly, is 30, the number of days in the month of June. The sequence you are expected to recognise is the months of the year, 'disguised' by representing each month by the number of days it has.

This is a good thing to have in your repertoire of 'things to try' when faced with these puzzles. Here's an example of a harder one:

3, 3, 5, 4, 4, ?

Look away now if you don't want to know the answer which is below in very small writing:

3. They're the number of letters in the words 'one', 'two', 'three' and so on, and there are three letters in the word 'six'.

What is the next number in the sequence

1, 1, 1, 1
?

If we let the function f be

f(n) = (41/24)(n-1)(n-2)(n-3)(n-4) + 1

then we have

f(1) = 1
f(2) = 1
f(3) = 1
f(4) = 1
f(5) = 42

Therefore the next number might as well be 42. The same trick can be pulled off with any given integer sequence. In fact, it can be any number you like. This is not particularly hard to prove.

If we have been given a sequence of numbers

a1, a2, a3, a4

and we wish to find the value of a5, then we are effectively being asked to find the the function f(x) such that

f(1) = a1
f(2) = a2
f(3) = a3
f(4) = a4

and then to evaluate f(5). Now, as we all know, if we are given any two points we can find a straight line through them. Given any three points we can find a parabola

f(x) = b2x2 + b1x + b0

through them. Correspondingly, given any n points on a graph, we can find a polynomial of degree n-1:

f(x) = b0 + b1x + ... + bn-1xn-1

which will go through of all of them. (Well, you can't if two points are directly above each other, but this is obviously impossible in this case.) This is called Lagrange interpolation. So suppose we know

f(1) = a1
f(2) = a2
...
f(n-1) = an-1 (*)

and we want an to be 42. To find f(x), we have to solve the set of simultaneous equations;

f(1) = b0 + b1 + ... + bn-1 = a1
f(2) = b0 + 2b1 + ... + 2n-1bn-1 = a2
...
f(n) = b0 + nb1 + ... + nn-1bn-1 = an = 42

We have n equations with n unknowns (being the bk). The solution will yield a polynomial f(x) which satisfies (*) and additionally has f(n) = an, for any given an. So for any integer sequence, you can arbitrarily pick your own next number and find a polynomial that proves you correct.

This is a very pedantic point to make, but you have to be a pedant to be a mathematician, because making unjustified assumptions while answering questions can be catastrophic. However, often the answer in questions like these is either fairly obvious or easy to find, so while this technique is an entertaining party trick (if one is at a particularly dull party), it's usually quicker to just find the next integer by traditional methods.

Log in or register to write something here or to contact authors.