I recently stumbled upon some notes (in Russian) of a public lecture given by Vladimir Arnold in 2006. In this lecture Arnold defines a notion of complexity for finite binary strings.
Consider a set of binary strings of length n. Let us first define the Ducci map acting on this set. The result of this operator acting on a string a1a2…an is a string of length n such that its i-th character is |ai − a(i+1)| for i < n, and the n-th character is |an − a1|. We can view this as a difference operator in the field F2, and we consider strings wrapped around. Or we can say that strings are periodic and infinite in both directions.
Let’s consider as an example the action of the Ducci map on strings of  length 6. Since the Ducci map respects cyclic permutation as well as  reflection, I will only check strings up to cyclic permutation and  reflection. If I denote the Ducci map as D, then the Ducci operator is  determined by its action on the following 13 strings, which represent  all 64 strings up to cyclic permutation and reflection: D(000000) =  000000, D(000001) = 000011, D(000011) = 000101, D(000101) = 001111,  D(000111) = 001001, D(001001) = 011011, D(001011) = 011101, D(001111) =  010001, D(010101) = 111111, D(010111) = 111101, D(011011) = 101101,  D(011111) = 100001, D(111111) = 000000.
Now suppose we take a string and apply the Ducci map several times.  Because of the pigeonhole principle, this procedure is eventually  periodic. On strings of length 6, there are 4 cycles. One cycle of  length 1 consists of the string 000000. One cycle of length 3 consists  of the strings 011011, 101101 and 110110. Finally, there are two cycles  of length 6: the first one is 000101, 001111, 010001, 110011, 010100,  111100, and the second one is shifted by one character.
We can represent the strings as vertices and the Ducci map as a  collection of directed edges between vertices. All 64 vertices  corresponding to strings of length 6 generate a graph with 4 connected  components, each of which contains a unique cycle.
The Ducci map is similar to a differential operator. Hence, sequences  that end up at the point 000000 are similar to polynomials. Arnold  decided that polynomials should have lower complexity than other  functions. I do not completely agree with that decision; I don’t have a  good explanation for it. In any case, he proposes the following notion  of complexity for such strings.
Strings that end up at cycles of longer length should be considered more  complex than strings that end up at cycles with shorter length. Within  the connected component, the strings that are further away from the  cycle should have greater complexity. Thus the string 000000 has the  lowest complexity, followed by the string 111111, as D(111111) = 000000.  Next in increasing complexity are the strings 010101 and 101010. At  this point the strings that represent polynomials are exhausted and the  next more complex strings would be the three strings that form a cycle  of length three: 011011, 101101 and 110110. If we assign 000000 a  complexity of 1, then we can assign a number representing complexity to  any other string. For example, the string 111111 would have complexity  2, and strings 010101 and 101010 would have complexity 3.
I am not completely satisfied with Arnold’s notion of complexity. First,  as I mentioned before, I think that some high-degree polynomials are so  much uglier than other functions that there is no reason to consider  them having lower complexity. Second, I want to give a definition of  complexity for periodic strings. There is a slight difference between  periodic strings and finite strings that are wrapped around. Indeed, the  string 110 of length 3 and the string 110110 of length 6 correspond to  the same periodic string, but as finite strings it might make sense to  think of string 110110 as more complex than string 110. As I want to  define complexity for periodic strings, I want the complexity of the  periodic strings corresponding to 110 and 110110 to be the same. So this  is my definition of complexity for periodic strings: let’s call the  complexity of the string the number of edges we need to traverse in the  Ducci graph until we get to a string we saw before. For example, let us  start with string 011010. Arrows represent the Ducci map: 011010 →  101110 → 110011 → 010100 → 111100 → 000101 → 001111 → 010001 → 110011.  We saw 110011 before, so the number of edges, and thus the complexity,  is 8.
The table below describes the complexity of the binary strings of length  6. The first column shows one string in a class up to a rotation or  reflections. The second column shows the number of strings in a class.  The next column provides the Ducci map of the given string, followed by  the length of the cycle. The last two columns show Arnold’s complexity  and my complexity.
| String s | 
# of Strings | 
D(s) | 
Length of the end cycle | 
Arnold’s complexity | 
My complexity | 
| 000000 | 
1 | 
000000 | 
1 | 
1 | 
1 | 
| 000001 | 
6 | 
000011 | 
6 | 
9 | 
8 | 
| 000011 | 
6 | 
000101 | 
6 | 
8 | 
7 | 
| 000101 | 
6 | 
001111 | 
6 | 
7 | 
6 | 
| 000111 | 
6 | 
001001 | 
3 | 
6 | 
5 | 
| 001001 | 
3 | 
011011 | 
3 | 
5 | 
4 | 
| 001011 | 
12 | 
011101 | 
6 | 
9 | 
8 | 
| 001111 | 
6 | 
010001 | 
6 | 
7 | 
6 | 
| 010101 | 
2 | 
111111 | 
1 | 
3 | 
3 | 
| 010111 | 
6 | 
111001 | 
6 | 
8 | 
7 | 
| 011011 | 
3 | 
101101 | 
3 | 
4 | 
3 | 
| 011111 | 
6 | 
100001 | 
6 | 
9 | 
8 | 
| 111111 | 
1 | 
000000 | 
1 | 
2 | 
2 | 
As you can see, for examples of length six my complexity doesn’t differ  much from Arnold’s complexity, but for longer strings the difference  will be more significant. Also, I am pleased to see that the sequence  011010, the one that I called The Random Sequence in one of my previous essays, has the highest complexity.
I know that my definition of complexity is only for periodic sequences.  For example, the binary expansion of pi will have a very high  complexity, though it can be represented by one Greek letter. But for  periodic strings it always gives a number that can be used as a measure  of complexity.
Share:



