## Papaya Words and Numbers, jointly with Sergei Bernstein

Here is a strange puzzle that was inspired by the palindrome problem. Suppose you have a sequence of words in some alphabet with the initial term *a* and all the other terms *b*: *a, b, b, b, b*, etc. Suppose this sequence generates palindromes every time you concatenate the first several terms, not counting the first term itself. So, *ab, abb, abbb*, and so on — are all palindromes. We call words *b* *“papaya”* words, when *a* exists, such that *a* and *b* generate the sequence with this palindrome property. Can you describe papayas?

**Theorem.** The word *b* is a papaya word if and only if *b* is a substring of Reverse*(b)*Reverse*(b)*.

**Proof.** After we have added *b* so many times that the initial part *a* is much smaller than half of the concatenated string, the middle part of the concatenated string would consist of several words copies of the word *b*. The middle of the reverse string consists of several concatenations of Reverse*(b)*. So the word *b* must be a substring of Reverse*(b)*Reverse*(b)*. On the other hand, suppose *b* is a substring of Reverse*(b)*Reverse*(b)*. Then Reverse*(b)*Reverse*(b)* is of the form *xby*, and we can choose *a = y*.

**Theorem.** Papaya words are either palindromes or concatenations of two palindromes.

**Proof.** Suppose our word consists of two palindromes *cd*. Then the reverse of it is *dc* and its double is *dcdc*. The word *cd* is a substring of *dcdc*, thus according to the first theorem, *cd* is a papaya word. Let’s do the other direction. Suppose the word *b* is a substring of Reverse*(b)*Reverse*(b)*. Then Reverse*(b)*Reverse*(b)* is of the form *xby*. Then *b = yx*, and Reverse*(b) = xy*, which equals Reverse*(x)*Reverse*(y)*. From here, Reverse*(x) = x* and Reverse*(y) = y*. If *x* or *y* has zero length, then our word is a palindrome. QED.

Hey, do you already know why we call these words papayas?

Just for fun we would like to study the structure of papaya words. Any one-character or two-character word is a papaya word, so the patterns are: a, aa, ab. For three-character words there are four patterns: aaa, aab, aba, abb. For four-character words there are 10 patterns: aaaa, aaab, aaba, abaa, aabb, abab, abba, abbb, abac, abcb. In this manner we get the sequence of the number of *n*-character papaya patterns: 1, 2, 4, 10, 21, 50 etc, which is sequence A165137 in the OEIS. This sequence depends on the number of letters in your alphabet. But the first *n* terms of these sequences are the same for all alphabets that have at least *n* letters.

Let us assume that we are working with an infinite alphabet. The complementary sequence would be the number of patterns for non-papaya words. The total number of patterns is described by sequence A000110 — Bell numbers: the number of word structures of length *n* using an infinite alphabet. So the beginning of this complementary sequence A165610 is: 0, 0, 1, 5, 31, 153, etc. The list of corresponding patterns is abc, aabc, abbc, abcc, abca, abcd, etc.

Historically, we first invented the corresponding sequence for numbers, not for words. We call a number a *papaya* number if it is a palindrome or a concatenation of two palindromes. If we use numbers instead of words in the problem, we need to carefully look at what happens if we encounter initial zeroes. Let’s take the papaya number 2200100, and see if we can find a number *a*, such that adding 22010 repeatedly to this sequence starting with *a* will always generate a palindrome. The number *a* must be 00100. But this is not a number. We have two choices: to say that we are working with strings of digits, or to allow several numbers to start the sequence before we add *b* repetitively and before getting to palindromes. In the latter case our sequence can start 0, 0, 100, 22010, 22010, and so on.

As we mentioned before, the number of patterns of papaya numbers will start the same as the number of patterns of papaya words. Later the sequence of patterns of numbers A165136 will be smaller than the corresponding sequence for words. As the sequence of Bell numbers is much more famous than the sequence A164864 of patterns of numbers, we expect the papaya patterns sequence corresponding to the infinite alphabet to be more interesting and important than the sequence of papaya patterns for numbers.

Though papaya numbers might be less important than papaya words with an infinite alphabet, they have an advantage in that we can generate more sequences with them. For example we can calculate the number of positive papaya number with *n* digits, as in the sequence A165135: 9, 90, 252, 1872, 4464, 29250, etc. And we can also calculate the sequence A165611 of *n*-digit non-papaya numbers: 0, 0, 648, 7128, 85536 etc.

## impartial james:

I’ve come across an interesting related problem. Initially, a chalkboard has the single letter words “A” and “B” written on it. The allowed moves are to concatenate one of the strings on the board to the other, on either the left or right. In other words, if at any point the two words on the board are (x,y), then the allowed moves are to (xy,y), (yx,y), (x,xy), and (x,yx). The goal is to prove that at any point, both words on the board are papaya words. I thought you might find it interesting, though I do not know the solution.

A, B

3 February 2018, 2:14 pmAB, B

ABB, B

ABB, BABB

ABB, BABBABB

BABBABBABB, BABBABB