People sometimes ask me where my brain goes when I check out - and the answer is that I like to work on small, well defined problems and attack them using different approaches. This could be tiny single-pass neural networks, small social media networks, small embedded device drivers, arbitrary encryption algorithms (creating and cracking), and sometimes super permutations.
Super permutations is a simple problem to explain, but a hard problem to solve. Essentially you take all the permutations of a string and try to generate a minimal large string containing all permutations, where overlaps are allowed.
For example, the permutations of a symbol set of length 3 (i.e. {1, 2, 3}
):
{1, 2, 3}
{1, 3, 2}
{2, 1, 3}
{2, 3, 1}
{3, 1, 2}
{3, 2, 1}
Depending on the algorithm you use, the minimal string length is: 123121321
. If you look carefully, you will find each permutations represented in the set.
There are some generally open questions about the problem space:
You can compute lengths of 1, 2, 3 near instantly with brute force, 4 and 5 near instantly with tricks, and length 6 is still being computed. As it currently stands, unless something changes, it’s predicted that a minimal solution for super permutations of length 7 will not be found.
Super permutations are interesting because they are simple. In a short time you could explain the problem to anybody, even a child - and yet the problem is hard to solve. Insight into solving the problem could literally come from anywhere - and I think this is what really makes the problem interesting.
In reality, it’s really not clear what problems a solution could help to solve. Perhaps it could reveal deep insight into solving the Travelling Salesman Problem, or perhaps it reveals some ideas about better compression methods.
One real life problem space I have heard is a garage door that accepts a code, but it checks for the opening key using a sliding window. Any security based on sliding windows could potentially be susceptible to an attack generated by an efficient solution to this problem.
Here I am going to detail some insights I’ve had before now, in the hope that they may help somebody else.
These are a special case of super permutation, where they read the same left-right and right-left. It’s suspected that these may not always produce the most efficient super permutations, but in theory should be the easiest to crack, given you only have to find half of the super permutation string.
I did generate a method for producing super permutation palindromes from 1-5 in a split second, by studying the patterns in the numbers generated. Unfortunately this shows multiple solutions for 5 (although recoverable) and fails for length 6 or more.
As echoed by others, for every palindrome found so far, it seems to be the case that there are no repetitions in the strings produced. You might think “this is obvious”, but actually it really is not. It’s not yet been proven that this holds up past lengths of 6.
Almost exclusively for palindromes, as you are building up the strings it allows you to say two things:
These allow you to drastically reduce the search space.
These are some approaches explored so far…
For this method we are checking every combination possible, starting from the smallest via string size predicted by the lower bound, up until the upper bound (which should not be met). This is of course very slow, so we make a few optimizations:
To give an idea of speed: For length 4 we can process approximately 3 million proposals a second using 8 cores on a i7-6700HQ.
Another method is to take all the combinations of the permutations and performing compression on the strings. By compression, I mean that we remove substrings that are already represented.
Whilst in theory this could be faster, in reality there may be issues as you have to pick one of two or more repeats to keep. The consequence is that it would affect subsequent compressions, meaning that you also need to test all combinations of compressions too.
The problem can be represented as the travelling salesman problem - by far the current most popular method is to use TSP solvers after representing the problem in the appropriate way. Whilst this has yielded minimal strings of length 7, this method does not seem feasible for 8 or higher and also doesn’t really add greater insight into the problem space.
This method is near instant for values less than 6 and fails at 6 or more symbols. This is a problem I am currently working on solving for palindromes, as it seems like the most viable approach to getting near minimal strings. At least for strings up to 5 in length, strings are generated near instantly.
These are just a series of unsupported hunches about the problem space: