This week I was in a job interview and one of my interview questions turned into a pretty interesting puzzle. I was asked the following question:
Assume a character encoding a = 1, b = 2, c = 3, … , z = 26. Given a string of numbers (e.g. '123456'
), figure out how many different ways that string could be decoded.
I didn’t arrive a complete solution in the few minutes I had during the interview but I kept thinking about it afterwards because I couldn’t shake the feeling there was really a math problem here, not just a programming problem.
The difficult part of this challenge is figuring out how many decodings there are when there are overlapping possible groupings of numbers. For example, the string '111'
can be decoded as ['1', '1', '1']
, ['11', '1']
, or ['1', '11']
. I never figured out a formula for calculating the number of decodings but by manually figuring out the number of decodings for the strings '1'
, '11'
, '111'
, '1111'
, '11111'
, '111111'
, etc. I noticed that the number of decodings follow the Fibonacci sequence starting with 1, 2, 3, 5, 8…! Unfortunately I don’t have a good explanation for why that is the case. (Though it makes gut level sense in the way the Fibonacci sequence is a sum of everything that has come before.) Please leave a comment if you can explain this!
So, my strategy for solving this problem follows these basic steps:
- Break the given string down into substrings such that each substring is one digit,
'10'
or '20'
, or cannot be broken down further because it has an unknown number of possible decodings. These substrings can be considered independently for the purposes of this problem.
- For example,
'956'
becomes ['9', '5', '6']
and '121956'
becomes ['1219', '5', '6']
.
- For each substring figure out the number of decodings. For single digits,
'10'
, and '20'
this is just 1. For other substrings this is the Nth value in the Fibonacci series 1, 2, 3, 5, 8… where N is the length of the substring.
- Multiply together all of the number of decodings for the individual substrings to get the number of of decodings for the original string of numbers.
You can see my code for this solution here: http://nbviewer.ipython.org/3831343/ (raw notebook).