In my previous post, I introduced you to the Advent of Code programming challenge, and explained how I’m going through the days, solving both parts of each puzzle. Continuing on, today we have day 10. In this challenge, Santa’s elves are playing look-and-say. Never heard of it before? Well, let me explain it to you.
1
How many of what do you see there? There is one number 1. To represent this, you would have 11.
Now repeat this. What you have now is two ones, represented as 21.
When you repeat this again, you have one two and one one, which is represented as 1211.
Repeated again, you end up with 111221.
Then 312211.
Then 13112221.
Do you get it? Okay then. Part one of the puzzle is, given a specific input string, what is the length of the 40th result? The input string provided to me is 1113222113. The first time processing this number, I need to get 3113322113.
Analyzing these requirements, I need to parse through the string, counting the number of times a number is in the string until the number changes. This process needs to repeat until the end of the string. Since counting is an aggregate function, if I can make this string into a table of the numbers, I should be able to perform the COUNT function against this, and to group by the particular group that the number is in. To convert this string into a table of the numbers, I’ll again use the virtual tally table:
This produces the following results, where N is the position in the string and Item is the number at that position:
Now what I need to do is to group by the sequentially identical numbers, then count them. Essentially, I need to create data islands for each sequentially identical number. Most data islands are used to identify sequentially incrementing numbers, but this is not what is happening here. To accomplish this challenge, I tried to use a simple grouping, but I just wasn’t able to come up with a way to do this. Therefore, I’m going to go back to the quirky update, which I talked about in a previous post. As a reminder, there are undocumented rules to follow when using this format, and a complete explanation of how this works and those rules can be found at //www.sqlservercentral.com/articles/T-SQL/68467/.
I’m going to start off by creating a table that will hold the above results:
And then insert this insert line between lines 8 and 9 of the “Extract numbers into table” code block above:
Now that I have this in a table that I can work with, and a column to assign the grouping to, let’s now perform the quirky update:
This will now produce the following results:
Here we can see that each Item is grouped by a unique Grp value. As the Item changes, the Grp changes. If the Item changes back to one of the Items already used, the Grp value still increments to the next number, instead of using the previous value.
At this point, we just need to get the Item and the count for each grp, and then put them back into a string to be displayed:
Please read my article at //www.sqlservercentral.com/articles/comma+separated+list/71700/ for how this is building the string in what I believe is the fastest possible way in SQL to create a list from a record set.
This code is returning the string “3113322113”. This is exactly what we are looking for. Now we just need to repeat this 40 times, where each time this calls upon the previous result. This sounds like a good job for a recursive common table expression (cte), so let me start off by putting all of the above code into a multi-statement table-valued function to make it easier to call from the cte:
Now I can call this with a recursive cte. Remember that we are looking for the length of the string, so I’ll get that also:
In a recursive cte, the first part of the cte (lines 4-5) gathers an initial result set, and provides the “anchor” member. It is then combined with a “recursive” member through the UNION ALL operator. The recursive member must reference the anchor member, which can be seen above where the recursive member is selecting from the anchor member. Additionally, I’m provided a column to show the level of recursion that is occurring; in the anchor member it is initialized to 1, and in the recursive member it is incremented. Finally, I just select out the information from this cte. I use the maxrecursion option here to prevent this code from running any longer than necessary. And the results that I get from this code for level 40 is a final length of 252594. Wow, that quickly built up a pretty long string.
For Part 2, it specifies to expand this to get the length at the 50th level. All that I need to do is to change the last two lines (13-14) by replacing the 40 with 50. When I run this, at level 46 I get a length of 1,239,364. If you look in the “Extract numbers into table” module, the virtual tally table will only generate numbers up to 1,000,000, which means that this isn’t enough to handle the string. I expanded this virtual tally table to handle 100 million rows by changing line 7 to:
And when this is run, I can get my final answer of over 3.5 million. And that is a long string. I wonder if Santa’s elves can do all of this in their heads?