Advent of Code 2018 – Day 5
As I explained in a recent post, I’m participating in this year’s Advent of Code challenge, with the twist of doing the challenges in T-SQL. In case you don’t know what “Advent of Code” is, Eric Wastl (t) created it for the purpose of, as Eric describes it:
Advent of Code is a series of small programming puzzles for a variety of skill levels. They are self-contained and are just as appropriate for an expert who wants to stay sharp as they are for a beginner who is just learning to code. Each puzzle calls upon different skills and has two parts that build on a theme.
The full explanation of “Advent of Code” is at //adventofcode.com/2018/about. You can see my other posts in the “Advent of Code” category.
2018 Advent of Code recap
We started off with someone going back in time and altering Santa’s past. You went back in time to correct it, but the device to fix the problems hadn’t been calibrated before sending you on your way. You calibrated the device on Day 1. On Day 2, we located the boxes in the warehouse where the new (500 years ago) Santa suit material is. We sliced and diced the material on Day 3, finding the best piece to use. We figured out the best time to sneak into the Santa suit lab on Day 4.
Day 5 – Part 1
In Day 5, we find ourselves working with the polymers of the new Santa suit. A polymer (the input file), consists of units, represented by upper and lower case letters. Adjacent units of the same letter, but of different polarity (case), cancel each other out. This may lead to other units that can then cancel each other out. The goal is to reduce the polymer to as small as possible, and report back the reduced size.
Tasks:
- Perform a case-sensitive search/replace for each letter of the alphabet. The search is for a pair of the same letter, where one is upper case, and the other is lower case.
- Recursively perform this operation until the string can no longer be reduced.
In my opinion, the key part to this is that the operation needs to be performed recursively. I can think of only two ways to recursively perform an operation in SQL Server:
- A recursive common table expression (cte).
- Using a WHILE loop.
I don’t like using either of these mechanisms in SQL Server – they both perform operations in a “Row-By-Agonizing-Row” method, instead of a more set-based approach. However, set-based recursion usually performs extremely poorly. So, I’m going to use a while loop.
Day 5 – Part 1 – Recursive replaces
The code for this operation is relatively short. As you may have noticed from previous days, I like to do the code is small pieces where they can be explained. Today, I’ll break from that and show all of the code, and then explain it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
DECLARE @data VARCHAR(MAX), @Letters CHAR(26)= 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', @Size INTEGER, @NewSize INTEGER = 0, @Loop INTEGER = 0; SELECT @data = FileData FROM OPENROWSET(BULK 'D:\AdventOfCode\2018\Input\Day05.txt', SINGLE_CLOB) dt(FileData); SET @Size = LEN(@data); RAISERROR('Loop %d; Size %d', 10, 1, @Loop, @Size) WITH NOWAIT; WHILE @Size <> @NewSize BEGIN SET @NewSize = @Size; SELECT @data = REPLACE(@data COLLATE SQL_Latin1_General_CP1_CS_AS, ca.UpperLtr + ca.LowerLtr, '') FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26)) dt(N) CROSS APPLY (VALUES (UPPER(SUBSTRING(@Letters, dt.N, 1)), LOWER(SUBSTRING(@Letters, dt.N, 1)))) ca(UpperLtr, LowerLtr) SELECT @data = REPLACE(@data COLLATE SQL_Latin1_General_CP1_CS_AS, ca.LowerLtr + ca.UpperLtr, '') FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26)) dt(N) CROSS APPLY (VALUES (UPPER(SUBSTRING(@Letters, dt.N, 1)), LOWER(SUBSTRING(@Letters, dt.N, 1)))) ca(UpperLtr, LowerLtr) SET @Size = LEN(@data); SET @Loop += 1; RAISERROR('Loop %d; Size %d', 10, 1, @Loop, @Size) WITH NOWAIT; END SELECT @Loop AS Loops, @Size AS Size; |
After declaring and initializing some variables, load the file into a variable. Since this example doesn’t have rows of data, I don’t need to split the data into rows. Store the size of the data so that we can break out of the WHILE loop.
Entering the WHILE loop, we do two case-sensitive replaces. By default, SQL Server is case-insensitive. To do a case-sensitive replace, we specify a case-sensitive collation to work with. In the collation used (“SQL_Latin1_General_CP1_CS_AS”), the “_CS_” identifies that this is a case-sensitive collation.
In the SELECT statements that do the replace, I’m using a table value constructor to create a set of numbers. I know that I’ll only be working with 26 letters, so I create a table with those 26 numbers. Next, the CROSS APPLY converts the numbers into upper-case and lower-case letters. Then the REPLACE function searches for these characters. The first replace looks for Upper\Lower, and the second looks for Lower\Upper.
After performing the replaces, we get the new size of the string. If it’s the same size as what we started the loop with, then the WHILE loop ends. Finally, we return the ending size of the string. When working with recursion, you MUST have a break, or it will run forever.
Day 5 – Part 2
In part 2, we need to identify which unit type could be removed from the polymer to reduce the polymer to the lowest size. This entails removing all occurrences of a unit (upper and lower case) from the polymer, and then running the process to see which unit produces the smallest polymer.
I added another WHILE loop to cycle through each letter. The new WHILE loop goes through each letter, removing all occurrences of it, before starting the recursive replace. At the end of the new loop, store the polymer size. After processing all letters, we see which letter (when removed) produces the smallest polymer.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
DECLARE @data VARCHAR(MAX), @Letters CHAR(26)= 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', @Size INTEGER, @NewSize INTEGER = 0, @Loop INTEGER = 0, @Letter CHAR(1) = 'A', @data2 VARCHAR(MAX); SELECT @data2 = FileData FROM OPENROWSET(BULK 'D:\AdventOfCode\2018\Input\Day05.txt', SINGLE_CLOB) dt(FileData); DECLARE @Results TABLE (Letter CHAR(1) PRIMARY KEY, Size INTEGER); WHILE ASCII(@Letter) <= ASCII('Z') BEGIN SET @data = REPLACE(@data2, @Letter, ''); SET @NewSize = 0 SET @Size = LEN(@data); SET @Loop = 0; RAISERROR('Letter: %s; Loop %d; Size %d', 10, 1, @Letter, @Loop, @Size) WITH NOWAIT; WHILE @Size <> @NewSize BEGIN SET @NewSize = @Size; SELECT @data = REPLACE(@data COLLATE SQL_Latin1_General_CP1_CS_AS, ca.UpperLtr + ca.LowerLtr, '') FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26)) dt(N) CROSS APPLY (VALUES (UPPER(SUBSTRING(@Letters, dt.N, 1)), LOWER(SUBSTRING(@Letters, dt.N, 1)))) ca(UpperLtr, LowerLtr) SELECT @data = REPLACE(@data COLLATE SQL_Latin1_General_CP1_CS_AS, ca.LowerLtr + ca.UpperLtr, '') FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26)) dt(N) CROSS APPLY (VALUES (UPPER(SUBSTRING(@Letters, dt.N, 1)), LOWER(SUBSTRING(@Letters, dt.N, 1)))) ca(UpperLtr, LowerLtr) SET @Size = LEN(@data); SET @Loop += 1; RAISERROR('Letter: %s; Loop %d; Size %d', 10, 1, @Letter, @Loop, @Size) WITH NOWAIT; END INSERT INTO @Results (Letter, Size) VALUES (@Letter, @Size); SET @Letter = CHAR(ASCII(@Letter)+1); END SELECT TOP (1) * FROM @Results ORDER BY Size; |
Summary
And here we have a T-SQL solution for Day 5 of the Advent of Code challenge. The key tasks that we can learn from today are:
- Loading a file.
- Case-sensitive and case-insensitive string operations.
- Utilizing a table value constructor.
- While loops.