Advent of Code 2018 – Day 4
As I explained in a recent post, I’m participating in this year’s Advent of Code challenge, with the twist of doing the challenges in T-SQL. In case you don’t know what “Advent of Code” is, Eric Wastl (t) created it for the purpose of, as Eric describes it:
Advent of Code is a series of small programming puzzles for a variety of skill levels. They are self-contained and are just as appropriate for an expert who wants to stay sharp as they are for a beginner who is just learning to code. Each puzzle calls upon different skills and has two parts that build on a theme.
The full explanation of “Advent of Code” is at //adventofcode.com/2018/about. You can see my other posts in the “Advent of Code” category.
2018 Advent of Code recap
We started of with someone going back in time and altering Santa’s past. You went back in time to correct it, but the device to fix the problems hadn’t been calibrated before sending you on your way. You calibrated the device on Day 1. On day 2, we located the boxes in the warehouse where the new (500 years ago) Santa suit material is. We sliced and diced the material on Day 3, finding the best piece to use.
Day 4, Part 1 – Loading the file
You need to sneak inside the Santa suit lab, but there is a guard outside it. Across the hall is a supply closet where you can observe things, and you see evidence of someone else observing the lab from there. Someone has written all over the walls their observation of the guard post. The observation covers only the midnight hour shift. It has the time (in YYYY-MM-DD hh:mm format), what guard went on shift, and when they fall asleep and wake up. The input file has an unsorted list of records. For this puzzle, the time recorded as “wakes up” means that the entire minute was awake. That minute is not counted as asleep.
For the first part of today’s puzzle, we need to figure out which guard spent the most time sleeping, and which minute of the hour was the guard asleep the most. The answer is the GuardID times the minute determined.
As we’ve done in the past, we start by loading the file and splitting it into rows.
1 2 3 4 5 |
SELECT ca2.Item FROM OPENROWSET(BULK 'D:\AdventOfCode\2018\Input\Day04.txt', SINGLE_CLOB) dt(FileData) CROSS APPLY dbo.Split(dt.FileData, CHAR(13)) ca1 -- split file into rows CROSS APPLY (VALUES (REPLACE(ca1.Item, CHAR(10), ''))) ca2(Item) -- remove LF (CHAR(10) ORDER BY ca2.Item |
Day 4, Part 1 – Extracting data
A few things that we note about this file:
- Only the record that marks the beginning of the shift has the guard number.
- A guard may fall asleep more than once during the shift.
- The guard can start the shift before or after midnight.
- The time, guard ID#, and when the guard falls asleep and wakes up is extracted from the data.
All of these issues are handled during the processing of the file. To handle #3, I’m going to create an extra column with the date of the guard shift. This is the timestamp of the records, converting to just the date. We increment the date by one day to get the proper day of the shift when the hour is 23. The code for handling #3 and #4 is:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
WITH cte AS ( SELECT --ca2.Item, ca5.itemTimeStamp, ca5.GuardID, ca5.ActionItem, ca5.dtTimeStamp, ca6.GuardDate FROM OPENROWSET(BULK 'D:\AdventOfCode\2018\Input\Day04.txt', SINGLE_CLOB) dt(FileData) CROSS APPLY dbo.Split(dt.FileData, CHAR(13)) ca1 -- split file into rows CROSS APPLY (VALUES (REPLACE(ca1.Item, CHAR(10), ''))) ca2(Item) -- remove LF (CHAR(10) -- Find the position in the string for the end of the timestamp and start of the guard number CROSS APPLY (VALUES (CHARINDEX(']', ca2.Item), -- end of timestamp CHARINDEX('#', ca2.Item) -- start of guard number )) ca3(posTimeEnd, posGuard) -- Calculate a start of the action CROSS APPLY (VALUES (CASE WHEN ca3.posGuard = 0 THEN ca3.posTimeEnd + 2 ELSE CHARINDEX(' ', ca2.Item, ca3.posGuard)+1 END)) ca4(posEnd) -- Calculate the timestamp, guard number, and action item -- Include the timestamp converted to a datetime2 datatype CROSS APPLY (VALUES (SUBSTRING(ca2.Item, 2, 16), -- timestamp CASE WHEN ca3.posGuard > 0 THEN SUBSTRING(ca2.Item, ca3.posGuard+1, CHARINDEX(' ', ca2.Item, ca3.posGuard) - ca3.posGuard - 1) ELSE NULL END, -- guard number SUBSTRING(ca2.Item, ca4.posEnd, 8000), -- action item CONVERT(DATETIME2, SUBSTRING(ca2.Item, 2, 16)) -- timestamp converted to SQL DATETIME2 )) ca5(itemTimeStamp, GuardID, ActionItem, dtTimeStamp) -- Calculate the date standing guard. If the guard checked in early, add a day to this date. CROSS APPLY (VALUES (CASE WHEN DATEPART(HOUR, ca5.dtTimeStamp) = 23 THEN DATEADD(DAY, 1, CONVERT(DATE, ca5.dtTimeStamp)) ELSE CONVERT(DATE, ca5.dtTimeStamp) END)) ca6(GuardDate) ) SELECT * FROM cte |
Day 4, Part 1 – Determining the guard for the other records
Next, we need to get the guard ID # for the records that don’t have it. Since there is only one guard on watch at a time, we can compare the record with the guard ID to the ones without based on the shift date.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
WITH cte AS ( SELECT --ca2.Item, ca5.itemTimeStamp, ca5.GuardID, ca5.ActionItem, ca5.dtTimeStamp, ca6.GuardDate FROM OPENROWSET(BULK 'D:\AdventOfCode\2018\Input\Day04.txt', SINGLE_CLOB) dt(FileData) CROSS APPLY dbo.Split(dt.FileData, CHAR(13)) ca1 -- split file into rows CROSS APPLY (VALUES (REPLACE(ca1.Item, CHAR(10), ''))) ca2(Item) -- remove LF (CHAR(10) -- Find the position in the string for the end of the timestamp and start of the guard number CROSS APPLY (VALUES (CHARINDEX(']', ca2.Item), -- end of timestamp CHARINDEX('#', ca2.Item) -- start of guard number )) ca3(posTimeEnd, posGuard) -- Calculate a start of the action CROSS APPLY (VALUES (CASE WHEN ca3.posGuard = 0 THEN ca3.posTimeEnd + 2 ELSE CHARINDEX(' ', ca2.Item, ca3.posGuard)+1 END)) ca4(posEnd) -- Calculate the timestamp, guard number, and action item -- Include the timestamp converted to a datetime2 datatype CROSS APPLY (VALUES (SUBSTRING(ca2.Item, 2, 16), -- timestamp CASE WHEN ca3.posGuard > 0 THEN SUBSTRING(ca2.Item, ca3.posGuard+1, CHARINDEX(' ', ca2.Item, ca3.posGuard) - ca3.posGuard - 1) ELSE NULL END, -- guard number SUBSTRING(ca2.Item, ca4.posEnd, 8000), -- action item CONVERT(DATETIME2, SUBSTRING(ca2.Item, 2, 16)) -- timestamp converted to SQL DATETIME2 )) ca5(itemTimeStamp, GuardID, ActionItem, dtTimeStamp) -- Calculate the date standing guard. If the guard checked in early, add a day to this date. CROSS APPLY (VALUES (CASE WHEN DATEPART(HOUR, ca5.dtTimeStamp) = 23 THEN DATEADD(DAY, 1, CONVERT(DATE, ca5.dtTimeStamp)) ELSE CONVERT(DATE, ca5.dtTimeStamp) END)) ca6(GuardDate) ), cte2 AS ( SELECT t1.itemTimeStamp, t2.GuardID, t1.ActionItem, t1.dtTimeStamp, t2.GuardDate FROM cte t1 JOIN cte t2 ON t1.GuardDate = t2.GuardDate WHERE t2.GuardID IS NOT NULL ) SELECT * FROM cte2 |
In cte2, cte is joined to itself based on the GuardDate. The second instance of cte is only looking at the records with the GuardID. This allows us to get the GuardID for all of the records.
Day 4, Part 1 – Calculating how long a guard was sleeping
Next, we need to calculate how long a guard was asleep. With the input file ordered by the time, we look at the rows when the guard wakes up, and then using the LAG function to look at the previous row (when the guard went to sleep). While we’re doing this, we will also get the minute the guard went to sleep, and the last minute the guard was asleep.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
WITH cte AS ( SELECT --ca2.Item, ca5.itemTimeStamp, ca5.GuardID, ca5.ActionItem, ca5.dtTimeStamp, ca6.GuardDate FROM OPENROWSET(BULK 'D:\AdventOfCode\2018\Input\Day04.txt', SINGLE_CLOB) dt(FileData) CROSS APPLY dbo.Split(dt.FileData, CHAR(13)) ca1 -- split file into rows CROSS APPLY (VALUES (REPLACE(ca1.Item, CHAR(10), ''))) ca2(Item) -- remove LF (CHAR(10) -- Find the position in the string for the end of the timestamp and start of the guard number CROSS APPLY (VALUES (CHARINDEX(']', ca2.Item), -- end of timestamp CHARINDEX('#', ca2.Item) -- start of guard number )) ca3(posTimeEnd, posGuard) -- Calculate a start of the action CROSS APPLY (VALUES (CASE WHEN ca3.posGuard = 0 THEN ca3.posTimeEnd + 2 ELSE CHARINDEX(' ', ca2.Item, ca3.posGuard)+1 END)) ca4(posEnd) -- Calculate the timestamp, guard number, and action item -- Include the timestamp converted to a datetime2 datatype CROSS APPLY (VALUES (SUBSTRING(ca2.Item, 2, 16), -- timestamp CASE WHEN ca3.posGuard > 0 THEN SUBSTRING(ca2.Item, ca3.posGuard+1, CHARINDEX(' ', ca2.Item, ca3.posGuard) - ca3.posGuard - 1) ELSE NULL END, -- guard number SUBSTRING(ca2.Item, ca4.posEnd, 8000), -- action item CONVERT(DATETIME2, SUBSTRING(ca2.Item, 2, 16)) -- timestamp converted to SQL DATETIME2 )) ca5(itemTimeStamp, GuardID, ActionItem, dtTimeStamp) -- Calculate the date standing guard. If the guard checked in early, add a day to this date. CROSS APPLY (VALUES (CASE WHEN DATEPART(HOUR, ca5.dtTimeStamp) = 23 THEN DATEADD(DAY, 1, CONVERT(DATE, ca5.dtTimeStamp)) ELSE CONVERT(DATE, ca5.dtTimeStamp) END)) ca6(GuardDate) ), cte2 AS ( SELECT t1.itemTimeStamp, t2.GuardID, t1.ActionItem, t1.dtTimeStamp, t2.GuardDate FROM cte t1 JOIN cte t2 ON t1.GuardDate = t2.GuardDate WHERE t2.GuardID IS NOT NULL ), cte3 AS ( -- By ordering on dtTimeStamp, when the action for this row is "wakes up", -- the prior row will be "falls asleep". Use the LAG function to get that time. -- Then determine the number of minutes asleep, and the minutes that the -- guard went to sleep and when the guard stopped being asleep (the prior minute -- for when "wakes up"). SELECT *, CASE WHEN ActionItem = 'wakes up' THEN DATEDIFF(MINUTE, LAG(dtTimeStamp, 1) OVER( ORDER BY dtTimeStamp), dtTimeStamp) ELSE NULL END AS MinutesAsleep, CASE WHEN ActionItem = 'wakes up' THEN DATEPART(MINUTE, LAG(dtTimeStamp, 1) OVER( ORDER BY dtTimeStamp)) ELSE NULL END AS SleepStart, CASE WHEN ActionItem = 'wakes up' THEN DATEPART(MINUTE, dtTimeStamp)-1 ELSE NULL END AS SleepEnd FROM cte2 ) SELECT * FROM cte3 |
Day 4, Part 1 – Calculating the answer
It becomes a simple matter of adding up the MinutesAsleep for each guard to see which one was asleep the most:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
WITH cte AS ( SELECT --ca2.Item, ca5.itemTimeStamp, ca5.GuardID, ca5.ActionItem, ca5.dtTimeStamp, ca6.GuardDate FROM OPENROWSET(BULK 'D:\AdventOfCode\2018\Input\Day04.txt', SINGLE_CLOB) dt(FileData) CROSS APPLY dbo.Split(dt.FileData, CHAR(13)) ca1 -- split file into rows CROSS APPLY (VALUES (REPLACE(ca1.Item, CHAR(10), ''))) ca2(Item) -- remove LF (CHAR(10) -- Find the position in the string for the end of the timestamp and start of the guard number CROSS APPLY (VALUES (CHARINDEX(']', ca2.Item), -- end of timestamp CHARINDEX('#', ca2.Item) -- start of guard number )) ca3(posTimeEnd, posGuard) -- Calculate a start of the action CROSS APPLY (VALUES (CASE WHEN ca3.posGuard = 0 THEN ca3.posTimeEnd + 2 ELSE CHARINDEX(' ', ca2.Item, ca3.posGuard)+1 END)) ca4(posEnd) -- Calculate the timestamp, guard number, and action item -- Include the timestamp converted to a datetime2 datatype CROSS APPLY (VALUES (SUBSTRING(ca2.Item, 2, 16), -- timestamp CASE WHEN ca3.posGuard > 0 THEN SUBSTRING(ca2.Item, ca3.posGuard+1, CHARINDEX(' ', ca2.Item, ca3.posGuard) - ca3.posGuard - 1) ELSE NULL END, -- guard number SUBSTRING(ca2.Item, ca4.posEnd, 8000), -- action item CONVERT(DATETIME2, SUBSTRING(ca2.Item, 2, 16)) -- timestamp converted to SQL DATETIME2 )) ca5(itemTimeStamp, GuardID, ActionItem, dtTimeStamp) -- Calculate the date standing guard. If the guard checked in early, add a day to this date. CROSS APPLY (VALUES (CASE WHEN DATEPART(HOUR, ca5.dtTimeStamp) = 23 THEN DATEADD(DAY, 1, CONVERT(DATE, ca5.dtTimeStamp)) ELSE CONVERT(DATE, ca5.dtTimeStamp) END)) ca6(GuardDate) ), cte2 AS ( SELECT t1.itemTimeStamp, t2.GuardID, t1.ActionItem, t1.dtTimeStamp, t2.GuardDate FROM cte t1 JOIN cte t2 ON t1.GuardDate = t2.GuardDate WHERE t2.GuardID IS NOT NULL ), cte3 AS ( -- By ordering on dtTimeStamp, when the action for this row is "wakes up", -- the prior row will be "falls asleep". Use the LAG function to get that time. -- Then determine the number of minutes asleep, and the minutes that the -- guard went to sleep and when the guard stopped being asleep (the prior minute -- for when "wakes up"). SELECT *, CASE WHEN ActionItem = 'wakes up' THEN DATEDIFF(MINUTE, LAG(dtTimeStamp, 1) OVER( ORDER BY dtTimeStamp), dtTimeStamp) ELSE NULL END AS MinutesAsleep, CASE WHEN ActionItem = 'wakes up' THEN DATEPART(MINUTE, LAG(dtTimeStamp, 1) OVER( ORDER BY dtTimeStamp)) ELSE NULL END AS SleepStart, CASE WHEN ActionItem = 'wakes up' THEN DATEPART(MINUTE, dtTimeStamp)-1 ELSE NULL END AS SleepEnd FROM cte2 ), cte4 AS ( -- The guard with the most time asleep is determined by adding up the total time asleep SELECT TOP (1) GuardID, SUM(MinutesAsleep) AS TotalSleepTime FROM cte3 WHERE cte3.ActionItem = 'wakes up' GROUP BY GuardID ORDER BY SUM(MinutesAsleep) DESC ) SELECT * FROM cte4 |
However, we still need to get which minute that guard was most frequently asleep. Using a virtual tally table, we get all the minutes between the SleepStart and SleepEnd minutes from cte3. Join that to cte4 to only get the data for the sleepiest guard. I covered the virtual tally table in Day 2.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
-- Virtual tally table that gets at least 60 numbers (for minutes in an hour) WITH Tens (N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1), Hundreds(N) AS (SELECT 1 FROM Tens t1, Tens t2), Tally (N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM Hundreds) ,cte AS ( SELECT --ca2.Item, ca5.itemTimeStamp, ca5.GuardID, ca5.ActionItem, ca5.dtTimeStamp, ca6.GuardDate FROM OPENROWSET(BULK 'D:\AdventOfCode\2018\Input\Day04.txt', SINGLE_CLOB) dt(FileData) CROSS APPLY dbo.Split(dt.FileData, CHAR(13)) ca1 -- split file into rows CROSS APPLY (VALUES (REPLACE(ca1.Item, CHAR(10), ''))) ca2(Item) -- remove LF (CHAR(10) -- Find the position in the string for the end of the timestamp and start of the guard number CROSS APPLY (VALUES (CHARINDEX(']', ca2.Item), -- end of timestamp CHARINDEX('#', ca2.Item) -- start of guard number )) ca3(posTimeEnd, posGuard) -- Calculate a start of the action CROSS APPLY (VALUES (CASE WHEN ca3.posGuard = 0 THEN ca3.posTimeEnd + 2 ELSE CHARINDEX(' ', ca2.Item, ca3.posGuard)+1 END)) ca4(posEnd) -- Calculate the timestamp, guard number, and action item -- Include the timestamp converted to a datetime2 datatype CROSS APPLY (VALUES (SUBSTRING(ca2.Item, 2, 16), -- timestamp CASE WHEN ca3.posGuard > 0 THEN SUBSTRING(ca2.Item, ca3.posGuard+1, CHARINDEX(' ', ca2.Item, ca3.posGuard) - ca3.posGuard - 1) ELSE NULL END, -- guard number SUBSTRING(ca2.Item, ca4.posEnd, 8000), -- action item CONVERT(DATETIME2, SUBSTRING(ca2.Item, 2, 16)) -- timestamp converted to SQL DATETIME2 )) ca5(itemTimeStamp, GuardID, ActionItem, dtTimeStamp) -- Calculate the date standing guard. If the guard checked in early, add a day to this date. CROSS APPLY (VALUES (CASE WHEN DATEPART(HOUR, ca5.dtTimeStamp) = 23 THEN DATEADD(DAY, 1, CONVERT(DATE, ca5.dtTimeStamp)) ELSE CONVERT(DATE, ca5.dtTimeStamp) END)) ca6(GuardDate) ), cte2 AS ( SELECT t1.itemTimeStamp, t2.GuardID, t1.ActionItem, t1.dtTimeStamp, t2.GuardDate FROM cte t1 JOIN cte t2 ON t1.GuardDate = t2.GuardDate WHERE t2.GuardID IS NOT NULL ), cte3 AS ( -- By ordering on dtTimeStamp, when the action for this row is "wakes up", -- the prior row will be "falls asleep". Use the LAG function to get that time. -- Then determine the number of minutes asleep, and the minutes that the -- guard went to sleep and when the guard stopped being asleep (the prior minute -- for when "wakes up"). SELECT *, CASE WHEN ActionItem = 'wakes up' THEN DATEDIFF(MINUTE, LAG(dtTimeStamp, 1) OVER( ORDER BY dtTimeStamp), dtTimeStamp) ELSE NULL END AS MinutesAsleep, CASE WHEN ActionItem = 'wakes up' THEN DATEPART(MINUTE, LAG(dtTimeStamp, 1) OVER( ORDER BY dtTimeStamp)) ELSE NULL END AS SleepStart, CASE WHEN ActionItem = 'wakes up' THEN DATEPART(MINUTE, dtTimeStamp)-1 ELSE NULL END AS SleepEnd FROM cte2 ), cte4 AS ( -- The guard with the most time asleep is determined by adding up the total time asleep SELECT TOP (1) GuardID, SUM(MinutesAsleep) AS TotalSleepTime FROM cte3 WHERE cte3.ActionItem = 'wakes up' GROUP BY GuardID ORDER BY SUM(MinutesAsleep) DESC ) -- Now we just get, for the guard that slept the most, which minute was the most sleepy time. SELECT TOP (1) cte3.GuardID, Tally.N AS [Minute], COUNT(*) AS Counter, cte3.GuardID * Tally.N AS Answer FROM cte3 JOIN cte4 ON cte3.GuardID = cte4.GuardID -- guard with most time asleep JOIN Tally ON Tally.N BETWEEN cte3.SleepStart AND cte3.SleepEnd -- each minute guard was asleep WHERE cte3.ActionItem = 'wakes up' GROUP BY cte3.GuardID, Tally.N ORDER BY Counter DESC; |
Day 4, Part 2 – Calculating the answer
With Part 2 unlocked, we have another scenario to check. Instead of getting the sleepiest minute for the sleepiest guard, this time we want to get which guard is most often asleep at the same minute. The calculation for the answer is the same: GuardID times the minute.
In looking over the requirements and the code we already have written, we see that we have all the parts that we need. The only thing that we need to change is to remove (or comment out) the line “JOIN cte4…”. Technically, we could remove the entire cte4 CTE, but the query will run just fine with it in place.
1 2 3 4 5 6 7 8 9 |
-- Now we just get which minute was the most sleepy time by any guard. SELECT TOP (1) cte3.GuardID, Tally.N AS [Minute], COUNT(*) AS Counter, cte3.GuardID * Tally.N AS Answer FROM cte3 --JOIN cte4 ON cte3.GuardID = cte4.GuardID -- guard with most time asleep JOIN Tally ON Tally.N BETWEEN cte3.SleepStart AND cte3.SleepEnd -- each minute guard was asleep WHERE cte3.ActionItem = 'wakes up' GROUP BY cte3.GuardID, Tally.N ORDER BY Counter DESC; |
Summary
And here we have a T-SQL solution for Day 4 of the Advent of Code challenge. The key tasks that we can learn from today are:
- Loading a file.
- Split a string on a delimiter.
- Assigning a sequential number to a set of rows in a specific order.
- Construction of a virtual TALLY table.
- Using a TALLY table to extract each character from a string.
- Getting values from earlier rows.
- Use of the GROUP BY and HAVING clauses while performing an aggregation.