Advent of Code, Day 5

2016-01-08 Wayne Sheffield No comments

In my previous post, I introduced you to the Advent of Code programming challenge, and explained how I’m going through the days, solving both parts of each puzzle. Continuing on, today we have day 5. Since you can only see this if you are up to this level in the challenge, let me explain it to you. In part one of this puzzle, Santa needs to figure out how many of the strings in his text file are “nice” strings. The text file to use can be found here. The string is a “nice” string if it contains at least 3 vowels, has any letter that appears twice consecutively, and it does not contain the strings “ab”, “cd”, “pq” or “xy”.

Once again, this list exceeds the limitations of my favorite string splitter function, so I’m going to use the alternate manner I used earlier to load it:

DECLARE @InputText VARCHAR(MAX);
SELECT  @InputText = REPLACE(FileText, CHAR(10), '')
FROM    OPENROWSET(BULK 'D:\AdventOfCode\Day05.txt', SINGLE_CLOB) UselessAlias(FileText)

DECLARE @TotalLength INTEGER = LEN(@InputText),
        @LengthNoCR  INTEGER = LEN(REPLACE(@InputText, CHAR(13), ''));
DECLARE @TotalRows   INTEGER = @TotalLength - @LengthNoCR;
DECLARE @SizePerRow  INTEGER = @TotalLength / @TotalRows;
DECLARE @CRPosition  INTEGER = CHARINDEX(CHAR(13), @InputText, @TotalLength / 2);

DECLARE @InputText VARCHAR(MAX);

SELECT @InputText = REPLACE(FileText, CHAR(10), '')

FROM OPENROWSET(BULK 'D:\AdventOfCode\Day05.txt', SINGLE_CLOB) UselessAlias(FileText)

DECLARE @TotalLength INTEGER = LEN(@InputText),

@LengthNoCR INTEGER = LEN(REPLACE(@InputText, CHAR(13), ''));

DECLARE @TotalRows INTEGER = @TotalLength - @LengthNoCR;

DECLARE @SizePerRow INTEGER = @TotalLength / @TotalRows;

DECLARE @CRPosition INTEGER = CHARINDEX(CHAR(13), @InputText, @TotalLength / 2);

The list (in the @InputText variable) actually contains 1000 lines. I’m declaring a few variables to determine how many rows there are, and the number of characters per row. In the next phase, I’m once again using a virtual tally table to split this string. I then check each line for whether it contains any of those conditions by comparing the length of the string to the string after the characters to search for have been removed, and finally I get a count of the number that meet the nice criteria.

WITH Tens    (N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
                     SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
                     SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
     Hundreds(N) AS (SELECT 1 FROM Tens t1, Tens t2),
     Millions(N) AS (SELECT 1 FROM Hundreds t1, Hundreds t2, Hundreds t3),
     Tally   (N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM Millions)
, cte AS 
(
    SELECT  TOP (@TotalRows) 
            N, ca.LineText,
            -- if difference in length between the original string and after all vowels have been removed is >= 3, then it has enough vowels.
            CASE WHEN ca2.LineTextLen-3 >= LEN(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(ca.LineText, 'a', ''), 'e', ''), 'i', ''), 'o', ''), 'u', '')) THEN 1 ELSE 0 END AS Has3Vowels,
            -- if there is a difference in length between the original string and after the non-allowed strings have been removed, then it has the non-allowed strings.
            CASE WHEN ca2.LineTextLen > LEN(REPLACE(REPLACE(REPLACE(REPLACE(ca.LineText, 'ab', ''), 'cd', ''), 'pq', ''), 'xy', '')) THEN 1 ELSE 0 END AS HasNotAllowedStrings,
            -- if there is a difference in length between the original string and after pairs of letters have been removed, then there is at least one pair of letters present.
            CASE WHEN ca2.LineTextLen > LEN(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(ca.LineText, 'aa', ''), 'bb', ''), 'cc', ''), 'dd', ''), 'ee', ''), 'ff', ''), 'gg', ''), 'hh', ''), 'ii', ''), 'jj', ''), 'kk', ''), 'll', ''), 'mm', ''), 'nn', ''), 'oo', ''), 'pp', ''), 'qq', ''), 'rr', ''), 'ss', ''), 'tt', ''), 'uu', ''), 'vv', ''), 'ww', ''), 'xx', ''), 'yy', ''), 'zz', '')) THEN 1 ELSE 0 END AS Has2Consecutive
    FROM    Tally
    -- split the string into the lenghts for each item
    CROSS APPLY (SELECT SUBSTRING(@InputText, (N*@SizePerRow)-@SizePerRow+1, @SizePerRow)) ca(LineText)
    -- and get the length of it, so that the calculation is only performed once instead of three times.
    CROSS APPLY (SELECT LEN(ca.LineText)) ca2(LineTextLen)
)
SELECT  COUNT(*) AS IsNice
FROM    cte
WHERE   Has3Vowels = 1 
AND     HasNotAllowedStrings = 0 
AND     cte.Has2Consecutive = 1;

WITH Tens (N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),

Hundreds(N) AS (SELECT 1 FROM Tens t1, Tens t2),

Millions(N) AS (SELECT 1 FROM Hundreds t1, Hundreds t2, Hundreds t3),

Tally (N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM Millions)

, cte AS

(

SELECT TOP (@TotalRows)

N, ca.LineText,

-- if difference in length between the original string and after all vowels have been removed is >= 3, then it has enough vowels.

CASE WHEN ca2.LineTextLen-3 >= LEN(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(ca.LineText, 'a', ''), 'e', ''), 'i', ''), 'o', ''), 'u', '')) THEN 1 ELSE 0 END AS Has3Vowels,

-- if there is a difference in length between the original string and after the non-allowed strings have been removed, then it has the non-allowed strings.

CASE WHEN ca2.LineTextLen > LEN(REPLACE(REPLACE(REPLACE(REPLACE(ca.LineText, 'ab', ''), 'cd', ''), 'pq', ''), 'xy', '')) THEN 1 ELSE 0 END AS HasNotAllowedStrings,

-- if there is a difference in length between the original string and after pairs of letters have been removed, then there is at least one pair of letters present.

CASE WHEN ca2.LineTextLen > LEN(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(ca.LineText, 'aa', ''), 'bb', ''), 'cc', ''), 'dd', ''), 'ee', ''), 'ff', ''), 'gg', ''), 'hh', ''), 'ii', ''), 'jj', ''), 'kk', ''), 'll', ''), 'mm', ''), 'nn', ''), 'oo', ''), 'pp', ''), 'qq', ''), 'rr', ''), 'ss', ''), 'tt', ''), 'uu', ''), 'vv', ''), 'ww', ''), 'xx', ''), 'yy', ''), 'zz', '')) THEN 1 ELSE 0 END AS Has2Consecutive

FROM Tally

-- split the string into the lenghts for each item

CROSS APPLY (SELECT SUBSTRING(@InputText, (N*@SizePerRow)-@SizePerRow+1, @SizePerRow)) ca(LineText)

-- and get the length of it, so that the calculation is only performed once instead of three times.

CROSS APPLY (SELECT LEN(ca.LineText)) ca2(LineTextLen)

)

SELECT COUNT(*) AS IsNice

FROM cte

WHERE Has3Vowels = 1

AND HasNotAllowedStrings = 0

AND cte.Has2Consecutive = 1;

After looking at this list, Santa realizes that the nice criteria isn’t valid, so some new criteria comes out. A “nice” string now must contain a pair of letters that appears in a row twice without overlapping, and a letter that repeats in the string with exactly one letter between those two repeating letters. In SQL, this is just pattern matching, and these can be performed with the LIKE operator. The hard part here is getting all of the valid pattern matching strings. I do this with a virtual tally table to get all of the letters of the alphabet, and then performing a set of queries to get all of these combinations (including the necessary wildcard characters to make the complete pattern to match). They are inserted into a table along with a number to identify which set of matching criteria that they belong to:

-- make a table to hold all of the allowed combinations
DECLARE @Nice TABLE (SetNbr INTEGER, Allowed VARCHAR(10));

WITH Tens    (N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
                     SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
                     SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
     Hundreds(N) AS (SELECT 1 FROM Tens t1, Tens t2),
     Millions(N) AS (SELECT 1 FROM Hundreds t1, Hundreds t2, Hundreds t3),
     Tally   (N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM Millions)
, Letters AS 
(
    SELECT TOP 26 CHAR(96+N) AS Letter
    FROM    Tally
)
-- fill the table with pattern matches. 
-- first, use "%A_A%" - there is one character between the letter and itself (the "_" specifies one character, and the "%" on both ends allows this pattern to be anywhere in the string
INSERT INTO @Nice (SetNbr, Allowed)
SELECT  1, '%' + Letter + '_' + Letter + '%'
FROM    Letters
UNION ALL
-- and then cross join the letters with itself to get all combinations of letters. Surround this with "%" to allow there to be any # of characters on either side, and between the pair combinations.
SELECT  2, '%' + a.Letter + b.Letter + '%' + a.Letter + b.Letter + '%'
FROM    Letters a CROSS JOIN Letters b;

-- make a table to hold all of the allowed combinations

DECLARE @Nice TABLE (SetNbr INTEGER, Allowed VARCHAR(10));

WITH Tens (N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),

Hundreds(N) AS (SELECT 1 FROM Tens t1, Tens t2),

Millions(N) AS (SELECT 1 FROM Hundreds t1, Hundreds t2, Hundreds t3),

Tally (N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM Millions)

, Letters AS

(

SELECT TOP 26 CHAR(96+N) AS Letter

FROM Tally

)

-- fill the table with pattern matches.

-- first, use "%A_A%" - there is one character between the letter and itself (the "_" specifies one character, and the "%" on both ends allows this pattern to be anywhere in the string

INSERT INTO @Nice (SetNbr, Allowed)

SELECT 1, '%' + Letter + '_' + Letter + '%'

FROM Letters

UNION ALL

-- and then cross join the letters with itself to get all combinations of letters. Surround this with "%" to allow there to be any # of characters on either side, and between the pair combinations.

SELECT 2, '%' + a.Letter + b.Letter + '%' + a.Letter + b.Letter + '%'

FROM Letters a CROSS JOIN Letters b;

At this point, all that is needed is to join the tables of lines to this table twice (once for each matching condition) using the LIKE operator, and to get a count of the number of distinct strings that are nice:

WITH Tens    (N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
                     SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
                     SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
     Hundreds(N) AS (SELECT 1 FROM Tens t1, Tens t2),
     Millions(N) AS (SELECT 1 FROM Hundreds t1, Hundreds t2, Hundreds t3),
     Tally   (N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM Millions)
, cte AS 
(
    SELECT  TOP (@TotalRows) 
            N, ca.LineText
    FROM    Tally
    -- split the string into the lenghts for each item
    CROSS APPLY (SELECT SUBSTRING(@InputText, (N*@SizePerRow)-@SizePerRow+1, @SizePerRow)) ca(LineText)
    -- and get the length of it, so that the calculation is only performed once instead of three times.
    CROSS APPLY (SELECT LEN(ca.LineText)) ca2(LineTextLen)
)
-- get a count of the distinct LineText
SELECT  COUNT (DISTINCT cte.LineText) AS IsNice
FROM    cte
-- where there is a match on one of the pattern combinations in set 1
JOIN    @Nice N1 ON cte.LineText LIKE N1.Allowed AND N1.SetNbr = 1
-- and in set 2
JOIN    @Nice N2 ON cte.LineText LIKE N2.Allowed AND N2.SetNbr = 2;

WITH Tens (N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),

Hundreds(N) AS (SELECT 1 FROM Tens t1, Tens t2),

Millions(N) AS (SELECT 1 FROM Hundreds t1, Hundreds t2, Hundreds t3),

Tally (N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM Millions)

, cte AS

(

SELECT TOP (@TotalRows)

N, ca.LineText

FROM Tally

-- split the string into the lenghts for each item

CROSS APPLY (SELECT SUBSTRING(@InputText, (N*@SizePerRow)-@SizePerRow+1, @SizePerRow)) ca(LineText)

-- and get the length of it, so that the calculation is only performed once instead of three times.

CROSS APPLY (SELECT LEN(ca.LineText)) ca2(LineTextLen)

)

-- get a count of the distinct LineText

SELECT COUNT (DISTINCT cte.LineText) AS IsNice

FROM cte

-- where there is a match on one of the pattern combinations in set 1

JOIN @Nice N1 ON cte.LineText LIKE N1.Allowed AND N1.SetNbr = 1

-- and in set 2

JOIN @Nice N2 ON cte.LineText LIKE N2.Allowed AND N2.SetNbr = 2;

The distinct is in there since the matching patterns can produce multiple rows for a single line.

Advent of Code, SQL none

Wayne Sheffield

Recent Posts

Top Posts

Archives

Categories

Blogroll

Advent of Code, Day 5

Like this:

Related

Leave a Reply to let me know how you liked this postCancel reply

Wayne Sheffield

Recent Posts

Top Posts

Archives

Categories

Tags

Blogroll

Advent of Code, Day 5

Share this:

Like this:

Related

Leave a Reply to let me know how you liked this postCancel reply