Next,  Advent of Code 2018 – Day 6

As I explained in a recent post, I’m participating in this year’s Advent of Code challenge, with the twist of doing the challenges in T-SQL. In case you don’t know what “Advent of Code” is, Eric Wastl (t) created it for the purpose of, as Eric describes it:

Advent of Code is a series of small programming puzzles for a variety of skill levels. They are self-contained and are just as appropriate for an expert who wants to stay sharp as they are for a beginner who is just learning to code. Each puzzle calls upon different skills and has two parts that build on a theme.

The full explanation of “Advent of Code” is at //adventofcode.com/2018/about. You can see my other posts in the “Advent of Code” category.

2018 Advent of Code recap

We started off with someone going back in time and altering Santa’s past. You went back in time to correct it, but the device to fix the problems hadn’t been calibrated before sending you on your way. You calibrated the device on Day 1. On Day 2, we located the boxes in the warehouse where the new (500 years ago) Santa suit material is. We sliced and diced the material on Day 3, finding the best piece to use. We figured out the best time to sneak into the Santa suit lab on Day 4. On Day 5, we modified the polymers in the Santa suit material.

Day 6 – Part 1

We all know that the shortest distance between two points is a straight line. What if you’re a taxi driver, and you need to deliver someone 6 blocks north and 7 blocks east? A straight line would probably take you through several buildings. In this case, your shortest distance is by grid. In Day 6, this is exactly what we are up against. After fixing the Santa suit, your device takes you another 500 years in the past. Along the way, the device detects chronal interference and gives you a list of grid points.

Thinking that those points are your danger zones, you need to find the largest area within your grid without any of those grid points. You will calculate the distance of all points on the grid from the grid points supplied, using the Manhattan distance calculation. We don’t consider any point that is the same distance from more than one of the supplied points. Additionally, we don’t consider any supplied point that extends indefinitely in any direction.

Side note: I created two different ways of accomplishing this. The procedural (while loop based) method was consistently twice as fast as the set-based method. I’ll be discussing the procedural method, but I will include the set-based method also.

The import file is a list of comma-separated values. The values represented the X-axis and the Y-axis coordinates on the grid. This makes it easy to process the import file:

Notice that I’m directing this output into a temporary table.

Day 6, Part 1 – Building the grid

Next, I need to build the grid. I use the MAX function to get the largest X and Y coordinate. Following that, I perform two CROSS JOINs to the physical tally table (built in Day 3) to get the proper number of values for both the X-axis and the Y-axis.

This code creates and populates another temporary table (#XYZ). The Z column is for the value being stored in the point referenced by the X-axis and Y-axis:

Day 6 Grid

Day 6 Grid

For the next step, I determined the distance from each grid point supplied to the current grid point. I’m storing two bits of data in the Z column for the points that weren’t supplied: the shortest distance, and the grid point that was the closest. I use an asterisk for the grid location if there are more than one grid point that is the shortest distance for that grid location. I separate these two values with a dash. The distance calculation is the absolute value difference between the X-axis and Y-axis points of the supplied location, and the current location. Adding the two values together gets the total distance. A cursor is used to process this – I did mention that this was a procedural method.

Day 6, Part 1 – Determining the number of grid points in the largest region

This final step is to process the grid to determine the largest region.

The where clause excludes the supplied grid locations that will go to infinity, and those where there are multiple locations for the shortest distance. After extracting the supplied grid location that is the shortest distance to the grid points, I group on this value to get the number of grid points that are the closest to the supplied location.

Day 6, Part 1 – the Set based solution

The Set-based solution for the cursor and the result is:

Day 6, Part 2

For Part 1, we assumed that the supplied locations were for the dangerous spots, and we were trying to get as far away from them as possible. In Part 2, we are assuming that those locations are for the safest locations, so we want to find the region nearest the most number of supplied locations. For this puzzle, I calculate the distance between each supplied location on the grid, and the current grid point, and get those locations where the total is less than 10,000. Finally, a simple set-based solution, based on the #XYZ table already created and populated, gets the result that we need:

In this query, we get the total distance from all supplied grid locations for every grid point. The #XYZ table (t1) gets all grid points, and the #XYZ (t2) gets just the supplied locations. Next, group by the X-axis and Y-axis, and get just those grid points where this total is < 10,000. Finally, count the number of grid points that are in this region.

Summary

And here we have a T-SQL solution for Day 5 of the Advent of Code challenge. The key tasks that we can learn from today are:

  • Loading a file.
  • Splitting a string on a delimiter.
  • Extracting data from a string.
  • While loops.
  • Using a physical tally table.