Wayne Sheffield

My blog about SQL Server

Browsing Posts published by Wayne Sheffield

I’m happy to announce that I’ll be going on a Central Texas SQL Tour in mid-August.

First up is the San Antonio SQL Saturday – I’ve been selected to be a speaker. This is really special because this is San Antonio’s first ever SQL Saturday!!! I’ll be presenting what is probably my favorite presentation – “Table Variables & Temp Tables – What You Need to Know!” This session usually runs about 75 minutes, so I’ve arranged to have the session immediately prior to lunch to help stagger the lunch flow. If you are anywhere in the area, come on out and see this presentation – I’ll guarantee that you will learn something, or double your registration fee back! You can register for this wonderful, free event at this link. From that link, you can also see the schedule and gaze in awe and wonder at all of the wonderful free sessions that you can attend all day. The SQL Saturday is on August 13th, 2016, and it is at Texas A&M University, One University Way, San Antonio, Texas, 78224 (map).

Image source: http://bit.ly/29R5yce

The following week, I’ll be at two user group meetings. The first is CACTUSS North on Tuesday, August 16th in Austin Texas. The meeting starts at 6pm, and I’ll be doing an encore presentation of “Table Variables & Temp Tables – What You Need to Know!”, so if you couldn’t see it at the SQL Saturday, be sure to come out to see it here. Sign up to attend this user group meeting here, and the meeting is at the Microsoft offices at 10900 Stonelake Blvd., Suite 225, Austin, TX, 78759 (map).

Image source: http://salssa.sqlpass.org/portals/404/salssa.gif

The next night, Wednesday, August 17th, I’ll be back in San Antonio, presenting at their SALSSA user group meeting. The meeting starts at 6pm, and I’ll be doing a different presentation – “SSMS Tips and Tricks – Using What You’ve Already Got”. They meet at the New Horizons Computer Learning Center, located at 2727 NW Loop 410, San Antonio, TX, 78230 (map).

Image source: http://www.thesanantonioriverwalk.com/images/site/home_night_banner.jpg

You can read the abstracts for both of these presentations here. And yes, I’m sure I’ll be at the Riverwalk in San Antonio sometime during my free time. I love the Riverwalk, and my favorite restaurant is nearby.

 

 

This post is re-published from my original post on SQL Solutions Group. I hope that you enjoy it.

When investigating a performance issue, the desired end result is already known… you need to make the queries run faster. It’s been my experience that most performance problems involve optimizing the query that is being run—sometimes the query needs a re-write to be more efficient, sometimes the tables being queried need a new (or modified) index, and sometimes even the underlying database schema might need modifying. Before starting down any of these routes though, the first thing that I do is to check the configuration settings that make a difference.

Enter SQL Server’s “Go Fast” button

turbobuttonIn a CPU universe far, far away, there existed a particular series known as the 80486 processor, commonly called just 486. One particular issue that this series of CPUs had was that they were too fast for many games of that era (games that were coded for running off of CPU ticks). If you can get hold of one of these games and try to run it on a modern system, I wish you luck in even being able to press a single key before the game ends! But I digress… in order to counteract this issue, the processors had a feature where they could be slowed down. There were different ways that the CPU could be slowed down, but to interact with the computer user, there was a button on the case, known as the Turbo button, that would cycle the system between the low and high speeds of the CPU. Accidentally leaving this button in the low speed would slow down everything else on that computer; merely putting the system into the high speed would fix your performance problem.

So what does this mean for SQL Server?

In a CPU universe very, very close to you, today CPUs have a similar feature known as CPU Throttling. This feature, otherwise known as Dynamic Frequency Scaling, dynamically adjusts the CPU frequency on the fly. There are several benefits from doing this: less heat is generated, less power is consumed, and it can even make the systems quieter by also reducing the fan speed necessary to cool the computer. In a large server farm, reducing the heat means that cooling costs are also reduced. By operating the systems in a reduced state, the lower cooling costs coupled with the lower power requirements of the computers themselves can mount up to quite a savings. The dynamic portion of this feature is based upon system monitoring of CPU usage, and when it is low the CPU frequency is reduced, typically by 50% or more.

For most systems, the computer’s BIOS will monitor certain communication from the operating system, so the operating system can also send instructions for setting the CPU Throttling. In the Windows operating system, this is performed through the “Power Plan” setting. Since Windows Server 2008, the default power plan is “Balanced”, which allows Windows to reduce the CPU speed in an effort to balance power and performance.

How does this affect SQL Server?

SQL Server is one application that doesn’t typically use CPU resources so much as to make the CPU Throttling disengage and the CPUs to run at full power. This manifests itself in SQL Server as queries just taking longer to run. You might even have a new server that the application runs slower on than the server it is replacing. The solution is to simply disengage the CPU Throttling feature so that the CPUs will run at full speed constantly.

Detecting if CPU Throttling is engaged

CPU Throttling is implemCPU-Z-CPU-Throttlingented at both the hardware and software level. The Windows operating system can control CPU Throttling, and this can also be controlled from the BIOS. On some systems, the BIOS will ignore signals sent from Windows, so we will cover how to test and set both methods.

To see if your CPUs are running at rated speed, there are two methods. The first is the third-party tool CPU-Z, available at http://www.cpuid.com/softwares/cpu-z.html. In this screen shot (from the above link), you can see that the CPU is rated to run at 3.2GHz, but is actually running at almost 1.2GHz. This system is having its CPU throttled.

Additionally, you can check out the WMI performance counters from the WIN32_Processor class. This DOS command returns the current and max clock speeds every second for 30 seconds; if the CurrentClockSpeed and the MaxClockSpeed differ significantly, then the CPU is being throttled.

In Windows, CPU Throttling is controlled from the Power Settings control panel applet. On a server running SQL Server, the desired setting is to be using the the “High Performance” power plan. You can do this from either the Power Settings control panel applet, or from the command line (starting with Windows Server 2008R2).

To start the Power Settings control panel applet, press the windows key + r key combination. In the run dialog box, enter powercfg.cpl. In the applet, you are looking for:

Power-Plan

From a command prompt, you can simply run this DOS command:

Disabling CPU Throttling

To change the power plan, in the control panel applet simply select the High Performance power plan (as shown above). From the DOS command prompt:

Changing the power plan takes effect immediately, so you should immediately see your CPUs running at full speed. This means that you can change the power plan on your SQL Server without requiring an outage. If, however, this does not return your CPUs to full speed, then the BIOS is overriding the Windows setting, and you will need to reboot to go into the BIOS to disable it, and once changed reboot again. Different BIOS manufacturers call CPU Throttling / Dynamic Frequency Scaling by different names, so you will need to investigate your server manufacturer’s web site to determine what it is called. This will obviously cause an outage during this time, so this needs to be a scheduled maintenance action.

What kind of difference does this make?

This isn’t the fix for all queries, but it should help out pretty dramatically. For a system replacing another, it should return the application to pre-update performance, if not surpassing it.

Recently, I had the opportunity to test out the performance impact of changing the power plan from Balanced to High Performance. Prior to changing the power plan, a set of queries was running in 7 hours, 45 minutes. After changing just the power plan to High Performance, this set of queries is now running in 3 hours, 55 minutes. That’s almost ½ the time, from this simple setting change.

Just remember… SQL Server is not “green”. It needs full power.

And now you know about SQL Server’s hidden “Go Fast” button.

Recently (okay, it went live a month ago, and I’m super-late in my blog post!), I was the guest on the podcast that Carlos Chacon of SQLDataPartners publishes.

In this podcast, we talk about Table Variables and Temporary Tables (if you’ve been to one of my presentations on this topic, then you know what’s coming!). We also talk about super-hero powers and other general chit-chat. You can hear the podcast at at this link (MP3). The “Show Notes” are available here.

I hope that you go out and listen to it, and that you enjoy it as much as I enjoyed harassing Carlos during the show!

TSQL Tuesday
It’s been a while since I’ve written a T-SQL Tuesday post, mainly because I just wasn’t inspired by the topic. And truth be told, this one was turning out to be that way also. But then today, I was wishing that SQL Server would track something that it doesn’t do, and I realized that I had a potential post here. Since this is a couple of days past the deadline, I reached out to Chris Yates (the host for this month) (b/t), and he told me to just bring it on, so here goes…

The theme this month is SQL Gifts… at least that’s what I’m calling it. Chris says:

With that said let’s get this party started and have some fun with it this month shall we. Treat yourself to a birthday gift and come up with a present for yourself SQL related – no limitations:

  • Is there something you’ve been wanting to do for a long time in SQL, but haven’t done it yet?
  • Perhaps there is a feature you would like to see added into SQL that just isn’t there yet – what is it?
  • Maybe you are a consultant and know something many of us don’t as you parachute in and out of shops that a reader can benefit from?
  • Is there a co-worker struggling with something SQL related; here is your chance to share the present (knowledge).

Gift wrap your post for yourself and be innovative with what you write about.

Since I was wishing that SQL Server would do something that it doesn’t, it fits right in.

Determining Unused Objects

Have you ever been trying to clean up a database, and you can’t tell what is being used and what isn’t? My wish is for SQL Server to track the time that objects were last used.

Yes, we have sys.dm_exec_procedure_stats to tell us information about procedures that are in the plan cache. And we have sys.dm_db_index_usage_stats for index related information (and index 0/1 will tell us about the table). But both of these can have data disappear. Procedures when the plan is removed from the plan cache – and if SQL restarts. Index stats are removed if the database goes offline (including restarting SQL), when doing a CREATE with DROP_EXISTING and in SQL 2012 a bug was introduced where they were reset on a ALTER INDEX REBUILD. To handle both of these views, we need some kind of process to periodically query these views, and to update a physical table to track these.

These two views don’t take care of other types of objects – views, functions, triggers, etc. To track these, we need a custom process (for example, an Extended Event session or a server-side trace) that captures all database activity. We then need to periodically query this process and determine the objects being used, and again update a physical table to track these. If you’ve ever tried to get the objects being used from a server-side trace, then you know how much a pain-in-the-butt fun this can be.

It sure would be nice if SQL Server would track this all for us. It would make it so much easier to determine what is being used in the database, and what isn’t.

What I’d like to see is one or two columns added to sys.objects: either last_used_time_user / last_used_time_system, or just last_used_time (for both). Then for that object, when it is used, it updates these datetime columns. This will handle tables, views, procedures, functions, triggers, etc., and pretty much give me exactly what I’m looking for. This data needs to not be reset, just updated with the last time that it was used. And it needs to persist across restarts of the database / instance.

Doing this will add overhead to the system (probably less than if we were to create our own process to do this), so it makes sense to me to have this controlled by an alter database command – something like ALTER DATABASE TRACK_OBJECT_USAGE = ON.

Should this idea be extended?

Sometimes we want to know about whether a column is being used…. is it safe to drop a column? When running a query, the system verifies that all of the columns do exist, so this could potentially be an extension to that process. We could extend this to add these columns to sys.columns also. I imagine that this would create a bit of overhead, so this could also be controlled by another alter database command – something like ALTER DATABASE TRACK_COLUMN_USAGE = ON.

These columns can be added to sys.indexes also, where they will persist after a restart. This would be somewhat similar to the last_user(system)_scan/seek/update/lookup columns in the sys.dm_db_index_usage_stats view, but they won’t disappear unless the index is completely dropped. Since this data is already being tracked for sys.dm_db_index_usage_stats, I don’t know if it would really add that much of an overhead to also track it in sys.indexes – but if so, just toggle this tracking with another alter database command – say ALTER DATABASE TRACK_INDEX_USAGE = ON.

I’ve created a connect item for these suggestions. Go on out and up-vote it if you like this idea.

Caution!

Whether this is implemented or not, however you determine objects that can be potentially removed, proceed cautiously. Ensure that you have considered end-of-period (month/quarter/year) processing where objects might be used just during these times. And always have a plan for bringing these objects back. In the case of tables / columns, you don’t want to lose any data. And in the case of T-SQL coded objects, ensure that you can re-create those if necessary. You don’t want this to be why your company let you go…

source: http://thistimeimeanit.com/wp-content/uploads/2011/06/officejuggler.jpg

One of the nightmare scenarios that a SQL Saturday organizer faces is selecting a speaker that is already going to another event on the same day. The speaker can only be at one event… and the site doesn’t really help us out with letting us know about these over-committed speakers.

So, I wrote this little PowerShell script to compare your selected speakers with the speakers selected at other SQL Saturday events that are on the same day. Now, by “selected speaker”, I mean those speakers that have been approved in the “Manage Sessions” section of the admin area of your event (those sessions don’t need to be on the schedule yet). Those sessions are then visible in the XML feed that is generated. This script will load the speakers for your event into an array, and then get the sessions for all of the other SQL Saturdays. If the other event is the same day as yours, it then compares the speakers. Finally, it will output those speakers that are selected in your event that are also selected in another event (aka the over-committed speakers) and the event that they are speaking at.

This script is based upon the script that Steve Jones’ wrote, however this script loads the information directly from the SQLSaturday.com web site each time that it runs. It could have used saved files (like Steve did), but this will be a bit easier to get changed speakers when either event changes the speakers that are presenting.

This script does not handle duplicate names (from a speaker presenting more than one session) at any of the events, so you may end up with duplicate output lines for a speaker.

Update: script updated on 2/9/2016 to handle > 1 duplicate speaker.

If anyone knows how to get a feed of the submitted sessions, I’ll create another script for comparing the sites for speakers that have submitted to multiple locations on the same day. I’d rather not even get to the selected phase if it can be avoided. If you know how to do this, just leave me a comment below please (or contact me directly).

source: http://cdn2.pcadvisor.co.uk/cmsdata/features/3380127/Infinity_Blade_II.jpg

In my previous post, I introduced you to the Advent of Code programming challenge, and explained how I’m going through the days, solving both parts of each puzzle. Continuing on, today we have day 21. In this challenge,you received a present – a role playing game. In this game, you (the player) and your enemy (the boss) take turns attacking each other. You each start off with 100 “Hit Points”, and you inflict damage based upon the weapons that you have. You are shielded by the armor that you have. You must purchase one weapon to use, and you may purchase one armor. You may purchase a ring for each hand. The store has just one of each item. The list of items in the store, along with their cost, how much damage they inflict, and how much protection they provide (as armor) is at the above link. The first person whose hit point score drops to zero (or less) loses. The damage inflicted is equal to the damage of all items of the attacker minus the armor total of the defender, and this damage is reduced from the hit points. Each attack will diminish the defender’s hit points by at least a value of one. The input file for this puzzle provides you with the boss’s stats. For Part 1 of this puzzle, you need to find out what is the least amount of gold that you can spend to win.

When the input file is examined, you see that it has three lines, one each for hit points, damage and armor. Each line lists the item, and then the amount of units for that item, with a colon between them. Therefore, we can use the string splitter routine to separate the lines, and then separate the items from their values by splitting again on the colon. The values returned will be stored into variables for future use. This code is:

The next section of code builds a common table expression (cte) for each weapon, armor and ring that is available. Note that for the armor and rings, I’ve added rows with 0 points to handle not purchasing those items. These ctes are (note that this will not run until all of the remaining sections are put together).

In the next cte, a list of all possible weapons with their costs, damage and armor scores are created. The actual weapon information is not necessary, just the values. This is accomplished by performing a CROSS JOIN between the three store items ctes. The rings cte is cross joined an additional time to handle a ring on each hand. Additionally, four CROSS APPLYs are performed. The first sums up the totals of the cost, damage and armor scores for the Player. The second calculates the damage to both the Player and Boss for that battle. The third adjusts the minimum damage to 1 in case it is less than 1. The fourth calculates the number of plays that the Boss and Player have for this battle by dividing the number of hit points (100) by the number of plays. Since this may return a fraction, and that fraction keeps the player alive for the next play, this is rounded up to the next integer value with the CEILING function. Additionally, the number of hits is handled as a numeric with a decimal position because in SQL Server, when an integer is divided by another integer, the result will be an integer with the fractional part truncated. A SELECT DISTINCT is performed because we only need to run the battle once for each condition:

Finally, the code returns the cost column from the first row where the number of Boss Plays is less than or equal to the number of Player Plays (since the player plays first, if they both would eliminate their opponent on that play, then the player eliminated the boss first):

And now we have the solution for Part 1. For Part 2, the challenge is to find the highest cost that the player can spend and still lose to the boss. All of the hard work has been done, and this requires only a minor change to the final query. Just replace the code starting at line 67 with this code:

Well, there we go, Day 21 has been solved. Come on back tomorrow to see how to solve Day 22.

source: http://www.myphillyalive.com/wp-content/uploads/2013/11/A-Elves-having-fun-Photo-by-Mark-Garvin.jpg

source: http://www.myphillyalive.com/wp-content/uploads/2013/11/A-Elves-having-fun-Photo-by-Mark-Garvin.jpg

In my previous post, I introduced you to the Advent of Code programming challenge, and explained how I’m going through the days, solving both parts of each puzzle. Continuing on, today we have day 20. In this challenge, it seems that Santa’s elves have too much time on their hands. To keep them busy, he is having them deliver presents. By hand. Door to door.

Santa sends the elves down a street. But this isn’t an ordinary street… this street is infinitely long. The houses on this street are sequentially numbered, starting with one. The elves (which is seems there is also an infinite supply of) are also numbered, starting with one. Each elf goes to every x house of the same number as the elf. In other words, elf #1 stops at every house. Elf #2 stops at every even numbered house. Elf #3 stops at every third house. And so on. At each stop, the elf leaves 10 times his number of presents. So elf 1 leave 10 presents, elf 2 leaves 20 presents, and so on.

For Part 1 of this puzzle, what is the lowest numbered house to receive at least a certain number of gifts? For me, I need to find out the lowest numbered house with 36 million gifts (and I’m left wondering just how long it would take to unwrap all of those presents).

To solve this problem, we need to determine what elves would be stopping at a particular house. We then need to sum up the elvf # times 10 for all of those elves, and see if it is at least the specified number. To determine if an elf would be stopping at a house, the elf’s number needs to divide evenly into the house number. In other words, we are looking for all of the factors of a number. In SQL Server, we can determine whether a number can be evenly divided into another with the modulo operator. This function returns the remainder, so we just need to look for results where the remainder is zero.

My first attempt to solve this is to use a set-based approach. In this approach, I use a virtual tally table, which is then cross-joined to itself (once for the sequential house numbers, and once for the elf numbers). The code checks that the elf number would go to the house number with the modulo operator, and it reduces the work by ensuring the elf number is less than or equal to the house number.

I did mention that this was my first attempt. This would be because this code locked up my laptop overnight for over 12 hours, and still didn’t complete (once I was able to determine the actual number, I came back to this and in running the code for just 1000 houses that included the correct house, this still took almost 90 seconds). Obviously, this is not a desired solution. So let’s look into an alternative.

In testing out this code, I found that this runs fine for an individual house. So I next decided to iterate through the houses, and to perform the determination of which elves go to that house in a set-based manner. The resulting code is:

In this newer incarnation of the code, the @HouseNbr variable is used to iterate through the house numbers. To keep track of the progress, I print out messages (with the RAISERROR function). At this point, patience is a virtue as you need to wait for the iteration to find the first house number that receives the specified number of presents. But at least while you wait, the laptop isn’t locked up and you can continue other work while waiting. And once the code returns the answer, you can enter it and move on to Part 2.

In Part 2, the elves will only stop at 50 houses each; however they will leave 11 presents at each house. With this new criteria, what is the new house number with this minimum number of presents?

Well, this is pretty straightforward change. Since each elf is now dropping of 11 presents instead of 10, change the 10 to 11 in line 13. And since each elf is only going to 50 houses, just add a new predicate where the Elf # * 50 is greater than or equal to the house number. Replace likes 13-16 of the above code with these 5 lines of code:

And there is my solution for day 20. I would love to find a better way to solve this – if you know of a way, please share it in the remarks.

source: http://www.brockpress.com/wp-content/uploads/2014/11/RudolphandSanta.jpg

In my previous post, I introduced you to the Advent of Code programming challenge, and explained how I’m going through the days, solving both parts of each puzzle. Continuing on, today we have day 19. In this challenge, Rudolph is sick and needs some medicine. Since Rudolph’s biology is different from other reindeer, he needs some specialized medicine. Therefore, Santa has ensured that the North Pole has a specialized Red-Nosed Reindeer nuclear fusion/fission plant, which is capable of constructing any Red-Nosed Reindeer molecule that you might need. This plant works by starting with an input molecule and then doing a series of replacements, one per step, until it has the right molecule.

Prior to using the machine, it needs to be calibrated. This involves determining the number of molecules that can be generated in one step from a given starting point. For instance, assume that the machine can make the following replacements:

H => HO
H => OH
O => HH

If you are starting with the molecule HOH, then the following replacements can be made:

  1. Replace the first H with HO to get HOOH.
  2. Replace the second H with HO to get HOHO.
  3. Replace the first H with OH to get OHOH.
  4. Replace the second H with OH to get HOOH.
  5. Replace the O with HH to get HHHH.

Today’s puzzle has an input file that lists all of the replacements that the machine can make, followed by the medicine molecule used to calibrate the machine. For Part 1, we need to determine how many unique molecule combinations can be made by doing just one replacement of all of the combinations. In the above example, there are 5 replacement molecules, but there are only 4 distinct molecules created (HOOH can be created two different ways).

The first challenge we run into when loading this file is that the input file has both the replacements and the medicine molecule. These need to be separated. After splitting the line on the carriage returns, the code then looks to see if the line has the “=>” string within the line. If so, then this line is a replacement line; otherwise it is the medicine molecule that we need. Since we will need to be extracting out from the replacement lines the string to be replaced, and the string to be replaced with, the code gets the position of the “=>” for further use. This information will be stored into a temporary table. Finally, get the line without the “=>” for the medicine molecule:

In the next step, we need to step through each replacement line, and replace just one occurrence of the characters to be replaced with the replacement characters. If I were to use the REPLACE function, then all occurrences of the characters to be replaced would be replaced in that line. Since we need to do just one at a time, the code needs to walk through the medicine molecule string to find the characters to be replaced, and then use the STUFF function to replace those characters with the replacement characters. I’ll be using the CHARINDEX function to find the next occurrence of the characters to be replaced, using the optional third parameter to specify the starting position within the string in order to find the next occurrence. Additionally, this is a case-sensitive string search, so I need to use the COLLATE option to specify that the medicine molecule string is of a collation that is case sensitive. After each replacement, the new medicine molecule string is stored into a table, and when all of the replacements have been performed, the code then will get the distinct count of all of these medicine molecules, using the DISTINCT keyword in the COUNT function. Since we need to work on each input line one at a time, and then to do the replacements one at a time upon the medicine molecule, I will be using a cursor to go line by line, and a nested WHILE loop to go through the medicine molecule. The cursor shreds the line into the Characters To Be Replaced and the Replacement Characters:

Part 1 has solved the issue for the calibration of the machine. For Part 2, we need to build the molecule. To build the molecule, we start with “e”, and run through the steps until the desired molecule has been created. For Part 2, starting with “e”, what is the minimum number of steps required to build the supplied medicine molecule using the same replacement list? At this point, I haven’t figured out how to solve this part, but I intend to come back to it and finish this up when time permits.

Well, I did get Part 1 finished. Come back tomorrow for the solutions for Day 20, and I’ll update this post when Part 2 is finished.

source: https://www.adafruit.com/images/145×109/2026-04.jpg

In my previous post, I introduced you to the Advent of Code programming challenge, and explained how I’m going through the days, solving both parts of each puzzle. Continuing on, today we have day 18. In this challenge, the fire department has cracked down on how many lights can be in your display. So the million light display that you used previously needs to be modified to have no more than 10,000 lights. You are working with them on a 100×100 grid. And Santa sends you instructions on how best to work with them.

With limiting the number of lights, you are going to need to move to an animated display. You have an input file of the initial state of all of the lights. In this file, a “#” means the light is on, and a “.” means that the light is off. The rules for turning on/off a light are:

  1. If the light is currently on, it will only stay on if 2 or 3 of it’s neighbors are also on. Otherwise, it is turned off
  2. If the light is currently off, it will only turn on if exactly 3 of it’s neighbors are on. Otherwise, it stays off.

The status of the lights you are comparing to is the status at the start of the step. All lights are compared based on this initial status.

In this puzzle, a “neighbor” is the eight “cells” that surround a particular cell. If the cell is on the top, bottom, left or right edge, there will be less than eight, and those “missing” neighbors are to be considered to be turned off.

For Part 1, we need to determine just how many lights are on after 100 steps have been performed.

The first step is going to be just like all the other steps with an input file – the input file needs to be loaded in to a temporary table. The file is larger than 8000 characters, so it will use the staged approach to extract the individual rows:

At this point, we have one hundred rows, and each row has the starting status of it’s one hundred lights. We need to extract the status of the individual lights. To accomplish this, let’s use a virtual tally table to get the character at each position (using the SUBSTRING function), and store the row, position and status in a second temporary table. Here I’m converting the status of “#” or “.” to 1 or 0:

Since we need to get the status of all lights based upon the initial state of the lights at the start of a step, we can use the update statement. However, since we need to perform 100 steps, we will have to perform this update 100 times. This is a combination of set-based and iterative code. I’ll accomplish the iterative part by using a while loop based upon the value of a counter. To determine the new status of an individual light, the query needs to get the status of the lights for the rows prior to and after the current row, and for the light positions prior to and after the current position. Since we are looking for a count of the number of neighbor lights that are on, we can utilize a sub-query to get the neighbors light status (remember that this is now a 1 or 0), and then sum them up. Next the query determines what the new light status should be, and updates the temporary table with this information:

If the lights current status is on (1), then the sum of all of the cells within the rows and positions will include this cell, therefore the neighbors count is looking for 3 or 4 instead of the specified 2 or 3 in order to compensate for this row (position) being included. If the current light is off, then we just look for whether 3 neighbors are on. And here we have the T-SQL solution for Part 1 of this puzzle.

For Part 2, you notice that the four corner lights are permanently on. What is the number of lights on after 100 steps now?

To solve this, we first need to set the four corner lights to being on initially. After the “Extract individual light status” code block, update those four rows to being on:

Next we need to alter the CASE statement in cte2. Replace line 68 and insert a new line 69 in the “Update grid 100 times” code block to these new lines:

The rest of the code remains the same. And here we have the solution for part 2 of day 18.

At this point, the solution is solved. But is this the most efficient way? That sub-query in the CROSS APPLY part of cte is being run for each row… or it’s being run 10,000 times. Let’s modify this query to use window functions instead – specifically the LAG and LEAD offset functions. In this new code, the LAG and LEAD functions will get the eight neighboring cells. A little bit of logic is needed to determine if this row is an edge, and if so then a 0 is used, otherwise the function is called to get the value from the offset row. The status of these eight neighboring cells are then added together to get the new Neighbors value. cte2 is also modified slightly – the query no longer needs to compensate for the current position being in the group being summed, and the case statement is changed slightly for determining when a status is turned on. The actual update statement is the same. The modified query is:

Performance wise, this updated code runs two-thirds faster. That’s a pretty nice improvement.

Come on back tomorrow for a T-SQL Solution for Day 19.

In my previous post, I introduced you to the Advent of Code programming challenge, and explained how I’m going through the days, solving both parts of each puzzle. Continuing on, today we have day 16. In this challenge, you have received a wonderful gift from your Aunt Sue, and you want to send her a thank-you card. There’s just one small problem… you have 500 Aunt Sue’s, and you don’t know which one sent you the gift.

The gift turns out to be your very own “My First Crime Scene Analysis Machine” (MFCSAM), and it can what specific compounds are in a sample. According to the instructions, it can detect:

  • children, by human DNA age analysis.
  • cats. It doesn’t differentiate individual breeds.
  • Several seemingly random breeds of dog: samoyeds, pomeranians, akitas, and vizslas.
  • goldfish. No other kinds of fish.
  • trees, all in one group.
  • cars, presumably by exhaust or gasoline or something.
  • perfumes, which is handy, since many of your Aunts Sue wear a few kinds.

In order to determine which Aunt Sue sent you the gift, you analyze the wrapping from the gift, and the machine tells you the following:

Well, you know a bit about your Aunts, so you start making a list of what you do know. However, there are items that you just don’t know about. They’re not missing, you just don’t know the value.

For part 1 of this puzzle, you need to return the number of the Aunt Sue that sent you the gift. The input file is in the format:

Sue 1: goldfish: 6, trees: 9, akitas: 0

You can see how there are some items that the MFCSAM returned that are not listed in the input file. As previously mentioned, these are not missing – you just don’t know what that Aunt Sue has for that item.

The first part of the solution is to load the input file, and to get for each Aunt the quantity of each item. This solution starts off by loading the input file into a variable:

Since the delimited string splitter only works on strings up to 8000 characters, you check the length and find that it is over 20,000 characters. So you decide to load the input file in steps, performing up to 8000 characters at a time. First off, create a temporary table to hold these lines. Next, get the first 8000 characters in the @InputText variable, and determine where the last CHAR(13) is. Get the string from @InputText up to this point. Then the string is split by the CHAR(13) to separate it into rows, and then each row is examined for which Sue # this is, and which items that we know about her by utilizing the CHARINDEX and SUBSTRING functions. This data is inserted into the temporary table, and the text that was just analyzed is removed from the beginning of @InputText. This process continues as long as @InputText has > 1 character (since we are not doing that final CHAR(13), @InputText will end up with just that in it):

At this point, we just need to analyze the data to see which Aunt Sue matches all of the criteria that the MFCSAM returned. However, since we don’t know all of the information for each Aunt, we need to assume that if the data isn’t there, that it does match. This will be accomplished with the ISNULL function, and if the item is NULL then it will return the value that was returned from the MFCSAM. The code for this is:

Great – now we know which Aunt sent us the gift. However, before sending the thank-you card, you notice in the instructions for the MFCSAM that for some of the items, the MFCSAM returns a range and not a specific value. Specifically, the cats and trees readings indicate that there are more than this number, and the pomeranians and goldfish readings indicate that there are fewer than this number. This is a relatively simple change to the code… for the cats and trees, change the value operator to a greater than, and change the value used in the ISNULL function to be higher than the number being tested for. Likewise, for the pomeranians and golffish, change the operator to a less than, and change the value used in the ISNULL function to be lower than the number being tested for:

And here is my T-SQL solution for Advent of Code, Day 16. Come on back tomorrow and see my next solution.