October 18, 2005 12:00 AM

Solve Sudoku Puzzles with T-SQL

Practice logic and logic programming at the same time

SQL Server Pro

InstantDoc ID #47752

Downloads
47752.zip

In many of my articles, I mention that SQL involves a lot of logic and that you can improve your SQL querying capabilities by solving pure logic puzzles. You've probably already heard of the logic puzzle called Sudoku, which is popular these days.

Sudoku is a Japanese term that refers to a number that's singular or unique. However, the concept behind Sudoku puzzles is by no means unique. In 1783, Swiss mathematician Leonhard Euler invented Latin squares as a variation of magic squares. In Latin squares, you populate a square grid with numbers or symbols that can't repeat in the same row or column. In the 1970s, a form of the puzzle called number place surfaced in America. In the late 1980s, the Sudoku puzzle surfaced in Japan.

The Sudoku puzzle has more restrictions than Latin squares. As Figure 1 shows, in a classic Sudoku puzzle, you get a 9×9 squared grid that's subdivided into 3×3 squared boxes. Some of the cells are already populated with numbers that range from 1 through 9. The goal is to populate all cells with numbers in the range 1 through 9 so that no number will repeat in the same row, column, or 3×3 box. Note that in a correctly formed Sudoku puzzle, there can be only one correct solution. Figure 2 shows the solution to the puzzle in Figure 1.

Sudoku puzzles have varying levels of difficulty, but all require you to apply logical deduction to solve them. The variations in complexity have to do with varying levels of logical rules and deductions that you need to apply to solve the puzzle. I thought that it would be interesting to handle Sudoku puzzles with T-SQL, using some of the new features in SQL Server 2005. That way, you'll have a chance to practice both your logic skills and learn how to use the T-SQL enhancements. The main T-SQL enhancements that I'll use are the PIVOT and UNPIVOT operators, and Common Table Expressions (CTEs). You can find details about the PIVOT and UNPIVOT operators in the Web-exclusive T-SQL 2005 articles "UNPIVOT" (InstantDoc ID 43589), "Dynamic Pivoting" (InstantDoc ID 43140), and "Pivot (or Unpivot) Your Data" (InstantDoc ID 42901). You can learn about CTEs in the T-SQL Black Belt articles "Cycling with CTEs (June 2004, InstantDoc ID 42452) and "Get in the Loop with CTEs" (May 2004, InstantDoc ID 42072).

Preparing the Sudoku Puzzle's Input
I like to use a normalized table to store input for a Sudoku puzzle. The code in Listing 1 creates a normalized table named SudokuInput and populates it with the sample data in Figure 1. This code loads the table with only those rows that represent populated cells.

Some programmers prefer to store the input in a denormalized form, where each row in the table represents a row in the input puzzle. The code in Listing 2 creates a denormalized table named SudokuInputD and populates it with the sample data in Figure 1.

Note that with the new PIVOT and UNPIVOT operators, you can easily generate one form from the other. The code at callout A in Listing 3 shows how to generate a denormalized form of the input from the normalized SudokuInput table. The code at callout B shows how to generate a normalized form of the input from the denormalized SudokuInputD table. In my code, I like to use the normalized form because I find it much more convenient for set manipulation.

Solving Sudoku Puzzles with T-SQL
There are two main approaches that you can take in programming a solution to Sudoku puzzles. One is a brute-force approach in which you try different permutations and check for validity or invalidity. Such a solution can be very slow and doesn't involve much sophisticated logic, so I won't discuss that approach here. The other approach is to identify logical rules that you use when solving a Sudoku puzzle and program those rules, which is my goal in this article. Some Sudoku puzzles can be hard to solve and require you to apply several interesting logical rules that have set-based applications.

I recommend that you create a table-valued function that returns the Sudoku solution as a normalized table. You need to accommodate cases in which the rules that you apply aren't sufficient to come up with the final solution, unless you're sure that your solution applies to any given Sudoku puzzle. If you suspect that your rule set isn't guaranteed to generate the final solution for any given Sudoku puzzle, you can take one of two approaches in terms of the content of the nonfinal cells in the returned table:

You can return only final cells (one unique value was identified).
You can return final cells and return nonfinal cells with all the possible values that remain in those cells after elimination. In your output, you can return the nonfinal cells as arrays or formatted as comma separated lists of values.

I prefer the latter approach in which I use comma separated lists of values for the output.

In my solution, I use the Nums auxiliary table, which I've used many times in past T-SQL Black Belt articles. The code in Listing 3 creates and populates the Nums table, which simply contains a sequence of integers from 1 and on. The code in Listing 4 creates a table-valued function named fn_sudoku, which at this point, doesn't yet apply any rules.

As Listing 4 shows, the fn_sudoku function declares the @rc variable and the @Sudoku Unique and @SudokuNonUnique table variables. The @rc variable keeps track of the number of rows affected by applying the different rules. The @SudokuUnique variable holds cells for which a final unique value is identified. The @SudokuNoUnique variable holds nonfinal cells (i.e., cells with multiple values).

The code at callout A in Listing 4 loads all values from the input Sudoku puzzle into @SudokuUnique. Because the rules might refer to the box in which a cell resides, I included a box attribute besides the row and col attributes. The box number is a function of the given row and column. The formula to calculate the box number is

(row-1)/3*3 + (col-1)/3 + 1

For example, a cell in row 5, col 4 resides in box 5. When you try to figure out this example in your head, keep in mind that T-SQL applies integer division when dividing two integers, which means that you get an integer quotient (fraction truncated).

At callout B in Listing 4, the fn_sudoku function loads into @SudokuNonUnique nine rows with the values 1 through 9 for each of the unoccupied cells in the input Sudoku puzzle. The code achieves this by cross joining three instances of Nums, where the n column in each is in the range 1 through 9. One instance represents rows (R), one instance represents columns (C), and one instance represents values (V). The code filters only cells (row/col combination) that don't exist in SudokuInput.

First
Previous
1
2
Next
Last

Solve Sudoku Puzzles with T-SQL

Related Content: