Long Refactoring: First New Unit Test

The heart of the code is the call to solve a single puzzle; the function tree_to_solution_string. We want to write a test that runs just this function. Getting there is not so easy.

This method is buried deep within the class SudokuSolver. But creating one of these essentially runs the entire code. In fact, we can redo our original test to run the code as a python call as opposed to a subprocess call. Lets’ start with that.

In order to create a SudokuSolver, we need to import the code. Add the following line to the top of the test file:

from treesudoku import tree_sudoku

But when we run tox, we get an error:

______________________________________________________ ERROR collecting treesudoku/test_tree_soduku.py _______________________________________________________
ImportError while importing test module '/home/ayoung/Documents/CodePlatoon/tree_sudoku/treesudoku/test_tree_soduku.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
.tox/py37/lib64/python3.7/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
treesudoku/test_tree_soduku.py:3: in <module>
    from treesudoku import tree_sudoku
E   ModuleNotFoundError: No module named 'treesudoku'

Create a blank file in the treesudoku directory name __init__.py.

Here’s my first implementation of the test.

def test_sudoku_solver():
    solver = tree_sudoku.SudokuSolver()
    assert(False)

Note that when I assert(False) aside from making the test fail, it also produces all the output from running the program. I want to start by getting rid of all that output, as I am going to pull it out of the objects. I can do that by pulling the display code out of the object. The output happens at the very end of tree_to_solution_string. here is the first transform

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index 8db59ce..60b9c24 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -23,7 +23,10 @@ class SudokuSolver:
         self.solved_board_strings = []
         for key, value in self.boards_dict.items():
             print(f"Board: {key}")
-            self.solved_board_strings.append([self.tree_to_solution_string(value)])
+            return_string = self.tree_to_solution_string(value)
+            self.solved_board_strings.append([return_string])
+            self.print_board(self.strings_to_board_dict([return_string])['0'])
+
 
     def tree_to_solution_string(self, original_board):
         index = 0
@@ -62,7 +65,6 @@ class SudokuSolver:
         while(current_node.next_node):
             current_node = current_node.next_node
             return_string += str(current_node.value)
-        self.print_board(self.strings_to_board_dict([return_string])['0'])
         return return_string
 
     def import_csv(self):

Aside from breaking up the long line, this code moves the print statement from the called function to the loop. However, it still uses the return value from the function instead of the key from the dictionary. What is the difference? Use the debugger to look:

self.print_board(self.strings_to_board_dict([return_string])['0'])
(Pdb) print (return_string)
483921657967345821251876493548132976729564138136798245372689514814253769695417382
(Pdb) print (value)
[['0', '0', '3', '0', '2', '0', '6', '0', '0'], ['9', '0', '0', '3', '0', '5', '0', '0', '1'], ['0', '0', '1', '8', '0', '6', '4', '0', '0'], ['0', '0', '8', '1', '0', '2', '9', '0', '0'], ['7', '0', '0', '0', '0', '0', '0', '0', '8'], ['0', '0', '6', '7', '0', '8', '2', '0', '0'], ['0', '0', '2', '6', '0', '9', '5', '0', '0'], ['8', '0', '0', '2', '0', '3', '0', '0', '9'], ['0', '0', '5', '0', '1', '0', '3', '0', '0']]

One is the solved string. The other is the original puzzle in Array form. We don’t seem to have a solution set. This seems to call for a dictionary much like the original one. Let’s create that here, and use it for the display. That member variable self.solved_board_strings = [] is not being used outside of this function, so we can convert that to a dictionary.

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index 8db59ce..a0e9c29 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -20,10 +20,13 @@ class SudokuSolver:
         self.boards_dict = self.strings_to_board_dict(self.board_strings)
         self.box_index_table = self.fill_box_index_table()
         self.board_index_table = self.fill_board_index_table()
-        self.solved_board_strings = []
+        self.solved_board_strings = dict()
         for key, value in self.boards_dict.items():
+            return_string = self.tree_to_solution_string(value)
+            self.solved_board_strings[key] = return_string
+        for key, solution in self.solved_board_strings.items():
             print(f"Board: {key}")
-            self.solved_board_strings.append([self.tree_to_solution_string(value)])
+            self.print_board(self.strings_to_board_dict([solution])['0'])
 
     def tree_to_solution_string(self, original_board):
         index = 0
@@ -62,7 +65,6 @@ class SudokuSolver:
         while(current_node.next_node):
             current_node = current_node.next_node
             return_string += str(current_node.value)
-        self.print_board(self.strings_to_board_dict([return_string])['0'])
         return return_string
 
     def import_csv(self):

Note that we have introduced a minor inefficiency by going through the loop twice. Compared to the time it takes to solve the puzzle this is irrelvant. We will do this a lot; add a small inefficiency to tease things apart. But, once we have refactored, we find we can more than make up for that by optimizing the expensive parts of the separated code.

Commit to git before continuing.

Lets move the display logic out of this class.

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index a0e9c29..400450c 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -24,9 +24,6 @@ class SudokuSolver:
         for key, value in self.boards_dict.items():
             return_string = self.tree_to_solution_string(value)
             self.solved_board_strings[key] = return_string
-        for key, solution in self.solved_board_strings.items():
-            print(f"Board: {key}")
-            self.print_board(self.strings_to_board_dict([solution])['0'])
 
     def tree_to_solution_string(self, original_board):
         index = 0
@@ -183,6 +180,10 @@ class Tree_Node:
         return self.value
 
 start = time.time()
-x = SudokuSolver()
+solver = SudokuSolver()
+for key, solution in solver.solved_board_strings.items():
+    print(f"Board: {key}")
+    solver.print_board(solver.strings_to_board_dict([solution])['0'])
+
 end = time.time()
 print(f"Finished in {end - start} seconds")

Now we can redo our unit test like this:

diff --git a/treesudoku/test_tree_soduku.py b/treesudoku/test_tree_soduku.py
index 67b2287..9969c94 100644
--- a/treesudoku/test_tree_soduku.py
+++ b/treesudoku/test_tree_soduku.py
@@ -2,30 +2,14 @@ import subprocess
 
 from treesudoku import tree_sudoku
 
-check_data ="""
-Board:0483921657967345821251876493548132976729564138136798245372689514814253769695417382
-Board:1245981376169273584837564219976125438513498627482736951391657842728349165654812793
-Board:2462831957795426183381795426173984265659312748248567319926178534834259671517643892"""
-
-def test_end_to_end():
-    print ("Running Test")
-    p = subprocess.run(["python3", "treesudoku/tree_sudoku.py"], capture_output=True)
-    output = p.stdout.decode("utf-8").split("\n")
-    output = "".join(output[:-2])
-    output = output.replace("-","").replace("|","")
-    output = output.replace(" ","").replace("\n","")
-    output = output.replace("Board","\nBoard")
-    print("comparing output ")
-    assert(len(check_data) == len(output))
-    assert(check_data == output)
-    print("OK")
-
+puzzles = {
+    "0": "483921657967345821251876493548132976729564138136798245372689514814253769695417382",
+    "1": "245981376169273584837564219976125438513498627482736951391657842728349165654812793",
+    "2": "462831957795426183381795426173984265659312748248567319926178534834259671517643892"
+}
 
 
 def test_sudoku_solver():
     solver = tree_sudoku.SudokuSolver()
-    assert(False)
-
-test_end_to_end()
-
-
+    for key, solution in solver.solved_board_strings.items():
+        assert solver.solved_board_strings[key] == puzzles[key]

Of course, I did this in two stages: first I got the test to run, then I removed the old test. I can no longer run the tests using the CLI, now only via tox.

With this level of testing in place, it is time to attack the code and restructure. First up is reworking the code that reads the file.

Adam Young's Web Log

The Notebook of a Programmer Climber Musician Ex-Soldier Woodworker and a few other things

Long Refactoring: First New Unit Test

Leave a Reply