Long Refactoring: Introduce Iterator

In a previous article, I had to shorten a bunch of lines that had a row and column value used as indexes to the board array. This repeated pattern is a call-to-action.

We want to encapsulate the logic for referring to a particular place on the board, and for advancing through the board. This is the responsibility of the Iterator pattern.

Spoiler Alert: we don’t get all the way there in this article.

Lets take a look at one specific piece of code, the code that sets a value:

curr_board_filling_node = curr_board_filling_node.next_node
                row = int(curr_board_filling_node.board_spot[0])
                col = int(curr_board_filling_node.board_spot[1])
                test_board[row][col] = curr_board_filling_node.value

curr_board_filling_node is acting like an iterator already, except that it does not encapsulate the set function of the assignment in the last line. If the curr_board_filling_node had a pointer back to the board, the code to set the value would be

    def set(self, value):
        row = int(self.board_spot[0])
        col = int(self.board_spot[1])
        self.board[row][col] = value

However, the nodes do not currently hold on to a reference to the table. Could we make them? Right now, they act like flyweights, taking the table as external state. We can keep that pattern up for now. Also, we note that whenever we are setting a value, it is from the node in question. That means we can write the method like this:

    def write(self, board):
        row = int(self.board_spot[0])
        col = int(self.board_spot[1])
        board[row][col] = self.value

Note that I call this “write” as opposed to “set” for clarity. Set implies that you are setting a value on the object. Here, we are implying that we are writing a value into the parameter that has been passed in.

Here is the entirety of the code change. Again, it is minimal, to be legible. Already the main function ist starting to be more understandable.

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index dc5a96b..cd8a002 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -42,14 +42,10 @@ class SudokuSolver:
         while index < MAX:
             curr_board_filling_node = head_node
             test_board = copy.deepcopy(original_board)
-            row = int(curr_board_filling_node.board_spot[0])
-            col = int(curr_board_filling_node.board_spot[1])
-            test_board[row][col] = curr_board_filling_node.value
+            curr_board_filling_node.write(test_board)
             while curr_board_filling_node.next_node:
                 curr_board_filling_node = curr_board_filling_node.next_node
-                row = int(curr_board_filling_node.board_spot[0])
-                col = int(curr_board_filling_node.board_spot[1])
-                test_board[row][col] = curr_board_filling_node.value
+                curr_board_filling_node.write(test_board)
 
             curr_row = int(curr_node.board_spot[0])
             curr_col = int(curr_node.board_spot[1])
@@ -216,6 +212,11 @@ class Tree_Node:
     def __str__(self):
         return self.value
 
+    def write(self, board):
+        row = int(self.board_spot[0])
+        col = int(self.board_spot[1])
+        board[row][col] = self.value
+
 start = time.time()
 solver = SudokuSolver(import_csv())
 for key, solution in solver.solved_board_strings.items():

How about this spot?

          curr_row = int(curr_node.board_spot[0])
            curr_col = int(curr_node.board_spot[1])
            if self.box_index.value_valid(test_board, curr_row, curr_col):

Again, we can move the logic to the Node object. Probably just as clean to move the logic to the BoxIndex. This one alone doesn’t clean up much, but every bit helps. We might need to do some re-balancing of these classes later.

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index cd8a002..9290c2f 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -47,9 +47,7 @@ class SudokuSolver:
                 curr_board_filling_node = curr_board_filling_node.next_node
                 curr_board_filling_node.write(test_board)
 
-            curr_row = int(curr_node.board_spot[0])
-            curr_col = int(curr_node.board_spot[1])
-            if self.box_index.value_valid(test_board, curr_row, curr_col):
+            if self.box_index.is_value_valid(test_board, curr_node):
                 index += 1
                 if index &gt;= MAX:
                     continue
@@ -118,6 +116,11 @@ class BoxIndex:
     def __init__(self):
         self.table = self.fill_box_index_table()
 
+    def is_value_valid(self, board, node):
+        row = int(node.board_spot[0])
+        col = int(node.board_spot[1])
+        return self.value_valid(board, row, col)
+
     def value_valid(self, board, row_index, column_index):
         row = ['1', '2', '3', '4', '5', '6', '7', '8', '9']
         column = ['1', '2', '3', '4', '5', '6', '7', '8', '9']

OK, what about this spot?

                row = int(new_node.board_spot[0])
                col = int(new_node.board_spot[1])
                if test_board[row][col] != '0':
                    new_node.value = test_board[row][col]
                    new_node.possible_values = []

I have to be honest, I am not sure what to call this. If the test board has had a value placed in it, then the node that represents the table has no possible values assigned. It is reducing the possibilities because the current board space is marked as solved. And that last line is the name of the new function. check_solved(). Note that I forgot to check in to git after the last one, and so this diff is bigger…and note how that makes things harder to review. But not impossible: there are two chunks of code removed from the algorithm, and two functions added later.

index cd8a002..17926c3 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -47,20 +47,13 @@ class SudokuSolver:
                 curr_board_filling_node = curr_board_filling_node.next_node
                 curr_board_filling_node.write(test_board)
 
-            curr_row = int(curr_node.board_spot[0])
-            curr_col = int(curr_node.board_spot[1])
-            if self.box_index.value_valid(test_board, curr_row, curr_col):
+            if self.box_index.is_value_valid(test_board, curr_node):
                 index += 1
                 if index &gt;= MAX:
                     continue
                 new_node = Tree_Node(
                     index, self.board_index.table[index], curr_node)
-
-                row = int(new_node.board_spot[0])
-                col = int(new_node.board_spot[1])
-                if test_board[row][col] != '0':
-                    new_node.value = test_board[row][col]
-                    new_node.possible_values = []
+                new_node.check_solved(test_board)
                 curr_node.next_node = new_node
                 curr_node = new_node
             else:
@@ -118,6 +111,11 @@ class BoxIndex:
     def __init__(self):
         self.table = self.fill_box_index_table()
 
+    def is_value_valid(self, board, node):
+        row = int(node.board_spot[0])
+        col = int(node.board_spot[1])
+        return self.value_valid(board, row, col)
+
     def value_valid(self, board, row_index, column_index):
         row = ['1', '2', '3', '4', '5', '6', '7', '8', '9']
         column = ['1', '2', '3', '4', '5', '6', '7', '8', '9']
@@ -217,6 +215,14 @@ class Tree_Node:
         col = int(self.board_spot[1])
         board[row][col] = self.value
 
+    def check_solved(self, board):
+        row = int(self.board_spot[0])
+        col = int(self.board_spot[1])
+        if board[row][col] != '0':
+            self.value = board[row][col]
+            self.possible_values = []
+
+
 start = time.time()
 solver = SudokuSolver(import_csv())
 for key, solution in solver.solved_board_strings.items():

The code is written with linked list modifications interspersed with the logic of solving the problem. The steps we have taken so far mitigate some of the problem. However, a proper iterator solution would extract that logic from this function altogether. The node object really is not the iterator. The node object should be the iterated object.

Here is another minor refactoring that removes duplicate code. It also clarifies the intention.

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index 17926c3..459619b 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -58,13 +58,12 @@ class SudokuSolver:
                 curr_node = new_node
             else:
                 if len(curr_node.possible_values) == 0:
+                    # backtrack
                     while len(curr_node.possible_values) == 0:
                         curr_node = curr_node.last_node
                         curr_node.next_node = None
                         index -= 1
-                    curr_node.next()
-                else:
-                    curr_node.next()
+                curr_node.next()
         return self.build_solution_string(head_node)
 
     def build_solution_string(self, head_node):

We could statically allocate all of the nodes we would ever need at the beginning (MAX) and iterate through a list. This would remove the need for the index variable as well. It also seems that the index member variable on the node is not used. Lets remove it.

index 17926c3..3778782 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -37,7 +37,7 @@ class SudokuSolver:
 
     def tree_to_solution_string(self, original_board):
         index = 0
-        head_node = Tree_Node(index, self.board_index.table[index])
+        head_node = Tree_Node(self.board_index.table[index])
         curr_node = head_node
         while index < MAX:
             curr_board_filling_node = head_node
@@ -52,19 +52,18 @@ class SudokuSolver:
                 if index >= MAX:
                     continue
                 new_node = Tree_Node(
-                    index, self.board_index.table[index], curr_node)
+                    self.board_index.table[index], curr_node)
                 new_node.check_solved(test_board)
                 curr_node.next_node = new_node
                 curr_node = new_node
             else:
                 if len(curr_node.possible_values) == 0:
+                    # backtrack
                     while len(curr_node.possible_values) == 0:
                         curr_node = curr_node.last_node
                         curr_node.next_node = None
                         index -= 1
-                    curr_node.next()
-                else:
-                    curr_node.next()
+                curr_node.next()
         return self.build_solution_string(head_node)
 
     def build_solution_string(self, head_node):
@@ -196,9 +195,8 @@ class BoardIndexTable:
 
 
 class Tree_Node:
-    def __init__(self, index, board_spot, last_node=None, next_node=None):
+    def __init__(self, board_spot, last_node=None, next_node=None):
         self.possible_values = ['1', '2', '3', '4', '5', '6', '7', '8']
-        self.index = index
         self.board_spot = board_spot
         self.next_node = next_node
         self.last_node = last_node

Lets take a closer look at our board_index. This is only used in the creation of new nodes. If we could pre-create all our nodes, we would not need this table. Take a mental note that we can inline that later on.

When we remove a node from the list, what are we doing? We are indicating “this is the furthers we’ve progressed thus far” and we ensure that when we progress past that point, the nodes have been re-initialized. Something feels wasteful about removing these nodes. When we backtrack, we could set the node in the fresh state and reuse them. If only we didn’t use the None as the way to indicate how far to roll forward. Going to be getting rid of that in the future.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.