Congratulations, you got your code to run! You are done! Ship it!. Just don’t expect to be able to read it next month. You need to maintain it. You need to add new features. It is a mess.
Give yourself a well deserved break. Then come back to it.
I have a piece of code written for that very common job interview question: write a Sudoku solver. Jarrett, A Code Platoon student I am mentoring, wrote it for his project, and he got it to work. I think it is an ideal candidate for a refactoring story. Over the next few posts, I am going to work through how I would tackle this.
It works. At 180 lines, it is complex enough that the mind cannot hold on to the entire structure at once. Here is the code in its current form.
In order to run the code, you need some sample data. I’ll post that afterwards
Here is tree_sudoku.py
#turn each csv row into a board #find what values can go into what spot #create a tree trying to put in each value #if value can not be possible, end stem, go back up the tree #retrun the branch when tree is 81 layers deep, the board is filled import csv import re import copy import time class SudokuSolver: def __init__(self): self.board_strings = self.import_csv() self.boards_dict = self.strings_to_board_dict(self.board_strings) self.box_index_table = self.fill_box_index_table() self.board_index_table = self.fill_board_index_table() self.solved_board_strings = [] for key, value in self.boards_dict.items(): print(f"Board: {key}") self.solved_board_strings.append([self.tree_to_solution_string(value)]) def tree_to_solution_string(self, original_board): index = 0 head_node = Tree_Node(index, self.board_index_table[index]) current_node = head_node while index < 81: current_board_filling_node = head_node test_board = copy.deepcopy(original_board) test_board[int(current_board_filling_node.board_spot[0])][int(current_board_filling_node.board_spot[1])] = current_board_filling_node.value while current_board_filling_node.next_node: current_board_filling_node = current_board_filling_node.next_node test_board[int(current_board_filling_node.board_spot[0])][int(current_board_filling_node.board_spot[1])] = current_board_filling_node.value if self.value_valid(test_board, int(current_node.board_spot[0]), int(current_node.board_spot[1])): index += 1 if index >= 81: continue new_node = Tree_Node(index, self.board_index_table[index], current_node) if test_board[int(new_node.board_spot[0])][int(new_node.board_spot[1])] != '0': new_node.value = test_board[int(new_node.board_spot[0])][int(new_node.board_spot[1])] new_node.possible_values = [] current_node.next_node = new_node current_node = new_node else: if len(current_node.possible_values) == 0: while len(current_node.possible_values) == 0: current_node = current_node.last_node current_node.next_node = None index -= 1 current_node.next() else: current_node.next() return_string = '' current_node = head_node return_string += str(current_node.value) while(current_node.next_node): current_node = current_node.next_node return_string += str(current_node.value) self.print_board(self.strings_to_board_dict([return_string])['0']) return return_string def import_csv(self): list_of_boards = [] with open('sample_sudoku_board_inputs.csv', 'r') as file: reader = csv.reader(file) for row in reader: list_of_boards.append(str(*row)) return list_of_boards def strings_to_board_dict(self, board_strings): return_dict = {} for index, board in enumerate(board_strings): rows = re.findall(r"\d{9}", board) board_list = [] for row in rows: row_list = [] row_list[:0] = row board_list.append(row_list) return_dict[str(index)] = board_list return return_dict def print_board(self, board): for index1, row in enumerate(board): if index1 == 0 or index1 == 3 or index1 == 6: print('-' * 21) for index, char in enumerate(row): print(char, '', end='') if index == 2 or index == 5: print('| ', end = '') print('') if index1 == 8: print('-' * 21) def value_valid(self, board, row_index, column_index): row = ['1', '2', '3', '4', '5', '6', '7', '8', '9'] column = ['1', '2', '3', '4', '5', '6', '7', '8', '9'] square = ['1', '2', '3', '4', '5', '6', '7', '8', '9'] for number in board[row_index]: if number == '0': continue if number in row: row.remove(number) else: return False for a_row in range(9): number = board[a_row][column_index] if number == '0': continue if number in column: column.remove(number) else: return False box_indexes = self.box_index_table[self.find_box_of_index(str(row_index) + str(column_index))] for index in box_indexes: row = int(index[0]) column = int(index[1]) number = board[row][column] if number == '0': continue if number in square: square.remove(number) else: return False return True def find_box_of_index(self, index): box = 'box not found' for each_box in self.box_index_table: if index in self.box_index_table[each_box]: box = each_box break return box def fill_box_index_table(self): boxes = {} box_center = [1, 1] box_number = 0 for _row_of_boxes in range(3): for _each_box in range(3): box_list = [] for i in range(-1, 2): box_list.append(str(box_center[0] + i) + str(box_center[1] - 1)) box_list.append(str(box_center[0] + i) + str(box_center[1])) box_list.append(str(box_center[0] + i) + str(box_center[1] + 1)) boxes[box_number] = box_list box_number += 1 box_center[1] += 3 box_center[0] += 3 box_center[1] -= 9 return boxes def fill_board_index_table(self): return_list = [] for row in range(9): for column in range(9): return_list.append(str(row) + str(column)) return return_list class Tree_Node: def __init__(self, index, board_spot, last_node=None, next_node=None): self.possible_values = ['1', '2', '3', '4', '5', '6', '7', '8'] self.index = index self.board_spot = board_spot self.next_node = next_node self.last_node = last_node self.value = '9' def next(self): self.value = self.possible_values.pop() def __str__(self): return self.value start = time.time() x = SudokuSolver() end = time.time() |
Here is the sample data you can use to run it. I cut it down to just 3 puzzles. To convert to a sudoku puzzle, break it up into 9 lines of 9 characters each. A 0 indicates a blank spot on the puzzle.
The full file test file has 50 puzzles in it. It takes the code a while to run against. I am going to cut it down to 3, but I want to hold on to the original data as well. I’ll do that using a symlink.
Here is sample_sudoku_board_inputs.csv.short. I can then symlink sample_sudoku_board_inputs.csv to this file.
003020600900305001001806400008102900700000008006708200002609500800203009005010300 200080300060070084030500209000105408000000000402706000301007040720040060004010003 000000907000420180000705026100904000050000040000507009920108000034059000507000000 |
ln -s sample_sudoku_board_inputs.csv.short sample_sudoku_board_inputs.csv |
Lets make sure we can run it;
$ python3 tree_sudoku.py Board: 0 --------------------- 4 8 3 | 9 2 1 | 6 5 7 9 6 7 | 3 4 5 | 8 2 1 2 5 1 | 8 7 6 | 4 9 3 --------------------- 5 4 8 | 1 3 2 | 9 7 6 7 2 9 | 5 6 4 | 1 3 8 1 3 6 | 7 9 8 | 2 4 5 --------------------- 3 7 2 | 6 8 9 | 5 1 4 8 1 4 | 2 5 3 | 7 6 9 6 9 5 | 4 1 7 | 3 8 2 --------------------- Board: 1 --------------------- 2 4 5 | 9 8 1 | 3 7 6 1 6 9 | 2 7 3 | 5 8 4 8 3 7 | 5 6 4 | 2 1 9 --------------------- 9 7 6 | 1 2 5 | 4 3 8 5 1 3 | 4 9 8 | 6 2 7 4 8 2 | 7 3 6 | 9 5 1 --------------------- 3 9 1 | 6 5 7 | 8 4 2 7 2 8 | 3 4 9 | 1 6 5 6 5 4 | 8 1 2 | 7 9 3 --------------------- Board: 2 --------------------- 4 6 2 | 8 3 1 | 9 5 7 7 9 5 | 4 2 6 | 1 8 3 3 8 1 | 7 9 5 | 4 2 6 --------------------- 1 7 3 | 9 8 4 | 2 6 5 6 5 9 | 3 1 2 | 7 4 8 2 4 8 | 5 6 7 | 3 1 9 --------------------- 9 2 6 | 1 7 8 | 5 3 4 8 3 4 | 2 5 9 | 6 7 1 5 1 7 | 6 4 3 | 8 9 2 --------------------- Finished in 4.777865409851074 seconds |
Last thing before we really get started is to check this all in to git.
$ git status fatal: not a git repository (or any parent up to mount point /) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). [ayoung@ayoungP40 tree_sudoku]$ git init Initialized empty Git repository in /home/ayoung/Documents/CodePlatoon/tree_sudoku/.git/ [ayoung@ayoungP40 tree_sudoku]$ git add . [ayoung@ayoungP40 tree_sudoku]$ git commit -m "Initial checking of functioning code and sample data" [master (root-commit) 5eb15e4] Initial checking of functioning code and sample data 5 files changed, 478 insertions(+) create mode 120000 sample_sudoku_board_inputs.csv create mode 100644 sample_sudoku_board_inputs.csv.full create mode 100644 sample_sudoku_board_inputs.csv.short create mode 100644 tree_sudoku.py create mode 100644 tree_sudoku.py.comments |
Next up; we’ll take a look through the code and write a test.
EDIT: I have posted all of the steps of the refactoring to github.