Refactoring With Tests in Python: a Practical Example

Refactoring With Tests in Python: a Practical Example

hackernoon.com hackernoon.com3 months ago in #Dev Love28

This post contains a step-by-step example of a refactoring session guided by tests. When dealing with untested or legacy code refactoring is dangerous and tests can help us do it the right way, minimizing the amount of bugs we introduce, and possibly completely avoiding them. Refactoring is not easy. It requires a double effort to understand code that others wrote, or that we wrote in the past, and moving around parts of it, simplifying it, in one word improving it, is by no means something for the faint-hearted. Like programming, refactoring has its rules and best practices, but it can be described as a mixture of technique, intuition, experience, risk. Programming, after all, is craftsmanship. The starting point The simple use case I will use for this post is that of a service API that we can access, and that produces data in JSON format, namely a list of elements like the one shown here { “age”: 20, “surname”: “Frazier”, “name”: “John”, “salary”: “£28943” } Once we convert this to a Python data structure we obtain a list of dictionaries, where ‘age’ is an integer, and the remaining fields are strings. Someone then wrote a class that computes some statistics on the input data. This class, called DataStats , provides a single method stats , whose inputs are the data returned by the service (in JSON format), and two integers called iage and salary . Those, according to the short documentation of the class, are the initial age and the initial salary used to compute the average yearly increase of the salary on the whole dataset. The code is the following import math import json class DataStats: def stats(self, data, iage, isalary): # iage and isalary are the starting age and salary used to # compute the average yearly increase of salary. # Compute average yearly increase average_age_increase = math.floor( sum([e[‘age’] for e in data])/len(data)) – iage average_salary_increase = math.floor( sum([int(e[‘salary’][1:]) for e in data])/len(data)) – isalary yearly_avg_increase = math.floor( average_salary_increase/average_age_increase) # Compute max salary salaries = [int(e[‘salary’][1:]) for e in data] threshold = ‘£’ str(max(salaries)) max_salary = [e for e in data if e[‘salary’] == threshold] # Compute min salary salaries = [int(d[‘salary’][1:]) for d in data] min_salary = [e for e in data if e[‘salary’] == ‘£{}’.format(str(min(salaries)))] return json.dumps({ ‘avg_age’: math.floor(sum([e[‘age’] for e in data])/len(data)), ‘avg_salary’: math.floor(sum( [int(e[‘salary’][1:]) for e in data])/len(data)), ‘avg_yearly_increase’: yearly_avg_increase, ‘max_salary’: max_salary, ‘min_salary’: min_salary }) The goal It is fairly easy, even for the untrained eye, to spot some issues in the previous class. A list of the most striking ones is The class exposes a single method and has no __init__ , thus the same functionality could be provided by a single function. The stats() method is too big, and performs too many tasks. This makes debugging very difficult, as there is a single inextricable piece of code that does everything. There is a lot of code duplication, or at least several lines that are very similar. Most notably the two operations ‘£’ str(max(salaries)) and ‘£{}’.format(str(min(salaries))) , the two different lines starting with salaries = , and the several list comprehensions. So, since we are going to use this code in some part of our Amazing New Project™, we want to possibly fix these issues. The class,  » Read More

Like to keep reading?

This article first appeared on hackernoon.com. If you'd like to keep reading, follow the white rabbit.

View Full Article

Leave a Reply