Testing in Python with pytest: from the basics to advanced techniques
I'm not going to convince you that testing is necessary — if you're reading this article, you already know that. Instead, I'll go straight to the point and show you how to work with Pytest to write unit tests, integration tests, functional tests, and more. I'll focus on various techniques and tools you can use in your Pythonic projects, though the same patterns apply to other languages.
Testing libraries in Python
Let's start with a list of essential libraries for testing in Python:
- pytest: The main testing framework in Python. Supports unit, integration, and functional tests, among others. Highly extensible with plugins.
- pytest-env: Manages environment variables in tests. Useful for configuring database connection strings, API keys, etc.
- pytest-xdist: Runs tests in parallel across multiple CPU cores. Useful for speeding up the test suite.
- pytest-asyncio: Runs tests for code that uses asyncio.
- respx: Simulates HTTP requests made with the httpx library (compatible with async code).
- time-machine: An alternative to freezegun. Faster and works better with async code.
- faker: Generates fake data (names, emails, addresses, phone numbers, etc.) for tests.
- polyfactory: A modern factory library with great support for Pydantic, dataclasses, and attrs.
- syrupy: Snapshot testing for pytest.
- dirty-equals: Flexible equality assertions like
IsUUID(),IsDatetime(),IsPartialDict(). Ideal for verifying parts of responses. - hypothesis: The reference library for property-based testing. Automatically generates test cases from specifications.
We'll use many of these, and they complement each other, but everything relies on pytest as the main testing framework.
Anatomy of a test
Let's start with the basics: how to write a test.
You need to create a separate test file, usually with the test_ prefix or the _test.py suffix. Inside that file, you define test functions that must also follow the naming convention (starting with test_).
Each test function should contain one or more assertions that verify the expected behavior of the code you're testing.
For example, inside a file called test_sum.py, we could have the following test for a sum function:
def test_sum():
assert sum([1, 2, 3]) == 6
Use a descriptive function name for each test, and split the test into 3 parts: given (data setup), when (execution of the code under test) and then (result verification).
def is_tropic(latitude, longitude):
pass
def test_is_tropic():
# Given
latitude = 0
longitude = 0
# When
result = is_tropic(latitude, longitude)
# Then
assert result #= True
def test_is_not_tropic():
# Given
latitude = 45
longitude = 45
# When
result = is_tropic(latitude, longitude)
# Then
assert not result #= False
Factory fixtures or creating reusable test objects
Suppose that to test a function we need to create a complex object, such as a library that gives me the current weather depending on a location. Instead of creating that object every time we need it in our tests, we can use a pytest fixture to create a reusable test object.
This is how we'd do it without a fixture:
def test_get_current_weather_new_york():
# Given
weather_service = WeatherService()
# When
weather = weather_service.get_current_weather(location="New York")
# Then
assert weather.temperature > 0
def test_get_current_weather_london():
# Given
weather_service = WeatherService()
# When
weather = weather_service.get_current_weather(location="London")
# Then
assert weather.temperature > 0
Every time a test runs, the instantiation code is repeated, which violates the DRY principle and makes maintenance more expensive.
The DRY (Don't Repeat Yourself) principle tells us not to repeat code, but to abstract it into functions, classes, or fixtures so it's easier to maintain and reuse.
And this is how we'd do it using a fixture:
import pytest
@pytest.fixture
def weather_service():
return WeatherService()
def test_get_current_weather_new_york(weather_service):
# Given
# No need to create the WeatherService object, we already have it as an argument
# When
weather = weather_service.get_current_weather(location="New York")
# Then
assert weather.temperature > 0
def test_get_current_weather_london(weather_service):
# Given
# No need to create the WeatherService object, we already have it as an argument
# When
weather = weather_service.get_current_weather(location="London")
# Then
assert weather.temperature > 0
You can also use fixtures to create test objects with specific data, read test files (CSV, JSON, etc.), configuration constants, etc.
If you work with Pydantic models or dataclasses, polyfactory lets you generate test instances without having to fill in every field by hand:
from polyfactory.factories.pydantic_factory import ModelFactory
from pydantic import BaseModel
class WeatherData(BaseModel):
location: str
temperature_celsius: float
conditions: str
humidity: int
class WeatherDataFactory(ModelFactory):
__model__ = WeatherData
def test_weather_factory():
weather = WeatherDataFactory.build()
assert isinstance(weather.temperature_celsius, float)
weather_list = WeatherDataFactory.batch(size=5)
assert len(weather_list) == 5
build() generates an instance with valid random values. You can override the fields you care about and let the library fill in the rest.
Test-driven Development Triangulation
How can I overcome the blank page fear when writing tests? Triangulation is a technique that gets you past that initial block. But don't confuse it with traditional TDD!
TDD (Test Driven Development) is the most widely used testing-based programming practice among developers. It consists of following a work design opposite to traditional testing, writing the test first and only adding new code if it fails.
With the triangulation approach, instead of writing a failing test first, you write three passing tests. Then you refactor the code to eliminate duplication between them.
Let's understand it with a practical example. Suppose we want to implement a function that checks whether a string is an anagram of another (2 words that have the same letters but in a different order).
First we write a very simple passing test:
def test_anagram_simple():
assert is_anagram("amor", "roma") #=> True
# First implementation: very simple, with a specific case
def is_anagram(s1, s2):
return sorted(s1) == sorted(s2)
Then we write another passing test with a different case:
def test_anagram_with_spaces():
assert is_anagram("amor", "a rom")
assert is_anagram("a rom", "amor")
assert is_anagram("maor", "a m o r")
# Second implementation: we expand with the first exception, but it's still not general
def is_anagram(s1, s2):
return sorted(s1.replace(" ", "")) == sorted(s2.replace(" ", ""))
And finally, a third passing test with uppercase letters:
def test_anagram_with_uppercase():
assert is_anagram("Amor", "Roma")
assert is_anagram("a Rom", "Amor")
assert is_anagram("maOr", "a m o r")
assert is_anagram("Amor", "ROMA")
# Third implementation: now we have a general function
def is_anagram(s1, s2):
return sorted(s1.replace(" ", "").lower()) == sorted(s2.replace(" ", "").lower())
From there, we can refactor the code to eliminate duplication between tests, or write more tests to cover additional cases (such as accents, special characters, etc.).
However, when we start having a large list of tests for the same function, it will help us to use parametrize to avoid duplication between them.
Parametrizing tests to run the same test with different data sets
Let's rescue the previous example, the last test we wrote, to parametrize it and add many more cases.
With pytest.mark.parametrize we can run the same test with different data sets.
First we define the parametrize decorator with the different test cases we want to run.
import pytest
@pytest.mark.parametrize(
"string_1, string_2",
[
pytest.param("amor", "roma", id="simple"),
pytest.param("amor", "a rom", id="with_spaces"),
pytest.param("Amor", "Roma", id="with_uppercase"),
pytest.param("a Rom", "Amor", id="with_spaces_and_uppercase"),
pytest.param("maOr", "a m o r", id="with_a_lot_of_spaces_and_uppercase"),
pytest.param("Amor", "ROMA", id="with_uppercase_2"),
pytest.param("amór", "romá", id="with_accents"),
pytest.param("#amor", "rom#a", id="with_special_characters"),
]
)
Each tuple contains the arguments that will be passed to the test function. The id parameter is optional, but very useful for identifying each test case.
A decorator by itself is not very practical. Now we define the test function that will run with each pair of strings:
def test_check_anagram(string_1, string_2):
assert is_anagram(string_1, string_2)
pytest will take care of running this test function with each pair of strings defined in the parameter list, without doing anything else.
All together it would look something like this:
import pytest
@pytest.mark.parametrize(
"string_1, string_2",
[
pytest.param("amor", "roma", id="simple"),
pytest.param("amor", "a rom", id="with_spaces"),
pytest.param("Amor", "Roma", id="with_uppercase"),
pytest.param("a Rom", "Amor", id="with_spaces_and_uppercase"),
pytest.param("maOr", "a m o r", id="with_a_lot_of_spaces_and_uppercase"),
pytest.param("Amor", "ROMA", id="with_uppercase_2"),
pytest.param("amór", "romá", id="with_accents"),
pytest.param("#amor", "rom#a", id="with_special_characters"),
]
)
def test_check_anagram(string_1, string_2):
assert is_anagram(string_1, string_2)
def is_anagram(s1, s2):
return sorted(s1.replace(" ", "").lower()) == sorted(s2.replace(" ", "").lower())
Test abstractions/protocols/contracts, not concrete implementations
A very common problem when writing tests is that we tend to test concrete implementations rather than abstractions or protocols.
For example, suppose we want to use an external resource like an API that returns climate information (api.open-meteo.com).
A common, and wrong, pattern would be to include calls to that API directly in our tests:
import requests
def test_get_current_weather():
# Given
response = requests.get("https://api.open-meteo.com/v1/forecast?latitude=35&longitude=139&hourly=temperature_2m")
data = response.json()
# When
temperature = data["hourly"]["temperature_2m"][0]
# Then
assert temperature > 0
What happens if the API is unavailable? Or if the response format changes? Or if the weather changes and the temperature is less than or equal to 0? These tests would be fragile and unreliable. Moreover, any small change in the API would force us to change all the tests that depend on it.
Instead, use an abstraction or protocol to interact with that API, and then mock or stub that abstraction in your tests.
import requests
from typing import Protocol
# We define the interface (Protocol) that implementations must fulfill
class WeatherGatewayInterface(Protocol):
def get_temperature(self, latitude: float, longitude: float) -> float | None:
pass
# Real implementation using Open-Meteo
class OpenMeteoGateway:
def get_temperature(self, latitude: float, longitude: float) -> float | None:
url = "https://api.open-meteo.com/v1/forecast"
params = {
"latitude": latitude,
"longitude": longitude,
"current_weather": "true"
}
try:
response = requests.get(url, params=params, timeout=5)
response.raise_for_status()
data = response.json()
return data["current_weather"]["temperature"]
except Exception:
# Following clean architecture, we handle exceptions by returning None
# so the use case can structure the error
return None
@pytest.fixture
def weather_gateway():
return OpenMeteoGateway()
def test_get_current_weather(weather_gateway):
# Given
latitude = 35
longitude = 139
# When
temperature = weather_gateway.get_temperature(latitude, longitude)
# Then
assert temperature > 0
A protocol is an interface that defines a set of methods and properties that a class must implement.
So we'll follow a decoupled design where we have a WeatherGatewayInterface that defines the contract that implementations must fulfill, and a concrete implementation OpenMeteoGateway that interacts with the Open-Meteo API. In our tests we can round things out by creating a fixture that returns an instance of OpenMeteoGateway.
How does
OpenMeteoGatewayknow it must comply withWeatherGatewayInterface? It doesn't, but that's not necessary. In Python, duck typing handles that. As long asOpenMeteoGatewayhas aget_temperaturemethod with the same signature as the one defined inWeatherGatewayInterface, it will be considered to fulfill that protocol.
We've already solved the first step: testing an abstraction rather than a concrete implementation. Now let's see how we can automatically generate test data to cover more cases, and later avoid depending on the real API for testing.
Generating test data
With the previous tests I depend on the inputs I've defined myself, which means my tests will be full of biases. To avoid that, I can use data generation tools.
We have 2 paths: define what random data, or define what test cases we want to generate from specifications.
Faker
To generate random test data such as names, emails, addresses, phone numbers, etc., we can use the faker library.
from faker import Faker
fake = Faker()
name = fake.name()
email = fake.email()
password = fake.password()
Each time we call fake.name(), fake.email(), etc., we get a different value.
Suppose we want to test some random locations for our is_tropic function. We could use faker to generate random latitudes and longitudes within certain ranges:
from faker import Faker
fake = Faker()
def test_random_locations():
latitude = fake.latitude()
longitude = fake.longitude()
result = is_tropic(latitude, longitude)
# Here we could make assertions based on the generated latitude and longitude
if -23.43691 <= latitude <= 23.43691:
assert result #= True
else:
assert not result #= False
We'll never be certain that randomly generated data covers all possible cases, but it's better than the hardcoded data we define ourselves.
Hypothesis for automatically generating test cases based on specifications
With the previous code we only validate a single random case. We could include a loop with several iterations to generate multiple random cases. However, we have a much more powerful tool for that.
With hypothesis we can define specifications for the test data we want to generate, and the library will take care of creating test cases automatically based on those specifications. It goes beyond random data generation, as it includes invariants (nulls, empty values, negatives, etc.) and edge cases (maximum values, minimums, etc.) that could break our code.
from hypothesis import given, strategies as st
@given(
latitude=st.floats(min_value=-90, max_value=90),
longitude=st.floats(min_value=-180, max_value=180)
)
def test_random_locations(latitude, longitude):
result = is_tropic(latitude, longitude)
if -23.43691 <= latitude <= 23.43691:
assert result #= True
else:
assert not result #= False
Although it may seem similar to faker at first glance, it's actually running dozens or even hundreds of combinations of latitudes and longitudes behind the scenes. It does the heavy lifting for us.
Wrappers
Suppose I want to test the current weather in New York, and I know it's sunny. To do that I create the get_current_weather function, which in turn calls datetime.now() to get the current date and time.
def get_current_weather(location):
current_time = datetime.now()
weather = weather_service.get_weather(location, current_time)
return weather
def test_get_current_weather():
weather = get_current_weather("New York")
assert weather.status == "sunny"
So far so good. You run it and it works. What about tomorrow?
Sometimes we need to test functionality that depends on variable elements such as:
- Random numbers
- Dates and times
- Hashing
- Access to external resources (such as databases or APIs)
Anything that can make tests non-deterministic or hard to reproduce. And all of them, of course, make our tests not always work the way we'd like.
To deal with this, we can use a wrapper that lets us fix those variable values during test execution.
A wrapper is a function that wraps another function to modify its behavior in some way. It's used when we want to interact with a function differently from how we'd normally do it.
Let's create one for datetime.now(). Suppose we know that on June 1, 2024 at 12:00 PM in New York it was sunny. That's very valuable information we can use for testing. What we'll do is create a wrapper that lets us fix the current date and time during test execution, so that whenever we call datetime.now() it returns that specific date and time.
class FixedTimeProvider:
"""Test implementation - returns a fixed time."""
def __init__(self, fixed_time: datetime):
self._fixed_time = fixed_time
def now(self) -> datetime:
return self._fixed_time
We also need to modify WeatherService to receive a time_provider as an argument in its constructor, and use that time provider instead of calling datetime.now() directly:
class WeatherService:
"""Weather service that depends on TimeProvider for timestamp control."""
def __init__(self, time_provider: TimeProvider):
self._time_provider = time_provider
def get_current_weather(self, city: str) -> Weather:
current_time = self._time_provider.now()
return Weather(
status="sunny",
temperature=25.0,
checked_at=current_time
)
With these pieces in place we can start building our tests in a more controlled and reliable way.
We create a fixture that returns an instance of FixedTimeProvider with the date and time we want to fix:
from datetime import datetime, timezone
import pytest
from time_provider import FixedTimeProvider
from weather_service import WeatherService
@pytest.fixture
def fixed_time():
return datetime(2024, 6, 1, 12, 0, 0, tzinfo=timezone.utc)
@pytest.fixture
def time_provider(fixed_time):
return FixedTimeProvider(fixed_time)
@pytest.fixture
def weather_service(time_provider):
return WeatherService(time_provider)
- The
fixed_timefixture returns adatetimeobject with the date and time we want to fix. - The
time_providerfixture returns an instance ofFixedTimeProviderwith the fixed date and time. - The
weather_servicefixture returns an instance ofWeatherServicethat uses thetime_providerto get the current date and time.
We've created an instance of WeatherService that depends on a time_provider to get the current date and time. When datetime.now() is called inside WeatherService, it will actually call FixedTimeProvider.now(), which returns the fixed date and time we've defined.
To use it in our tests, we simply pass the time_provider fixture as an argument to the test function.
def test_get_current_weather_sets_checked_at(weather_service, fixed_time):
# When
weather = weather_service.get_current_weather("New York")
# Then
assert weather.status == "sunny"
assert weather.checked_at == fixed_time # Clean and simple!
def test_weather_timestamp_is_controlled(weather_service):
# When
weather1 = weather_service.get_current_weather("New York")
weather2 = weather_service.get_current_weather("London")
# Then - Both have the same timestamp (as expected in tests)
assert weather1.checked_at == weather2.checked_at
This design pattern is known as Dependency Injection (DI), and it's very useful for making our code easier to test and maintain. By injecting dependencies instead of creating them directly inside our functions or classes, we can easily change the behavior.
You also have a library called time-machine that does exactly the same thing as our wrapper, but in a more elegant way and with support for async code. You can use it like this:
import datetime as dt
from zoneinfo import ZoneInfo
import time_machine
hill_valley_tz = ZoneInfo("America/Los_Angeles")
@time_machine.travel(dt.datetime(1985, 10, 26, 1, 24, tzinfo=hill_valley_tz))
def test_delorean():
assert dt.date.today().isoformat() == "1985-10-26"
But it adds one more dependency to your project. Depending on your project's complexity, your own wrapper or time-machine may be more convenient.
Now let's look at another example with a wrapper for random numbers. Suppose we have a function that generates a random number between 1 and 10, and we want to test it deterministically.
We start by creating a wrapper that lets us fix the random number during test execution:
class FixedRandomProvider:
"""Test implementation - returns a fixed number."""
def __init__(self, fixed_number: int):
self._fixed_number = fixed_number
def randint(self, a: int, b: int) -> int:
return self._fixed_number
Now it's time to configure the fixtures following the same pattern as before. We create a fixture that returns an instance of FixedRandomProvider with the random number we want to fix:
import pytest
from random_provider import FixedRandomProvider
from dice_game import DiceGame
@pytest.fixture
def dice_game_winning():
"""Game with a fixed winning roll (5)."""
random_provider = FixedRandomProvider(5)
return DiceGame(random_provider)
@pytest.fixture
def dice_game_losing():
"""Game with a fixed losing roll (2)."""
random_provider = FixedRandomProvider(2)
return DiceGame(random_provider)
def test_roll_dice_winning(dice_game_winning):
# When
result = dice_game_winning.roll_dice()
# Then
assert result.value == 5
assert result.is_winning is True
def test_roll_dice_losing(dice_game_losing):
# When
result = dice_game_losing.roll_dice()
# Then
assert result.value == 2
assert result.is_winning is False
Another option is to use the dirty-equals library, which lets us make flexible equality assertions.
from dirty_equals import IsUUID, IsDatetime, IsPositiveInt
def test_weather_response_shape(weather_provider_hot):
recommender = OutdoorActivityRecommender(weather_provider_hot)
result = recommender.recommend(40.4168, -3.7038)
assert result["id"] == IsUUID()
assert result["created_at"] == IsDatetime()
assert result["temperature_celsius"] == IsPositiveInt()
The assertion passes if the value satisfies the type condition, not if it matches exactly.
Mocking or stubbing to simulate the behavior of external dependencies without actually accessing them
In a previous example we implemented an external dependency like OpenMeteoGateway that interacts with the Open-Meteo API. It's a bad idea for several reasons:
- It's slow, since every time the test runs, a call is made whose latency depends on the network and the external server.
- It's fragile, since if the API changes or becomes unavailable, our tests will fail.
- It's expensive, since some external APIs have usage limits or costs associated with calls.
- It's unreliable: we can't guarantee that responses will always be the same.
Among many other problems. So instead of interacting with the real API, we can use mocking or stubbing to simulate the behavior of that external dependency without actually accessing it.
Mocking isn't only for external APIs. We can also use it to simulate the behavior of databases, real-time services, message queues, etc.
Reading data
Let's assume we have a function that gets the current weather at a given location, and we want to test it without making real calls to the Open-Meteo API.
We'll start by defining an abstraction or protocol that defines the contract that implementations interacting with the Open-Meteo API must fulfill:
@dataclass
class WeatherData:
"""Weather information from external system."""
location: str
temperature_celsius: float
conditions: str
humidity: int
class WeatherProvider(Protocol):
def get_current_weather(self, latitude: float, longitude: float) -> WeatherData:
pass
Now we can create a concrete implementation of that interface that interacts with the Open-Meteo API, and another test implementation that returns fixed data for our tests.
Production will have the real implementation:
class OpenMeteoWeatherProvider:
def get_current_weather(self, latitude: float, longitude: float) -> WeatherData:
# Logic to interact with the Open-Meteo API
pass
But we'll use the test implementation that returns fixed data for our tests:
class FixedWeatherProvider:
def get_current_weather(self, latitude: float, longitude: float) -> WeatherData:
# Logic to return fixed test data
pass
Python provides the unittest.mock library that lets us create simulated objects (mocks) that mimic the behavior of real objects, and inside it is the create_autospec function. It will help us create a mock that respects the signature of the function or method we're simulating.
For example, if we want to create a mock of WeatherProvider, we can use create_autospec to make sure the mock has the same signature as the WeatherProvider interface:
provider = create_autospec(WeatherProvider)
The provider variable is now a mock of WeatherProvider. This means any call to provider.get_current_weather must have the same parameters and return types as the method defined in WeatherProvider.
We can configure the behavior of the mock's methods using return_value. For example, we can make provider.get_current_weather return a WeatherData object with specific test data:
provider.get_current_weather.return_value = WeatherData(
location="Madrid",
temperature_celsius=30.0,
conditions="Clear",
humidity=40,
)
Now we can configure the fixtures where we'll include mocks of WeatherProvider that return specific test data for each test case:
@pytest.fixture
def weather_provider_hot():
"""Mocked provider returning hot weather."""
provider = create_autospec(WeatherProvider)
provider.get_current_weather.return_value = WeatherData(
location="Madrid",
temperature_celsius=30.0,
conditions="Clear",
humidity=40,
)
return provider
@pytest.fixture
def weather_provider_mild():
"""Mocked provider returning mild weather."""
provider = create_autospec(WeatherProvider)
provider.get_current_weather.return_value = WeatherData(
location="Barcelona",
temperature_celsius=20.0,
conditions="Partly cloudy",
humidity=60,
)
return provider
We're missing one last piece. We have nothing to test. The WeatherProvider mock is just a simulated object that returns test data, but we don't have any function or class that uses that provider to do something with that data. For that, let's create an OutdoorActivityRecommender class that uses the WeatherProvider to recommend outdoor activities based on the weather:
class OutdoorActivityRecommender:
"""Recommends outdoor activities based on weather."""
def __init__(self, weather_provider: WeatherProvider):
self._weather_provider = weather_provider
def recommend(self, latitude: float, longitude: float) -> ActivityRecommendation:
weather = self._weather_provider.get_current_weather(latitude, longitude)
if weather.temperature_celsius > 25:
return ActivityRecommendation(
activity="Swimming",
reason=f"Perfect! It's {weather.temperature_celsius}°C",
)
elif weather.temperature_celsius > 15:
return ActivityRecommendation(
activity="Hiking",
reason=f"Great weather at {weather.temperature_celsius}°C",
)
elif weather.temperature_celsius > 0:
return ActivityRecommendation(
activity="Skiing",
reason=f"Snow weather! {weather.temperature_celsius}°C",
)
else:
return ActivityRecommendation(
activity="Stay indoors",
reason=f"Very cold at {weather.temperature_celsius}°C",
)
Now we can simulate the behavior of the Open-Meteo API in our tests.
def test_recommends_swimming_when_hot(weather_provider_hot):
# Given
recommender = OutdoorActivityRecommender(weather_provider_hot)
# When
result = recommender.recommend(40.4168, -3.7038)
# Then
assert result.activity == "Swimming"
assert "30.0°C" in result.reason
def test_recommends_hiking_when_mild(weather_provider_mild):
# Given
recommender = OutdoorActivityRecommender(weather_provider_mild)
# When
result = recommender.recommend(41.3851, 2.1734)
# Then
assert result.activity == "Hiking"
assert "20.0°C" in result.reason
The production flow will be:
OutdoorActivityRecommenderreceives aWeatherProvideras a dependency, specificallyOpenMeteoWeatherProvider.OpenMeteoWeatherProvideruses theget_current_weatherfunction to get the weather data.get_current_weathermakes a call to the Open-Meteo API to get the current weather at the given location.- The Open-Meteo API returns a response depending on the latitude and longitude provided.
Now our mock replaces the real WeatherProvider, returning fixed test data when get_current_weather is called.
This design pattern is portable to other programming languages and testing frameworks. The idea is always the same: abstract external dependencies behind an interface or protocol, and then use mocks or stubs to simulate their behavior in tests.
To simplify the process with HTTP requests, you can use the respx library that does exactly the same thing as the previous example, but in a more elegant way and with support for async code. You can use it like this:
import httpx
import respx
from httpx import Response
@respx.mock
def test_get_current_weather():
# Given
my_route = respx.get("https://api.open-meteo.com/v1/forecast").mock(
return_value=Response(200, json={
"current_weather": {
"temperature": 30.0,
"conditions": "Clear",
"humidity": 40
}
})
)
# When
response = httpx.get("https://api.open-meteo.com/v1/forecast?latitude=40.4168&longitude=-3.7038")
data = response.json()
# Then
assert data["current_weather"]["temperature"] == 30.0
assert data["current_weather"]["conditions"] == "Clear"
assert data["current_weather"]["humidity"] == 40
Don't be afraid to work with other protocols like WebSockets, gRPC, AMQP, etc.
Writing/modifying data
Testing write or modification operations requires a different approach from reads: it's not enough to verify that the function doesn't fail, we need to confirm that the data was saved correctly and with the expected values.
Following the same decoupled design, we define the repository protocol:
class WeatherRepositoryInterface(Protocol):
def save(self, weather: WeatherData) -> None:
pass
def update(self, weather_id: int, weather: WeatherData) -> None:
pass
Instead of connecting to a real database, we create a stub (fake implementation) that internally records the calls it receives:
class FakeWeatherRepository:
def __init__(self):
self.saved: list[WeatherData] = []
self.updated: list[tuple[int, WeatherData]] = []
def save(self, weather: WeatherData) -> None:
self.saved.append(weather)
def update(self, weather_id: int, weather: WeatherData) -> None:
self.updated.append((weather_id, weather))
Unlike a mock from create_autospec, the stub lets us inspect exactly what data was written, not just whether the method was invoked.
We configure the fixtures with the stub and the service that uses it:
@pytest.fixture
def weather_repository():
return FakeWeatherRepository()
@pytest.fixture
def weather_saver(weather_repository):
return WeatherSaver(weather_repository)
And now we can test both writes and updates:
def test_save_weather(weather_saver, weather_repository):
# Given
weather = WeatherData(
location="Madrid",
temperature_celsius=30.0,
conditions="Clear",
humidity=40,
)
# When
weather_saver.save(weather)
# Then
assert len(weather_repository.saved) == 1
assert weather_repository.saved[0].location == "Madrid"
assert weather_repository.saved[0].temperature_celsius == 30.0
def test_update_weather(weather_saver, weather_repository):
# Given
weather = WeatherData(
location="Madrid",
temperature_celsius=35.0,
conditions="Hot",
humidity=20,
)
# When
weather_saver.update(weather_id=1, weather=weather)
# Then
assert len(weather_repository.updated) == 1
weather_id, updated_weather = weather_repository.updated[0]
assert weather_id == 1
assert updated_weather.temperature_celsius == 35.0
If you prefer not to write the stub by hand, you can use create_autospec and verify that the mock was called with the correct arguments:
from unittest.mock import create_autospec
@pytest.fixture
def weather_repository():
return create_autospec(WeatherRepositoryInterface)
def test_save_weather_with_mock(weather_repository):
# Given
saver = WeatherSaver(weather_repository)
weather = WeatherData(
location="Madrid",
temperature_celsius=30.0,
conditions="Clear",
humidity=40,
)
# When
saver.save(weather)
# Then
weather_repository.save.assert_called_once_with(weather)
Use the stub when you need to inspect the exact content of the written data. Use create_autospec when you only need to confirm that the method was called with the correct arguments.
Snapshot testing to compare a function's output with a previously saved version
In the previous strategy, we created fake objects (mocks or stubs) to simulate the behavior of external dependencies. It works well for small, controlled cases, but when the output of the function we want to test is large or complex, writing assertions for every field can be tedious.
Let's learn how to compare a function's output with a previously saved version, which is known as snapshot testing.
The strategy is simple:
- We run the function we want to test.
- We save its output to a snapshot file.
- On subsequent runs, we compare the current output with the saved snapshot.
Every time the output changes, we run the first 2 steps again to update the snapshot. This way, if the change is intentional, we update the snapshot. If the change is unintentional, the test will fail and alert us that something has changed.
The reference library for this with pytest is syrupy.
Let's look at different examples of snapshot testing for various use cases.
API response
Suppose we have an endpoint that returns the current weather for a location. The test would look something like this:
import pytest
from app import app
@pytest.fixture
def client():
return TestClient(app)
def test_get_weather_response(client):
response = client.get("/weather/Madrid")
assert response.status_code == 200
assert response.json()['country'] == "Spain"
To include snapshot testing, we import SnapshotAssertion from syrupy.assertion and add it as an argument to the test function:
# test_api.py
import pytest
from fastapi.testclient import TestClient
from syrupy.assertion import SnapshotAssertion # New
from app import app
@pytest.fixture
def client():
return TestClient(app)
def test_get_weather_response(client, snapshot: SnapshotAssertion): # New
response = client.get("/weather/Madrid")
assert response.status_code == 200
assert response.json() == snapshot # New
assert response.json()['country'] == "Spain"
The first time you run the test it will fail because no snapshot exists yet. To create it:
pytest --snapshot-update
Syrupy will generate a __snapshots__/test_api.ambr file with the serialized output:
# __snapshots__/test_api.ambr
# serializer version: 1
# name: test_get_weather_response
dict({
'conditions': 'Partly cloudy',
'forecast': list([
dict({
'conditions': 'Sunny',
'day': 'Monday',
'temperature_celsius': 24.0,
}),
...
]),
'humidity': 65,
'location': 'Madrid',
'country': 'Spain',
'temperature_celsius': 22.5,
'wind_speed_kmh': 15.0,
})
# ---
From that point on, any change in the response will cause the test to fail. If the change is intentional (adding a field, for example), you update the snapshot:
# Update all snapshots
pytest --snapshot-update
# Update only the snapshot for a specific test
pytest test_api.py::test_get_weather_response --snapshot-update
Only update the snapshot when you're sure the change is correct. Doing it blindly removes the protection against regressions.
An important requirement: variable values like dates or IDs must be fixed before doing snapshot testing. Otherwise the snapshot will change on every run. For that you already have the wrappers and dependency injection we covered earlier.
CSV
Snapshot testing is also useful for testing CSV exports. Suppose we have a function that generates a CSV with the historical temperature data:
# weather_export.py
import csv
import io
def export_weather_to_csv(records: list[dict]) -> str:
output = io.StringIO()
writer = csv.DictWriter(output, fieldnames=["date", "location", "temperature_celsius", "conditions"])
writer.writeheader()
writer.writerows(records)
return output.getvalue()
The test with snapshot:
from syrupy.assertion import SnapshotAssertion
from weather_export import export_weather_to_csv
def test_export_weather_to_csv(snapshot: SnapshotAssertion):
# Given
records = [
{"date": "2024-06-01", "location": "Madrid", "temperature_celsius": 30.0, "conditions": "Sunny"},
{"date": "2024-06-02", "location": "Madrid", "temperature_celsius": 28.5, "conditions": "Partly cloudy"},
{"date": "2024-06-03", "location": "Madrid", "temperature_celsius": 25.0, "conditions": "Rainy"},
]
# When
result = export_weather_to_csv(records)
# Then
assert result == snapshot
Syrupy will save the exact CSV content. If someone changes the column order, the separator, or the value format, the test will catch it.
Final notes
Throughout the article we've seen several techniques and tools that you can combine depending on what you need to test:
- Fixtures to reuse test objects and keep the code DRY.
- Triangulation to break the initial block and build the implementation incrementally.
- Parametrization to cover many cases with a single test.
- Abstractions and protocols to decouple the code from its external dependencies, making it possible to replace them in tests.
- Data generation with
fakerfor realistic data andhypothesisto find edge cases you wouldn't have thought of. - Wrappers and dependency injection to control non-deterministic elements like dates, random numbers, or external access.
- Mocks and stubs to simulate the behavior of external dependencies without actually accessing them.
- Snapshot testing to protect the shape of responses or exports without writing one assertion per field.
None of these techniques excludes the others. A test can use a mock for the API, a wrapper for the date, and finish with a snapshot. The key is to apply each tool where it reduces friction, not where it adds it.
One last tip: a test suite that takes 10 minutes to run is a suite nobody runs. Keep tests fast, isolated, and deterministic, and they'll be an ally rather than a burden. If the suite still grows, pytest-xdist runs it in parallel by distributing tests across available cores:
# Use all available cores
pytest -n auto
# Use a specific number of workers
pytest -n 4
With large suites, the time savings can be considerable without changing a single line of the tests themselves.
- Testing libraries in Python
- Anatomy of a test
- Factory fixtures or creating reusable test objects
- Test-driven Development Triangulation
- Parametrizing tests to run the same test with different data sets
- Test abstractions/protocols/contracts, not concrete implementations
- Generating test data
- Faker
- Hypothesis for automatically generating test cases based on specifications
- Wrappers
- Mocking or stubbing to simulate the behavior of external dependencies without actually accessing them
- Reading data
- Writing/modifying data
- Snapshot testing to compare a function's output with a previously saved version
- API response
- CSV
- Final notes
This work is under a Attribution-NonCommercial-NoDerivatives 4.0 International license.
Support me on Ko-fi
Comments
There are no comments yet.