Hypothesis for Property-Based Tests in Python

Preamble

Example-driven tests—whether pytest parametrization or hand-picked tables—only guard the cases you thought to write. Hypothesis turns that around: you state a property that should hold for many inputs, and the library searches for counterexamples. When it finds one, shrinking reduces the failure to a minimal case you can actually debug.

I use Hypothesis beside normal tests, not instead of them. Unit examples document intent; properties document invariants the architecture must obey.

Where properties shine

Parsers and serializers
Round-trip laws (decode(encode(x)) ≈ x) catch surprising Unicode, empty strings, and boundary lengths.

Pure functions with laws
Sorting, merging intervals, monetary calculations—anything with algebraic structure is a natural fit.

Stateful systems
Hypothesis can model sequences of operations (append, pop, balance) against a reference implementation. That is closer to integration testing; worth the setup when bugs are expensive.

Generators and custom strategies

Stock strategies cover many builtins; composite strategies encode “valid user” shapes that mirror production constraints better than text(). The goal is not maximal randomness—it is representative diversity.

Shrinking as a teaching tool

A 200-character failing string that shrinks to "[" teaches faster than a wall of hex. I read shrunk failures like compiler errors: fix the smallest law break first.

Conclusion

Property tests document laws, not single scenarios. They pair with Mockito and unittest.mock: Boundaries in Tests’s mock discipline and 2024’s architecture tests: keep I/O at the edge, keep invariants in the middle.