Python—a language well-suited for fast prototyping in the AI field—has become increasingly popular these days. Creating a well-designed Python package at the right time can make your work more impactful and more likely to be adopted.
Since October 2024, I have been learning about Python package development. Since then, I have published my own work IAAR-Shanghai/UHGEval as a Python package, participated in the early architecture design of MemTensor/MemOS , and deeply studied and contributed to gaogaotiantian/dowhen . These experiences have given me some insights into Python package design, which I document in this post.
Among the projects I’ve worked on, gaogaotiantian/dowhen stands out for its elegant and truly Pythonic design. I highly encourage you to explore it in depth; I also provide an extended introduction in the Appendix.
Related Work
There are already many articles and tutorials online about building and designing Python packages. For example:
- (Must-read) Packaging Python Projects, the official tutorial from the Python Packaging Authority (PyPa), which concisely introduces the process of building and publishing Python packages 1.
- (Must-read) Designing Pythonic library APIs, an article on how to design Pythonic APIs 2.
- Python Packaging Best Practices, an article explaining packaging principles 3, including sdist, wheel, and an introduction to frontend/backend tools.
- Structuring Your Project, a slightly outdated article on Python package code structure 4.
These articles are either outdated 4, lack a discussion of design philosophy 1 3, or are not comprehensive enough 1 2. This post aims to fill those gaps.
API Design
The API is the deliverable of your package—the core product.
- Only the package’s API is exposed for users.
- All other non-API codes do not need to be explained to users, and users should not have to read them when using the API (only developers need to read them).
As you can see, API design is extremely important. Ben Hoyt, in his article 2, offers many practical tips, and his writing is quite engaging. I highly recommend reading it. Here are his takeaways:
- Good API design is very important to users.
- When creating a library, start with a good base and iterate.
- Try to follow PEP 8 and grok PEP 20. This is the way.
- The standard library isn’t always the best example to follow.
- Expose a clean API; file structure is an implementation detail.
- Flat is better than nested.
- Design your library to be used as
import lib ... lib.Thing()
rather thanfrom lib import LibThing ... LibThing()
.- Avoid global configuration; use good defaults and let the user override them.
- Avoid global state; use a class instead.
- Names should be as short as they can be while still being clear.
- Function names should be verbs and classes nouns, but don’t get hung up on this.
- Being
_private
is fine;__extra_privacy
is unnecessary.- If an error occurs, raise a custom exception; use built-in exceptions if appropriate.
- Only break backwards compatibility if you’re overhauling your API.
- Keyword arguments and dynamic typing are great for backwards compatibility.
- Use type annotations at least for your public API; your users will thank you.
- Use
@dataclass
for classes which are (mostly) data.- Python’s expressiveness is boundless; don’t use too much of it!
In addition to these suggestions, this post briefly supplements how to achieve a stable and simple API.
APIs Should Remain Stable and Simple
Code should be continuously refactored to adapt to new situations. However, at least within each major version (e.g., v0.*.*, v1.*.*), the API should remain stable. This means that before developing a package, you should already have a clear idea of the core functionality your package will provide!
If the core functionality is logically complete, then the API should be stable and simple. For example, CRUD (Create, Read, Update, Delete) is a typical logically complete set of functions; similarly, Einstein Operations are a logically complete set of mathematical operations— arogozhnikov/einops only provides three core APIs5! Good API design often draws inspiration from logic and mathematics.
Video 1: Einstein Operation Introduction
Having fewer APIs brings other benefits: it’s easier for users to remember, easier for developers to write documentation (since it’s clear which functions/classes need detailed explanations, examples, and use cases), and easier to write tests, as these APIs are expected to be more stable.
How to Achieve a Stable and Simple API?
Here are two Pythonic approaches (I think).
Achieving simplicity through parameter polymorphism. gaogaotiantian/dowhen
exposes only seven functions to users: ['bp', 'clear_all', 'do', 'get_source_hash', 'goto', 'when', 'DISABLE']
, with bp
, goto
, do
, and when
being the core ones. It’s very clear what users should use. dowhen
mainly achieves this through parameter polymorphism. For example, the when
function for creating triggers has the following signature:
|
|
Here, IdentifierType = int | str | re.Pattern | Literal["<start>", "<return>"] | None
. Since identifiers
is a variadic parameter, it implicitly supports logical AND and OR relationships. As a result, this function supports a very wide range of functionality. See the Appendix for details on its implementation.
Achieving simplicity through the factory pattern. huggingface/transformers
also has simple and memorable APIs, but relies more on the factory pattern, such as AutoModelForCausalLM
, AutoModelForSequenceClassification
, and even the abstract factory pipeline
. The user-facing API parameters are not particularly complex with the most important one being model_name_or_path
, so you don’t need to worry much about parameter interactions. Overall, it’s easy for users to use and remember.
When Simplicity Is Impossible
Sometimes, things don’t go as we wish. If you look at the API of NVIDIA/Megatron-LM , you’ll find it very complex; the same goes for hiyouga/LLaMA-Factory and MemTensor/MemOS . These libraries are often experimental, and their interfaces are unlikely to be stable. As a result, “getting things running” is a big part of using them, and their APIs are hard to keep simple. When the API can’t be simplified, users must learn about it through some channel. So, our focus shifts to optimizing that channel.
- For NVIDIA/Megatron-LM , users are expected to git clone the entire package. It provides various training template bash scripts for easy modification. For further customization, users can directly read megatron/training/arguments.py, which aggregates all parameters.
- For hiyouga/LLaMA-Factory , which also has many parameters, its documentation provides detailed parameter descriptions for each training paradigm. LLaMA-Factory also launches a Gradio interface, visually unifying parameter descriptions and settings.
- For MemTensor/MemOS , it relies on three channels: detailed example scripts, various cookbooks, and using Pydantic to constrain user input, so that error messages guide users on how to fix issues.
Dependency Management
Python does not have an official package management tool like Rust or NodeJS. However, with the gradual standardization of pyproject.toml
by the Python organization 6 7, Python’s package management tools are maturing.
In package management, dependency management is one of the most important aspects. Here, we focus only on dependency-related topics including management tools, dependency types, version ranges, and lazy imports tricks.
Management Tools
There are many options, such as pypa/setuptools and python-poetry/poetry . My personal recommendation is astral-sh/uv , which represents the future direction of packaging. It has three main advantages:
- Written in Rust, so it’s fast 8.
- Supports Python’s official
pyproject.toml
specification, with many schemas closely following Python PEPs. - Supports integration with the PyTorch ecosystem 9.
Three Types of Dependencies
As described in the uv official documentation, there are three types of dependencies10:
project.dependencies
: Published dependencies.project.optional-dependencies
: Published optional dependencies, or “extras”.dependency-groups
: Local dependencies for development.
Published dependencies are installed when running pip install my-package
. These should include all core dependencies—without any of them, almost none of the package’s code will run. However, there are exceptions: for packages like torch
, which are platform-dependent and huge, it’s better to guide users to install them themselves (see # Lazy Imports and Guided Installation).
Published optional dependencies are installed with pip install my-package[extra]
. These should include all optional dependencies—without any of them, the core code still runs, but a major feature block won’t. This requires managing different feature/extra groups. For example, MemTensor/MemOS
groups dependencies by memory type, such as MemoryOS[tree-mem]
, MemoryOS[mem-reader]
; EleutherAI/lm-evaluation-harness
groups dependencies by supported benchmarks, such as lm-eval[ifeval]
, lm-eval[math]
.
Local dependencies for development are for development only. These include all dependencies needed during development, such as testing, code formatting, and documentation. They are used at different stages: testing dependencies (pytest
, coverage
) and code formatting dependencies (mypy
, ruff
) may only be needed in CI/CD; documentation tools (sphinx
, mkdocs
) may only be needed when publishing docs. Not all development dependencies need to be installed locally, so grouping them is more reasonable.
Specifying Dependency Version Ranges
Generally, we can assume that the latest version of a dependency is the most compatible, so we can set the lower bound to the latest version. We should also set an upper bound, usually the next major version (the first non-zero digit plus one), e.g., >=1.0.0, <2.0.0
, >=0.2.0, <0.3.0
.
This is because a non-zero major version usually indicates a major change; as long as the major version doesn’t change, we can assume the dependency’s API hasn’t broken.
Lazy Imports and Guided Installation
In many open-source repositories, torch
is difficult to manage as a dependency because its installation is platform-dependent and it’s huge.
When using huggingface/transformers
, you may also have encountered issues about missing packages like sentencepiece
or einops
, which are dependencies for specific models.
For those special dependencies mentioned above, we don’t want users to install them at pip install
time. Instead, we want to check at runtime whether they’re installed, and if not, provide installation guidance. The following decorator achieves this:
dependency.py
|
|
Functions decorated with this will check at runtime whether the specified package is installed. If not, an ImportError
is raised with installation instructions. Note that you should lazily import the package inside the function, not at the top of the file, to avoid excessive error messages.
Finally, for developers, we want to avoid waiting until runtime to install these dependencies. So, we should include them in a catch-all optional dependency group, the all
extras group, so that installing my-package[all]
will install all development-time dependencies at once:
|
|
Code Structure
Here’s what a modern Python package’s GitHub repository typically contains and how it’s organized.
Repository Root Directory
- README.md: Introduction, usage, installation instructions, etc.
- LICENSE: License file.
- pyproject.toml: Python package metadata, dependencies, build backend, and tool settings (mypy, uv, pytest, etc.).
- Makefile: Contains common commands like
make test
,make build
,make clean
, etc. - .gitignore: Files to ignore in Git, such as build artifacts and test outputs. Recommended: use the
Python.gitignore
from github/gitignore . - .pre-commit-config.yaml: Pre-commit hook configuration for code style checks before commits.
- src/my_package/ or my_package/: Source code directory. This is what gets installed via pip. Adding a
src/
layer helps avoid import confusion and allows managing multiple packages in one repo (e.g.,src/my_package/
andsrc/my_other_package/
). - tests/: Test code directory.
Pytest
is recommended for its simplicity and Pythonic style. - docs/: Documentation directory.
- .github/: GitHub configuration, including CI/CD workflows, issue templates, PR templates, etc. See MemTensor/MemOS for examples.
- etc.
Python Package Directory
Inside src/my_package/ or my_package/, you often find the following (all present in MemTensor/MemOS ):
- __init__.py: Package initializer, exposing core APIs.
- api: Core HTTP API code. Recommend using
FastAPI
for its simplicity. - cli: Command-line interface code. Recommend using
argparse
,click
, etc. - configs: Configuration protocols, often using
Pydantic
. - log.py: Custom logging handlers, usually based on Python’s
logging
module. - exceptions.py: Error handling utilities.
- types.py: Custom type definitions.
- constants/settings: Package-level constants (should be minimal, e.g.,
debug
flags). - deprecation management: Tools for managing deprecated code.
- dependency management: Tools for dependency management, such as the
require_python_package
decorator above. - Other business logic.
Appendix
Python Package: dowhen
gaogaotiantian/dowhen
is an instrumentation tool for testing, debugging, software security analysis, etc. Python does not have a built-in instrumentation tool. Python core developer gaogaotiantian
developed this tool using the new sys.monitoring
feature introduced in Python 3.12.
The core APIs of dowhen
are just two: one for specifying what to execute (callback.py
, implementing the do
part), and one for specifying when to execute (trigger.py
, implementing the when
part).
To better combine do
and when
(e.g., providing context managers, trigger timing checks), the underlying layer is a handler.py
module; to use the system’s sys.monitoring
, there’s an lower-level instrumenter.py
module. These four modules together form the dowhen
package.
For basic usage, see the official documentation 11. Here, I introduce the API design and implementation ideas behind the APIs.
Trigger Workflow
|
|
Callback Workflow
|
|
Citation
@article{shichaosong2025pythonpackagede,
title = {Python Package Design: API, Dependency and Code Structure},
author = {Shichao Song},
journal = {The Kiseki Log},
year = {2025},
month = {August},
url = {https://ki-seki.github.io/posts/250725-python-dev/}
}
References
Python Packaging Authority. “Packaging Python Projects.” Packaging Python Projects, n.d., https://packaging.python.org/en/latest/tutorials/packaging-projects/. Accessed 3 Aug. 2025. ↩︎ ↩︎ ↩︎
Hoyt, Ben. “Pythonic API Design.” Ben Hoyt’s blog, n.d., https://benhoyt.com/writings/python-api-design/. Accessed 3 Aug. 2025. ↩︎ ↩︎ ↩︎
Ferrer, Miqui. “Python Packaging Best Practices.” Medium, n.d., https://medium.com/@miqui.ferrer/python-packaging-best-practices-4d6da500da5f. Accessed 3 Aug. 2025. ↩︎ ↩︎
Reitz, Kenneth, and Schlusser, Tanya. “Structuring Your Project.” The Hitchhiker’s Guide to Python, n.d., https://docs.python-guide.org/writing/structure/. Accessed 3 Aug. 2025. ↩︎ ↩︎
Rogozhnikov, Alex. “einops.” einops Docs, n.d., https://einops.rocks/. Accessed 3 Aug. 2025. ↩︎
Rosen, Stephen. “PEP 735 – Dependency Groups in pyproject.toml.” Python Enhancement Proposals, 20 Nov. 2023, https://peps.python.org/pep-0735/. Accessed 3 Aug. 2025. ↩︎
Cannon, Brett. “PEP 751 – A file format to record Python dependencies for installation reproducibility.” Python Enhancement Proposals, 24 July 2024, https://peps.python.org/pep-0751/. Accessed 3 Aug. 2025. ↩︎
Astral. “uv.” uv Docs, n.d., https://docs.astral.sh/uv/. Accessed 3 Aug. 2025. ↩︎
Astral. “Using uv with PyTorch.” uv Docs, n.d., https://docs.astral.sh/uv/guides/integration/pytorch/. Accessed 3 Aug. 2025. ↩︎
Astral. “Managing dependencies.” uv Docs, n.d., https://docs.astral.sh/uv/concepts/projects/dependencies/. Accessed 3 Aug. 2025. ↩︎
Gao, Tian. “dowhen documentation.” dowhen Docs, n.d., https://dowhen.readthedocs.io/en/latest/. Accessed 3 Aug. 2025. ↩︎