Developing and Building

This section won't go into the actual coding of your core code in rust - see the excursions for that. Assuming that you have your core code written (in rust):

Wrapping with pyo3

The relevant section of the pyo3 book does a great job of explaining how to wrap your code, so I'll just touch the highlights here:


  1. Name your library the way you want the module to appear in python. For example to import using from fizzbuzz import fizzbuzzo3
  2. Use the cdylib library type
  3. Add a dependency to pyo3

Add the following to ./rust/fizzbuzzo3/Cargo.toml

  name = "fizzbuzzo3"
  path = "src/"
  crate-type = ["cdylib"]  # cdylib required for python import, rlib required for rust tests.

  pyo3 = { git = "", branch = "pyo3-testing" }

Note: for now this uses a git dependency to a branch on my fork - until either PR pyo3/#4099 lands or I pull the testing support out into an independent crate

Use the same name for the library and exported module

I have not spent much time trying but I couldn't get the import to work if you have different names for the library and imported module. Trying to rename the library to fizzbuzzo3lib leads to a file like python/fizzbuzz/ being generated but unusable:

>>> from fizzbuzz import fizzbuzzo3
Traceback (most recent call last):
File "<stdin>", line 1, in  <module>
ImportError: cannot import name 'fizzbuzzo3' from 'fizzbuzz' (/workspaces/FizzBuzz/python/fizzbuzz/
>>> from fizzbuzz import fizzbuzzo3lib
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: dynamic module does not define module export function (PyInit_fizzbuzzo3lib)
rust/fizzbuzzo3/Cargo.toml - full source
name = "fizzbuzzo3"
version = "2.1.0" # Tracks python version
edition = "2021"

name = "fizzbuzzo3"
path = "src/"
crate-type = ["cdylib"]  # cdylib required for python import, rlib required for rust tests.

fizzbuzz = { path = "../fizzbuzz" }
pyo3 = "0.21.2"
rayon = "1.10.0"

pyo3-testing = "0.3.4"

# See more keys and their definitions at

Adding the wrapped module to your project

I chose to use setuptools & setuptools-rust as my build backend. Pyo3 offer two backends setuptools-rust and maturin. I preferred to try the first because:

  • I am already used to using setuptools and didn't want to change out a working system for something else
  • I found setuptools-rustto be very easy to use
  • The docs point out that it offers more flexibility and fits better to a use case where you may also have independent python code

Add the following to ./pyproject.toml

    requires = ["setuptools", "setuptools-rust"]
    build-backend = "setuptools.build_meta"
  # The last part of the name (e.g. "_lib") has to match in Cargo.toml,
  # but you can add a prefix to nest it inside of a Python package.
  target = "fizzbuzz.fizzbuzzo3"
  path = "rust/fizzbuzzo3/Cargo.toml"
  binding = "PyO3"
  features = ["pyo3/extension-module"] # IMPORTANT!!!
  debug = false # Adds `--release` to `pip install -e .` builds, necessary for performance testing

Avoid errors packaging for linux

It is important to specify features = ["pyo3/extension-module"] in ./pyproject.toml to avoid linking the python interpreter into your library and failing quality checks when trying to package for linux.

Background is available by combining the pyo3 FAQ and manylinux specification

Missing out on 4-7x performance gains

For performance benchmarking or faster tests, add debug = false to ./pyproject.toml. This ensures your Rust code is built with the right optimisations even when installing in editable mode via pip install -e .. Without this, only install and wheel builds will be optimised.

The default release profile is well documented in The Cargo Book. I found a 4-7x performance boost when I enabled this!

    # Additional ignores for tests 
    "**/test_*.py" = [
        "INP001",  # Missing
        "ANN",     # Missing type annotations
        "S101",    # Use of `assert`
        "PLR2004", # Magic number comparisons are OK in tests
        "D1",      # Don't REQUIRE docstrings for tests - but they are nice

    "**/" = [
        "D104", # Don't require module docstring in
        "F401", # Unused imports are fine: using to expose them with implicit __ALL__ 

Python virtual environment & build

I like to keep things as simple as possible. Python has many virtual environment managers, venv is part of the core library and does everything we need while leaving us in control of the entire build and integration process.

Quick start with justfile

The justfile ./justfile handles all of this for you. Feel free to copy it.

# build and test a wheel (a suitable venv must already by active!)
test-wheel: clean
  cibuildwheel --only cp312-manylinux_x86_64
Creating a virtual environment with venv

If you are unfamiliar with venv here are the docs

Depending on your distro you may need to install venv as a separate package

Creating a virtual environment is as simple as:

/projectroot$ python -m venv .venv

Sourcing development dependencies from ./pyproject.toml

To provide a single line python build for local development you will need to source your development dependencies from ./pyproject.toml. These can be split into multiple groups to give more control during automated processes where you don't need everything.

Add the following to ./pyproject.toml

  lint = ["ruff"]
  test = ["pytest", "pytest-doctest-mkdocstrings"]
  cov = ["fizzbuzz[test]", "pytest-cov"]
  doc = ["black", "mkdocs", "mkdocstrings[python]", "mkdocs-material"]
  dev = ["fizzbuzz[lint,test,cov,doc]"]

Building and installing the wrapped rust code for use in python development

Before you can use the wrapped rust code you need to build the equivalent of python/fizzbuzz/

  1. Make sure your virtual environment is active. If not run . .venv/bin/activate (note the leading dot, which is easier than typing source all the time)
  2. Then simply use pip to create an editable installation of your codebase:
    (.venv)/projectroot$ pip -e.[dev]


Cleaning up old build artefacts

As with any built language it is a good idea to clean up old build artefacts before generating new ones, or at least before finalising a change. Cargo offers a simple cargo clean for this, but you will also have the python library and various python caches in place which can sometimes cause problems.

To clean both languages:

(.venv)/projectroot$ cargo clean || true
(.venv)/projectroot$ rm -rf .pytest_cache
(.venv)/projectroot$ rm -rf build
(.venv)/projectroot$ rm -rf dist
(.venv)/projectroot$ rm -rf wheelhouse
(.venv)/projectroot$ rm -rf .ruff_cache
(.venv)/projectroot$ find . -depth -type d -not -path "./.venv/*" -name "__pycache__" -exec rm -rf "{}" \;
(.venv)/projectroot$ find . -depth -type d -path "*.egg-info" -exec rm -rf "{}" \;
(.venv)/projectroot$ find . -type f -name "*.egg" -delete
(.venv)/projectroot$ find . -type f -name "*.so" -delete

Or just use just clean from the ./justfile

API design

There are a few things to consider when designing your API for python users.


Assuming part of the reason you are doing this is to provide a performance over native python, you will want to consider the (small but noticeable) performance cost each time you cross the python-rust boundary. Discussion pyo3/#4085 covers this topic, further improvements are promised for the next versions of pyo3.

Pass Containers, don't make multiple calls

One simple way to avoid crossing the boundary often is to pass a list or similar rather than making multiple individual calls. The performance difference can be seen below:

Rust: [1 calls of 10 runs fizzbuzzing up to 1_000_000]
Python: [1 calls of 10 runs fizzbuzzing up to 1_000_000]
Rust vector: [1 calls of 10 runs fizzbuzzing a list of numbers up to 1_000_000]

Use the rayon crate to break the GIL and run parallel calculations

Adding parallel processing to a rust iterator is insanely simple. My impression of rust is "easy things are hard, hard things are easy!"

I simply added the following to my core rust code /rust/fizzbuzz/src/

use rayon::prelude::*;
static BIG_VECTOR: usize = 300_000; // Size from which parallelisation makes sense
impl<Num> MultiFizzBuzz for Vec<Num>
    Num: FizzBuzz + Sync,
    fn fizzbuzz(&self) -> FizzBuzzAnswer {
        if self.len() < BIG_VECTOR {
            FizzBuzzAnswer::Many(self.iter().map(|n| n.fizzbuzz().into()).collect())
        } else {
            FizzBuzzAnswer::Many(self.par_iter().map(|n| n.fizzbuzz().into()).collect())

Check it makes sense

Adding parallel processing doesn't always make sense as it adds overhead ramping and managing a threadpool. You will want to do some benchmarking to find the sweet-spot. Benchmarking and performance testing is a topic for itself, so I'll add a dedicated section ...

Even more speed by passing a range (and implementing IntoPy and FromPyObject traits)

The world obviously needs the most performant fizzbuzz available! In an attempt to squeeze out even more speed I tried completely avoiding the need to build and pass a list and instead (ab)used a python slice to provide start, stop, and optional step values. This gave another 1.5x speed boost. Surprisingly most of that comes from passing the list to rust, not creating it or processing it:

Timeit results

Rust: [3 calls of 10 runs fizzbuzzing up to 1000000]
[13.941677560000244, 12.671054376998654, 12.669853160998173]
Rust vector: [3 calls of 10 runs fizzbuzzing a list of numbers up to 1000000]
[5.104824486003054, 4.96210950999739, 4.903727466000419]
Rust vector, with python list overhead: [3 calls of 10 runs creating and fizzbuzzing a list of numbers up to 1000000]
[5.363066075999086, 5.316481181002018, 5.361383773997659]
Rust range: [3 calls of 10 runs fizzbuzzing a range of numbers up to 1000000]
[3.8294942710017494, 3.8227306799999496, 3.800879727001302]

Criterion bench results

                    time:   [62.035 ms 63.960 ms 65.921 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

                        time:   [60.295 ms 62.228 ms 64.228 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Excursion to follow ...

This change was quite in depth, so expect an excursion later on the changes to the core rust/fizzbuzz/src/

On the pyo3 side this involved the following in rust/fizzbuzzo3/src/

  1. Creating a struct MySlice to hold the start, stop and step values which:
    1. Can be created from a python slice:
      struct MySlice {
          start: isize,
          stop: isize,
          step: Option<isize>,
    2. Can be converted into a pyo3 PySlice:
      impl IntoPy<Py<PyAny>> for MySlice {
          fn into_py(self, py: Python<'_>) -> Py<PyAny> {
              PySlice::new_bound(py, self.start, self.stop, self.step.unwrap_or(1)).into_py(py)
      Note: There is no rust standard type which pyo3 maps to a slice.
  2. Parsing the slice to provide equivalent logic to python for negative steps:
    fn py_fizzbuzz(num: FizzBuzzable) -> PyResult<String> {
        match num {
            FizzBuzzable::Slice(s) => match s.step {
                None => Ok((s.start..s.stop).fizzbuzz().into()),
                Some(1) => Ok((s.start..s.stop).fizzbuzz().into()),
                Some(step) => match step {
                    1.. => Ok((s.start..s.stop)
                    //  ```python
                    //  >>> foo[1:5:0]
                    //  Traceback (most recent call last):
                    //    File "<stdin>", line 1, in <module>
                    //  ValueError: slice step cannot be zero
                    //  ```
                    0 => Err(PyValueError::new_err("step cannot be zero")),
                    //  ```python
                    //  >>> foo=[0,1,2,3,4,5,6]
                    //  >>> foo[6:0:-2]
                    //  [6, 4, 2]
                    //  ```
                    // Rust doesn't accept step < 0 or stop < start so need some trickery
                    ..=-1 => Ok((s.start.neg()..s.stop.neg())
                        .map(|x| x.neg())
  3. Quite a bit of extra testing ...
mod test {

    use super::*;

    fn big_vector_is_well_ordered() {
        let input: Vec<_> = (1..BIG_VECTOR + 2).collect();
        let output: Vec<FizzBuzzAnswer> = input.clone().fizzbuzz().collect();
        let mut expected: Vec<FizzBuzzAnswer> = vec![];
        for i in input.iter() {
        assert_eq!(output, expected);

    fn fizzbuzz_range() {
        let input = 1..20;
        let mut expected: Vec<FizzBuzzAnswer> = vec![];
        for i in 1..20 {
        let output: Vec<FizzBuzzAnswer> = input.fizzbuzz().collect();
        assert_eq!(output, expected)

Ducktyping & Union types

Remember your primary users are python coders who are used to duck typing 🦆. They will expect fizzbuzz(3.0)to return '3.0' and fizzbuzz(3.1) to return '3.1' unless something is documented regarding rounding to the nearest integer or similar. (Leaving aside any discussion on why floats are inaccurate).

Python also often provides single functions which can receive multiple significantly different types for a single argument: e.g. fizzbuzz([1,2,3]) and fizzbuzz(3) could easily both work. The function signature would be def fizzbuzz(n: int | list[int]) -> str:.

Use a custom enum and match to allow multiple types

This is best done directly in your wrapping library as it is part of the rust-python interface not the core functionality.

In rust/fizzbuzzo3/src/ I used this pattern:

enum FizzBuzzable {
#[pyo3(name = "fizzbuzz", text_signature = "(n)")]
fn py_fizzbuzz(num: FizzBuzzable) -> String {
    match num {
        FizzBuzzable::Int(n) => n.fizzbuzz().into(),
        FizzBuzzable::Float(n) => n.fizzbuzz().into(),
        FizzBuzzable::Vec(v) => v.fizzbuzz().into(),

Union type returns: def fizzbuzz(n: int | list[int]) -> str | list[str]

If you would like to provide different return types for different cases:

  1. Implement an enum, or a wrapper struct around an existing enum, that holds the different types.
  2. Provide one or more conversion From traits to convert from the return of your core rust functions.
  3. Provide a conversion IntoPy trait to convert to the relevant PyO3 types.
  4. Use this new type as the return of your wrapped function.

    In /rust/fizzbuzzo3/src/

    struct FizzBuzzReturn(FizzBuzzAnswer);
    impl From<FizzBuzzAnswer> for FizzBuzzReturn {
        fn from(value: FizzBuzzAnswer) -> Self {
    impl IntoPy<Py<PyAny>> for FizzBuzzReturn {
        fn into_py(self, py: Python<'_>) -> Py<PyAny> {
            match self.0 {
                FizzBuzzAnswer::One(string) => string.into_py(py),
                FizzBuzzAnswer::Many(list) => list.into_py(py),
    #[pyo3(name = "fizzbuzz", text_signature = "(n)")]
    fn py_fizzbuzz(num: FizzBuzzable) -> PyResult<FizzBuzzReturn> {

    Thanks to the comments in Issue pyo3/#1637 for pointers on how to get this working.

  5. Add @overload hints for your IDE (see IDE type & doc hinting), so that it understands the relationships between input and output types:

    In /python/fizzbuzz/fizzbuzzo3.pyi:

    from typing import overload
    def fizzbuzz(n: int) -> str:
    def fizzbuzz(n: list[int] | slice) -> list[str]:
    def fizzbuzz(n):
        Returns the correct fizzbuzz answer for any number or list/range of numbers.

python/fizzbuzz/fizzbuzzo3.pyi - full source
# flake8: noqa: PYI021
An optimised rust version of fizzbuzz.

Provides a fizzbuzz() function which will run on multiple CPU cores if needed.

    >>> from fizzbuzz.fizzbuzzo3 import fizzbuzz

from typing import overload

def fizzbuzz(n: int) -> str:

def fizzbuzz(n: list[int] | slice) -> list[str]:

def fizzbuzz(n):
    Returns the correct fizzbuzz answer for any number or list/range of numbers.

    This is an optimised algorithm compiled in rust. Large lists will utilise multiple CPU cores for processing.
    Passing a slice, to represent a range, is fastest.

        n: the number(s) to fizzbuzz

        In the case of a single number: a `str` with the correct fizzbuzz answer.
        In the case of a list or range of inputs: a `list` of `str` with the correct fizzbuzz answers.

        a single `int`:
        >>> from fizzbuzz.fizzbuzzo3 import fizzbuzz
        >>> fizzbuzz(1)
        >>> fizzbuzz(3)
        a `list`:
        from fizzbuzz.fizzbuzzo3 import fizzbuzz
        >>> fizzbuzz([1,3])
        ['1', 'fizz']
        a `slice` representing a range:
        from fizzbuzz.fizzbuzzo3 import fizzbuzz
        >>> fizzbuzz(slice(1,4,2))
        ['1', 'fizz']
        >>> fizzbuzz(slice(1,4))
        ['1', '2', 'fizz']
        >>> fizzbuzz(slice(4,1,-1))
        ['4', 'fizz', '2']
        >>> fizzbuzz(slice(1,5,-1))
        Note: Slices are inclusive on the left, exclusive on the right and can contain an optional step.
        Negative steps require start > stop, positive steps require stop > start; other combinations return `[]`.
        A step of zero is invalid and will raise a `ValueError`.

IDE type & doc hinting

Pyo3 does a great job automatically exporting inline rust documentation (using /// ...) as python docstrings. It also creates a simple attribute detailling the function signature, which you can manually adjust.

IDEs, linters, etc. don't actually import your code to read the docstrings and signatures, they parse the source-code; and with the source in rust, they can't do this directly for your wrapped modules.

Autogenerating hints

Because I hate copy-pasting stuff I created pyo3-stubgen to auto-generate the information. It is available on pypi: pip install pyo3-stubgen, has a simple command line interface and can also be called from python if you prefer.

Create a .pyi file

  1. Create a stub file with:
    • the same name as your exported module
    • the extension .pyi
    • in the location you would otherwise have placed the file
  2. Add function definitions with type hints and docstrings but no code
  3. For functions with no docstrings enter ... as the function body
  4. Add the .pyi extension to the files checked by doctest: ./pyproject.toml:
      addopts = [
Pyo3 discusses this topic in Appendix C.