Notes on Python package setup

Creating Python packages is a little fiddlier than one might hope.

Operating system prerequisites

If your package has non-standard system dependencies, there are a number of options:

  1. Manual installation by the user.

  2. Packaging your Python code within an OS package, such as:

    • a .deb file, for Debian Linux, easily installable via gdebi, e.g. sudo gdebi DEBFILE;

    • an .rpm file, for Red Hat Linux and derivatives, creatable from a .deb file via alien, and easily installable via sudo yum install RPMFILE;

    • Windows packaging.

  3. Docker

    In general, this is preferred, because it guarantees the OS environment exactly, is fairly simple to install, and performance remains good.

Python package dependencies: install_requires versus requirements.txt


Code in needs to cope with (a) installation, as in pip install ., and (b) package creation, as in python sdist.


This is a standard Python problem: “my_package depends on other_package version 1.2.3”.

  • requirements.txt is read by “bots” such as Dependabot on Github, so if this is your primary list of requirements, automatic pull requests will work. It’s also read when users do a manual installation from it. And it’s read by PyCharm and other IDEs.

    • But possibly it works without this file? Yes; it should. See below.

  • The setup(..., install_requires=[...]) parameter in is read by pip.

How do these differ? See


  • It’s better to specify a dependency version range when you are providing libraries, and an exact version when you are providing an application.

  • When you run a user-defined script from your package, it calls pkg_resources.load_entry_point(dist, group, name) (this is part of setuptools). See This doesn’t seem to re-call or check requirements.txt.

  • pip uses install_requires and not requirements.txt when installing your package.


One “single-source” approach is to define a variable such as INSTALL_REQUIRES in that is used in the setup(..., install_requires=INSTALL_REQUIRES) and is used to write requirements.txt (e.g. via an extra call by the developer: python --extras).

Another is to read and parse requirements.txt in

Experimenting with a package that has a simple requirement for semantic_version:

  • No requirements specified: code will crash at runtime with ModuleNotFoundError: No module named 'semantic_version'

  • install_requires only:

    • PyCharm notices (even if indirected via a variable).

      But note: it will cope with simple indirection, e.g.


      but not with more complex indirection, e.g.

      with StringIO(REQUIREMENTS_TEXT) as f:
          for line in f.readlines():
              line = line.strip()
              if (not line) or line.startswith('#') or line.startswith('--'):
    • Dependabot is meant to notice. Its code suggests it will cope with arbitrary indirection:

    • pip install does what’s required and the code runs.

  • requirements.txt only:

    • PyCharm notices.

    • We know Dependabot notices.

    • pip install does NOT install the necessary dependencies.

    So this option is useless.

The next question is whether requirements.txt is necessary at all. One view (e.g. Reddit above) is that it can be kept for development environments, i.e. the extras required for development but not for running your package.


  • For package distribution, install_requires in is mandatory, and requirements.txt is optional and therefore perhaps best avoided so that automatic code analysis tools don’t get confused.

Data and other non-Python files: versus

Here’s another tricky thing. In, you have package_data and include_package_data arguments to setup(). There is also the file

# # or ? # - # noqa # - # noqa # # or both? # - # noqa # … MANIFEST gets the files into the distribution # … package_data gets them installed in the distribution # # data_files is from distutils, and we’re using setuptools # - # noqa


… the last, in particular, suggesting that both (required for sdist) and package_data (used for install) are necessary. However, it seems that you can use just if you specify include_package_data=True.

For complex file specification, you could use Python and then write to, but actually the manifest syntax is quite good:

So, the two realistic options are:

  1. Have a that auto-writes when required.

  2. Specify properly and use include_package_data=True. This is probably better. See in particular


Use plus setup(..., include_package_data=True). Use the full syntax available for

To find all extensions (for the global-exclude command), use:

find . -type f | perl -ne 'print $1 if m/\.([^.\/]+)$/' | sort -u

Beware a nasty caching effect

Consider deleting any old MY_PACKAGE_NAME.egg_info directory from within, before calling setup(). This may be particularly applicable for packages that ship “data”. See,-more-lies-and-python-packaging-documentation-on–package_data-/

Like this, for example:


import os
import shutil

THIS_DIR = os.path.abspath(os.path.dirname(__file__))  # contains
EGG_DIR = os.path.join(THIS_DIR, PACKAGE_NAME + ".egg-info")

shutil.rmtree(EGG_DIR, ignore_errors=True)


This is perhaps meant to be unnecessary, per, but maybe isn’t.

It appears to be unnecessary once you shift to and include_package_data=True.