Notes on Python package setup
Creating Python packages is a little fiddlier than one might hope.
Operating system prerequisites
If your package has non-standard system dependencies, there are a number of options:
Manual installation by the user.
Packaging your Python code within an OS package, such as:
-
In general, this is preferred, because it guarantees the OS environment exactly, is fairly simple to install, and performance remains good.
Python package dependencies: install_requires versus requirements.txt
Remember
Code in setup.py needs to cope with (a) installation, as in
pip install ., and (b) package creation, as in python setup.py sdist.
Background
This is a standard Python problem: “my_package depends on other_package version 1.2.3”.
requirements.txtis read by “bots” such as Dependabot on Github, so if this is your primary list of requirements, automatic pull requests will work. It’s also read when users do a manual installation from it. And it’s read by PyCharm and other IDEs.But possibly it works without this file? Yes; it should. See below.
The
setup(..., install_requires=[...])parameter insetup.pyis read bypip.
How do these differ? See
https://packaging.python.org/discussions/install-requires-vs-requirements/ (not terribly helpful);
https://medium.com/knerd/best-practices-for-python-dependency-management-cc8d1913db82
https://blog.miguelgrinberg.com/post/the-package-dependency-blues
https://www.b-list.org/weblog/2018/apr/25/lets-talk-about-packages/
https://www.reddit.com/r/Python/comments/3uzl2a/setuppy_requirementstxt_or_a_combination/
https://packaging.python.org/en/latest/distributing/#working-in-development-mode
http://python-packaging-user-guide.readthedocs.org/en/latest/distributing/
http://jtushman.github.io/blog/2013/06/17/sharing-code-across-applications-with-python/
Note:
It’s better to specify a dependency version range when you are providing libraries, and an exact version when you are providing an application.
When you run a user-defined script from your package, it calls
pkg_resources.load_entry_point(dist, group, name)(this is part ofsetuptools). See https://setuptools.readthedocs.io/en/latest/pkg_resources.html. This doesn’t seem to re-callsetup.pyor checkrequirements.txt.pipusesinstall_requiresand notrequirements.txtwhen installing your package.
Options
One “single-source” approach is to define a variable such as
INSTALL_REQUIRES in setup.py that is used in the setup(...,
install_requires=INSTALL_REQUIRES) and is used to write requirements.txt
(e.g. via an extra call by the developer: python setup.py --extras).
Another is to read and parse requirements.txt in setup.py.
Experimenting with a package that has a simple requirement for
semantic_version:
No requirements specified: code will crash at runtime with
ModuleNotFoundError: No module named 'semantic_version'install_requiresonly:PyCharm notices (even if indirected via a variable).
But note: it will cope with simple indirection, e.g.
REQUIREMENTS = [ "semantic_version==32.8.4", ] setup( ..., install_requires=REQUIREMENTS, )
but not with more complex indirection, e.g.
REQUIREMENTS_TEXT = """ semantic_version==32.8.4 """ REQUIREMENTS = [] with StringIO(REQUIREMENTS_TEXT) as f: for line in f.readlines(): line = line.strip() if (not line) or line.startswith('#') or line.startswith('--'): continue REQUIREMENTS.append(line) setup( ..., install_requires=REQUIREMENTS, )
Dependabot is meant to notice. Its code suggests it will cope with arbitrary indirection: https://github.com/dependabot/dependabot-core/blob/main/python/helpers/lib/parser.py
pip installdoes what’s required and the code runs.
requirements.txtonly:PyCharm notices.
We know Dependabot notices.
pip installdoes NOT install the necessary dependencies.
So this option is useless.
The next question is whether requirements.txt is necessary at all. One
view (e.g. Reddit above) is that it can be kept for development environments,
i.e. the extras required for development but not for running your package.
Conclusion
For package distribution,
install_requiresinsetup.pyis mandatory, andrequirements.txtis optional and therefore perhaps best avoided so that automatic code analysis tools don’t get confused.
Data and other non-Python files: setup.py versus MANIFEST.in
Here’s another tricky thing. In setup.py, you have package_data and
include_package_data arguments to setup(). There is also the file
MANIFEST.in.
# # or MANIFEST.in ? # - https://stackoverflow.com/questions/24727709/i-dont-understand-python-manifest-in # noqa: E501 # - https://stackoverflow.com/questions/1612733/including-non-python-files-with-setup-py # noqa: E501 # # or both? # - https://stackoverflow.com/questions/3596979/manifest-in-ignored-on-python-setup-py-install-no-data-files-installed # noqa: E501 # … MANIFEST gets the files into the distribution # … package_data gets them installed in the distribution # # data_files is from distutils, and we’re using setuptools # - https://docs.python.org/3.5/distutils/setupscript.html#installing-additional-files # noqa: E501
See:
http://danielsokolowski.blogspot.co.uk/2012/08/setuptools-includepackagedata-option.html
… relates to an old problem?
https://stackoverflow.com/questions/779495/access-data-in-package-subdirectory
https://packaging.python.org/guides/distributing-packages-using-setuptools/
https://packaging.python.org/guides/using-manifest-in/#using-manifest-in
https://setuptools.readthedocs.io/en/latest/userguide/datafiles.html
https://stackoverflow.com/questions/24727709/i-dont-understand-python-manifest-in
https://stackoverflow.com/questions/1612733/including-non-python-files-with-setup-py
… relevant
https://stackoverflow.com/questions/3596979/manifest-in-ignored-on-python-setup-py-install-no-data-files-installed …
MANIFEST.ingets the files into the distribution; …package_datagets them installed in the distributionhttps://ep2015.europython.eu/media/conference/slides/less-known-packaging-features-and-tricks.pdf
… this one is very good.
… the last, in particular, suggesting that both MANIFEST.in (required for
sdist) and package_data (used for install) are necessary.
However, it seems that you can use just MANIFEST.in if you specify
include_package_data=True.
For complex file specification, you could use Python and then write to
MANIFEST.in, but actually the manifest syntax is quite good:
So, the two realistic options are:
Have a
setup.pythat auto-writesMANIFEST.inwhen required.Specify
MANIFEST.inproperly and useinclude_package_data=True. This is probably better. See in particular https://ep2015.europython.eu/media/conference/slides/less-known-packaging-features-and-tricks.pdf
Conclusion
Use MANIFEST.in plus setup(..., include_package_data=True).
Use the full syntax available for MANIFEST.in.
To find all extensions (for the global-exclude command), use:
find . -type f | perl -ne 'print $1 if m/\.([^.\/]+)$/' | sort -u
Beware a nasty caching effect
Consider deleting any old MY_PACKAGE_NAME.egg_info directory from within
setup.py, before calling setup(). This may be particularly
applicable for packages that ship “data”. See
http://blog.codekills.net/2011/07/15/lies,-more-lies-and-python-packaging-documentation-on–package_data-/
Like this, for example:
# setup.py
import os
import shutil
PACKAGE_NAME = "MY_PACKAGE_NAME"
THIS_DIR = os.path.abspath(os.path.dirname(__file__)) # contains setup.py
EGG_DIR = os.path.join(THIS_DIR, PACKAGE_NAME + ".egg-info")
shutil.rmtree(EGG_DIR, ignore_errors=True)
setup(...)
This is perhaps meant to be unnecessary, per https://stackoverflow.com/questions/3779915/why-does-python-setup-py-sdist-create-unwanted-project-egg-info-in-project-r, but maybe isn’t.
It appears to be unnecessary once you shift to MANIFEST.in and
include_package_data=True.