Session 4: Packaging, releases, and sharing your code
So far, we have build created Python files in the structure of a package; we have created detailed documentation, and we have saved everything in an open-source, version-controlled repository. But how is our code any different or more useful in this structure that regular Python files? Why did we have to create this structure?
First, let's create a release on GitHub
Before we get into sharing your Python package, let's look at how we can use GitHub's release options to snapshot your code.
When this is done, you will be able to:
- Reference a specific version of your code when you use it in research
- Share your repository with others and have an easy way for them to reference the version of your code
- Use a DOI to reference your software
Semantic versioning
When creating releases on GitHub, you'll ideally want to use SemVer or Semantic Versioning. The summary provided by SemVer.org states:
Given a version number MAJOR.MINOR.PATCH, increment the:
MAJOR version when you make incompatible API changes
MINOR version when you add functionality in a backward compatible manner
PATCH version when you make backward compatible bug fixes
Let's start with version 0.1.0 (see the guide here).
Get a DOI
GitHub has a Zenodo integration that makes it simple to issue DOIs for releases. Before we create any releases, let's set up this integration.
- Login to Zenodo with GitHub here after reviewing the agreement/data sharing policies
- Go to the Zenodo GitHub page
- "Flip the switch" on the repositories you want DOIs for (aka the repository you are using for this course!)
Now, when you create a new release, you will be issued a DOI.
Creating a release on GitHub
All a release is is a zipped folder containing the contents of your directory at a specific point in time (a specific commit), with an easier-to-reference identifier that your unique git commit tag. You can access the release page from your GitHub repository page, in the right-hand menu under Releases.
Using releases
Create a release for any research code you use and cite the specific release version you use for any work done; if you update the code following feedback from collaborators, supervisors or paper reviewers, create a new release and cite this instead! Follow the SemVer instructions above to decide on version names.
Click Choose a tag and type in 0.1.0
into the dialogue box, and click + Create new tag: 0.1.0 on publish. Then in the Release title text box, enter v0.1.0
. Click Generate release notes and add some more detail to the text box stating that this is the initial release.
For now, we are going to ignore some of the extra options available; click Publish release.
Fantastic; now you have your first release!
Get your release DOI
If you go back to the Zenodo GitHub page, next to the repository you created a new release on, you will now have a little DOI badge. If you click on this, a popup will open with a variety of ways that you can share this. You cn add this tag to your README.md
file, and reference it in your citation file.
Create your pyproject.toml
file
Once you've created your pyproject.toml
file, you'll be able to:
- Install your package locally using pip
- Build compiled binaries of your package to upload to a GitHub release or to PyPI
One thing we can do is build an installable version of your package so that you can use it from anywhere on your system and not have to copy across the Python files to the directory you're working in. We just need a pyproject.toml
file for this. Hop back over to the webapp we used before and generate a toml
file for your project, and put it in the main folder of your project (on the same level as the README.md
file). You can learn more (or build this file from scratch) on the Setup tools documentation site.
Your toml
file will look something like this, except with your details:
[build-system]
requires = ["setuptools>=61.0", "setuptools-scm"]
build-backend = "setuptools.build_meta"
[project]
name = "example_package"
description = "A simple Python project"
version = "0.0.1"
readme = "README.md"
authors = [
{ name="Author Full Name", email="authors_email@goes_here.ie" },
]
requires-python = ">=3.10"
classifiers = [
"Programming Language :: Python :: 3",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
]
dependencies = [
"numpy>=1.21.2",
]
[project.optional-dependencies]
test = [
"pytest==6.2.5",
]
You need to replace the dependencies sections with the packages you need for your project - in this case, you can delete them as you're using pure/base Python!
Dependencies for packaging and release vs. for producing scientific results
While developing your code, you may need other external Python packages, for example numpy
, matplotlib.pyplot
, or scipy
. These are dependencies. While developing your code, you can use the dependency manager of your choice, such as conda
, and then export your dependencies to an environment.yml
file or a requirements.txt
file. You can then add these dependencies to your pyproject.toml
file when you are ready to share the code.
When should you pin dependencies and when should you leave them flexible?
Read this note on pinning specific dependencies. The basic tl;dr is:
- Leave dependencies loose for sharing a Python package: this prevents dependency conflicts when someone tries to install your package.
- Specify exact dependencies when recording scientific/research results or releasing an application (such as a webapp): even if the detailed pinned dependency list will be hard to reproduce on another machine, it's important to record the exact environment in which data were produced.
Example workflow using conda
When writing your code, you should work in a virtual environment, in this case using conda. First you should create an environment with the version of Python you need, and any initial packages you know you will require.
conda create -n ENV-NAME python=3.12 numpy
Then, as needed, you can add packages (with the environment active):
conda install pytest
When you are ready to package/release the code, you can export all your environments to an environment.yml file:
conda env export --from-history > environment.yml # again, from inside the activated env
You can modify this file and remove packages that you were just using for development (for example, pytest). You should test the environment works and you code can run by installing the env (you may need to change it's name):
conda env create -f environment.yml
if you have installed packages with pip from inside your conda env, you will need to add a few steps to add these requirements. ekiwi111 on GitHub provides the following code snippet:
# Extract installed pip packages
pip_packages=$(conda env export | grep -A9999 ".*- pip:" | grep -v "^prefix: ")
# Export conda environment without builds, and append pip packages
conda env export --from-history | grep -v "^prefix: " > new-environment.yml
echo "$pip_packages" >> new-environment.yml
For more flexibility in pip package versions, we can modify this to cut the pip version numbers out:
# Extract installed pip packages
pip_packages=$(conda env export | grep -A9999 ".*- pip:" | grep -v "^prefix: " | cut -f1 -d"=")
# Export conda environment without builds, and append pip packages
conda env export --from-history | grep -v "^prefix: " > new-environment.yml
echo "$pip_packages" >> new-environment.yml
Read about when dependencies/requirements should be flexible vs. tight on DeReLiCT.
Build your project locally
Check your conda env
You can check what conda environments you have with:
conda env list
You can activate an environment with:
conda activate ENV-NAME
You can check what's installed in an environment with:
conda list
from inside an active environment.
To create a new env (with Python 3.12): conda create --name NEW-ENV-NAME python=3.12
Because you have a pyproject.toml
file and a well-organised directory structure, you can build your Python package. This means you can created pre-compiled binaries for your project.
Create a new environment from the terminal called build-env:
conda create --name build-env python=3.12
Activate the new environment:
conda activate build-env
Install build
with pip
:
python3 -m pip install --upgrade build
Then to create a dist
folder with the installable tar.gz
and whl
files run build:
python3 -m build
When you create a new release on GitHub, you can upload these files to allow people to pip install
using the release url.
Manual binary upload in a GitHub release
One possible workflow for generating a release for your code that includes installable binaries is detailed below.
- Test your code locally using
pytest
and check that your docs build and are up to date. - Do a local install with
python3 -m pip install .
and try loading your package to check that it works. - Build your package locally with
python3 -m build
. - Push all your changes to the main branch (the
dist
andbuild
files won't be included because of the.gitignore
file). - Create a new release on your latest commit, and use the section Attach binaries by dropping them here or selecting them to upload both zipped files in the
dist
folder. Upload these individually instead of uploading thedist
folder. -
Once your release is published, you can point people to install your package with the following snippet (with the URL updated to the URL of your
tar.gz
file associated with the release)python -m pip install https://github.com/YOUR-USERNAME/YOUR-REPO-NAME/releases/download/YOUR-VERSION-NAME/PACKAGENAME-VERSION.tar.gz
To install the package locally and test it, you can use pip
:
python3 -m pip install --editable . # to install an editable version that will update when you change the source code
python3 -m pip install . # to install in non-editable format
You can then open Python and try importing your package and seeing if it works!
Test and release your project automatically in GitHub using workflows
The folder example_workflows
in the template repository you are using contains some suggestions for different automated workflows you can put in place. For example, build-release.yml
tests your code with pytest
, isntalls it locally on a test machine and runs some code to check it works (you will have to change this to suit your script), and then builds the binaries. You need to create a release tag using the instructions above, and then set the workflow to run on this tag.
Additionally, this folder contains build-package.yml
that stores artifacts for you following a successful build, and python-app.yml
which is the basic example workflow GitHUb provides that will test your code with Pytest
.
These workflows can be set to run everytime someone initiates a pull request against the main branch, and can be used to ensure that opnly code which passes all tests can be merged into the main branch.
Completing our documentation
We mentioned in Session 3 that the README file should contain the following (sections we have already filled in are crossed out):
Project titleProject Description- How to install and run the project
- How to use the project
Credits and acknowledgements
You can now add some details for number three, How to install and run the project; keep it short and simple, like this:
Create a virtual environment with pip available. From within this env, simply run the
pip install
command with the url of the desired packaged binary:
bash python -m pip install https://github.com/murphyqm/swd3-testing-ghcodespaces-demo-repo/releases/download/v0.0.1-alpha.2/hypot-0.0.1.tar.gz
You can test that it has installed correctly by running:
bash python -c "import hypot.calc;print(hypot.calc.squared(2))"
You can add this in to both your README file and into your docs by adding a markdown snippet like this (edited to point to your specific URL):
# To install the package with pip
Create a virtual environment with pip available. From within this env, simply run the `pip install` command with the url of the desired packaged binary:
```bash
python -m pip install https://github.com/murphyqm/swd3-testing-ghcodespaces-demo-repo/releases/download/v0.0.1-alpha.2/hypot-0.0.1.tar.gz
```
You can test that it has installed correctly by running:
bash
```
python -c "import hypot.calc;print(hypot.calc.squared(2))"
```
For the How to use the project section, you can give a brief outline of the maths behind the code, give an example of useage, and explain how you want people to cite the code if they do use it. You can add in your Zenodo DOI badge here, and an example citation, and point people to your CITATION.ctf
file if they would prefer that format. See this library guide on how to format a code citation.
Citation file
In addition to describing in your README how you want people to cite your code, you can additionally provide a citation file. Use this tool to generate a ctf file for your code. You can also use the converter at the previous link to convert your ctf file into a BibTex entry and other formats that might be useful to have in your repository.