Lec 20 - Continuous integration/testing with github actions
Setting up python and install packages in requirements.txt
See https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python for details.
.github/workflows/python_testing.yml
name: Setting up python
on:
workflow_dispatch:
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v4
with:
python-version: '3.11'
cache: 'pip'
- run: pip install -r requirements.txt
- run: python -c "import sys; print(sys.version)"
Setting up python with fast caching
A faster cache can be used with the following code (this is what we used for the autograding homework assignments). If speed is not an issue, the previous simpler steps work fine.
.github/workflows/python_testing.yml
name: Setting up python with a fast cache
on:
workflow_dispatch:
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Cache dependencies
uses: actions/cache/restore@v3
id: cache
with:
path: ${{ runner.tool_cache }}/Python/3.10.13 # e.g /opt/hostedtoolcache/Python/3.11.6
key: ${{ runner.tool_cache }}/Python/3.10.13/${{ runner.arch }}-${{ hashFiles('requirements.txt') }}
- name: Set up Python env
uses: actions/setup-python@v5
with:
python-version: '3.10.13'
- name: Install pip dependencies
shell: bash
run: pip install -r requirements.txt
- uses: actions/cache/save@v3
if: steps.cache.outputs.cache-hit != 'true'
with:
path: ${{ runner.tool_cache }}/Python/3.10.13 # e.g /opt/hostedtoolcache/Python/3.11.6
key: ${{ runner.tool_cache }}/Python/3.10.13/${{ runner.arch }}-${{ hashFiles('requirements.txt') }}
Pytest, pycodestyle workflow on every pull request
This combines the previous python setup to run pytest
and pycodestyle
on every new pull request (with the event pull request
in the second block).
A test file may for instance verify that no missing values are present in the variable df
computed in rates.py
:
test_missing_values.py
from rates import df
def test_missing():
assert not df.isnull().values.any()
By setting up the required packages in requirements.txt
, we may now run pytest on every pull request, with the extra steps
jobs:
build:
runs-on: ubuntu-latest
steps:
#
# (...)
#
- name: Display Python version
run: python -c "import sys; print(sys.version)"
- name: Run pytest
run: python -m pytest
- name: Pycodestyle to enforce pep8
run: pycodestyle *.py
so that the full workflow is given by
.github/workflows/python_testing.yml
name: Python testing
on:
pull_request:
workflow_dispatch:
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Cache dependencies
uses: actions/cache/restore@v3
id: cache
with:
path: ${{ runner.tool_cache }}/Python/3.10.13 # e.g /opt/hostedtoolcache/Python/3.11.6
key: ${{ runner.tool_cache }}/Python/3.10.13/${{ runner.arch }}-${{ hashFiles('requirements.txt') }}
- name: Set up Python env
uses: actions/setup-python@v5
with:
python-version: '3.10.13'
- name: Install pip dependencies
shell: bash
run: pip install -r requirements.txt
- uses: actions/cache/save@v3
if: steps.cache.outputs.cache-hit != 'true'
with:
path: ${{ runner.tool_cache }}/Python/3.10.13 # e.g /opt/hostedtoolcache/Python/3.11.6
key: ${{ runner.tool_cache }}/Python/3.10.13/${{ runner.arch }}-${{ hashFiles('requirements.txt') }}
- name: Display Python version
run: python -c "import sys; print(sys.version)"
- name: Run pytest
run: python -m pytest
- name: Pycodestyle to enforce pep8
run: pycodestyle *.py
Periodic backup of csv file as github artefacts
The official documentation for uploading/downloading github artifacts is given at https://docs.github.com/en/actions/using-workflows/storing-workflow-data-as-artifacts.
Let us assume that the repository contains a rates.py
at the root, that constructs a dataframe in variable df
. Let us use the requirements.txt
given by
requirements.txt
pytest
pandas
numpy
matplotlib
jupytext
nbconvert
pycodestyle ipykernel
and assume that we have a small file to export the dataframe into a csv file named rates.csv
:
export_df_to_csv
from rates import df
'rates.csv') df.to_csv(
We can now set up a backup of the dataframe as a csv file, that will be downloadable from the github website from any point in time when the workflow was run, by adding the steps
jobs:
build:
runs-on: ubuntu-latest
steps:
#
# (...)
#
- name: Export to csv
run: python export_df_to_csv.py
- uses: actions/upload-artifact@v4
with:
name: artifact rates.csv daily
path: rates.csv
so that the full workflow looks like
.github/workflows/csv_backup.yml
name: Backup data in csv as a github artifact
on:
workflow_dispatch:
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Cache dependencies
uses: actions/cache/restore@v3
id: cache
with:
path: ${{ runner.tool_cache }}/Python/3.10.13 # e.g /opt/hostedtoolcache/Python/3.11.6
key: ${{ runner.tool_cache }}/Python/3.10.13/${{ runner.arch }}-${{ hashFiles('requirements.txt') }}
- name: Set up Python env
uses: actions/setup-python@v5
with:
python-version: '3.10.13'
- name: Install pip dependencies
shell: bash
run: pip install -r requirements.txt
- uses: actions/cache/save@v3
if: steps.cache.outputs.cache-hit != 'true'
with:
path: ${{ runner.tool_cache }}/Python/3.10.13 # e.g /opt/hostedtoolcache/Python/3.11.6
key: ${{ runner.tool_cache }}/Python/3.10.13/${{ runner.arch }}-${{ hashFiles('requirements.txt') }}
- name: Run pytest
run: python -m pytest
- name: Export to csv
run: python export_df_to_csv.py
- uses: actions/upload-artifact@v4
with:
name: artifact rates.csv daily
path: rates.csv
Performing recurring action
See https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#schedule
We can apply the previous workflow (backup of a csv file as github artifact) by running the workflow on a schedule. The following will run the workflow either manually (workflow_dispatch
) or twice a day (using schedule
):
on:
schedule:
# This example triggers the workflow every day at 5:30 and 17:30 UTC:
# * is a special character in YAML so you have to quote this string
- cron: '30 5,17 * * *'
Github action to publish the quarto website on github pages
See the official documentation at https://quarto.org/docs/publishing/github-pages.html#github-action