Skip to content

Contributing

Python development setup#


  • Fork and clone the repository

  • Create a new Python environment with your favourite environment manager (e.g. virtualenv or conda) and Python 3.9 (newer versions will return a library conflict in auto_doc.py)

  • Install repository in editable mode with development dependencies:

cd python
pip install -e ".[dev]"
  • Install pre-commit and then activate its hooks. pre-commit is a framework for managing and maintaining multi-language pre-commit hooks. The library uses pre-commit to ensure code-style and code formatting through ruff. Run the following commands from the python directory:
cd python
pip install --user pre-commit
pre-commit install

Afterwards, pre-commit will run whenever you commit.

  • To run formatting and code-style separately, you can configure your IDE, such as VSCode, to use ruff, or run it via the command line:
# linting
ruff check python --fix
# formatting
ruff format python

Python documentation#

We follow a few best practices for writing the Python documentation:

  1. Use the Google docstring style:
"""[One Line Summary]

[Extended Summary]

[!!! example
    import xyz
]

# Arguments
    arg1: Type[, optional]. Description[, defaults to `default`]
    arg2: Type[, optional]. Description[, defaults to `default`]

# Returns
    Type. Description.

# Raises
    Exception. Description.
"""

If Python 3 type annotations are used, they are inserted automatically.

  1. Hopsworks entity engine methods (e.g. ExecutionEngine etc.) only require a single line docstring.
  2. Private REST API implementations (e.g. FeatureGroupApi etc.) should be fully documented with docstrings without defaults.
  3. Public API such as metadata objects and public REST API implementations should be fully documented with defaults.

Setup and Build Documentation#

We use mkdocs together with mike (for versioning) to build the documentation and a plugin called keras-autodoc to auto generate Python API documentation from docstrings.

Background about mike: mike builds the documentation and commits it as a new directory to the gh-pages branch. Each directory corresponds to one version of the documentation. Additionally, mike maintains a json in the root of gh-pages with the mappings of versions/aliases for each of the directories available. With aliases you can define extra names like dev or latest, to indicate stable and unstable releases.

  1. Install Hopsworks with requirements-docs.txt:
pip install -r requirements-docs.txt
pip install -e "python[dev]"
  1. To build the docs, first run the auto doc script:
python python/auto_doc.py
Option 1: Build only current version of docs#
  1. Either build the docs, or serve them dynamically:

Note: Links and pictures might not resolve properly later on when checking with this build. The reason for that is that the docs are deployed with versioning on docs.hopsworks.ai and therefore another level is added to all paths, e.g. docs.hopsworks.ai/[version-or-alias]. Using relative links should not be affected by this, however, building the docs with version (Option 2) is recommended.

mkdocs build
# or
mkdocs serve
Option 2 (Preferred): Build multi-version doc with mike#
Versioning on docs.hopsworks.ai#

On docs.hopsworks.ai we implement the following versioning scheme:

  • current master branches (e.g. of hopsworks corresponding to master of Hopsworks): rendered as current Hopsworks snapshot version, e.g. 4.0.0-SNAPSHOT [dev], where dev is an alias to indicate that this is an unstable version.
  • the latest release: rendered with full current version, e.g. 3.8.0 [latest] with latest alias to indicate that this is the latest stable release.
  • previous stable releases: rendered without alias, e.g. 3.4.4.
Build Instructions#
  1. For this you can either checkout and make a local copy of the upstream/gh-pages branch, where mike maintains the current state of docs.hopsworks.ai, or just build documentation for the branch you are updating:

    Building one branch:

    Checkout your dev branch with modified docs:

    git checkout [dev-branch]
    

    Generate API docs if necessary:

    python auto_doc.py
    

    Build docs with a version and alias

    mike deploy [version] [alias] --update-alias
    
    # for example, if you are updating documentation to be merged to master,
    # which will become the new SNAPSHOT version:
    mike deploy 4.0.0-SNAPSHOT dev --update-alias
    
    # if you are updating docs of the latest stable release branch
    mike deploy [version] latest --update-alias
    
    # if you are updating docs of a previous stable release branch
    mike deploy [version]
    

    If no gh-pages branch existed in your local repository, this will have created it.

    Important: If no previous docs were built, you will have to choose a version as default to be loaded as index, as follows

    mike set-default [version-or-alias]
    

    You can now checkout the gh-pages branch and serve:

    git checkout gh-pages
    mike serve
    

    You can also list all available versions/aliases:

    mike list
    

    Delete and reset your local gh-pages branch:

    mike delete --all
    
    # or delete single version
    mike delete [version-or-alias]
    

Adding new API documentation#

To add new documentation for APIs, you need to add information about the method/class to document to the auto_doc.py script:

PAGES = {
    "connection.md": [
        "hopsworks.connection.Connection.connection"
    ]
    "new_template.md": [
            "module",
            "xyz.asd"
    ]
}

Now you can add a template markdown file to the docs/templates directory with the name you specified in the auto-doc script. The new_template.md file should contain a tag to identify the place at which the API documentation should be inserted:

## The XYZ package

{{module}}

Some extra content here.

!!! example
    ```python
    import xyz
    ```

{{xyz.asd}}

Finally, run the auto_doc.py script, as decribed above, to update the documentation.

For information about Markdown syntax and possible Admonitions/Highlighting etc. see the Material for Mkdocs themes reference documentation.