Skip to content

Releasing robin-sparkless

This document describes how to cut a release and publish:

  • Rust crates to crates.io and
  • The Python package sparkless (Sparkless v4) to PyPI.

The repository is a Cargo workspace with members: robin-sparkless (root, main facade), crates/robin-sparkless-core, crates/robin-sparkless-polars, crates/spark-sql-parser, and the Python extension crate under python/. The primary Rust dependency for users is robin-sparkless; the subcrates may be published for advanced or minimal-use cases. make check and CI build the whole workspace (cargo build --workspace --all-features, cargo test --workspace --all-features).

Version pinning (CI and local)

So that local development and CI use the same toolchains and tools:

Pre-release checklist (e.g. X.Y.Z)

  • [ ] Versions — Root, robin-sparkless-core, and robin-sparkless-polars Cargo.toml have the same version (e.g. 4.x.y). Root Cargo.toml path deps use matching version = "4" (or "4.x"). Python python/pyproject.toml and python/Cargo.toml version match if releasing Python.
  • [ ] CHANGELOG — Add [X.Y.Z] - YYYY-MM-DD section with Added/Changed/Fixed; move Unreleased items or leave Unreleased for next.
  • [ ] README — Rust install examples use the major version or the new release version (e.g. robin-sparkless = "4" or robin-sparkless = "4.x.y").
  • [ ] CImake check-full passes (format, clippy, audit, deny, Rust tests, Python lint). Push to a branch and confirm CI green.
  • [ ] Secrets — GitHub repo has CARGO_REGISTRY_TOKEN (crates.io) and PYPI_API_TOKEN (PyPI) if publishing Python.
  • [ ] Tag — After merge to main, git tag vX.Y.Z and git push origin vX.Y.Z; release workflow runs automatically.

Prerequisites

  • The repository must have a GitHub Actions secret named CARGO_REGISTRY_TOKEN set to a crates.io API token.
  • The repository must have a GitHub Actions secret named PYPI_API_TOKEN set to a PyPI API token for the sparkless project.
  • Create a token at crates.io/settings/tokens (requires a crates.io account). Store it as a repo secret in GitHub under Settings → Secrets and variables → Actions.
  • Create a PyPI token at pypi.org/manage/account/token and store it as PYPI_API_TOKEN under the same GitHub settings.

Release steps

  1. Bump the version in all Rust and Python manifests so they stay in sync:
  2. Cargo.toml (root, robin-sparkless)
  3. crates/robin-sparkless-core/Cargo.toml
  4. crates/robin-sparkless-polars/Cargo.toml
  5. crates/spark-sql-parser/Cargo.toml
  6. python/pyproject.toml (Python package metadata)
  7. python/Cargo.toml (native extension crate)

Use the same version for the three robin-sparkless crates and the Python package (e.g. 4.x.y). spark-sql-parser can use the same version or its own (e.g. 0.x.y). Update the version in root and the crates; if you publish the subcrates, also update the dependency version in root (e.g. robin-sparkless-core = { version = "4", path = "..." }, robin-sparkless-polars = { version = "4", path = "..." }). Commit and push to main.

  1. Create and push a tag matching the version with a v prefix:
git tag vX.Y.Z
git push origin vX.Y.Z
  1. Release workflow runs automatically on the tag push (see .github/workflows/release.yml):

  2. Rust checks:

    • Format check, Clippy (workspace), cargo audit, cargo deny
    • Build and tests for the whole workspace
    • cargo doc --workspace
  3. Crates.io publish (Rust) in dependency order:
    1. spark-sql-parser
    2. robin-sparkless-core
    3. robin-sparkless-polars
    4. robin-sparkless
  4. Python checks:
    • Mypy over the Python sources
    • Build per-OS wheels for the native extension (Ubuntu, macOS, Windows)
    • Python smoke tests + fast pytest subset against the built wheel across a Python version/OS matrix
  5. PyPI publish (Python):

    • sparkless wheels and sdist are published to PyPI for:
    • manylinux x86_64 + sdist
    • manylinux aarch64
    • musllinux x86_64
    • musllinux aarch64
    • macOS (x86_64-apple-darwin, aarch64-apple-darwin)
    • Windows (x86_64, aarch64)
  6. Verify:

  7. Rust:
  8. Python:
    • pypi.org/project/sparkless
    • pip install sparkless==X.Y.Z in a clean virtualenv and a quick from sparkless.sql import SparkSession smoke test.

Version policy

  • Tags must match the version in the root Cargo.toml (e.g. tag vX.Y.Z only when root and both crates have version = "X.Y.Z").
  • Do not re-tag or overwrite tags; crates.io does not allow republishing the same version.
  • The three robin-sparkless crates are published with the same version number so that the main crate can depend on robin-sparkless-core = "4" and robin-sparkless-polars = "4" and resolve to the matching v4 release. spark-sql-parser may use a separate version (e.g. 0.x.y).

Manual publish (optional)

If you need to publish from the repo without using the tag workflow (e.g. a one-off fix for one crate), use the same order so dependencies exist on crates.io:

cargo publish -p spark-sql-parser --token <CRATES_IO_TOKEN>
cargo publish -p robin-sparkless-core --token <CRATES_IO_TOKEN>
cargo publish -p robin-sparkless-polars --token <CRATES_IO_TOKEN>
cargo publish -p robin-sparkless --token <CRATES_IO_TOKEN>

Ensure the version in each crate’s Cargo.toml is bumped and that dependency version fields in root match the versions you are publishing.

Notes on the Python package

  • The Python package sparkless (v4+) is published from this repository’s python/ directory.
  • Wheels are built using maturin with a pyo3-based native extension crate (sparkless-native) that links against robin-sparkless.
  • The package exposes from sparkless.sql import SparkSession and uses a private sparkless._native module for the Rust bindings. The legacy sparkless_robin module name is no longer used.