Robin Sparkless¶
PySpark-style DataFrames in Rust—no JVM. A DataFrame library that mirrors PySpark's API and semantics while using Polars as the execution engine. The same engine powers Sparkless v4 for Python: a drop-in PySpark replacement with no JVM and no Polars Python at runtime.
Quick links¶
Rust
- User guide — Learn how to use Robin Sparkless (Rust)
- Quickstart — Build, install, and basic usage (Rust)
- Persistence guide — Global temp views and disk-backed saveAsTable
- PySpark differences — Known divergences and caveats
- Roadmap — Development phases and Sparkless integration
Python (Sparkless v4)
- Getting started (Python) — Installation, quick start, core features, testing
- Testing Guide — Dual-mode testing with
sparkless.testing - Package README — Why Sparkless v4, Sparkless 3 vs 4.x, API overview, backend
- Migration (PySpark / Sparkless 3) — Switching from PySpark or Sparkless 3.x
- PySpark differences — Same reference applies to Python usage
What is Robin Sparkless?¶
Robin Sparkless provides a PySpark-like API in Rust so you can write familiar DataFrame code without the JVM. It is designed to power Sparkless—the Python PySpark drop-in replacement—as its execution backend via PyO3.
| Feature | Description |
|---|---|
| Core | SparkSession, DataFrame; lazy by default. ExprIr (engine-agnostic): root col/lit_*/gt/… → filter_expr_ir, collect_rows, agg_expr_ir. Column/Expr (Polars): prelude/functions → filter, with_column, full expression set. Plus groupBy, joins |
| Engine | Polars for fast, native execution |
| Optional | SQL (spark.sql, temp views, global temp views, saveAsTable in-memory or warehouse), Delta Lake (read_delta / write_delta) |
Documentation¶
Rust
- Getting started — Quickstart, Persistence guide, Embedding (bindings / FFI), Releasing
- Reference — PySpark differences, Parity status, Robin-Sparkless missing
- Testing — Run
make checkfor Rust checks and tests;make test-parity-phase-Xfor phase-specific parity. See QUICKSTART and TEST_CREATION_GUIDE. - Sparkless integration — Integration analysis, Full backend roadmap, Logical plan format
- Development — Roadmap, Test creation guide, Converter status, Bugs and improvements plan
Python (Sparkless v4) — mirrors Sparkless doc structure
- Getting started — Python getting started (installation, quick start, DataFrame/SQL, testing, lazy evaluation)
- Testing — Testing Guide (
sparkless.testingmodule: dual-mode testing, fixtures, markers, DataFrame comparison) - API / reference — Package README (API overview), PySpark differences
- Guides — Migration from PySpark / Sparkless 3
- Additional — Parity status (fixture coverage)
For the full list of documents, see the Doc index in the navigation.
Rust API¶
- docs.rs/robin-sparkless — Crate API reference
License¶
MIT