Robin Sparkless Documentation¶
This page is the legacy doc index. The full documentation is built with MkDocs and published on Read the Docs. To build locally: pip install -r docs/requirements.txt then mkdocs serve.
Python (Sparkless v4)¶
Documentation for the Python package mirrors the Sparkless readthedocs structure:
| Document | Description |
|---|---|
| Python getting started | Installation, quick start, core features (DataFrame, SQL, windows), testing, lazy evaluation, next steps |
| Testing Guide | sparkless.testing module: dual-mode testing (sparkless + PySpark), fixtures, markers, DataFrame comparison utilities, CI configuration |
| Package README | Why Sparkless v4, Sparkless 3 vs 4.x, installation, API overview, backend, known limitations |
| Migration (PySpark / Sparkless 3) | Switching from PySpark or Sparkless 3.x; Sparkless 3 vs 4.x table |
| PySpark differences | Known divergences and caveats (applies to both Rust and Python usage) |
| PySpark 4 parity plan | Roadmap for PySpark 4 parity while keeping PySpark 3.2–3.5 compatibility |
Rust and general¶
| Document | Description |
|---|---|
| QUICKSTART | Build, install, basic usage, optional features (SQL, Delta, JDBC), troubleshooting, benchmarks |
| JDBC_TESTING | JDBC setup for all backends (PostgreSQL, SQLite, MySQL, MariaDB, SQL Server, Oracle, DB2), Docker Compose, env vars, PySpark-compatible options |
| EMBEDDING | Embedding and bindings: prelude::embed, *_engine() API, schema helpers, traits; minimal FFI surface |
| ROADMAP | Development roadmap and Sparkless integration phases |
| RELEASING | How to cut a release (version bump, tag, crates.io publish) |
| CHANGELOG | Version history and release notes |
| PARITY_STATUS | PySpark parity coverage matrix (159 fixtures; 3 plan fixtures; Phases 12–25 + signature alignment + gap closure) |
| PYSPARK_DIFFERENCES | Known divergences from PySpark (window, SQL, Delta, rand/randn semantics; DataFrame cube/rollup/write/saveAsTable and stubs; in-memory tables and catalog; Phase 8 + gap closure) |
| ROBIN_SPARKLESS_MISSING | What Sparkless has and robin-sparkless does not (or stub only); XML/XPath/sentences deferred |
| SIGNATURE_GAP_ANALYSIS | PySpark vs robin-sparkless signature gap analysis (params, types, defaults) and recommendations |
| SIGNATURE_ALIGNMENT_TASKS | Checklist to align Python param names to PySpark (historical, for the previous Python bindings) |
| CONVERTER_STATUS | Sparkless → robin-sparkless fixture converter |
| SPARKLESS_PARITY_STATUS | Phase 5: pass/fail and failure reasons for converted fixtures |
| FULL_BACKEND_ROADMAP | Phased plan to full Sparkless backend replacement (Phases 12–25 + gap closure; ~295+ functions, 159 fixtures, plan interpreter; Phase 26 crate publish, Phase 27 Sparkless integration) |
| GAP_ANALYSIS_SPARKLESS_3.28 | Full gap analysis vs Sparkless 3.28.0 (installed API comparison) |
| PARITY_CHECK_SPARKLESS_3.28 | Double-check parity: implemented vs gap (Feb 2026) |
| PHASE15_GAP_LIST | Function gap list (PYSPARK_FUNCTION_MATRIX vs robin-sparkless) |
| SPARKLESS_INTEGRATION_ANALYSIS | Sparkless backend replacement strategy, architecture, test conversion |
| SPARKLESS_REFACTOR_PLAN | Refactor plan for Sparkless (serializable logical plan) to prepare for robin backend |
| READINESS_FOR_SPARKLESS_PLAN | What robin-sparkless can do in parallel (plan interpreter, fixtures, API) before merge |
| LOGICAL_PLAN_FORMAT | Backend plan format (op list + payload shapes + expression tree) consumed by execute_plan; full expression support (all scalar functions in filter/select/withColumn) |
| TEST_CREATION_GUIDE | How to add parity tests and convert Sparkless fixtures |
| IMPLEMENTATION_STATUS | Polars migration status, build & test status |