Skip to content

Robin Sparkless Documentation

This page is the legacy doc index. The full documentation is built with MkDocs and published on Read the Docs. To build locally: pip install -r docs/requirements.txt then mkdocs serve.

Python (Sparkless v4)

Documentation for the Python package mirrors the Sparkless readthedocs structure:

Document Description
Python getting started Installation, quick start, core features (DataFrame, SQL, windows), testing, lazy evaluation, next steps
Testing Guide sparkless.testing module: dual-mode testing (sparkless + PySpark), fixtures, markers, DataFrame comparison utilities, CI configuration
Package README Why Sparkless v4, Sparkless 3 vs 4.x, installation, API overview, backend, known limitations
Migration (PySpark / Sparkless 3) Switching from PySpark or Sparkless 3.x; Sparkless 3 vs 4.x table
PySpark differences Known divergences and caveats (applies to both Rust and Python usage)
PySpark 4 parity plan Roadmap for PySpark 4 parity while keeping PySpark 3.2–3.5 compatibility

Rust and general

Document Description
QUICKSTART Build, install, basic usage, optional features (SQL, Delta, JDBC), troubleshooting, benchmarks
JDBC_TESTING JDBC setup for all backends (PostgreSQL, SQLite, MySQL, MariaDB, SQL Server, Oracle, DB2), Docker Compose, env vars, PySpark-compatible options
EMBEDDING Embedding and bindings: prelude::embed, *_engine() API, schema helpers, traits; minimal FFI surface
ROADMAP Development roadmap and Sparkless integration phases
RELEASING How to cut a release (version bump, tag, crates.io publish)
CHANGELOG Version history and release notes
PARITY_STATUS PySpark parity coverage matrix (159 fixtures; 3 plan fixtures; Phases 12–25 + signature alignment + gap closure)
PYSPARK_DIFFERENCES Known divergences from PySpark (window, SQL, Delta, rand/randn semantics; DataFrame cube/rollup/write/saveAsTable and stubs; in-memory tables and catalog; Phase 8 + gap closure)
ROBIN_SPARKLESS_MISSING What Sparkless has and robin-sparkless does not (or stub only); XML/XPath/sentences deferred
SIGNATURE_GAP_ANALYSIS PySpark vs robin-sparkless signature gap analysis (params, types, defaults) and recommendations
SIGNATURE_ALIGNMENT_TASKS Checklist to align Python param names to PySpark (historical, for the previous Python bindings)
CONVERTER_STATUS Sparkless → robin-sparkless fixture converter
SPARKLESS_PARITY_STATUS Phase 5: pass/fail and failure reasons for converted fixtures
FULL_BACKEND_ROADMAP Phased plan to full Sparkless backend replacement (Phases 12–25 + gap closure; ~295+ functions, 159 fixtures, plan interpreter; Phase 26 crate publish, Phase 27 Sparkless integration)
GAP_ANALYSIS_SPARKLESS_3.28 Full gap analysis vs Sparkless 3.28.0 (installed API comparison)
PARITY_CHECK_SPARKLESS_3.28 Double-check parity: implemented vs gap (Feb 2026)
PHASE15_GAP_LIST Function gap list (PYSPARK_FUNCTION_MATRIX vs robin-sparkless)
SPARKLESS_INTEGRATION_ANALYSIS Sparkless backend replacement strategy, architecture, test conversion
SPARKLESS_REFACTOR_PLAN Refactor plan for Sparkless (serializable logical plan) to prepare for robin backend
READINESS_FOR_SPARKLESS_PLAN What robin-sparkless can do in parallel (plan interpreter, fixtures, API) before merge
LOGICAL_PLAN_FORMAT Backend plan format (op list + payload shapes + expression tree) consumed by execute_plan; full expression support (all scalar functions in filter/select/withColumn)
TEST_CREATION_GUIDE How to add parity tests and convert Sparkless fixtures
IMPLEMENTATION_STATUS Polars migration status, build & test status