← Sutra home Sutra

NeurIPS 2026 — submission archive

The immutable record of the Sutra paper as submitted to NeurIPS 2026. Downloads:

Paper (PDF)

The Sutra language paper — author-attributed, full version.

→

Paper (PDF, anonymized)

The double-blind review version — author identity stripped, same content.

→

Reproduction archive (ZIP)

Compiler source, tests, paper-claim reproduction scripts, and the agent-runnable replication recipe.

→

This page links to the version of the Sutra paper as submitted to NeurIPS 2026. Use these artifacts when you want the camera-ready record of what was actually submitted, not the live revision-in-progress.

The permanent, immutable record is the git commit ea6f8a01 below — history never changes, so that snapshot is the canonical “what the reviewers saw.” The working copy under paper/neurips/ tracks it but may carry small factual-correction errata over time (e.g. a license or contact-detail fix); substantive errata are listed at the bottom of this page. The live paper/paper.md is the separate next-venue revision target.

The exact commit the NeurIPS submission was based on

The paper, supplementary, and all reproduction scripts were frozen at the repository state recorded by commit ea6f8a01 on 2026-05-07:

ea6f8a01 — CLAUDE.md: freeze the entire paper, not just title + abstract

The paper/paper.md content immediately before that commit was unchanged for several days; the last edit to the paper text itself was ed56d8e3 (“drop tensor normal form — undefined term”). Either commit is a valid anchor for “what the NeurIPS reviewers actually saw,” with ea6f8a01 being the canonical pin because that’s where the freeze was declared.

You can browse the entire repository at that exact state via:

https://github.com/EmmaLeonhart/Sutra/tree/ea6f8a01

This is the canonical reference for verifying that any later repository change has or has not affected the NeurIPS-cited paths. It’s worth keeping this anchor visible because the project is vibe-coded — surface changes between sessions can break paper reproducibility silently, and the right way to check is to diff the current state against this commit.

Downloads

The PDFs and the supplementary zip are built by the paper-pdf.yml GitHub Actions workflow on every change to paper/neurips/ and on manual dispatch. Download the latest from the workflow’s paper-neurips-frozen artifact:

Latest workflow runs →

The artifact bundle contains:

File	What it is
`paper-neurips-named.pdf`	Author-named camera-ready PDF, as distributed after the rebuttal phase
`paper-neurips-anonymized.pdf`	Double-blind reviewer version (`Anonymous Authors`, line numbers, same content)
`sutra-neurips-supplementary.zip`	The reproduction archive uploaded to OpenReview

Source files (browsable)

File	Link
Paper source markdown	`paper/neurips/paper.md`
LaTeX wrapper	`paper/neurips/paper.tex`
NeurIPS style file	`paper/neurips/neurips_2026.sty`
Supplementary docs	`paper/neurips/supplementary/`
In-repo README	`paper/neurips/README.md`

The supplementary archive itself is not committed — it is a build artifact regenerated by scripts/build_supplementary_zip.py. Reviewers verifying the archive byte-for-byte should re-run the script against the repo state at the submission tag.

What’s in the supplementary archive (preview)

File	Purpose
`README.md`	Reviewer-facing overview of the archive’s layout
`SKILL.md`	Agent-runnable replication recipe (the entry point for AI reviewers)
`REPRODUCE.md`	Human-facing step-by-step reproduction instructions
`SYNTAX.md`	Surface-syntax reference for the `.su` programs cited in the paper
`sdk/sutra-compiler/`	Python compiler (lexer, parser, codegen, runtime)
`examples/`	The `.su` programs and `_smoke_test.py` driver §5 cites
`experiments/`	The §3 reproduction scripts plus reference output JSONs
`sutraDB/`	Rust FFI for the embedded triplestore tests

Why this archive exists

NeurIPS does not accept post-deadline edits. The repository’s main paper/paper.md may evolve toward future revisions, but the version that was actually submitted to NeurIPS 2026 is preserved here as a permanent snapshot for citation, audit, and reviewer-trail purposes.

In-repo notes: - paper/neurips/README.md — what’s in the archive and why - paper-pdf.yml — CI that builds the PDFs

Errata

Errata to the submission archive. None of these change the empirical results or the paper’s headline claims; they’re corrections to descriptions in the supplementary material plus one known spec-vs-code drift discovered after submission.

`map<K, V>` vs `dict<K, V>` description (supplementary SYNTAX.md)

Discovered: 2026-05-09 (commit 061eb974).

The supplementary’s SYNTAX.md § “Maps and codebooks” described map<K, V> as “a literal rotation-hashmap” with “single unbind” lookup. The compiler actually emits a static codebook with cosine-argmax retrieval for map<K, V>; the real rotation-hashmap lives under a separate dict<K, V> type that the original SYNTAX.md did not document.

What this means for reproducibility: all 27 example programs the paper cites still pass the supplementary smoke test, including the ones that use map<K, V> for codebooks. The reproduction is unaffected — only the description of what map<K, V> does was imprecise. The live paper/supplementary/SYNTAX.md (outside the frozen archive) now matches the compiler. The frozen paper/neurips/supplementary/SYNTAX.md retains the original imprecise description for the historical record.

Host-Python-string emission at non-codebook string boundaries

Discovered: 2026-05-10 (commit 895e7a78), fix committed same day.

After the freeze, a second string model (codepoint-array via make_string + AXIS_STRING_FLAG) was added to the spec (2026-05-08 strings.md). The new model wasn’t wired into the codegen’s literal-emission path, so string literals at string-typed function parameters / variable declarations were flowing as host Python str values rather than substrate-encoded Strings.

What this means for reproducibility: the NeurIPS paper describes a codebook + nearest-string string model (basis- vector embeddings + _vector_map_lookup at the boundary). That model was correct throughout and is what hello_world.su (the supplementary smoke check) uses. The bug was on a parallel code path that the paper does not describe and does not depend on. examples/hello_world.su still prints exactly "hello world" post-fix.

License: Apache-2.0 → AGPL-3.0-only (supplementary README)

Applied: 2026-06-18.

The supplementary README.md § “License” stated the compiler/example sources were MIT and SutraDB (the sutraDB/ Rust crates) was Apache-2.0. The project’s top-level LICENSE was already AGPLv3, and on 2026-06-18 all first-party components were relicensed to AGPL-3.0-only for consistency. This is a factual licensing correction, not a change to any result or claim. With the paper/neurips/ edit-freeze retired the same day, the archived supplementary README.md was updated in place to state AGPL-3.0-only (rather than leaving a now-false license line for the record).

What we know is NOT broken

The four-substrate width-k experiment (§5.1 / Table 1 / Figure 2) — runs unchanged against the frozen reproduction scripts.
The trainable-from-random-init fuzzy-rule program (§5.2 / Figure 4) — runs unchanged.
The single-cycle bind / unbind round-trip precision claim (≈ 1.5e-15) — unchanged.
The 27-program smoke test driver — green at master HEAD as of this errata page’s last update.

Errata to this page itself will be added in chronological order as anything else surfaces.

Reviews from clawRxiv (separate from NeurIPS)

Sutra has been auto-submitted to clawRxiv (an AI peer-review platform) on every paper revision. The archived AI reviews live under paper/reviews/ in the repository. These are not part of the NeurIPS submission record — they’re an orthogonal feedback channel.