LeVCS: A Distributed Version Control System

Naming a version control system after yourself is a mild hubris. Linus has been candid that this is exactly what he did with git — “I’m an egotistical bastard, and I name all my projects after myself. First ‘Linux’, now ‘git’.” LeVCS — Levi’s VCS — admits the same joke up front and gets on with it. The work itself is the substrate: an object model, a federation API, a merge engine, and a small instance server, written in Rust, sized to fit on a VPS and audit-able by one person in a weekend.

The first instance will be at levcs.levineuwirth.org; the source lives at git.levineuwirth.org/neuwirth/levcs and is mirrored by the same git infrastructure that hosts everything else. v0.1.0 is what is described here. The workflow surface — the PR review object, issue tracking, web UI, CI conventions — is intentionally not part of v0.1.0 and is the next document in the series.

This page describes what LeVCS is, why it diverges from git on five specific axes, and how to use and operate it today.


What It Is

LeVCS is a distributed version control system. It uses the same conceptual primitives as every other modern DVCS: content-addressed objects, a directed-acyclic-graph history, signed commits, three-way merge. The substrate is small — about ten Rust crates, ~194 passing tests at v0.1.0, full cargo test under a minute on a laptop.

The short version of what comes with it:

The dependency list is short: a recent stable Rust toolchain (workspace MSRV is 1.75) and a C compiler for the tree-sitter grammars. No database server, no message broker, no external service.


Why a New VCS?

Git is the dominant DVCS, and there is no good case for replacing it on the strength of taste alone. The case for replacing it rests on five specific places where its 2005 design has aged poorly enough that bolt-on solutions have stopped paying their freight:

LeVCS is an attempt at a clean restart that takes the DAG model and content addressing as obvious wins, and rebuilds identity, federation, merging, hashing, and releases as protocol-level concerns rather than conventions or sidecar tools.


The Shape of the System

LeVCS is layered. Each layer has a clean interface to the one below it:

┌───────────────────────────────────────────────────────────┐
│ Workflow tools (TBD: review, issues, web UI)              │
├───────────────────────────────────────────────────────────┤
│ CLI: `levcs init / commit / push / merge / release`       │
├───────────────────────────────────────────────────────────┤
│ Federation HTTP API (instances, mirrors, releases)        │
├───────────────────────────────────────────────────────────┤
│ Object model: Blob / Tree / Commit / Release / Authority  │
│ Merge engine: textual → format-aware → tree-sitter        │
│ Trust root: signed authority chain (Ed25519)              │
│ Content addressing: BLAKE3                                │
└───────────────────────────────────────────────────────────┘

Five object kinds, all content-addressed by their BLAKE3 digest:

A repository is the set of these objects plus a refs/ map (branches/*, releases/*, authority/{genesis,current}) indexing into them. The repo_id is the BLAKE3 of the genesis authority — globally unique by construction, no central registrar needed.


What’s Different from Git

Axis git LeVCS
Hash SHA-1 (deprecated, transitioning) BLAKE3
Identity Author string in commit Signed authority object with explicit roles
Push authorization Server-side hook or hosting platform Protocol-level role check
Force-push rule Server policy (off-protocol) Protocol enforces maintainer-or-owner role
Federation URL-bound remotes Global repo_id + replicating instances
Mirror replication git fetch --mirror (best-effort) First-class with three storage modes
Tags / releases Mutable string refs (often) Signed objects with predecessor + parent-release chain
Merge granularity Line-level (myers / patience) Cascade: textual → format → tree-sitter → plugin
Merge audit No artifact .levcs/merge-record TOML, signed with the commit
Web UI / issues Hosting platform Out of scope for v1

The rest of this section unpacks each axis worth unpacking.

Identity in the Protocol, Not on Top

Git stores Author: Name <email> and Committer: Name <email> strings in commits. There is nothing cryptographic about either. Even signed commits answer the wrong question — is some key behind this signature? — instead of the right one: is this signer currently authorized to write to this repository?

LeVCS makes membership a first-class object. An authority body has:

schema_version  repo_id  previous_authority  version  created_micros
members:        [(public_key, handle, role, added_micros, added_by), ...]
policy:         [(key, value), ...]

Roles form a strict order: Reader < Contributor < Maintainer < Owner. Every commit references the authority hash that was current when it was signed. Updating membership is a versioned operation: you write a new authority object, signed by an Owner, with previous_authority pointing at the prior one. The instance walks the chain on push and rejects any push whose author key isn’t a current member.

The practical consequence is that “give Bob push access” is not a hosting-platform toggle. It is a signed authority update that travels in the repository and is auditable for the lifetime of the project.bb This is the design choice I care most about. The alternative — that authorization is a dashboard somewhere — means the repository is not actually self-describing. You can’t tell, from the repository alone, who could have written this history. With a chained authority object you can.

Federation, Not “Remotes”

A git remote is a URL plus some credentials. There is no fact-of-the-matter about whether two URLs refer to the same repository — git checks by walking commits, but “same project” is a convention.

LeVCS has a global repo_id — the BLAKE3 of the genesis authority object, so two clones of the same project have the same repo_id even if they live on instances on opposite continents. An instance is a federation peer: it serves /levcs/v1/repos/<repo_id>/... endpoints and replicates state from other instances when configured to. Mirroring is the protocol’s normal mode, not a git fetch --mirror cron job.

This composes with three storage modes:

The instance enforces these on push. A release-mode replica refuses pushes that update branches; a metadata-mode replica refuses all pushes (it is populated entirely by mirroring). A migrating maintainer can move the source-of-truth role from one instance to another with levcs migrate, replaying the full history at the destination — the repo_id is unchanged, because the genesis authority is unchanged.

The Merge Cascade

This is the technical centerpiece.

A traditional three-way merge — git, mercurial, fossil — works at the line level. It is correct for prose and acceptable for code, but it generates false conflicts on reformats (linters, prettifiers, whitespace-policy bumps), key reorderings in JSON / YAML / TOML, imports lists in source files that two branches both edited, and Markdown files where two contributors modified disjoint sections of the same paragraph.

LeVCS dispatches per-file to a handler cascade ranked by aggressiveness:

rank 0  textual           universal line-level fallback
rank 1  format-aware      json | yaml | toml | xml | markdown | prose
rank 2  tree-sitter       rust | python | js | ts | go | c | cpp |
                          java | ruby | bash
rank 3  plugin            wasm-sandboxed, user-supplied

A repository’s .levcs/merge.toml maps glob patterns to handlers. Per-user .levcs/merge.local.toml can demote but never promote, so a distrusted plugin can be locally turned off without a repo edit. Each merged file produces a FileRecord in .levcs/merge-record listing the handler used and its hash; the merge-record blob is committed alongside the resolved tree, so every merge in history is auditable.

Two examples illustrate the practical difference:

Format-aware example. package.json where Alice adds a dependency at the top of dependencies and Bob adds one at the bottom. Git produces a conflict because the lines are adjacent. The JSON handler parses both sides, computes the structural diff, and merges them — both new entries appear in the output, no conflict.

Tree-sitter example. Two contributors add unrelated use statements to a Rust file. Line diff conflicts. The tree-sitter handler treats the use_declaration list as an ordered set, merges both additions, no conflict.

The cascade is fail-safe. A tree-sitter handler that bails on a syntax error falls through to the format-aware handler if applicable, then to textual. The textual handler always merges — it might produce conflicts, but it never fails to produce some output. This matters for CI and for automated mirror sync: there is no merge that the engine simply refuses to attempt.

Hashing

Git uses SHA-1. SHAttered (2017) was a practical collision, and the SHA-256 transition is incomplete in 2026. LeVCS uses BLAKE3 from day one — faster than SHA-256 in practice (~5 GiB/s on a laptop for blob serialize-plus-hash), tree-hashed, no commitment to a specific length-tag convention. Object IDs are 32 bytes everywhere, with no migration story to live through.

Releases as Objects

Git tags are refs that point to commits — or to tag objects, if you remember to use -a. Either way, they are names, not artifacts. A release in LeVCS is a signed object:

tree            commit's root tree
predecessor     commit being released
parent_release  prior release in the chain (or zero)
authority       authority hash at release time
declarer_key    public key of the signing maintainer/owner
timestamp       Unix micros
label           "v1.0.0" or similar
notes           release notes (UTF-8, up to 4 GiB)

The chain parent_release → parent_release → ... gives a clean release history independent of branch topology. The replica modes above can replicate just releases (and their trees and authority) for archive instances that don’t need the inter-release commit history — a useful primitive for long-tail preservation.


How You Use It

Bootstrap

levcs key generate --label primary
levcs init --key primary
levcs track --all
levcs commit -m "initial import"

After init, .levcs/ exists alongside the working tree. The genesis authority names the chosen key as the sole Owner; the repo_id is fixed forever. After commit, the repository has one commit on refs/branches/main.

Branch and Merge

levcs branch feature/x
# ... edit files ...
levcs commit -m "wip on x"
levcs branch main
levcs merge feature/x

If the merge produces conflicts, drop into the resolution TUI:

levcs merge --resolve

The TUI shows each conflicted file with the ours/base/theirs panes the handler emitted, plus the cascade decision (which handler ran, and why it fell through if it did). On accept, it writes the resolved file and a signed .levcs/merge-record entry.

Release

levcs release v1.0.0 --notes "first release"

Writes a Release object with the current commit as predecessor, signs it with the active key, and adds refs/releases/v1.0.0. If prior releases exist, parent_release chains to the most recent one automatically.

Federation

levcs instance --set https://levcs.levineuwirth.org/levcs/v1
levcs push refs/branches/main

The first push to a fresh instance auto-inits the repository using the genesis authority. Subsequent pushes are role-checked. Pulls are public-read by default (the public_read policy bit on the genesis authority).

To migrate to a new home:

levcs migrate https://new-host.example.com/levcs/v1 --set-active

migrate re-inits and replays the full history at the destination, then points the local repository at it. The repo_id is unchanged — same project, new location.


Operating an Instance

A single binary, levcs-instance, reads a TOML config and listens on HTTP. Production deployments terminate TLS at a reverse proxy; the instance binds to localhost. The full walkthrough — systemd unit, Caddy and nginx examples, firewall, laptop-side bootstrap — lives in deploy/README.md in the repository.

The protocol surface is small:

GET  /health
GET  /levcs/v1/instance/info
GET  /levcs/v1/instance/peers
GET  /levcs/v1/repos/<repo_id>/info
GET  /levcs/v1/repos/<repo_id>/refs
GET  /levcs/v1/repos/<repo_id>/objects/<hash>
GET  /levcs/v1/repos/<repo_id>/pack?have=...&want=...
POST /levcs/v1/repos/<repo_id>/init
POST /levcs/v1/repos/<repo_id>/push

That is the whole API. No admin endpoints, no users-and-passwords table, no web UI to firewall. POSTs require a signed LeVCS-Signature header (Ed25519-over-canonical-request, with timestamp and nonce for replay protection); GETs are public unless the genesis authority’s policy turned that off.

Storage is a directory tree. Per-object atomic writes via temp-then-rename, per-repository serializing mutex on push. A consistent backup is just a snapshot of /var/lib/levcs. The first instance — levcs.levineuwirth.org — is configured exactly this way, fronted by Caddy on a small VPS, dogfooding the federation surface against the source-of-truth Forgejo at git.levineuwirth.org.


What LeVCS Isn’t (Yet)

The honest list of things you would want for a full project home that LeVCS does not provide:

If a use case requires any of the above today, the right pattern is to run LeVCS parallel to an existing platform. Forgejo, GitHub, or Gitea continues to host the workflow; the LeVCS instance acts as a dogfood replica that gets the same commits via a push-both wrapper. When the workflow surface lands, the migration story flips. This is how levcs.levineuwirth.org will be operated for the foreseeable future.


What Is True Today (and How We Know)

The repository at v0.1.0 has 194 passing tests covering:

A baseline microbenchmark suite lives in scripts/bench.sh with metadata capture (rustc version, kernel, CPU, git rev) for run-to-run comparison. On a Ryzen 7 laptop, headline numbers:

Numbers are reproducible via scripts/bench.sh --quick.


The Roadmap

The immediate priorities, in order:

  1. Workflow spec — the missing layer above. PR/review object, discussion threads, CI hook conventions, web UI design. This is the document the rest of v1 builds toward.
  2. Reference workflow tools — a minimal web UI that reads the federation API and lets you browse, review, and merge. Probably a separate repository and process, not bundled into the instance binary.
  3. CI conventions — a published webhook protocol so existing CI systems can integrate without polling.
  4. Plugin handler examples — a few real wasm handlers (e.g. protobuf, SQL migrations) to validate the plugin protocol against real formats.
  5. Git import — a one-way import path so existing projects can adopt LeVCS without hand-replaying history.

The substrate guarantees the workflow layer can lean on:

A “PR” is just an object kind LeVCS doesn’t have yet; an “issue” is another; the storage modes already define how a CI system would replicate the metadata it needs without pulling source.


Trying It

Build:

git clone https://git.levineuwirth.org/neuwirth/levcs
cd levcs
cargo build --release
sudo install -m 0755 \
    target/release/levcs target/release/levcs-instance \
    /usr/local/bin/

Local single-machine tour:

levcs key generate --label me
mkdir /tmp/demo && cd /tmp/demo
echo "hello" > a.txt
levcs init --key me
levcs track --all
levcs commit -m "first"
levcs log

Push to the public instance once it lands at levcs.levineuwirth.org:

levcs instance --set https://levcs.levineuwirth.org/levcs/v1
levcs push refs/branches/main

Read the technical report in the repository at doc/technical-report.md. Read the code: every crate is small and documented. crates/levcs-core is the object model, crates/levcs-merge is the cascade, crates/levcs-instance is the server, crates/levcs-cli is the user-facing tool.


License and Repository

The code is released under the Apache License 2.0 — see LICENSE in the repository for the full text. The choice is deliberate: the patent grant and the explicit contributor license are worth the slight ceremony for a substrate other people may build on. Frameworks should not take a stake in the work they compile, but they should be unambiguous about what compiling against them does and doesn’t permit.

The repository is at git.levineuwirth.org/neuwirth/levcs. The first federation instance will be at levcs.levineuwirth.org. The next document in the series is the workflow spec; until it lands, comments and corrections on the substrate itself are welcome, addressable to your’s truly!