The Markovian Proof Protocol

Protocol Specification v1.0

AuthorSigmaSynth LLC <[email protected]> StatusProposed Standard DateJune 2026 ReplacesNone

Contents

  1. Abstract
  2. Introduction
  3. Terminology
  4. Cryptographic Primitives
  5. The Four Layers
  6. Bundle Format
  7. Verification Endpoint
  8. Miner Prediction API
  9. Security Considerations
  10. Implementation Notes
  11. IANA Considerations
  12. References

Abstract

This document defines the Markovian Proof: a four-layer cryptographic attestation format for intelligence-derived outputs produced by the Markovian Protocol. A Markovian Proof establishes, under a single Merkle root, that: (1) the transition matrix used in computation was committed to a specific training dataset, (2) the inputs to a given synthesis run were sealed before that run executed, (3) the outputs of that run are Merkle-attested and Schnorr-proven, and (4) miner credibility is accumulating from on-chain commit-reveal predictions. The security of Layers 1–3 reduces to the discrete logarithm assumption on the BN128 elliptic curve. Layer 4 security is economic. A verifier submits one input—a Merkle root—and receives a complete four-layer bundle identifying all claims and their validity status.


1. Introduction

The Markovian Protocol produces market regime intelligence as a byproduct of proof-of-work consensus. Each synthesis run takes a set of observable inputs, applies a Hidden Markov Model under a publicly governed transition matrix, and produces a regime classification. The output enters a shared archive. Any party purchasing access to that archive has an interest in confirming that a given output was produced honestly: that the matrix was not altered after training, that the inputs were not selected retroactively, and that the reported output matches the computation that actually ran.

Standard proof-of-work consensus provides no mechanism for this. A hash meeting a difficulty target proves work was performed. It does not prove what computation was performed or whether the claimed inputs and outputs are accurate.

The Markovian Proof addresses this gap. It is not a zk-SNARK circuit proving correct HMM execution. Proving Baum-Welch expectation-maximization in zero knowledge would require a circuit of prohibitive complexity. Instead, the Markovian Proof provides the following guarantees: the matrix M was committed to a specific training dataset at a recorded time; the inputs to a specific synthesis run were hashed and committed before that run executed; the outputs of that run are bound to a Merkle root and proven via Schnorr sigma proof; and miner prediction track records are accumulating on-chain with tamper-evident commit-reveal commitments.

An independent verifier can: confirm the Schnorr proofs are valid, re-download the training data and recompute the training hash independently, and audit the miner prediction ledger. No trusted party is required.

1.1 Requirements Language

The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional in this document are to be interpreted as described in RFC 2119.


2. Terminology

TermDefinition
GENESIS_MThe 3×3 stochastic transition matrix governing the Markovian Protocol. Trained on OHLCV data for GLD, QQQ, SPY, TLT, and USO from 2000 through 2026. Each row sums to 1.0. States: ACCUMULATION, MARKUP, DISTRIBUTION indexed 0, 1, 2.
training_hashThe SHA-256 digest of a canonical JSON serialization of all OHLCV rows used to train GENESIS_M, sorted deterministically. See Section 3.6.
m_versionInteger. Increments when GENESIS_M is updated through governance. Current production version: 1.
synthesis_runA single execution of the SigmaSynth intelligence pipeline producing a regime classification and associated signal outputs.
merkle_rootThe SHA-256 Merkle root computed over the output fields of a synthesis run. Primary identifier for a synthesis run and the single input to the Markovian Proof verifier.
input_hashSHA-256 of the canonical JSON serialization of input fields provided to a synthesis run before that run executed.
bundle_hashSHA-256 of the four layer validity flags encoded as a fixed-length byte string. See Section 5.2.
Markovian ProofThe complete four-layer cryptographic bundle identified by a merkle_root.
KovAtomic unit of the MKV token. 1 MKV = 100,000,000 Kovs.
pre-provenanceA synthesis run predating the deployment of the input provenance system (Layer 3). Identified by input_hash = 'pre-provenance'. Layer 3 validity for pre-provenance runs is null, not false.

3. Cryptographic Primitives

3.1 Elliptic Curve

All commitments and proofs use the BN128 pairing-friendly elliptic curve, defined in Ethereum EIP-196. The generator point G is the standard BN128 G1 generator. The curve order is:

curve_order = 21888242871839275222246405745257275088548364400416034343698204186575808495617

The security of all proofs in this specification reduces to the discrete logarithm assumption on BN128. This is the same assumption underlying Ethereum's ZK-EVM.

3.2 Second Generator H

A second generator H is derived deterministically:

_H_SEED = int(SHA-256("SigmaSynth-H-generator-v1"), 16) mod curve_order
H = _H_SEED * G

The string "SigmaSynth-H-generator-v1" is encoded as UTF-8. This derivation is public and requires no trusted setup. H's discrete log relative to G is unknown to any party.

3.3 Pedersen Commitment

A Pedersen commitment to scalar m with blinding factor r:

C = r*G + m*H

Pedersen commitments are computationally hiding under the discrete log assumption and perfectly binding.

3.4 Schnorr Sigma Proof

A Schnorr sigma proof proves knowledge of (r, m) such that C = r*G + m*H, without revealing r or m. Made non-interactive via Fiat-Shamir.

Prover:

k1, k2 <- random in [1, curve_order - 1]
R = k1*G + k2*H
e = int(SHA-256(encode(C) || encode(R) || context)) mod curve_order
s1 = (k1 - e*r) mod curve_order
s2 = (k2 - e*m) mod curve_order
proof = { R, s1, s2, e, full_context }

Where full_context is the complete context string used to compute e, stored in the proof to enable independent verification.

Verifier:

lhs = s1*G + s2*H + e*C
assert lhs == R

e_check = int(SHA-256(encode(C) || encode(proof.R) || proof.full_context)) mod curve_order
assert e_check == e

Point coordinates are encoded as their integer representation. The context string is UTF-8 encoded.

3.5 Merkle Tree

The Merkle tree uses SHA-256 at every node. Leaves are SHA-256 digests of field values. Fields are ordered canonically by alphabetical key name. The root is the SHA-256 of the concatenation of the two child hashes at the root level.

3.6 Training Hash

rows = all OHLCV rows for [GLD, QQQ, SPY, TLT, USO] from 2000-01-01 through 2026-12-31
rows_sorted = sorted(rows, key=(ticker, date))
payload = JSON.dumps([row.to_dict() for row in rows_sorted], sort_keys=True)
training_hash = SHA-256(payload.encode("utf-8"))

The production training hash for GENESIS_M m_version=1:

4aef008fce8c008ce3b7efa992c64bebbfd85ea637835bf5d317ea351559ad0e

This covers 29,795 OHLCV rows.

3.7 Input Hash

for each key k, value v in inputs:
    if v is a string and len(v) > 500:
        sanitized[k] = "sha256:" + SHA-256(v.encode("utf-8"))
    elif v is numeric, boolean, or null:
        sanitized[k] = v
    else:
        sanitized[k] = str(v)[:500]

payload = JSON.dumps(sanitized, sort_keys=True)
input_hash = SHA-256(payload.encode("utf-8"))

4. The Four Layers

4.1 Layer 1 — Matrix Provenance

Layer 1 proves that GENESIS_M was committed to a specific training dataset at a recorded time.

The commitment scalar is derived from the matrix:

m_scalar = int(SHA-256(JSON.dumps(GENESIS_M.tolist())), 16) mod curve_order

A Pedersen commitment C_m = r*G + m_scalar*H is computed and a Schnorr proof is recorded. This commitment is made once per m_version before any synthesis runs under that version.

Storage schema (m_provenance table):

FieldTypeDescription
idINTEGERm_version
training_hashTEXTSHA-256 of training data
training_rowsINTEGERRow count
m_commitmentTEXTJSON-encoded C_m point [x, y]
zk_proofJSONBSchnorr proof {R, s1, s2, e, full_context}
zk_validBOOLEANSelf-verification result at commit time
committed_atTIMESTAMPUTC commit time

Layer 1 verification:

  1. Fetch m_provenance record by m_version.
  2. Deserialize m_commitment as point C.
  3. Run Schnorr verifier against C and zk_proof.
  4. Optionally: re-download OHLCV data, recompute training_hash, assert equality.

Layer 1 must be valid for a Markovian Proof to be considered valid.

4.2 Layer 2 — Output Proof

Layer 2 proves that a specific set of outputs was Merkle-attested and Schnorr-committed.

For each synthesis run, the output fields are serialized, a Merkle root is computed over them, and a Pedersen commitment and Schnorr proof are recorded.

Storage schema (signal_provenance table, Layer 2 columns):

FieldTypeDescription
merkle_rootTEXTPrimary key. SHA-256 Merkle root.
modelTEXTModel identifier
gateTEXTGate decision at synthesis time
zk_commitmentTEXTJSON-encoded output commitment point
zk_proofJSONBSchnorr proof
zk_validBOOLEANSelf-verification at synthesis time
run_atTIMESTAMPUTC synthesis time

Layer 2 verification:

  1. Fetch signal_provenance row by merkle_root.
  2. Deserialize zk_commitment as point C.
  3. Run Schnorr verifier against C and zk_proof.

Layer 2 must be valid for a Markovian Proof to be considered valid.

4.3 Layer 3 — Input Provenance

Layer 3 proves that the input vector was committed to before synthesis executed.

The commitment must be recorded before the synthesis pipeline runs. A Layer 3 proof recorded after synthesis output is not compliant.

Storage schema (signal_provenance table, Layer 3 columns):

FieldTypeDescription
input_hashTEXTSHA-256 of sanitized inputs, or 'pre-provenance'
input_fieldsJSONBTruncated input field snapshot
input_commitmentTEXTJSON-encoded input commitment point
input_zk_proofJSONBSchnorr proof with full_context
input_zk_validBOOLEANSelf-verification at commit time

Pre-provenance handling:

Synthesis runs predating the deployment of Layer 3 must have input_hash = 'pre-provenance'. For such rows, Layer 3 validity in the bundle must be returned as null. It must not be returned as false. Pre-provenance is a historical designation, not a verification failure.

Layer 3 verification:

  1. Fetch signal_provenance row by merkle_root.
  2. If input_hash = 'pre-provenance': return null.
  3. Deserialize input_commitment as point C.
  4. Run Schnorr verifier against C and input_zk_proof.

4.4 Layer 4 — Miner Credibility

Layer 4 accumulates miner prediction track records via a commit-reveal scheme.

Commitment:

nonce = 32 random bytes from CSPRNG
commitment = SHA-256(address || ticker || regime || str(target_block) || nonce.hex())

Where || denotes string concatenation. The commitment is posted on-chain at block height H_commit. The prediction claims the regime of ticker at block height target_block.

Resolution:

At target_block, the actual regime is determined from the chain. The miner reveals (ticker, regime, target_block, nonce). The commitment is recomputed and verified. If correct, the prediction is scored: accurate or inaccurate. The miner's credibility score updates and never resets.

Storage schema (miner_predictions table):

FieldTypeDescription
idINTEGERAuto-increment
addressTEXTMiner address
tickerTEXTPredicted ticker
predicted_regimeTEXTACCUMULATION | MARKUP | DISTRIBUTION
target_blockINTEGERBlock at which to resolve
commitment_hashTEXTSHA-256 commitment
nonceTEXTRevealed nonce (null until resolved)
resolvedBOOLEANWhether prediction has been resolved
correctBOOLEANWhether prediction was accurate (null until resolved)
committed_atTIMESTAMPUTC commitment time
resolved_atTIMESTAMPUTC resolution time (null until resolved)

Layer 4 in the bundle:

If no resolved predictions exist: Layer 4 validity is null. If resolved predictions exist: Layer 4 validity is a function of prediction accuracy. Credibility scores must be computed only over resolved predictions.


5. Bundle Format

5.1 Wire Format

{
  "type": "markovian_proof",
  "version": 1,
  "merkle_root": "<64 lowercase hex chars>",
  "run_at": "<ISO 8601 UTC>",
  "bundle_hash": "<64 lowercase hex chars>",
  "valid": <bool>,
  "layers": {
    "m_provenance": {
      "valid": <bool>,
      "m_version": <int>,
      "training_hash": "<64 hex chars>",
      "training_rows": <int>,
      "committed_at": "<ISO 8601 UTC>"
    },
    "output_proof": {
      "valid": <bool>,
      "model": "<string>",
      "gate": "<string>",
      "run_at": "<ISO 8601 UTC>"
    },
    "input_provenance": {
      "valid": <bool | null>,
      "input_hash": "<64 hex chars | 'pre-provenance'>",
      "note": "<string if null>"
    },
    "miner_credibility": {
      "valid": <bool | null>,
      "total_predictions": <int>,
      "resolved_predictions": <int>,
      "accuracy": <float | null>,
      "note": "<string if null>"
    }
  }
}

All string fields are UTF-8. Numeric fields are JSON numbers. Boolean and null fields use JSON boolean and null literals.

5.2 Bundle Hash

def encode_validity(v):
    if v is True:  return b'\x01'
    if v is False: return b'\x00'
    if v is None:  return b'\xff'

payload = (
    encode_validity(layers["m_provenance"]["valid"]) +
    encode_validity(layers["output_proof"]["valid"]) +
    encode_validity(layers["input_provenance"]["valid"]) +
    encode_validity(layers["miner_credibility"]["valid"])
)
bundle_hash = SHA-256(payload).hexdigest()

5.3 Top-Level Validity

valid = (layers["m_provenance"]["valid"] == True) AND
        (layers["output_proof"]["valid"] == True)

Layers 3 and 4 contribute to the bundle_hash and to credibility assessment. They do not gate the top-level validity determination. A proof with Layer 1 and Layer 2 passing is a valid Markovian Proof. Layers 3 and 4 strengthen the proof as the system matures.


6. Verification Endpoint

6.1 Request

GET /verify/{merkle_root}

merkle_root must be 64 lowercase hexadecimal characters. Any other format must return HTTP 400.

6.2 Verification Algorithm

1.  Fetch m_provenance by m_version (latest)
2.  Run Schnorr verify on Layer 1 (m_commitment, zk_proof)
3.  Fetch signal_provenance by merkle_root
4.  If not found: return HTTP 404
5.  Run Schnorr verify on Layer 2 (zk_commitment, zk_proof)
6.  If input_hash == 'pre-provenance':
        Layer 3 valid = null, note = "predates provenance system"
    Else:
        Run Schnorr verify on Layer 3 (input_commitment, input_zk_proof)
7.  Fetch miner_predictions aggregate
8.  If no resolved predictions:
        Layer 4 valid = null, note = "no resolved predictions"
    Else:
        Compute accuracy = correct / resolved
        Layer 4 valid = (accuracy >= threshold)
9.  Compute bundle_hash per Section 5.2
10. Set top-level valid per Section 5.3
11. Return bundle

6.3 Response Determinism

For a given merkle_root, the response must be deterministic with the following exception: Layer 4 data updates as predictions resolve. The response must reflect the current state of the miner_predictions table at request time. Implementations should not cache Layer 4 data.

6.4 Response Codes

CodeCondition
200Bundle returned successfully
400Invalid merkle_root format
404merkle_root not found in signal_provenance
500Internal error; should include error field

7. Miner Prediction API

7.1 Submit Prediction

POST /predict
Content-Type: application/json

{
  "address": "<miner address>",
  "ticker": "<ticker symbol>",
  "regime": "<ACCUMULATION | MARKUP | DISTRIBUTION>",
  "target_block": <integer>
}

Server computes nonce and commitment. Returns:

{
  "commitment_hash": "<64 hex chars>",
  "nonce": "<64 hex chars>",
  "target_block": <int>,
  "record_id": <int>
}

The nonce must be stored by the miner for reveal at resolution time. Loss of the nonce makes the prediction unresolvable.

7.2 Query Credibility

GET /credibility/{address}

Returns:

{
  "address": "<address>",
  "total": <int>,
  "resolved": <int>,
  "correct": <int>,
  "accuracy": <float | null>,
  "predictions": [...]
}

8. Security Considerations

8.1 Cryptographic Security

The security of Layers 1, 2, and 3 reduces entirely to the discrete logarithm assumption on BN128. Under this assumption, a computationally bounded adversary cannot produce a valid Schnorr proof for a commitment without knowing the committed scalar. This means: the committed matrix cannot be substituted after commitment, the committed inputs cannot be altered after commitment, and the committed output cannot be substituted after commitment.

The second generator H is derived deterministically from a public string. Its discrete log relative to G is unknown. Any party may verify this derivation. No trusted setup is required.

8.2 What the Protocol Does Not Prove

Layer 1 proves that GENESIS_M was committed to a training dataset identified by training_hash. It does not prove that the HMM was trained correctly on that dataset. Proving Baum-Welch expectation-maximization in zero knowledge would require a general computation circuit at the current limits of zkML research. An independent verifier can: download the raw OHLCV data, recompute the training_hash, assert equality with the committed value, and then re-run HMM training independently to verify that the same M results. This is a weaker guarantee than a ZK execution proof. It is an honest characterization of what the system provides.

8.3 Layer 4 Security

Layer 4 security is economic, not cryptographic. The commitment scheme prevents retroactive claims. The SHA-256 preimage of the commitment cannot be found without the nonce. However, the credibility score is only meaningful after a statistically significant number of resolved predictions across multiple market regimes. In early protocol operation, Layer 4 validity is null by design. Implementations must not treat null as evidence of a security failure.

8.4 Replay and Forgery

Each Schnorr proof includes a full_context string that binds the proof to a specific run. Copying a valid proof from one synthesis run to another must be detected by the verifier, because the Fiat-Shamir challenge incorporates the commitment point, which differs between runs.

8.5 Pre-Provenance Rows

Synthesis runs predating Layer 3 deployment are designated pre-provenance. This is a historical fact, not a vulnerability. The output proofs (Layer 2) for pre-provenance runs are valid. Only the input pre-commitment is absent. Implementations must represent this as null, not false.

8.6 Post-Quantum Security

The proof system specified in this document is not post-quantum resistant. The discrete logarithm assumption on BN128 does not hold against an adversary equipped with a fault-tolerant quantum computer running Shor's algorithm. This is a known limitation shared by all deployed elliptic curve proof systems, including Ethereum's ZK-EVM.

A future protocol version will substitute a lattice-based commitment scheme in place of BN128 Pedersen commitments and Schnorr sigma proofs. The four-layer bundle structure, Merkle root format, and archive schema are designed to be upgrade-compatible: the cryptographic layer can be replaced without modifying the output format or historical archive. Such an upgrade requires a governance-ratified matrix version increment and a coordinated hard fork at a known block height.

Until that upgrade is ratified, the security of this protocol depends on the assumption that fault-tolerant quantum computation at the scale required to break 128-bit elliptic curve security remains unavailable. Implementations should monitor developments in this area and must not represent the current proof system as post-quantum resistant.


9. Implementation Notes

9.1 Reference Implementation

Python 3.11. Library: py_ecc 8.0.0 for BN128 operations.

FilePurpose
zk_m_provenance.pyLayer 1 commit and verify
zk_input_provenance.pyLayer 3 commit and verify
neo_claude_synthesis.pySynthesis pipeline with Layers 2+3 wired
miner_predictions.pyLayer 4 commit-reveal
api_server.pyVerification endpoint

9.2 Database

PostgreSQL 14+ with TimescaleDB. Tables: m_provenance, signal_provenance, miner_predictions.

9.3 Public Endpoint

https://api.quantsynth.net/verify/{merkle_root}

10. IANA Considerations

This document defines no new protocol parameters requiring IANA registration.


11. References

Normative

Informative