Protocol Specification v1.0
This document defines the Markovian Proof: a four-layer cryptographic attestation format for intelligence-derived outputs produced by the Markovian Protocol. A Markovian Proof establishes, under a single Merkle root, that: (1) the transition matrix used in computation was committed to a specific training dataset, (2) the inputs to a given synthesis run were sealed before that run executed, (3) the outputs of that run are Merkle-attested and Schnorr-proven, and (4) miner credibility is accumulating from on-chain commit-reveal predictions. The security of Layers 1–3 reduces to the discrete logarithm assumption on the BN128 elliptic curve. Layer 4 security is economic. A verifier submits one input—a Merkle root—and receives a complete four-layer bundle identifying all claims and their validity status.
The Markovian Protocol produces market regime intelligence as a byproduct of proof-of-work consensus. Each synthesis run takes a set of observable inputs, applies a Hidden Markov Model under a publicly governed transition matrix, and produces a regime classification. The output enters a shared archive. Any party purchasing access to that archive has an interest in confirming that a given output was produced honestly: that the matrix was not altered after training, that the inputs were not selected retroactively, and that the reported output matches the computation that actually ran.
Standard proof-of-work consensus provides no mechanism for this. A hash meeting a difficulty target proves work was performed. It does not prove what computation was performed or whether the claimed inputs and outputs are accurate.
The Markovian Proof addresses this gap. It is not a zk-SNARK circuit proving correct HMM execution. Proving Baum-Welch expectation-maximization in zero knowledge would require a circuit of prohibitive complexity. Instead, the Markovian Proof provides the following guarantees: the matrix M was committed to a specific training dataset at a recorded time; the inputs to a specific synthesis run were hashed and committed before that run executed; the outputs of that run are bound to a Merkle root and proven via Schnorr sigma proof; and miner prediction track records are accumulating on-chain with tamper-evident commit-reveal commitments.
An independent verifier can: confirm the Schnorr proofs are valid, re-download the training data and recompute the training hash independently, and audit the miner prediction ledger. No trusted party is required.
The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional in this document are to be interpreted as described in RFC 2119.
| Term | Definition |
|---|---|
GENESIS_M | The 3×3 stochastic transition matrix governing the Markovian Protocol. Trained on OHLCV data for GLD, QQQ, SPY, TLT, and USO from 2000 through 2026. Each row sums to 1.0. States: ACCUMULATION, MARKUP, DISTRIBUTION indexed 0, 1, 2. |
training_hash | The SHA-256 digest of a canonical JSON serialization of all OHLCV rows used to train GENESIS_M, sorted deterministically. See Section 3.6. |
m_version | Integer. Increments when GENESIS_M is updated through governance. Current production version: 1. |
synthesis_run | A single execution of the SigmaSynth intelligence pipeline producing a regime classification and associated signal outputs. |
merkle_root | The SHA-256 Merkle root computed over the output fields of a synthesis run. Primary identifier for a synthesis run and the single input to the Markovian Proof verifier. |
input_hash | SHA-256 of the canonical JSON serialization of input fields provided to a synthesis run before that run executed. |
bundle_hash | SHA-256 of the four layer validity flags encoded as a fixed-length byte string. See Section 5.2. |
Markovian Proof | The complete four-layer cryptographic bundle identified by a merkle_root. |
Kov | Atomic unit of the MKV token. 1 MKV = 100,000,000 Kovs. |
pre-provenance | A synthesis run predating the deployment of the input provenance system (Layer 3). Identified by input_hash = 'pre-provenance'. Layer 3 validity for pre-provenance runs is null, not false. |
All commitments and proofs use the BN128 pairing-friendly elliptic curve, defined in Ethereum EIP-196. The generator point G is the standard BN128 G1 generator. The curve order is:
curve_order = 21888242871839275222246405745257275088548364400416034343698204186575808495617
The security of all proofs in this specification reduces to the discrete logarithm assumption on BN128. This is the same assumption underlying Ethereum's ZK-EVM.
A second generator H is derived deterministically:
_H_SEED = int(SHA-256("SigmaSynth-H-generator-v1"), 16) mod curve_order
H = _H_SEED * G
The string "SigmaSynth-H-generator-v1" is encoded as UTF-8. This derivation is public and requires no trusted setup. H's discrete log relative to G is unknown to any party.
A Pedersen commitment to scalar m with blinding factor r:
C = r*G + m*H
Pedersen commitments are computationally hiding under the discrete log assumption and perfectly binding.
A Schnorr sigma proof proves knowledge of (r, m) such that C = r*G + m*H, without revealing r or m. Made non-interactive via Fiat-Shamir.
Prover:
k1, k2 <- random in [1, curve_order - 1]
R = k1*G + k2*H
e = int(SHA-256(encode(C) || encode(R) || context)) mod curve_order
s1 = (k1 - e*r) mod curve_order
s2 = (k2 - e*m) mod curve_order
proof = { R, s1, s2, e, full_context }
Where full_context is the complete context string used to compute e, stored in the proof to enable independent verification.
Verifier:
lhs = s1*G + s2*H + e*C
assert lhs == R
e_check = int(SHA-256(encode(C) || encode(proof.R) || proof.full_context)) mod curve_order
assert e_check == e
Point coordinates are encoded as their integer representation. The context string is UTF-8 encoded.
The Merkle tree uses SHA-256 at every node. Leaves are SHA-256 digests of field values. Fields are ordered canonically by alphabetical key name. The root is the SHA-256 of the concatenation of the two child hashes at the root level.
rows = all OHLCV rows for [GLD, QQQ, SPY, TLT, USO] from 2000-01-01 through 2026-12-31
rows_sorted = sorted(rows, key=(ticker, date))
payload = JSON.dumps([row.to_dict() for row in rows_sorted], sort_keys=True)
training_hash = SHA-256(payload.encode("utf-8"))
The production training hash for GENESIS_M m_version=1:
4aef008fce8c008ce3b7efa992c64bebbfd85ea637835bf5d317ea351559ad0e
This covers 29,795 OHLCV rows.
for each key k, value v in inputs:
if v is a string and len(v) > 500:
sanitized[k] = "sha256:" + SHA-256(v.encode("utf-8"))
elif v is numeric, boolean, or null:
sanitized[k] = v
else:
sanitized[k] = str(v)[:500]
payload = JSON.dumps(sanitized, sort_keys=True)
input_hash = SHA-256(payload.encode("utf-8"))
Layer 1 proves that GENESIS_M was committed to a specific training dataset at a recorded time.
The commitment scalar is derived from the matrix:
m_scalar = int(SHA-256(JSON.dumps(GENESIS_M.tolist())), 16) mod curve_order
A Pedersen commitment C_m = r*G + m_scalar*H is computed and a Schnorr proof is recorded. This commitment is made once per m_version before any synthesis runs under that version.
Storage schema (m_provenance table):
| Field | Type | Description |
|---|---|---|
id | INTEGER | m_version |
training_hash | TEXT | SHA-256 of training data |
training_rows | INTEGER | Row count |
m_commitment | TEXT | JSON-encoded C_m point [x, y] |
zk_proof | JSONB | Schnorr proof {R, s1, s2, e, full_context} |
zk_valid | BOOLEAN | Self-verification result at commit time |
committed_at | TIMESTAMP | UTC commit time |
Layer 1 verification:
Layer 1 must be valid for a Markovian Proof to be considered valid.
Layer 2 proves that a specific set of outputs was Merkle-attested and Schnorr-committed.
For each synthesis run, the output fields are serialized, a Merkle root is computed over them, and a Pedersen commitment and Schnorr proof are recorded.
Storage schema (signal_provenance table, Layer 2 columns):
| Field | Type | Description |
|---|---|---|
merkle_root | TEXT | Primary key. SHA-256 Merkle root. |
model | TEXT | Model identifier |
gate | TEXT | Gate decision at synthesis time |
zk_commitment | TEXT | JSON-encoded output commitment point |
zk_proof | JSONB | Schnorr proof |
zk_valid | BOOLEAN | Self-verification at synthesis time |
run_at | TIMESTAMP | UTC synthesis time |
Layer 2 verification:
Layer 2 must be valid for a Markovian Proof to be considered valid.
Layer 3 proves that the input vector was committed to before synthesis executed.
The commitment must be recorded before the synthesis pipeline runs. A Layer 3 proof recorded after synthesis output is not compliant.
Storage schema (signal_provenance table, Layer 3 columns):
| Field | Type | Description |
|---|---|---|
input_hash | TEXT | SHA-256 of sanitized inputs, or 'pre-provenance' |
input_fields | JSONB | Truncated input field snapshot |
input_commitment | TEXT | JSON-encoded input commitment point |
input_zk_proof | JSONB | Schnorr proof with full_context |
input_zk_valid | BOOLEAN | Self-verification at commit time |
Pre-provenance handling:
Synthesis runs predating the deployment of Layer 3 must have input_hash = 'pre-provenance'. For such rows, Layer 3 validity in the bundle must be returned as null. It must not be returned as false. Pre-provenance is a historical designation, not a verification failure.
Layer 3 verification:
input_hash = 'pre-provenance': return null.Layer 4 accumulates miner prediction track records via a commit-reveal scheme.
Commitment:
nonce = 32 random bytes from CSPRNG
commitment = SHA-256(address || ticker || regime || str(target_block) || nonce.hex())
Where || denotes string concatenation. The commitment is posted on-chain at block height H_commit. The prediction claims the regime of ticker at block height target_block.
Resolution:
At target_block, the actual regime is determined from the chain. The miner reveals (ticker, regime, target_block, nonce). The commitment is recomputed and verified. If correct, the prediction is scored: accurate or inaccurate. The miner's credibility score updates and never resets.
Storage schema (miner_predictions table):
| Field | Type | Description |
|---|---|---|
id | INTEGER | Auto-increment |
address | TEXT | Miner address |
ticker | TEXT | Predicted ticker |
predicted_regime | TEXT | ACCUMULATION | MARKUP | DISTRIBUTION |
target_block | INTEGER | Block at which to resolve |
commitment_hash | TEXT | SHA-256 commitment |
nonce | TEXT | Revealed nonce (null until resolved) |
resolved | BOOLEAN | Whether prediction has been resolved |
correct | BOOLEAN | Whether prediction was accurate (null until resolved) |
committed_at | TIMESTAMP | UTC commitment time |
resolved_at | TIMESTAMP | UTC resolution time (null until resolved) |
Layer 4 in the bundle:
If no resolved predictions exist: Layer 4 validity is null. If resolved predictions exist: Layer 4 validity is a function of prediction accuracy. Credibility scores must be computed only over resolved predictions.
{
"type": "markovian_proof",
"version": 1,
"merkle_root": "<64 lowercase hex chars>",
"run_at": "<ISO 8601 UTC>",
"bundle_hash": "<64 lowercase hex chars>",
"valid": <bool>,
"layers": {
"m_provenance": {
"valid": <bool>,
"m_version": <int>,
"training_hash": "<64 hex chars>",
"training_rows": <int>,
"committed_at": "<ISO 8601 UTC>"
},
"output_proof": {
"valid": <bool>,
"model": "<string>",
"gate": "<string>",
"run_at": "<ISO 8601 UTC>"
},
"input_provenance": {
"valid": <bool | null>,
"input_hash": "<64 hex chars | 'pre-provenance'>",
"note": "<string if null>"
},
"miner_credibility": {
"valid": <bool | null>,
"total_predictions": <int>,
"resolved_predictions": <int>,
"accuracy": <float | null>,
"note": "<string if null>"
}
}
}
All string fields are UTF-8. Numeric fields are JSON numbers. Boolean and null fields use JSON boolean and null literals.
def encode_validity(v):
if v is True: return b'\x01'
if v is False: return b'\x00'
if v is None: return b'\xff'
payload = (
encode_validity(layers["m_provenance"]["valid"]) +
encode_validity(layers["output_proof"]["valid"]) +
encode_validity(layers["input_provenance"]["valid"]) +
encode_validity(layers["miner_credibility"]["valid"])
)
bundle_hash = SHA-256(payload).hexdigest()
valid = (layers["m_provenance"]["valid"] == True) AND
(layers["output_proof"]["valid"] == True)
Layers 3 and 4 contribute to the bundle_hash and to credibility assessment. They do not gate the top-level validity determination. A proof with Layer 1 and Layer 2 passing is a valid Markovian Proof. Layers 3 and 4 strengthen the proof as the system matures.
GET /verify/{merkle_root}
merkle_root must be 64 lowercase hexadecimal characters. Any other format must return HTTP 400.
1. Fetch m_provenance by m_version (latest)
2. Run Schnorr verify on Layer 1 (m_commitment, zk_proof)
3. Fetch signal_provenance by merkle_root
4. If not found: return HTTP 404
5. Run Schnorr verify on Layer 2 (zk_commitment, zk_proof)
6. If input_hash == 'pre-provenance':
Layer 3 valid = null, note = "predates provenance system"
Else:
Run Schnorr verify on Layer 3 (input_commitment, input_zk_proof)
7. Fetch miner_predictions aggregate
8. If no resolved predictions:
Layer 4 valid = null, note = "no resolved predictions"
Else:
Compute accuracy = correct / resolved
Layer 4 valid = (accuracy >= threshold)
9. Compute bundle_hash per Section 5.2
10. Set top-level valid per Section 5.3
11. Return bundle
For a given merkle_root, the response must be deterministic with the following exception: Layer 4 data updates as predictions resolve. The response must reflect the current state of the miner_predictions table at request time. Implementations should not cache Layer 4 data.
| Code | Condition |
|---|---|
| 200 | Bundle returned successfully |
| 400 | Invalid merkle_root format |
| 404 | merkle_root not found in signal_provenance |
| 500 | Internal error; should include error field |
POST /predict
Content-Type: application/json
{
"address": "<miner address>",
"ticker": "<ticker symbol>",
"regime": "<ACCUMULATION | MARKUP | DISTRIBUTION>",
"target_block": <integer>
}
Server computes nonce and commitment. Returns:
{
"commitment_hash": "<64 hex chars>",
"nonce": "<64 hex chars>",
"target_block": <int>,
"record_id": <int>
}
The nonce must be stored by the miner for reveal at resolution time. Loss of the nonce makes the prediction unresolvable.
GET /credibility/{address}
Returns:
{
"address": "<address>",
"total": <int>,
"resolved": <int>,
"correct": <int>,
"accuracy": <float | null>,
"predictions": [...]
}
The security of Layers 1, 2, and 3 reduces entirely to the discrete logarithm assumption on BN128. Under this assumption, a computationally bounded adversary cannot produce a valid Schnorr proof for a commitment without knowing the committed scalar. This means: the committed matrix cannot be substituted after commitment, the committed inputs cannot be altered after commitment, and the committed output cannot be substituted after commitment.
The second generator H is derived deterministically from a public string. Its discrete log relative to G is unknown. Any party may verify this derivation. No trusted setup is required.
Layer 1 proves that GENESIS_M was committed to a training dataset identified by training_hash. It does not prove that the HMM was trained correctly on that dataset. Proving Baum-Welch expectation-maximization in zero knowledge would require a general computation circuit at the current limits of zkML research. An independent verifier can: download the raw OHLCV data, recompute the training_hash, assert equality with the committed value, and then re-run HMM training independently to verify that the same M results. This is a weaker guarantee than a ZK execution proof. It is an honest characterization of what the system provides.
Layer 4 security is economic, not cryptographic. The commitment scheme prevents retroactive claims. The SHA-256 preimage of the commitment cannot be found without the nonce. However, the credibility score is only meaningful after a statistically significant number of resolved predictions across multiple market regimes. In early protocol operation, Layer 4 validity is null by design. Implementations must not treat null as evidence of a security failure.
Each Schnorr proof includes a full_context string that binds the proof to a specific run. Copying a valid proof from one synthesis run to another must be detected by the verifier, because the Fiat-Shamir challenge incorporates the commitment point, which differs between runs.
Synthesis runs predating Layer 3 deployment are designated pre-provenance. This is a historical fact, not a vulnerability. The output proofs (Layer 2) for pre-provenance runs are valid. Only the input pre-commitment is absent. Implementations must represent this as null, not false.
The proof system specified in this document is not post-quantum resistant. The discrete logarithm assumption on BN128 does not hold against an adversary equipped with a fault-tolerant quantum computer running Shor's algorithm. This is a known limitation shared by all deployed elliptic curve proof systems, including Ethereum's ZK-EVM.
A future protocol version will substitute a lattice-based commitment scheme in place of BN128 Pedersen commitments and Schnorr sigma proofs. The four-layer bundle structure, Merkle root format, and archive schema are designed to be upgrade-compatible: the cryptographic layer can be replaced without modifying the output format or historical archive. Such an upgrade requires a governance-ratified matrix version increment and a coordinated hard fork at a known block height.
Until that upgrade is ratified, the security of this protocol depends on the assumption that fault-tolerant quantum computation at the scale required to break 128-bit elliptic curve security remains unavailable. Implementations should monitor developments in this area and must not represent the current proof system as post-quantum resistant.
Python 3.11. Library: py_ecc 8.0.0 for BN128 operations.
| File | Purpose |
|---|---|
zk_m_provenance.py | Layer 1 commit and verify |
zk_input_provenance.py | Layer 3 commit and verify |
neo_claude_synthesis.py | Synthesis pipeline with Layers 2+3 wired |
miner_predictions.py | Layer 4 commit-reveal |
api_server.py | Verification endpoint |
PostgreSQL 14+ with TimescaleDB. Tables: m_provenance, signal_provenance, miner_predictions.
https://api.quantsynth.net/verify/{merkle_root}
This document defines no new protocol parameters requiring IANA registration.