Relational Cell Morphing

for Array-Manipulating Programs

Marco Gaboardi (Boston University)  ·  gaboardi@bu.edu PRESENTER

Eric Koskinen (Stevens Institute of Technology)  ·  ekoskine@stevens.edu

Hiroshi Unno (Tohoku University)  ·  hiroshi.unno@acm.org

Weihao Qu (Monmouth University)  ·  wqu@monmouth.edu

ARRAY 2026 Workshop

Background: relational properties involving arrays

Relational verification compares two executions. Four relational properties, each with its program and formal claim:

1. monotonicity 2. robustness 3. non-interference 4. relative cost

Monotonicity

arrayMax(A, n): m := A[0] for i = 1 to n-1: if A[i] > m: m := A[i] return m Relational property: if A1[i] ≤ A2[i] for all i, then max(A1) ≤ max(A2)

Robustness / sensitivity

clampPositive(A, n): for i = 0 to n-1: B[i] := max(A[i], 0) return B Relational property: if A1, A2 differ at one index, then Hamming(B1, B2) ≤ 1

Non-interference

arrayDoubleOpNI(A, n): out := 0 for i = 0 to n-1: out := out + 2*pub(A[i]) return out Relational property: if A1 =_L A2 (public part agrees), then out1 = out2

Relative cost analysis

countPositive(A, n): cost := 0 for i = 0 to n-1: if A[i] > 0: cost++ Relational property: if A1, A2 share ≥ d_0 equal positions, then cost1 − cost2 ≤ n − d_0
Talk goal: prove these array-relational claims automatically, with no array-theory reasoning in the solver query.
1 / 4

Level 1 cost analysis: valid but coarse

CountPositive: cost(A) = #{ i : A[i] > 0 }. Relational cost compares two runs: c = cost(A₁) − cost(A₂).

L1 is safe but coarse It sees one fixed-cell flag bk, but it cannot count the full equality set size d₀.
running example
countPositive(A,n): for i = 0..n-1: if A[i] > 0: cost++A1 = [3, 5, 4, -2, 5]A2 = [3, -1, 4, -2, 5]
truthd₀ = 4, n − d₀ = 1
L1 boundn − 1 = 4
what L1 sees
sees bk

one equality flag at the fixed tracked cell k

cannot count d₀

how many array indices are equal across the two runs

fixed-k pfwnCSP skeleton
State: Inv(i, n, k, ak1, ak2, bk, c)HIT equal: i = k ∧ bk = 1 ⇒ c' = cMISS: i ≠ k ⇒ c' = c + 1Goal: c ≤ n − 1 when bk = 1
safe fixed-k boundc ≤ n − 1
missed tight targetc ≤ n − d₀
PCSAT probesP sat,37N unsat,2V unsat,9Probe key: P positive bound · N negated goal · V vacuity check

Takeaway: fixed-k L1 charges equal cells it cannot track

Only cell k can take HIT. Every other position falls through MISS, so equal cells outside k still add +1.

truth c0 L1 charge0 tight n − d₀1
One rule creates the gap
fixed k = 0HIT equal: i = k ∧ bk = 1 ⇒ c' = cMISS: i ≠ k ⇒ c' = c + 1
HIT equal i = k ∧ bk = 1 charge +0
MISS cost-different i ≠ k ∧ [A₁[i] > 0] ≠ [A₂[i] > 0] legitimate +1
MISS equal i ≠ k ∧ A₁[i] = A₂[i] over-charge +1
Press Step to scan positions i = 0 → 4.
MISS over-charges at i = 2, 3, 4: the cells are equal, but fixed-k L1 does not represent those equalities.
Truth: c = 1 = n − d₀ L1 charge: 4 = n − 1 Gap: d₀ − 1 = 3

Takeaway: ARel's β is a set; CHC sees its size d₀ and one bit bk

ARel uses β as a mask of positions where the array relation holds [ARel, ICFP'19].

Bridge: β relation mask → prophecy scalar d₀ This CHC does not store β as a set; it carries d₀ = |β| and queries bk = β(k).
d / d₀0 / 4 c0 n − d₀1
What survives in CHC
ARel: β = { i | A₁[i] = A₂[i] }example: β = {0,2,3,4}prophecy: d₀ = |β| = 4state: (..., bk, d, d₀, c)fixed k: bk = β(k) sees one cellnext: refocus witnesses d₀ β-cells
Press Step to build the β mask, then summarize it into d₀.
β is the set. d₀ is its size. A fixed k sees only one bit, not the whole set.

From cell morphing to relational cell morphing

Standard cell morphing
one execution
( k, a_k )

Track one arbitrary index and replace the array with one scalar cell.

same index k
in both executions
Relational cell morphing
two executions
( k, a_k1, a_k2, b_k )

Track the paired cells and carry a relation abstraction between them.

new ingredient b_k abstracts the relation between a_k1 and a_k2 a_k1 = a_k2 a_k1 ≤ a_k2 |a_k1 - a_k2| ≤ 1

Click a column to explore the relational cell (here b_k tracks equality). Everything else is abstracted by ∀ clauses.

( k, ak1, ak2, bk )
Pick a cell to track.
One paired cell replaces two whole arrays, so array-relational reasoning becomes scalar CHC: no array theory, no quantifier over the arrays.

The scalar CHC template

For this talk, we present three levels. Every encoding has the same skeleton; only the state and step cases grow.

shared CHC skeleton
Init: Inv( i, ...state... )Step: Inv(s') :- Inv(s), case at the tracked cellQuery: the property / bound holds at every terminal stateProbes: main (sat) + negation (unsat) + vacuity (unsat)
L1fixed relational cell
k, a_k1, a_k2, b_k

HIT / MISS at the fixed cell k.

monotonicity · order · NI
L2search + prophecy
+ d, d0, c

PickK searches for related cells, then settles when prophecy is fulfilled.

relative cost c ≤ n - d0
L3async scheduler
+ TT / TF / FT

Scheduler aligns runs whose source loops advance at different paces.

stride-mismatched async
Same Init / Step / Query / Probes throughout. L1 starts with the relational cell; L2 adds counters and PickK; L3 adds the async scheduler.

Backend: pfwnCSP + PCSat

The clause templates become pfwnCSP queries, and every evaluated encoding in this talk is solved by PCSat.

Array programstwo runs, arrays, relational goal
Relational cell morphingscalar state and proof clauses
pfwnCSP .clpCHC-style extension for PCSat
PCSatsynthesizes predicate interpretations
sat / unsatproof and probe evidence
Language

pfwnCSP is predicate-family-wise ∀-∃ constrained Horn clauses: a CHC-style generalization expressive enough for our encodings.

ordinary predicates functional predicates well-founded predicates
Tool

PCSat [Unno, Terauchi, Gu, Koskinen, CAV'21] synthesizes the predicate interpretations and invariants needed to satisfy those queries.

sat in our proof tables means the proof query is satisfiable. For negation and vacuity probes, unsat rules out the corresponding bad or vacuous witness.

Future backend standard CHC output

The three levels: L1, L2, L3

Two orthogonal axes: how the cell is tracked (fixed → PickK) and how the two runs are scheduled (sync → async).

L1  21L2  22L3  26
Cell trackingfixed cell kPickK + prophecyPickK
Schedulingsynchronoussynchronousmerged async scheduler
New stateb_kd, d0, cTT / TF / FT
Provesmonotonicity, NI, sensitivityrelational cost c ≤ n − d0stride-mismatched cost

Running example: CountPositive  cost = #{ i : A[i] > 0 }. L1 and L2 share the same synchronous program and differ only in the encoding; L3 changes the loop structure.

Synchronous program (L1 & L2)

P1: for i = 0 to n-1: if A1[i]>0: c1++P2: for i = 0 to n-1: if A2[i]>0: c2++
L1
fixed cell k
c ≤ n − 1

one tracked relational cell for the whole proof

L2
PickK refocuses
c ≤ n − d0

searches candidate positions; d,d0,c make the same program prove the tight bound

Stride-mismatched program (L3)

P1: for i = 0 to n-1: if A1[i]>0: c1++ (stride 1)P2: for i = 0,2,4,…: if A2[i]>0: c2++ (stride 2)merged scheduler re-aligns the paces⇒ c ≤ n − d0 (async, still tight)
Total: 69 mains with Tier-3 status across L1, L2, L3  (21 + 22 + 26; locked 2026-06-02).

Level 1 pfwnCSP encoding: arbitrary cell k

Level 1 tracks one arbitrary array index k. The proof obligation is parametric in k, so discharging the scalar clauses proves the relational obligation uniformly for every choice of k.

1
array relation ∀j. 0 ≤ A1[j] ≤ A2[j]

The assumed relation between the two runs' input arrays.

2
arbitrary cell 0 ≤ k < n; 0 ≤ A1[k] ≤ A2[k]

One distinguished index, left free; the proof is uniform in k.

3
scalar state Inv(i, k, ak1, ak2, m1, m2, n)

The relational cell plus running maxes. No arrays reach the solver query.

4
pfwnCSP clauses Init / HIT / MISS / Query

The four clause shapes the next slide assembles one by one.

All four ingredients are scalar: no array theory reaches the solver. Next: assemble them into the Level 1 clauses, one component at a time.

The same template proves robustness and non-interference; only the cell relation b_k changes: |ak1−ak2| ≤ 1 (robustness), ak1 = ak2 (NI).

Level 1 components, one by one

Worked example: ArrayMax monotonicity. Step () to assemble the seven Level 1 components; each lights up the matching clause in the encoding.

// ArrayMax (nonnegative): m := 0; for i: if A[i] > m: m := A[i]// prove: ∀i. 0 ≤ A1[i] ≤ A2[i] ⇒ max(A1) ≤ max(A2) State: Inv(i, k, ak1, ak2, m1, m2, n) ak1 = A1[k], ak2 = A2[k]; b_k is 0 ≤ ak1 ≤ ak2 Init: Inv(0, k, ak1, ak2, 0, 0, n) :- 0 ≤ k < n, 0 ≤ ak1 ≤ ak2, n > 0 HIT (i=k): m1' = max(m1, ak1), m2' = max(m2, ak2)MISS (i≠k): m1' ≥ m1, m2' ≥ m2 // others abstracted both branches keep m1' ≤ m2' (invariant) Query: m1 ≤ m2 at i = n
Level 1 component ledger
0 / 7
1
Relational goal∀j. 0 ≤ A1[j] ≤ A2[j] ⇒ max(A1) ≤ max(A2)
2
Arbitrary cell + scalar statek arbitrary; Inv(i,k,ak1,ak2,m1,m2,n)
3
Init clause0 ≤ k < n ∧ 0 ≤ ak1 ≤ ak2
4
Transition case: HITi = k ⇒ use ak1, ak2 exactly
5
Transition case: MISSi ≠ k ⇒ abstract other cells
6
Invariant obligationHIT and MISS both preserve m1' ≤ m2'
7
Query + solver evidencei = n ⇒ m1 ≤ m2PCSAT: sat,5 / neg unsat,3 / vac unsat,3 → FULL

L2: Sync + PickK + Prophecy

PROPHECY
d = d₀ fulfills prophecy
Found the promised d₀ = |β| equal positions.
SETTLED + BOUND
c ≤ n − d₀ ends tight
SETTLED lets c grow worst case while preserving the target bound.
d / d₀
0 / 3
c
0
n − d₀
3
Press Step to scan positions.

L2 — Property class coverage (22 examples)

Click a class to expand its sub-buckets.

Cost upper bound 16
Sensitivity (incl. subtypes) 6
Qualitative 0

Click a class to see its sub-buckets…

Locked corpus status: 59 FULL, 7 PARTIAL, 0 FAIL, 3 RETRY. Per-file details in report/TIER3_VERIFICATION_RECORDS.md.

L3 scheduler: TT / TF / FT, then SEARCH / SETTLE

L3 adds a proof scheduler. It does not change the source loops; it chooses how the relational encoding pairs the next source transitions.
Who steps next?
TTat shared pointboth runs take their next source transition in the same proof step
TFi₁ < i₂advance P1 while P2 waits
FTi₂ < i₁advance P2 while P1 waits
Inside TT: two phases
SETTLEd ≥ d₀

Prophecy threshold met. Keep kp = k; stop relying on further discovery; allow worst-case c.

stride 2v3 epoch: scheduler repairs pace mismatch
(0,0)TT / SEARCH (2,3)TF (4,3)FT (4,6)TF (6,6)TT / SETTLE
A TT starts at a shared comparison point, then both source loops move once. TF / FT catch up until the next joint proof step is available.
TF/FT repair pace mismatch. TT is where PickK searches for β positions, then settles once d ≥ d₀.

Why this scheduler is not cheating

Hostile-reviewer question: did we prove the program, or did we bake in a favorable schedule?
1
Source coverage

The modeled async-step cases are exhaustive: TT, TF, or FT. Each has a corresponding L3 transition case.

both stepTTP1 onlyTFP2 onlyFT
2
Lower bounds

During SEARCH, lower-bound clauses constrain proof traces to make progress toward promised β positions. They add invariant obligations, not source assumptions.

d < d₀ ⇒ require catch-up proof steps
3
Phase switch

Once d ≥ d₀, SETTLE stops relying on further discovery. It fixes kp = k and over-approximates cost gap c.

SEARCH → SETTLE
The scheduler guides the proof trace; it does not change the source trace.
Adequacy status: informal forward-simulation obligation. Evidence: transition-case audit + solver probes; formal forward-simulation proof in progress.

TT/SEARCH on CountPositive: aligned source cell -> CHC update

Running source example
P1: if A1[i1] > 0: cost1++
P2: if A2[i2] > 0: cost2++
c = cost1 - cost2
Names used below
beta hitaligned cells match; record d' = d + 1
HIT-unequalaligned cells differ; record d' = d and use guards for c
Step 0: source frame
This TT/SEARCH step records
schedulerSchTT_search
tracked indexnext: kp = i1 = i2
branchsource frame
updatestep to inspect the first aligned cell
We run CountPositive on two executions. A TT/SEARCH proof step only applies when the runs are aligned, so the source position becomes the tracked cell.
In TT/SEARCH, the scheduler only gives us the aligned step. Equality decides beta progress; the source guards decide the cost-gap update.

Payoff: L3 verifies all stride benchmarks

2v3 example encoding lcm 6
Inv(i1,i2,...,d,d0,c) TT/SEARCH: i1=i2, d<d0, kp=i1 equal selected cells + bk=1 => d'=d+1, c'=c unequal cases adjust c by guards i1'=i1+2, i2'=i2+3 TF: i1<i2, d<d0 => i1'=i1+2 FT: i2<i1, d<d0 => i2'=i2+3 SETTLE: d>=d0 GOAL: 2*c + 2*d0 <= n + 1
forced SEARCH schedule 2v3
(0,0)TT/search (2,3)TF (4,3)FT (4,6)TF (6,6)
Bidirectional catch-up forces the next alignment event, so PickK can discover fresh beta positions until d = d0.
Same L3 template, larger LCM alignments: 2v5 and 3v4 require no special case.
1v2lcm 2FULLsat,20
1v3lcm 3FULLsat,13
2v4lcm 4FULLsat,12
2v3lcm 6FULLsat,27
2v5lcm 10FULLsat,16
3v4lcm 12FULLsat,28

FULL = full benchmark obligation proved. sat,N are locked PCSAT evidence entries, not runtimes or a speed comparison.

L3 — Both cost models for the stride family

Each stride pair is encoded under two cost models. Click a model to see what it counts.

StrideSource-state  (c = source cost1 − cost2)V1 strict  (c = per-construct trace cost diff)
1v2sat,20 / unsat,2sat,54 / unsat,2
1v3sat,13 / unsat,2sat,43 / unsat,2
2v3sat,27 / unsat,3sat,51 / unsat,3
2v4sat,12 / unsat,2sat,60 / unsat,2
2v5sat,16 / unsat,1deferred (template limitation)
3v4sat,28 / unsat,3deferred (template limitation)
4 of 6 stride pairs verified under BOTH cost models. The two deferred V1 cases are solver-template limitations, not counterexamples; the source-state versions are FULL.

Experimental results across 69 Tier-3 CHC cases

59
FULL
required probes completed
7
PARTIAL
not all required probes completed or confirmed
3
RETRY
P probe omitted on retry list
0
FAIL
no expected probe contradicted
Status legend
P = positive bound · N = negated goal · V = vacuity check
FULLP sat + N unsat + V unsat; L2 also has V2 unsat.
PARTIALAt least one expected match is present, but not all required probes completed or confirmed.
RETRYP probe omitted because the main is on the retry list.
FAILAn expected probe is contradicted. Here: 0.
Corpus breakdown
By encoding level
L1 21L2 22L3 26
By property class
cost 41sensitivity 16qualitative 12
Most obligations fully verify; the remaining non-FULL cases are methodology-justified partials, not counterexamples.
Locked headline: 59 FULL / 7 PARTIAL / 0 FAIL / 3 RETRY.

Locked snapshot: 2026-06-02. Per-file records: report/TIER3_VERIFICATION_RECORDS.md.

Honest limitations

01

Solver/template incompleteness

Partial results mean PCSAT did not synthesize every positive proof; they do not imply invalid encodings.

7 PARTIAL; 3 RETRY
02

Corpus scope

The examples exercise mechanisms, not broad benchmark coverage over arbitrary array programs.

focused examples, not coverage
03

Evidence boundary

CHC probes validate encoded obligations; source adequacy and rule soundness are separate proof obligations.

encoding evidence ≠ full metatheory
Bottom line: the evidence supports the encoding story, while the companion proof story closes the semantic gap.

Roadmap: logic, automation, benchmarks

Logic

Finish the companion proof story

Complete the Level-3 awhile-repac rule and the source-to-CHC adequacy argument.

Automation

Strengthen solver strategy

Improve invariant templates and search strategy for LCM-heavy schedules and partial positive probes.

Benchmarks

Broaden the array corpus

Add new program families while preserving the Tier-3 probe discipline from the current evaluation.

L1fixed cellawhile
L2PickK + prophecyawhile-evolving
L3merged schedulerawhile-repac
Direction: keep the CHC encoding practical, and make the logic story explain exactly why the automation is sound.

Implementation + reproducibility

69 mains with Tier-3 status: 59 FULL, 7 PARTIAL, 0 FAIL, 3 RETRY. Reproducible via the CoAR Docker image (coar:latest, built locally; build steps available on request).

$ docker run --rm \ -v "$(pwd)":/root/coar/example \ coar:latest bash -c " cd /root/coar && ./main.exe -c ./config/solver/pcsat_tbq_ar.json -p pcsp \ ./example/encodings/level-2/demo2a_countpositive_pickk_sync_pos.clp"
Press Run to see the recorded solver output.

Solver probes verify the encoded CHCs; paired with the source-to-CHC adequacy argument from the logic side.

Recap

  • 69 mains with Tier-3 status: 59 FULL, 7 PARTIAL, 0 FAIL, 3 RETRY
  • One abstraction: relational cells + scalar CHCs across cost, sensitivity, qualitative classes
  • PickK + prophecy: discover shared positions and prove bounds such as c ≤ n − d_0
  • L3 merged scheduler: keeps stride-mismatched async runs relationally aligned
  • Honest boundary: probes verify encodings; source adequacy is argued separately
Q & A
Thank you!

What is cell morphing?

An array program can be encoded as CHCs directly, but that needs array theory: invariants must quantify over every element, which CHC solvers handle poorly. Cell morphing avoids array theory.

1. array CHC 2. quantified prefix 3. arbitrary k 4. scalar invariant

Start with the direct CHC view: the invariant talks about the whole array A.

Naive: CHC over array theory

Invariants must quantify over all elements, e.g. ∀j. P(A[j]). The proof is valid, but quantified array reasoning is solver-hostile.

Cell morphing

Monniaux & Gonnord, SAS'16: track one arbitrary free index k. At access i, HIT if i = k; otherwise MISS.

Example: prove every element is positive (for i: assert A[i] > 0). Same property, two encodings:

Array-theory invariant

Inv(i, A) // A is an arrayneed: ∀j. 0 ≤ j < i → A[j] > 0∀ over the array → solver stalls

Cell-morphing invariant

Inv(i, k, a_k) // all scalarsneed: 0 ≤ k < i → a_k > 0scalar; k is a free CHC variable
1 / 4

Where Level 2 stops: source loops expose the gap

L2 can refocus with PickK, but its source-transition discipline is still lockstep: both runs take their next source transition in the same proof step. These source loops require proof states where one run advances while the other waits, so L2 has no TF / FT scheduler state.
L2 works: same pace
P1for i1 step +1: if A1[i1] > 0: c1++
P2for i2 step +1: if A2[i2] > 0: c2++
One TT proof step advances both. PickK can search β positions and grow counter d.
L2 breaks: source paces differ
stride 1v2i1 += 1 | i2 += 2
stride 2v3i1 += 2 | i2 += 3
InplaceMap 1v2A[i] = f(A[i]), pace 1 vs 2
heterogeneous 1v2guards differ, pace 1 vs 2
red example: stride 2v3 needs catch-up states
(0,0)TT (2,3)TF (4,3)FT (4,6)TF (6,6)
TT takes both next transitions. TF / FT advance one run while the other waits, realigning the next comparison.
Need Level 3: a merged scheduler with TT / TF / FT transitions keeps PickK aligned across different paces.