Preconditions and Postconditions¶

This document explains how Askalot analyzes preconditions (when items are reachable) and postconditions (what answers are valid) using SMT-based satisfiability checking.

Preconditions: Reachability Analysis¶

For each item \(I_i\), the precondition \(P_i(\mathbf{S})\) is a Boolean predicate over the answer variables \(\mathbf{S} = (S_1, \dots, S_n)\) that determines whether the item is eligible to be asked.

Operationally, items are presented only when \(P_i\) holds. Evaluation of \(P_i\) is always with respect to assignments that satisfy the base domain constraints:

\[ \mathbf{a} \models \bigwedge_{k=1}^n D_k(S_k) \]

Precondition Status (Reachability)¶

We classify \(P_i\) with respect to the base constraints \(B\) via solver-checkable criteria:

\[ \begin{aligned} \textbf{ALWAYS} &:\quad \mathrm{UNSAT}\bigl(B \wedge \lnot P_i\bigr)\\ \textbf{NEVER} &:\quad \mathrm{UNSAT}\bigl(B \wedge P_i\bigr)\\ \textbf{CONDITIONAL} &:\quad \text{otherwise} \end{aligned} \]

These classifications tell us:

ALWAYS reachable: Item \(I_i\) will be asked in every valid survey execution, regardless of how previous questions are answered.
NEVER reachable: Item \(I_i\) can never be reached under any valid assignment. This typically indicates a questionnaire design error.
CONDITIONALLY reachable: Item \(I_i\) is reachable under some (but not all) valid assignments. This is the expected case for branching logic.

Algorithm: Precondition Reachability Classification¶

Input: Base constraints \(B\), precondition \(P_i\)

Output: Reachability status (ALWAYS, NEVER, or CONDITIONAL)

Check \(\mathrm{SAT}(B \wedge \neg P_i)\) using SMT solver
If result is UNSAT, return ALWAYS (item is always reachable)
Check \(\mathrm{SAT}(B \wedge P_i)\) using SMT solver
If result is UNSAT, return NEVER (item is never reachable)
Otherwise, return CONDITIONAL (item is conditionally reachable)

Detailed Classification Definitions¶

Always Satisfiable¶

\(P_i(\mathbf{S})\) is always satisfiable if for every valid assignment \(\mathbf{a}\) of \(\mathbf{S}\) (i.e., \(\mathbf{a}\) satisfies the domain constraints), \(P_i(\mathbf{a})\) evaluates to True.

Formally:

\[ \forall \mathbf{a}, \quad \mathbf{a} \models \bigwedge_{k=1}^n D_k(S_k) \implies P_i(\mathbf{a}) \]

In practice, this means item \(I_i\) is always unlocked no matter how previous questions are answered.

Example: A questionnaire starts with demographic questions that have no dependencies. Their preconditions are always true:

\[ P_1(\mathbf{S}) := \text{True} \]

Never Satisfiable¶

\(P_i(\mathbf{S})\) is never satisfiable if for every valid assignment \(\mathbf{a}\):

\[ \forall \mathbf{a}, \quad \mathbf{a} \models \bigwedge_{k=1}^n D_k(S_k) \implies \neg P_i(\mathbf{a}) \]

Meaning there is no scenario or set of answers that can make \(I_i\) accessible.

Example: A design error where an item requires contradictory conditions:

\[ P_5(\mathbf{S}) := (S_2 = 1) \wedge (S_2 = 0) \]

This precondition can never be satisfied, indicating a bug in the questionnaire design.

Conditionally Satisfiable¶

\(P_i(\mathbf{S})\) is conditionally satisfiable if neither always nor never satisfiable. Equivalently, there exists at least one valid assignment \(\mathbf{a}\) where \(P_i\) is True and at least one valid assignment \(\mathbf{b}\) where \(P_i\) is False.

Formally:

\[ \exists \mathbf{a}, \mathbf{b}, \quad \left(\mathbf{a} \models \bigwedge_{k=1}^n D_k(S_k)\right) \land \left(\mathbf{b} \models \bigwedge_{k=1}^n D_k(S_k)\right) \land P_i(\mathbf{a}) \land \neg P_i(\mathbf{b}) \]

This means item \(I_i\) is reachable only under some (but not all) consistent assignments. Under other valid assignments, \(I_i\) remains locked.

Example: A follow-up question only for parents:

\[ P_4(\mathbf{S}) := (S_3 > 0) \]

where \(S_3\) represents number of children. This is conditional: satisfied when \(S_3 \in \{1, 2, 3, \dots\}\), not satisfied when \(S_3 = 0\).

Summary of Reachability Cases

Always \(\quad\leftrightarrow\quad \forall \mathbf{a}\in \text{Dom},\; P_i(\mathbf{a})\)
Never \(\quad\leftrightarrow\quad \forall \mathbf{a}\in \text{Dom},\; \neg P_i(\mathbf{a})\)
Conditional \(\quad\leftrightarrow\quad\) neither of the above, i.e., \(\exists\mathbf{a},\mathbf{b}\in\text{Dom}\) s.t. \(P_i(\mathbf{a})\) and \(\neg P_i(\mathbf{b})\)

Where \(\text{Dom}\) is the set of all assignments that satisfy the domain constraints \(\bigwedge_{k=1}^n D_k(S_k)\).

Postconditions: Constraint Validity¶

The postcondition \(Q_i(\mathbf{S})\) constrains acceptable outcomes for \(I_i\) to maintain semantic consistency with prior answers.

For example, "family income" should not be less than the sum of individually reported incomes. Postconditions express such logical constraints.

Postcondition Invariants (relative to \(P_i\))¶

Let \(B\) denote the conjunction of all base constraints, \(P_i(\mathbf{S})\) the precondition, and \(Q_i(\mathbf{S})\) the postcondition of item \(I_i\).

We classify the postcondition as:

TAUTOLOGICAL: \(\mathrm{UNSAT}\bigl( B \wedge P_i \wedge \lnot Q_i \bigr)\), i.e., \(B \wedge P_i \models Q_i\); The postcondition always holds whenever the item is asked. It provides no additional constraint.
CONSTRAINING: Both \(\mathrm{SAT}\bigl( B \wedge P_i \wedge Q_i \bigr)\) and \(\mathrm{SAT}\bigl( B \wedge P_i \wedge \lnot Q_i \bigr)\) hold; The postcondition sometimes constrains valid answers, ruling out some possibilities while allowing others.
INFEASIBLE: \(\mathrm{UNSAT}\bigl( B \wedge P_i \wedge Q_i \bigr)\); The postcondition can never be satisfied when the item is asked. This indicates a design error.

Design Guidelines for Postconditions¶

When specifying a postcondition for an item, the classification reveals whether the constraint serves its intended purpose:

CONSTRAINING postconditions are correct: This is the useful case. The postcondition meaningfully restricts the item's domain based on previously collected answers, excluding values that would be semantically inconsistent with the survey history while allowing valid responses.
INFEASIBLE postconditions indicate design errors: If no answer can satisfy the postcondition when the item is reachable, the question becomes impossible to answer correctly. This reveals a logical flaw in the questionnaire design that must be fixed.
TAUTOLOGICAL postconditions are pointless: If the postcondition always holds regardless of the answer given, it provides no actual constraint. Specifying such a postcondition adds no value—it neither validates nor restricts anything. The postcondition should either be strengthened to provide meaningful validation or removed entirely.

Good Postcondition Design

Consider an item asking for family income after collecting individual incomes:

Bad (TAUTOLOGICAL): \(Q_i := (\text{FamilyIncome} \geq 0)\) — this is already enforced by the domain
Good (CONSTRAINING): \(Q_i := (\text{FamilyIncome} \geq \text{MotherIncome} + \text{FatherIncome})\) — meaningfully restricts values based on prior answers
Bad (INFEASIBLE): \(Q_i := (\text{FamilyIncome} < \text{MotherIncome})\) when father's income is positive — impossible to satisfy

In well-designed questionnaires, every specified postcondition should be CONSTRAINING. Any TAUTOLOGICAL or INFEASIBLE result during validation indicates a design issue requiring attention.

Algorithm: Postcondition Classification¶

Input: Base constraints \(B\), precondition \(P_i\), postcondition \(Q_i\)

Output: Classification (TAUTOLOGICAL, CONSTRAINING, INFEASIBLE, or VACUOUS)

Check \(\mathrm{SAT}(B \wedge P_i)\) using SMT solver
If result is UNSAT, return VACUOUS (item never reachable)
Check \(\mathrm{SAT}(B \wedge P_i \wedge Q_i)\) using SMT solver
If result is UNSAT, return INFEASIBLE (postcondition cannot be satisfied)
Check \(\mathrm{SAT}(B \wedge P_i \wedge \neg Q_i)\) using SMT solver
If result is UNSAT, return TAUTOLOGICAL (postcondition always holds)
Otherwise, return CONSTRAINING (postcondition sometimes constrains)

Vacuity Note¶

If \(\mathrm{UNSAT}(B \wedge P_i)\) (i.e., the item is NEVER reachable), postcondition checks are vacuous with respect to \(P_i\). The invariant classification concerns only cases where \(P_i\) can hold.

Global Satisfiability of \(Q_i\)¶

For completeness, we may record the global satisfiability properties of \(Q_i\) independent of \(P_i\):

\[ q_{\text{globally false}} := \mathrm{UNSAT}(B \wedge Q_i), \qquad q_{\text{globally true}} := \mathrm{UNSAT}(B \wedge \lnot Q_i) \]

These do not alter the precondition-based reachability classes but can be reported as diagnostic properties of \(Q_i\).

Dependency Graph and Cycle Detection¶

Understanding which items depend on which is crucial for:

Evaluation order: Items must be evaluated after their dependencies
Cycle detection: Circular dependencies make on-the-fly navigation impossible
Coverage analysis: Identifying which item answers affect which subsequent items

Path Definition¶

Definition (Path in Dependency Graph): A path from item \(I_j\) to item \(I_i\) in the dependency graph \(G = (V, E)\) is a sequence of distinct items:

\[ \pi = (I_j = I_{k_0}, I_{k_1}, \dots, I_{k_m} = I_i) \]

where \(m \geq 0\) and for each \(\ell \in \{0, \dots, m-1\}\), there exists an edge \((k_\ell \to k_{\ell+1}) \in E\).

When \(m = 0\), the path consists of a single item (\(I_j = I_i\)), representing a trivial path from an item to itself.

Dependency Graph Construction¶

The dependency graph \(G = (V, E)\) captures data flow dependencies between items through both direct variable usage and intermediate computations in code blocks. Construction proceeds in three stages:

Stage 1: Build Data Flow Graph¶

Through static analysis of preconditions, postconditions, and code blocks, we construct an intermediate directed graph where vertices represent both items and variables:

Item vertices: Each questionnaire item \(I_i\) is a vertex
Variable vertices: Each variable (item outcome \(S_i\) or intermediate variable \(v\)) is a vertex

We extract dependency edges by analyzing all code blocks and formulas:

When item \(I_j\) is answered, its outcome \(S_j\) is assigned → edge \((I_j \to S_j)\)
When code assigns item outcome to variable \(v\) (e.g., v = S_j) → edge \((S_j \to v)\)
When code modifies variable \(v\) based on \(S_j\) (e.g., v += S_j) → edge \((S_j \to v)\)
When variable \(v\) is used in \(I_i\)'s precondition \(P_i\) or postcondition \(Q_i\) → edge \((v \to I_i)\)
When \(S_j\) is directly used in \(P_i\) or \(Q_i\) → edge \((S_j \to I_i)\)

Stage 2: Compute Item-to-Item Dependencies¶

From the data flow graph, we construct the item dependency graph \(G = (V, E)\) where \(V = \{I_1, \dots, I_n\}\) contains only item vertices.

An edge \((I_j \to I_i) \in E\) exists if and only if there is a path in the data flow graph from \(I_j\) to \(I_i\) through any sequence of variable vertices.

This captures transitive dependencies through intermediate variables: if \(I_j\)'s outcome flows through variables \(v_1, \dots, v_k\) to affect \(I_i\)'s evaluation, then \(I_i\) depends on \(I_j\).

Example: Consider code that introduces variable \(v\), assigns v = S_j, then updates v = v + S_k, and finally uses \(v\) in both \(P_i\) and \(P_m\). The data flow graph contains:

\[ I_j \to S_j \to v, \quad I_k \to S_k \to v, \quad v \to I_i, \quad v \to I_m \]

resulting in item dependencies: \((I_j \to I_i)\), \((I_j \to I_m)\), \((I_k \to I_i)\), \((I_k \to I_m)\).

Stage 3: Topological Ordering¶

To determine a valid evaluation order, we compute a topological sort of \(G\) using Kahn's algorithm:

Kahn's Algorithm for Topological Sorting:

Input: Dependency graph \(G = (V, E)\)

Output: Topological order \(L\), or indication of cycle

Initialize \(S \gets \{\text{items with in-degree } 0\}\), \(L \gets \text{empty list}\)
While \(S\) is non-empty:
Remove an item \(n\) from \(S\)
Add \(n\) to tail of \(L\)
For each item \(m\) with edge \((n \to m)\):
- Remove edge \((n \to m)\) from graph
- If \(m\) has in-degree \(0\), add \(m\) to \(S\)
If graph still has edges, report cycle detected
Otherwise, return \(L\) as valid topological order

The resulting topological order \(L\) ensures that items are evaluated in dependency-respecting sequence: when item \(I_i\) is reached, all variables referenced by \(P_i\) have been assigned.

Cycle Test¶

A questionnaire has no cycles if \(G\) is a Directed Acyclic Graph (DAG). Equivalently, there is no sequence of distinct items \(i_1, i_2, \dots, i_k\) (\(k \geq 2\)) such that:

\[ i_1 \to i_2 \to \dots \to i_k \to i_1 \]

Kahn's algorithm detects cycles: if the algorithm terminates with edges remaining in the graph, those edges form cycles.

Cycles Threaten Navigation

Cycles threaten on-the-fly navigation because no item on the cycle can be evaluated first—each depends on others in the cycle. Askalot detects cycles during questionnaire validation and reports them as errors.

Satisfiability of the Whole Questionnaire¶

Beyond analyzing individual items, we need to check whether the questionnaire as a whole is consistent.

Per-Item Validity¶

For each item \(i\):

\[ \mathrm{UNSAT}\bigl(B \wedge P_i \wedge \lnot Q_i\bigr) \]

Equivalently: \(B \models (P_i \Rightarrow Q_i)\)

This checks that whenever an item is reachable (\(P_i\) holds), its postcondition can be satisfied.

Global Satisfiability¶

Let \(B := \bigwedge_{i=1}^n D_i(S_i)\) and define the global formula:

\[ G := B \wedge \bigwedge_{i=1}^n \bigl(P_i \Rightarrow Q_i\bigr) \]

We say the questionnaire is globally satisfiable if \(\mathrm{SAT}(G)\) holds—that is, there exists at least one assignment satisfying \(B\) that makes all implications true simultaneously.

Important Distinction

\(\mathrm{SAT}(G)\) is existential; it does not require that every model of \(B\) satisfy the implications.

If \(\mathrm{SAT}(B)\) and all items are per-item valid (i.e., \(\mathrm{UNSAT}(B \wedge P_i \wedge \lnot Q_i)\) for all \(i\)), then \(\mathrm{SAT}(G)\).

However, the converse need not hold. See Global vs Local Validity for the precise relationships.

Algorithm: Global Satisfiability Check¶

Input: Set of items with domains \(D_i\), preconditions \(P_i\), postconditions \(Q_i\)

Output: Satisfiability result and witness (if satisfiable)

Gather all domain constraints \(B := \bigwedge_{i=1}^n D_i(S_i)\)
Encode each question's logic as \(P_i(\mathbf{S}) \Rightarrow Q_i(\mathbf{S})\)
Form global formula \(G := B \wedge \bigwedge_{i=1}^n (P_i \Rightarrow Q_i)\)
Invoke SMT solver to check \(\mathrm{SAT}(G)\)
If solver returns SAT with model \(\mathbf{a} = (a_1,\dots,a_n)\):
Return GLOBALLY SATISFIABLE with witness \(\mathbf{a}\)
Otherwise:
Return GLOBALLY UNSATISFIABLE

Practical Diagnostics¶

In practice, Askalot reports:

Per-item reachability: ALWAYS, NEVER, or CONDITIONAL for each item
Per-item postcondition status: TAUTOLOGICAL, CONSTRAINING, INFEASIBLE, or VACUOUS
Dependency cycles: List of items involved in circular dependencies
Global satisfiability: Whether at least one valid survey completion exists
Unreachable items: Items classified as NEVER that indicate design problems

These diagnostics help survey designers identify and fix logical errors before deployment.

Summary¶

This section established:

Reachability classification for preconditions (ALWAYS, NEVER, CONDITIONAL)
Constraint classification for postconditions (TAUTOLOGICAL, CONSTRAINING, INFEASIBLE)
Algorithms for SMT-based classification using Z3
Dependency analysis for detecting cycles and determining evaluation order
Global satisfiability checks to ensure questionnaire consistency

The next section explores the subtle relationship between per-item validity and global satisfiability through formal proofs.