Dependency Analysis and Cycle Detection¶
This document explains how Askalot analyzes dependencies between questionnaire items to ensure they can be evaluated in a consistent order and to detect circular dependencies that would make evaluation impossible.
Overview¶
To ensure questionnaires can be evaluated in a consistent order, we analyze dependencies between items. Understanding which items depend on which is crucial for:
- Evaluation order: Items must be evaluated after their dependencies
- Cycle detection: Circular dependencies make on-the-fly navigation impossible
- Coverage analysis: Identifying which item answers affect which subsequent items
Path in Dependency Graph¶
Definition (Path): A path from item \(I_j\) to item \(I_i\) in the dependency graph \(G = (V, E)\) is a sequence of distinct items:
where \(m \geq 0\) and for each \(\ell \in \{0, \dots, m-1\}\), there exists an edge \((k_\ell \to k_{\ell+1}) \in E\).
When \(m = 0\), the path consists of a single item (\(I_j = I_i\)), representing a trivial path from an item to itself.
Dependency Graph Construction¶
The dependency graph \(G = (V, E)\) captures data flow dependencies between items through both direct variable usage and intermediate computations. Construction proceeds in three stages.
Stage 1: Build Data Flow Graph¶
Through static analysis of preconditions, postconditions, and code blocks, we construct an intermediate directed graph where vertices represent both items and variables:
- Item vertices
- Each questionnaire item \(I_i\) is a vertex
- Variable vertices
- Each variable (item outcome \(S_i\) or intermediate variable \(v\)) is a vertex
We extract dependency edges by analyzing all formulas and code blocks:
- When item \(I_j\) is answered, its outcome \(S_j\) is assigned \(\Rightarrow\) edge \((I_j \to S_j)\)
- When code assigns item outcome to variable \(v\) (e.g., \(v = S_j\)) \(\Rightarrow\) edge \((S_j \to v)\)
- When code modifies variable \(v\) based on \(S_j\) (e.g., \(v \mathrel{+}= S_j\)) \(\Rightarrow\) edge \((S_j \to v)\)
- When variable \(v\) is used in \(I_i\)'s precondition \(P_i\) or postcondition \(Q_i\) \(\Rightarrow\) edge \((v \to I_i)\)
- When \(S_j\) is directly used in \(P_i\) or \(Q_i\) \(\Rightarrow\) edge \((S_j \to I_i)\)
Data Flow Graph Construction
Survey with Intermediate Calculations:
- \(I_1\): "Your monthly rent" with outcome \(S_1\)
- \(I_2\): "Your monthly utilities" with outcome \(S_2\)
- Code block: \(\mathtt{total\_housing} = S_1 + S_2\)
- \(I_3\): "Other monthly expenses" with outcome \(S_3\), precondition \(P_3 = (\mathtt{total\_housing} < 3000)\)
Data flow edges:
- \((I_1 \to S_1)\) — answering \(I_1\) assigns \(S_1\)
- \((I_2 \to S_2)\) — answering \(I_2\) assigns \(S_2\)
- \((S_1 \to \mathtt{total\_housing})\) — \(S_1\) used to compute \(\mathtt{total\_housing}\)
- \((S_2 \to \mathtt{total\_housing})\) — \(S_2\) used to compute \(\mathtt{total\_housing}\)
- \((\mathtt{total\_housing} \to I_3)\) — \(\mathtt{total\_housing}\) used in \(P_3\)
Stage 2: Compute Item-to-Item Dependencies¶
From the data flow graph, we construct the item dependency graph \(G = (V, E)\) where \(V = \{I_1, \dots, I_n\}\) contains only item vertices.
An edge \((I_j \to I_i) \in E\) exists if and only if there is a path in the data flow graph from \(I_j\) to \(I_i\) through any sequence of variable vertices.
This captures transitive dependencies through intermediate variables: if \(I_j\)'s outcome flows through variables \(v_1, \dots, v_k\) to affect \(I_i\)'s evaluation, then \(I_i\) depends on \(I_j\).
Item Dependency Extraction
Continuing the previous example, the data flow graph contains paths:
Resulting item dependencies:
- \((I_1 \to I_3)\) — \(I_3\) depends on \(I_1\)
- \((I_2 \to I_3)\) — \(I_3\) depends on \(I_2\)
Item \(I_3\) cannot be evaluated until both \(I_1\) and \(I_2\) have been answered.
General example: Consider code that introduces variable \(v\), assigns \(v = S_j\), then updates \(v = v + S_k\), and finally uses \(v\) in both \(P_i\) and \(P_m\). The data flow graph contains:
resulting in item dependencies: \((I_j \to I_i)\), \((I_j \to I_m)\), \((I_k \to I_i)\), \((I_k \to I_m)\).
Stage 3: Topological Ordering (Kahn's Algorithm)¶
To determine a valid evaluation order, we compute a topological sort of \(G\) using Kahn's algorithm:
Algorithm (Kahn's Algorithm for Topological Sorting):
Input: Dependency graph \(G = (V, E)\)
Output: Topological order \(L\), or indication of cycle
- Initialize \(S \gets \{\text{items with in-degree } 0\}\), \(L \gets \text{empty list}\)
- While \(S\) is non-empty:
- Remove an item \(n\) from \(S\)
- Add \(n\) to tail of \(L\)
- For each item \(m\) with edge \((n \to m)\):
- Remove edge \((n \to m)\) from graph
- If \(m\) has in-degree \(0\), add \(m\) to \(S\)
- If graph still has edges:
- Return CYCLE DETECTED
- Otherwise:
- Return \(L\) as valid topological order
The resulting topological order \(L\) ensures that items are evaluated in dependency-respecting sequence: when item \(I_i\) is reached, all variables referenced by \(P_i\) have been assigned.
Topological Ordering
Survey with Dependencies:
- \(I_1\): "Annual income" (no dependencies, in-degree 0)
- \(I_2\): "Monthly savings" (no dependencies, in-degree 0)
- \(I_3\): "Savings rate" with \(P_3\) depending on \(S_1\) and \(S_2\) (in-degree 2)
- \(I_4\): "Financial health" with \(P_4\) depending on \(S_3\) (in-degree 1)
Dependency graph: \((I_1 \to I_3)\), \((I_2 \to I_3)\), \((I_3 \to I_4)\)
Kahn's algorithm execution:
- Initially \(S = \{I_1, I_2\}\) (in-degree 0)
- Remove \(I_1\), add to \(L = [I_1]\), decrement in-degree of \(I_3\) to 1
- Remove \(I_2\), add to \(L = [I_1, I_2]\), decrement in-degree of \(I_3\) to 0
- Add \(I_3\) to \(S\), remove \(I_3\), add to \(L = [I_1, I_2, I_3]\), decrement in-degree of \(I_4\) to 0
- Add \(I_4\) to \(S\), remove \(I_4\), add to \(L = [I_1, I_2, I_3, I_4]\)
- No edges remain, return \(L\)
Valid evaluation order: Items can be evaluated in the order \(I_1, I_2, I_3, I_4\) (or \(I_2, I_1, I_3, I_4\)).
Cycle Detection¶
A questionnaire has no cycles if \(G\) is a Directed Acyclic Graph (DAG). Equivalently, there is no sequence of distinct items \(i_1, i_2, \dots, i_k\) (\(k \geq 2\)) such that:
Kahn's algorithm detects cycles: if the algorithm terminates with edges remaining in the graph, those edges form cycles.
Cycles Threaten Navigation
Cycles threaten on-the-fly navigation because no item on the cycle can be evaluated first—each depends on others in the cycle.
Cycle Detection
Survey with Circular Dependency (Design Error):
- \(I_1\): "Estimated total cost" with precondition \(P_1\) depending on \(S_3\)
- \(I_2\): "Budget available" with precondition \(P_2\) depending on \(S_1\)
- \(I_3\): "Cost overrun" with precondition \(P_3\) depending on \(S_2\)
Dependency graph: \((I_1 \to I_2)\), \((I_2 \to I_3)\), \((I_3 \to I_1)\)
Kahn's algorithm execution:
- Initially \(S = \{\}\) (all items have in-degree 1, none can start)
- Algorithm terminates immediately with edges remaining
- Return CYCLE DETECTED: \(I_1 \to I_2 \to I_3 \to I_1\)
This circular dependency makes evaluation impossible—no item can be evaluated first because each depends on another item in the cycle.
Dependency Analysis with Complex Types¶
The dependency graph construction naturally extends to complex types (vectors and matrices). See Complex Types for details.
For vector and matrix items:
- An edge \((I_j \to I_i)\) exists if \(P_i\) references any component of \(S_j\)
- For vector \(S_j\): references to \(S_j[k]\) for any \(k\) create a dependency
- For matrix \(S_j\): references to \(S_j[k][\ell]\) for any \(k, \ell\) create a dependency
Dependency analysis ensures questionnaires can be evaluated in a consistent order and detects design errors where circular dependencies would make navigation impossible.
Further Reading¶
- Questionnaire Analysis Foundations - Core definitions and notation
- Preconditions and Postconditions - Reachability and constraint classification
- Complex Types - Extensions for vector and matrix items with component-level dependencies
- Global vs Local Validity - Relationship between per-item and global checks