Extension to Vector and Matrix Item Types¶

While the core concepts of questionnaire validation have been presented using single integer answer variables for clarity, real-world questionnaires often require more complex answer structures. This document extends our formal framework to handle vector and matrix item types.

Motivation for Complex Types¶

Modern questionnaires frequently employ question formats that cannot be adequately represented by single integer values:

Ranking Questions: "Please rank these 5 features by importance (1 = most important, 5 = least important)"; Requires a vector of integers where each element appears exactly once
Multiple Selection Questions: "Select all symptoms you have experienced"; Produces a binary vector indicating which options were selected
Allocation Questions: "Distribute 100 points among these 4 categories to indicate their relative importance"; Requires a vector with a sum constraint
Grid Questions: "For each product, rate the following attributes on a scale of 1-10"; Generates a matrix where rows represent products and columns represent attributes
Correlation Matrices: "Indicate the relationship strength between each pair of factors"; Creates a symmetric matrix of values

These question types are essential for capturing nuanced respondent preferences, multi-dimensional assessments, and complex relationships that single-value questions cannot express.

Extended Type System¶

Type Definitions¶

We extend our framework to support three item types, where each item \(I_i\) has an associated answer variable \(S_i\) of one of the following types:

Scalar Type: \(S_i \in \mathbb{Z}\)

A single integer value with domain constraint:

\[ D_i: \ell_i \leq S_i \leq u_i \]

Vector Type: \(\mathbf{S}_i \in \mathbb{Z}^{k_i}\)

A vector of \(k_i\) integers with domain constraint:

\[ D_i: \bigwedge_{j=1}^{k_i} \left(\ell_{i,j} \leq S_{i,j} \leq u_{i,j}\right) \]

Matrix Type: \(\mathbf{S}_i \in \mathbb{Z}^{m_i \times n_i}\)

A matrix of \(m_i \times n_i\) integers with domain constraint:

\[ D_i: \bigwedge_{j=1}^{m_i} \bigwedge_{k=1}^{n_i} \left(\ell_{i,j,k} \leq S_{i,j,k} \leq u_{i,j,k}\right) \]

Unified Notation¶

To maintain notational consistency across all item types, we adopt the following conventions:

\(S_i\) denotes the answer variable for item \(I_i\) regardless of its type
For scalar items: \(S_i\) directly represents the integer value
For vector items: \(S_i\) represents the entire vector, with \(S_{i,j}\) or \(S_i[j]\) denoting the \(j\)-th component
For matrix items: \(S_i\) represents the entire matrix, with \(S_{i,j,k}\) or \(S_i[j][k]\) denoting the element at row \(j\), column \(k\)

This notation allows us to write preconditions and postconditions uniformly while accessing components when needed.

Extended Domain Constraints¶

The base constraint formula \(B\) from the foundational section naturally extends to accommodate complex types:

\[ B := \bigwedge_{i=1}^{n} D_i(S_i) \]

Where \(D_i(S_i)\) is defined according to the type of \(S_i\):

For scalar \(S_i\): \(D_i(S_i) := \ell_i \leq S_i \leq u_i\)
For vector \(\mathbf{S}_i\): \(D_i(\mathbf{S}_i) := \bigwedge_{j=1}^{k_i} (\ell_{i,j} \leq S_{i,j} \leq u_{i,j})\)
For matrix \(\mathbf{S}_i\): \(D_i(\mathbf{S}_i) := \bigwedge_{j=1}^{m_i} \bigwedge_{k=1}^{n_i} (\ell_{i,j,k} \leq S_{i,j,k} \leq u_{i,j,k})\)

Preconditions and Postconditions with Complex Types¶

Component Access in Conditions¶

Preconditions and postconditions can reference individual components of vector and matrix items, enabling sophisticated conditional logic:

Vector component access: \(P_i\) may include expressions like \((S_j[2] > 3)\) to check if the second element of vector item \(I_j\) exceeds 3
Matrix element access: \(P_i\) may include \((S_j[1][3] = S_k[2][3])\) to compare specific matrix elements
Aggregate operations: Conditions can use sums, products, or other aggregate functions over vector/matrix components

Common Constraint Patterns¶

Ranking Constraints¶

For a ranking question with \(k\) options, the postcondition typically ensures:

All values are within range \([1, k]\)
All values are distinct (each rank used exactly once)

Formally, for ranking item \(I_i\) with \(k_i\) options:

\[ Q_i := \left(\bigwedge_{j=1}^{k_i} 1 \leq S_{i,j} \leq k_i\right) \land \text{Distinct}(S_{i,1}, \ldots, S_{i,k_i}) \]

Allocation Constraints¶

For allocation questions where respondents distribute a fixed total across categories:

\[ Q_i := \left(\sum_{j=1}^{k_i} S_{i,j} = T_i\right) \land \left(\bigwedge_{j=1}^{k_i} S_{i,j} \geq 0\right) \]

where \(T_i\) is the total to be allocated (e.g., 100 points).

Matrix Symmetry Constraints¶

For correlation or relationship matrices that must be symmetric:

\[ Q_i := \bigwedge_{j=1}^{m_i} \bigwedge_{k=j+1}^{n_i} (S_{i,j,k} = S_{i,k,j}) \]

Z3 Compilation for Complex Types¶

Askalot uses Z3's array theory to represent vectors and matrices efficiently.

Vector Representation in Z3¶

Vectors are represented using Z3's array theory, where arrays map integer indices to integer values. The array theory provides efficient indexing and update operations.

Matrix Representation in Z3¶

Matrices use nested arrays, where the outer array maps row indices to inner arrays (representing rows). This representation allows natural matrix element access syntax.

Compiling Complex Type Operations¶

The Python-to-Z3 compiler extends to handle vector and matrix operations:

Vector Subscript Access¶

Python code accessing vector components translates directly to array access in Z3.

Aggregate Operations¶

Sum operations over vector components compile to Z3 sum expressions.

Matrix Operations¶

Matrix operations require nested array access and iteration.

Reachability Analysis for Complex Types¶

The reachability analysis framework extends naturally to complex types. The key insight is that precondition satisfiability checking remains fundamentally the same—we simply have richer constraint expressions.

Example: Ranking-Based Conditional Logic¶

Consider a questionnaire where follow-up questions depend on ranking priorities. For a ranking of 4 product features (price, quality, support, design):

Item \(I_1\): Rank features (1=most, 4=least important)
Type: Vector of 4 integers
Domain: Each \(S_1[j] \in [1,4]\)
Postcondition: All values distinct
Item \(I_2\): Deep-dive on price (asked if price is top priority)
Precondition: \(P_2 = (S_1[0] == 1)\) where price is first element
Item \(I_3\): Deep-dive on quality (asked if quality is top 2)
Precondition: \(P_3 = (S_1[1] == 1) \vee (S_1[1] == 2)\) where quality is second element
Item \(I_4\): Trade-off question (asked if both price and quality are top 2)
Precondition: \(P_4 = (S_1[0] \leq 2) \wedge (S_1[1] \leq 2)\)

The Z3 solver can determine:

\(I_2\) is conditionally reachable when price is ranked first
\(I_3\) is conditionally reachable when quality is ranked first or second
\(I_4\) is conditionally reachable but requires checking if the ranking constraint allows both conditions simultaneously

Dependency Graph with Complex Types¶

The dependency graph construction extends to track component-level dependencies:

An edge \((j \to i)\) exists if \(P_i\) references any component of \(S_j\)
For vector \(S_j\): references to \(S_j[k]\) for any \(k\) create a dependency
For matrix \(S_j\): references to \(S_j[k][\ell]\) for any \(k, \ell\) create a dependency

This ensures that cycle detection and coverage analysis remain sound for complex types.

Practical Examples¶

Example 1: Employee Satisfaction Survey¶

A survey measuring employee satisfaction across multiple dimensions:

Item \(I_1\): Rate 5 aspects of job satisfaction (1-10 scale)
Type: Vector[5], Domain: Each \(S_1[j] \in [1,10]\)
Item \(I_2\): Detailed questions if any aspect rated below 5
Precondition: \(P_2 = \bigvee_{j=0}^{4} (S_1[j] < 5)\)
Item \(I_3\): Rank problematic areas (only those rated < 5)
Precondition: \(P_3 = P_2\)
Item \(I_4\): Allocate 100 points for improvement priority
Type: Vector[5], Precondition: \(P_4 = P_2\)
Postcondition: \(Q_4 = \left(\sum_{j=0}^{4} S_4[j] == 100\right)\)

This demonstrates:

Vector items for multi-dimensional ratings
Conditional logic based on vector components
Allocation constraint with sum requirement

Example 2: Product Comparison Matrix¶

A market research questionnaire comparing products:

Item \(I_1\): Compare 4 products on 6 attributes (0-100 scale)
Type: Matrix[4][6], Domain: Each \(S_1[j][k] \in [0,100]\)
Item \(I_2\): Follow-up if any product scores perfect on all attributes
Precondition: \(P_2 = \bigvee_{j=0}^{3} \left(\bigwedge_{k=0}^{5} (S_1[j][k] == 100)\right)\)
Item \(I_3\): Identify best product per attribute
Type: Vector[6] storing product index for each attribute
Postcondition: \(S_3[k] = \arg\max_j(S_1[j][k])\) for each \(k\)

This demonstrates:

Matrix items for two-dimensional data
Complex preconditions involving matrix aggregations
Derived calculations from matrix data

Example 3: Budget Allocation with Constraints¶

A questionnaire analyzing spending priorities with budget constraints:

Item \(I_1\): Allocate annual budget across 5 categories
Type: Vector[5], Domain: Each \(S_1[j] \in [0, 100000]\)
Postcondition: \(Q_1 = \left(\sum_{j=0}^{4} S_1[j] \leq 500000\right)\) (total budget limit)
Item \(I_2\): Monthly breakdown for highest priority category
Type: Vector[12] for monthly allocations
Let \(m = \arg\max_j(S_1[j])\) be the index of the max element
Postcondition: \(Q_2 = \left(\sum_{k=0}^{11} S_2[k] == S_1[m]\right) \wedge \left(\bigwedge_{k=0}^{11} S_2[k] \geq 0\right)\)

Constraint Structure¶

Constraint Patterns¶

Complex types generate structured constraint systems:

Vector of size \(k\): Requires \(k\) domain constraints (one per element)
Matrix of size \(m \times n\): Requires \(m \times n\) domain constraints (one per element)
Distinct constraints: For \(k\) elements, requires \(\binom{k}{2}\) pairwise inequality constraints

For example, a ranking question with 10 options requires:

10 domain constraints (one per element)
\(\binom{10}{2} = 45\) pairwise distinctness constraints
Total: 55 constraints

Symmetry in Constraint Systems¶

For symmetric structures like correlation matrices where \(S_i[j][k] = S_i[k][j]\), constraint generation can exploit symmetry. Instead of generating \(n(n-1)\) bidirectional constraints, only \(\frac{n(n-1)}{2}\) constraints for the upper triangle are needed due to the symmetric property.

Implementation in QML¶

Complex types are built into the QML (Questionnaire Markup Language) type system. Each item's outcome attribute can be:

Integer: For scalar questions (single value)
List[Integer]: For vector questions (rankings, multiple selections)
List[List[Integer]]: For matrix questions (grids, comparison tables)

Because Python uses dynamic typing, the outcome attribute can be any of these types at runtime. The questionnaire framework determines the actual type based on the item's configuration in the QML specification.

Compiler Extensions¶

The Python-to-Z3 compiler recognizes complex type patterns:

item.outcome[i] → Vector access
item.outcome[i][j] → Matrix access
sum(item.outcome) → Vector sum
len(item.outcome) → Vector/matrix dimension

The compiler automatically generates appropriate Z3 array declarations and constraint translations based on the detected access patterns.