Extension to Vector and Matrix Item Types¶
While the core concepts of questionnaire validation have been presented using single integer answer variables for clarity, real-world questionnaires often require more complex answer structures. This document extends our formal framework to handle vector and matrix item types, demonstrating that the validation techniques—reachability analysis, constraint satisfaction, and dependency analysis—naturally generalize to these richer structures.
Motivation for Complex Types¶
Modern questionnaires frequently employ question formats that cannot be adequately represented by single integer values:
- Ranking Questions
- "Please rank these 5 features by importance (1 = most important, 5 = least important)"
- Requires a vector of integers where each element appears exactly once
- Multiple Selection Questions
- "Select all symptoms you have experienced"
- Produces a binary vector indicating which options were selected
- Allocation Questions
- "Distribute 100 points among these 4 categories to indicate their relative importance"
- Requires a vector with a sum constraint
- Grid Questions
- "For each product, rate the following attributes on a scale of 1-10"
- Generates a matrix where rows represent products and columns represent attributes
- Correlation Matrices
- "Indicate the relationship strength between each pair of factors"
- Creates a symmetric matrix of values
These question types are essential for capturing nuanced respondent preferences, multi-dimensional assessments, and complex relationships that single-value questions cannot express.
Extended Type System¶
We extend our framework to support three item types, where each item \(I_i\) has an associated answer variable \(S_i\) of one of the following types.
Type Definitions¶
Scalar Type:
A single integer value with domain constraint \(D_i: \ell_i \leq S_i \leq u_i\)
Vector Type:
A vector of \(k_i\) integers with domain constraint:
Matrix Type:
A matrix of \(m_i \times n_i\) integers with domain constraint:
Unified Notation¶
To maintain notational consistency across all item types, we adopt the following conventions:
- \(S_i\) denotes the answer variable for item \(I_i\) regardless of its type
- For scalar items: \(S_i\) directly represents the integer value
- For vector items: \(S_i\) represents the entire vector, with \(S_{i,j}\) or \(S_i[j]\) denoting the \(j\)-th component
- For matrix items: \(S_i\) represents the entire matrix, with \(S_{i,j,k}\) or \(S_i[j][k]\) denoting the element at row \(j\), column \(k\)
This notation allows us to write preconditions and postconditions uniformly while accessing components when needed.
Unified Notation in Practice
Product Feedback Survey:
- \(I_1\): "Rate overall satisfaction" (scalar) → \(S_1 \in [1, 5]\)
- \(I_2\): "Rate 3 product features" (vector) → \(\mathbf{S}_2 \in \mathbb{Z}^3\) with \(S_2[j] \in [1, 5]\) for \(j \in \{1,2,3\}\)
- \(I_3\): "Compare 2 products on 4 attributes" (matrix) → \(\mathbf{S}_3 \in \mathbb{Z}^{2 \times 4}\) with \(S_3[j][k] \in [1, 10]\)
All use the same notation pattern \(S_i\), with component access as needed.
Extended Domain Constraints¶
The base constraint formula \(B\) from Questionnaire Analysis naturally extends to accommodate complex types:
Where \(D_i(S_i)\) is defined according to the type of \(S_i\):
- For scalar \(S_i\): \(D_i(S_i) := \ell_i \leq S_i \leq u_i\)
- For vector \(\mathbf{S}_i\): \(D_i(\mathbf{S}_i) := \bigwedge_{j=1}^{k_i} (\ell_{i,j} \leq S_{i,j} \leq u_{i,j})\)
- For matrix \(\mathbf{S}_i\): \(D_i(\mathbf{S}_i) := \bigwedge_{j=1}^{m_i} \bigwedge_{k=1}^{n_i} (\ell_{i,j,k} \leq S_{i,j,k} \leq u_{i,j,k})\)
Preconditions and Postconditions with Complex Types¶
Component Access in Conditions¶
Preconditions and postconditions can reference individual components of vector and matrix items, enabling sophisticated conditional logic:
- Vector component access
- \(P_i\) may include expressions like \((S_j[2] > 3)\) to check if the second element of vector item \(I_j\) exceeds 3
- Matrix element access
- \(P_i\) may include \((S_j[1][3] = S_k[2][3])\) to compare specific matrix elements
- Aggregate operations
- Conditions can use sums, products, or other aggregate functions over vector/matrix components
Component Access in Conditional Logic
Employee Performance Review:
- \(I_1\): "Rate 5 performance areas" (vector) with \(\mathbf{S}_1 \in \mathbb{Z}^5\), each \(S_1[j] \in [1, 5]\)
- \(I_2\): "Improvement plan details" with precondition \(P_2 = \bigvee_{j=1}^{5} (S_1[j] \leq 2)\)
- Asked if any area rated ≤2 (needs improvement)
- \(I_3\): "Performance bonus eligibility" with precondition \(P_3 = \bigwedge_{j=1}^{5} (S_1[j] \geq 4)\)
- Asked only if all areas rated ≥4 (excellent performance)
This demonstrates how vector components can be used in both disjunctive (∨) and conjunctive (∧) preconditions.
Common Constraint Patterns¶
Ranking Constraints¶
For a ranking question with \(k\) options, the postcondition typically ensures:
- All values are within range \([1, k]\)
- All values are distinct (each rank used exactly once)
Formally, for ranking item \(I_i\) with \(k_i\) options:
Ranking Constraint
Feature Priority Ranking:
- \(I_1\): "Rank 4 product features by importance" (1=most important, 4=least important)
- Type: Vector of 4 integers
- Domain: Each \(S_1[j] \in [1, 4]\)
- Postcondition: \(Q_1 = \left(\bigwedge_{j=1}^{4} 1 \leq S_1[j] \leq 4\right) \land \text{Distinct}(S_1[1], S_1[2], S_1[3], S_1[4])\)
Valid response: \((2, 1, 4, 3)\) — quality ranked 1st, price 2nd, support 3rd, design 4th
Invalid response: \((1, 1, 2, 3)\) — violates distinctness (two features ranked 1st)
Allocation Constraints¶
For allocation questions where respondents distribute a fixed total across categories:
where \(T_i\) is the total to be allocated (e.g., 100 points).
Allocation Constraint
Budget Distribution:
- \(I_1\): "Distribute 100 points across 4 business areas to indicate investment priority"
- Type: Vector of 4 integers
- Domain: Each \(S_1[j] \in [0, 100]\)
- Postcondition: \(Q_1 = \left(\sum_{j=1}^{4} S_1[j] = 100\right) \land \left(\bigwedge_{j=1}^{4} S_1[j] \geq 0\right)\)
Valid response: \((40, 30, 20, 10)\) — R&D gets 40%, Marketing 30%, Sales 20%, Operations 10%
Invalid response: \((50, 30, 20, 10)\) — violates sum constraint (totals 110, not 100)
Matrix Symmetry Constraints¶
For correlation or relationship matrices that must be symmetric:
Symmetry Constraint
Team Collaboration Matrix:
- \(I_1\): "Rate collaboration level between each pair of team members (0-10 scale)"
- Type: Matrix \(4 \times 4\) (4 team members)
- Postcondition: \(Q_1 = \bigwedge_{j=1}^{4} \bigwedge_{k=j+1}^{4} (S_1[j][k] = S_1[k][j])\)
This ensures that if Alice rates her collaboration with Bob as 8, Bob's rating of collaboration with Alice must also be 8 (symmetric relationship).
Reachability Analysis for Complex Types¶
The reachability analysis framework from Preconditions and Postconditions extends naturally to complex types. The key insight is that precondition satisfiability checking remains fundamentally the same—we simply have richer constraint expressions.
Ranking-Based Conditional Logic
Product Feature Survey:
Consider a questionnaire where follow-up questions depend on ranking priorities:
Item \(I_1\): "Rank 4 product features by importance (1=most, 4=least)"
- Type: Vector of 4 integers (price, quality, support, design)
- Domain: Each \(S_1[j] \in [1,4]\)
- Postcondition: All values distinct
Item \(I_2\): "Deep-dive on price" (asked if price is top priority)
- Precondition: \(P_2 = (S_1[1] = 1)\) where price is first element
- Reachability: CONDITIONALLY reachable (when price ranked 1st)
Item \(I_3\): "Deep-dive on quality" (asked if quality is top 2)
- Precondition: \(P_3 = (S_1[2] = 1) \vee (S_1[2] = 2)\) where quality is second element
- Reachability: CONDITIONALLY reachable (when quality ranked 1st or 2nd)
Item \(I_4\): "Trade-off analysis" (asked if both price and quality are top 2)
- Precondition: \(P_4 = (S_1[1] \leq 2) \wedge (S_1[2] \leq 2)\)
- Reachability: CONDITIONALLY reachable
- Note: The distinctness constraint ensures this is satisfiable (e.g., price=1, quality=2 or price=2, quality=1)
Analysis: The satisfiability solver verifies that all preconditions are reachable under the ranking constraint, ensuring no impossible branching logic.
Dependency Graph with Complex Types¶
The dependency graph construction from Preconditions and Postconditions extends to track component-level dependencies:
- An edge \((I_j \to I_i)\) exists if \(P_i\) references any component of \(S_j\)
- For vector \(S_j\): references to \(S_j[k]\) for any \(k\) create a dependency
- For matrix \(S_j\): references to \(S_j[k][\ell]\) for any \(k, \ell\) create a dependency
This ensures that cycle detection and coverage analysis remain sound for complex types.
Dependency Graph with Vectors
Budget Planning Survey:
- \(I_1\): "Allocate total budget across 5 departments" (vector)
- \(I_2\): "R&D sub-allocation" with precondition depending on \(S_1[1]\) (R&D budget component)
- \(I_3\): "Marketing sub-allocation" with precondition depending on \(S_1[2]\) (Marketing budget component)
Dependency graph:
- \((I_1 \to I_2)\) — \(I_2\) depends on \(S_1[1]\)
- \((I_1 \to I_3)\) — \(I_3\) depends on \(S_1[2]\)
Both \(I_2\) and \(I_3\) depend on \(I_1\), but can be evaluated in parallel after \(I_1\) is completed (no dependency between \(I_2\) and \(I_3\)).
Practical Examples¶
Example 1: Employee Satisfaction Survey¶
A survey measuring employee satisfaction across multiple dimensions with conditional follow-up questions.
Survey Structure:
Item \(I_1\): "Rate 5 aspects of job satisfaction on a 1-10 scale"
- Type: Vector of 5 integers
- Domain: Each \(S_1[j] \in [1,10]\) for \(j \in \{1,2,3,4,5\}\)
- Aspects: compensation, work-life balance, management, career growth, work environment
Item \(I_2\): "Detailed questions about problem areas"
- Precondition: \(P_2 = \bigvee_{j=1}^{5} (S_1[j] < 5)\)
- Asked if any aspect rated below 5 (dissatisfaction threshold)
Item \(I_3\): "Rank problematic areas by priority for improvement"
- Type: Vector (variable size, only low-rated aspects included)
- Precondition: \(P_3 = P_2\) (same condition as \(I_2\))
Item \(I_4\): "Allocate 100 points across aspects to indicate improvement priority"
- Type: Vector of 5 integers
- Precondition: \(P_4 = P_2\)
- Postcondition: \(Q_4 = \left(\sum_{j=1}^{5} S_4[j] = 100\right)\)
Key Features:
- Vector items for multi-dimensional ratings
- Conditional logic based on vector component thresholds
- Allocation constraint ensuring total equals 100
Example 2: Product Comparison Matrix¶
A market research questionnaire comparing products across multiple attributes.
Survey Structure:
Item \(I_1\): "Compare 4 products on 6 attributes (0-100 scale)"
- Type: Matrix \(4 \times 6\)
- Domain: Each \(S_1[j][k] \in [0,100]\)
- Rows: Products A, B, C, D
- Columns: Price, Quality, Design, Support, Innovation, Value
Item \(I_2\): "Follow-up on perfect-scoring products"
- Precondition: \(P_2 = \bigvee_{j=1}^{4} \left(\bigwedge_{k=1}^{6} (S_1[j][k] = 100)\right)\)
- Asked if any product scores 100 on all attributes
Item \(I_3\): "Identify best product per attribute"
- Type: Vector of 6 integers (product indices)
- Domain: Each \(S_3[k] \in \{1,2,3,4\}\) (product A, B, C, or D)
- Postcondition: For each attribute \(k\), \(S_3[k]\) must be the index of the product with maximum score for that attribute
Key Features:
- Matrix items for two-dimensional comparisons
- Complex preconditions with nested quantifiers
- Derived vector from matrix aggregations
Constraint Complexity¶
Constraint Growth with Complex Types¶
Complex types increase constraint complexity significantly compared to scalar items:
- Vector of size \(k\)
- Generates \(k\) domain constraints (one per element)
- Matrix of size \(m \times n\)
- Generates \(m \times n\) domain constraints (one per element)
- Distinct constraints
- For \(k\) elements, generates \(\binom{k}{2} = \frac{k(k-1)}{2}\) pairwise inequality constraints
Constraint Complexity Example
Ranking Question with 10 Options:
- Domain constraints: 10 (one per element, ensuring each \(S_i[j] \in [1, 10]\))
- Distinctness constraints: \(\binom{10}{2} = 45\) pairwise inequalities
- Total: 55 constraints
This demonstrates how constraint count grows quadratically with the number of ranked options.
Symmetry Exploitation¶
For symmetric structures like correlation matrices where \(S_i[j][k] = S_i[k][j]\), the constraint system can exploit symmetry:
- Instead of generating \(n(n-1)\) bidirectional constraints
- Only \(\frac{n(n-1)}{2}\) constraints for the upper triangle are needed
- The symmetric property is enforced by the postcondition
Symmetry in Correlation Matrix
Team Relationship Matrix (5 team members):
Without symmetry:
- Would need \(5 \times 5 = 25\) elements
- Plus \(5 \times 5 - 5 = 20\) equality constraints for symmetry
With symmetry exploitation:
- Only need upper triangle: \(\binom{5}{2} = 10\) elements
- Diagonal elements (self-correlation): 5 elements
- Total: 15 elements instead of 25 (40% reduction)
Implementation Notes¶
Complex types are implemented in the QML (Questionnaire Markup Language) type system. Each item's outcome variable can have one of three types:
- Integer
- For scalar questions (single value responses)
- List[Integer]
- For vector questions (rankings, multiple selections, allocations)
- List[List[Integer]]
- For matrix questions (grids, comparison tables, correlation matrices)
Because QML uses Python's dynamic typing, the outcome type is determined at runtime based on the item's configuration in the QML specification. The validation framework automatically handles type-appropriate constraint generation and satisfiability checking.
Further Reading¶
- Questionnaire Analysis Foundations - Core definitions and notation
- Preconditions and Postconditions - Reachability and constraint classification
- Dependency Analysis and Cycle Detection - Ensuring evaluation order and detecting circular dependencies
- Global vs Local Validity - Relationship between per-item and global checks