AI-Assisted Questionnaire Generation¶
This document describes the AI-assisted questionnaire generation feature that helps researchers create logically sound, comprehensive surveys by combining natural language understanding with formal mathematical validation.
Overview¶
AI-assisted questionnaire generation transforms research briefs and source documents into formally verified QML questionnaires. The system combines:
- Natural language understanding to interpret research objectives and extract quantitative dimensions
- Domain expertise to recommend appropriate question types and response scales
- Formal verification using SMT solvers (Z3) to ensure logical consistency
- Iterative refinement to fix validation issues automatically
Research Domain Support¶
The system supports a wide range of research domains:
| Domain | Examples | Typical Focus |
|---|---|---|
| Sociology | Social attitudes, demographic studies, community surveys | Population characteristics, behavioral patterns |
| Market Research | Consumer preferences, brand perception, product testing | Purchasing behavior, satisfaction metrics |
| Health & Medical | Patient outcomes, clinical trials, wellness assessments | Symptom tracking, treatment efficacy, quality of life |
| Political | Voter preferences, policy opinions, election polling | Party support, issue importance, demographic cross-tabs |
| Risk & Compliance | Internal controls, regulatory compliance, risk assessment | Control effectiveness, compliance gaps, risk exposure |
| Due Diligence | Vendor assessment, M&A analysis, background checks | Financial health, operational risks, reputational factors |
| Procurement | Supplier evaluation, contract compliance, bid assessment | Capability verification, pricing analysis, SLA adherence |
| IT Security | Threat analysis, vulnerability assessment, security posture | Risk vectors, control coverage, incident history |
| Conflict of Interest | Ethics disclosures, relationship mapping, independence verification | Financial relationships, personal connections, decision influence |
Document Understanding Pipeline¶
Large research projects often involve extensive source materials. The AI processes these documents through a structured pipeline to extract the essence relevant for questionnaire design.
flowchart TD
subgraph input["Input Layer"]
docs[Source Documents]
prompt[Research Brief]
end
subgraph processing["Document Processing"]
ingest[Document Ingestion] --> chunk[Semantic Chunking]
chunk --> extract[Key Concept Extraction]
extract --> quantify[Quantifiable Dimension Identification]
end
subgraph synthesis["Research Synthesis"]
quantify --> themes[Theme Clustering]
themes --> chapters[Chapter Identification]
chapters --> outline[Questionnaire Outline]
end
docs --> ingest
prompt --> themes
outline --> alignment[Researcher Alignment]
Stage 1: Document Ingestion¶
The system accepts multiple document formats:
- PDF documents (reports, research papers, regulations)
- Word documents (briefs, specifications, guidelines)
- Spreadsheets (data dictionaries, variable lists, coding schemes)
- Text files (transcripts, notes, requirements)
For large document sets, the AI:
- Indexes all documents with metadata extraction
- Identifies document types (regulatory, academic, operational, etc.)
- Establishes cross-references between related documents
- Prioritizes by relevance to the research brief
Stage 2: Concept Extraction¶
From the ingested documents, the AI extracts:
| Extraction Type | Description | Example |
|---|---|---|
| Key entities | Organizations, roles, processes mentioned | "Audit Committee", "Risk Manager", "Monthly Review" |
| Measurable constructs | Concepts that can be quantified | "Satisfaction level", "Frequency of occurrence", "Severity rating" |
| Categorical variables | Discrete classification dimensions | "Department", "Region", "Product category" |
| Relationships | Dependencies between concepts | "Satisfaction correlates with retention" |
| Constraints | Rules and requirements that affect question logic | "Only applicable if revenue > $10M" |
Stage 3: Quantifiable Dimension Identification¶
The AI identifies aspects of the research that can be measured through survey questions:
Document excerpt: "Employees must report conflicts of interest annually.
The policy requires disclosure of financial interests
exceeding $5,000 and any family relationships with
vendors or competitors."
Extracted dimensions:
┌─────────────────────────────┬────────────────────────────────────┐
│ Dimension │ Measurement Approach │
├─────────────────────────────┼────────────────────────────────────┤
│ Reporting compliance │ Yes/No + date of last disclosure │
│ Financial interest amount │ Numeric range with thresholds │
│ Relationship types │ Multiple selection (family types) │
│ Entity categories │ Checkbox (vendor, competitor, etc) │
└─────────────────────────────┴────────────────────────────────────┘
Stage 4: Chapter Identification¶
For comprehensive research projects, the AI organizes the questionnaire into logical chapters:
User: "Create a questionnaire for evaluating IT security posture
across our organization"
AI: Based on the security framework documents provided, I recommend
organizing the questionnaire into 6 chapters:
Chapter 1: General Information (10 questions)
- Organization details, respondent role, IT environment scope
Chapter 2: Access Control (15 questions)
- Authentication methods, authorization policies, privilege management
Chapter 3: Network Security (12 questions)
- Perimeter protection, segmentation, monitoring capabilities
Chapter 4: Data Protection (14 questions)
- Classification, encryption, backup, retention policies
Chapter 5: Incident Response (11 questions)
- Detection capabilities, response procedures, recovery testing
Chapter 6: Compliance & Governance (8 questions)
- Regulatory requirements, audit findings, policy currency
Would you like me to proceed with this structure, or would you
prefer to modify the chapter organization?
Researcher Alignment¶
Before generating QML, the AI engages in a structured dialogue to ensure alignment with research objectives.
Clarification Questions¶
The AI asks targeted questions to refine requirements:
AI: I've analyzed your research brief on customer satisfaction.
Before generating the questionnaire, I need to clarify a few points:
1. Target Population
Who are the intended respondents?
[ ] All customers
[ ] Recent purchasers (last 30/90/180 days)
[ ] Specific product line users
[ ] High-value customers only
2. Response Format Preference
For satisfaction ratings, which scale do you prefer?
[ ] 5-point Likert (Strongly Disagree -> Strongly Agree)
[ ] 7-point Likert (more granular)
[ ] 0-10 NPS-style scale
[ ] Let AI recommend based on analysis type
3. Open-Ended Questions
Should the questionnaire include free-text responses?
[ ] Yes, after key sections
[ ] Yes, only at the end
[ ] No, quantitative only
4. Estimated Completion Time
What's the acceptable survey length?
[ ] Under 5 minutes (~15-20 questions)
[ ] 5-10 minutes (~25-40 questions)
[ ] 10-15 minutes (~40-60 questions)
[ ] Comprehensive (60+ questions, with progress saves)
Research Design Validation¶
The AI validates that the proposed questionnaire design matches research methodology:
AI: Based on your research objectives, I've identified the following
analytical requirements:
Statistical Analysis Planned:
┌─────────────────────────┬─────────────────────────────────────┐
│ Analysis Type │ Questionnaire Implication │
├─────────────────────────┼─────────────────────────────────────┤
│ Cross-tabulation │ Need categorical demographics │
│ Regression analysis │ Need continuous outcome variables │
│ Factor analysis │ Need multiple items per construct │
│ Segment comparison │ Need clear grouping variables │
└─────────────────────────┴─────────────────────────────────────┘
Recommended additions:
- Add 3 demographic questions to enable cross-tabulation
- Include 4-item scales for key constructs to support factor analysis
- Add "segment identifier" question for group comparisons
Accept these recommendations?
QML Generation Flow¶
Once aligned on requirements, the AI enters an iterative generation cycle that produces formally verified QML.
flowchart TD
subgraph generation["Generation Cycle"]
gen[Generate/Enhance QML] --> validate[SMT Validation]
validate --> report{Validation<br/>Report}
report -->|Errors Found| correct[Correct QML]
correct --> validate
report -->|Valid| output[Valid QML Output]
end
subgraph feedback["Feedback Loop"]
output --> review[Customer Review]
review --> test[Test in SirWay]
test --> changes{Changes<br/>Needed?}
changes -->|Yes| gen
changes -->|No| final[Final Questionnaire]
end
input[Research Alignment] --> gen
Step 1: Initial QML Generation¶
The AI generates QML code for each chapter based on the research synthesis:
AI: Generating Chapter 2: Access Control...
Creating 15 questions covering:
- Authentication mechanisms (4 questions)
- Authorization policies (5 questions)
- Privilege management (4 questions)
- Access monitoring (2 questions)
Applying QML best practices:
- Using appropriate control types for each question
- Setting realistic domain constraints (min/max)
- Adding preconditions for conditional questions
- Including postconditions for logical validation
- Organizing items in dependency order
Generated QML snippet:
- id: q_auth_methods
kind: Question
title: "Which authentication methods are used for system access?"
input:
control: Checkbox
labels:
1: "Username/password"
2: "Multi-factor authentication (MFA)"
4: "Single sign-on (SSO)"
8: "Biometric authentication"
16: "Hardware tokens"
32: "Certificate-based"
- id: q_mfa_coverage
kind: Question
title: "What percentage of user accounts have MFA enabled?"
precondition:
- predicate: (q_auth_methods.outcome & 2) != 0
input:
control: Slider
min: 0
max: 100
step: 5
left: "0%"
right: "100%"
labels:
0: "None"
50: "Half"
100: "All"
postcondition:
- predicate: q_mfa_coverage.outcome >= 0
hint: "Please specify the MFA coverage percentage"
Step 2: SMT Validation¶
The generated QML is sent to the SMT-based questionnaire validator, which performs three levels of analysis:
Level 1: Per-Item Validation¶
For each item, the validator checks:
| Check | Classification | Meaning |
|---|---|---|
| Precondition reachability | ALWAYS / NEVER / CONDITIONAL | Can this question ever be reached? |
| Postcondition validity | TAUTOLOGICAL / CONSTRAINING / INFEASIBLE | Does the validation rule make sense? |
Validation Report - Chapter 2: Access Control
=============================================
Item: q_auth_methods
Precondition: ALWAYS (no dependencies)
Postcondition: TAUTOLOGICAL (no constraints)
Status: Valid
Item: q_mfa_coverage
Precondition: CONDITIONAL (requires MFA selected)
Postcondition: TAUTOLOGICAL
Status: Valid
Item: q_password_complexity
Precondition: CONDITIONAL (requires password auth)
Postcondition: CONSTRAINING (enforces minimum length)
Status: Valid
Item: q_token_expiry
Precondition: NEVER (!)
Issue: Precondition (q_auth_methods.outcome & 64) != 0
references bit 64, but no label with value 64 exists
Status: Error - unreachable question
Level 2: Global Consistency¶
The validator checks whether the entire questionnaire has at least one valid completion path:
Global Satisfiability Check
===========================
Formula: B AND (P_i => Q_i) for all items
Result: SAT (satisfiable)
Witness (example valid completion):
q_auth_methods = 7 (password + MFA + SSO)
q_mfa_coverage = 85
q_password_complexity = 3 (complex)
...
Level 3: Path-Based Validation¶
For comprehensive validation, the system checks that no questions become unreachable due to accumulated constraints:
Accumulated Reachability Analysis
=================================
Item: q_privileged_account_review
Direct precondition: q_has_privileged_accounts.outcome == 1
Accumulated constraints: q_account_management.outcome >= 2
Check: Can both conditions be satisfied simultaneously?
Result: SAT (reachable on valid paths)
Item: q_legacy_system_access
Direct precondition: q_has_legacy_systems.outcome == 1
Accumulated constraints: q_modernization_complete.outcome == 1
Check: Can both conditions be satisfied simultaneously?
Result: UNSAT (!) (dead code - never reachable)
Reason: If modernization is complete, legacy systems flag is forced to 0
Step 3: Automatic Correction¶
When validation errors are found, the AI automatically generates corrections:
AI: Found 2 issues in Chapter 2. Applying corrections...
Issue 1: q_token_expiry references non-existent bit flag
---------------------------------------------------------
Original precondition:
predicate: (q_auth_methods.outcome & 64) != 0
Analysis: Bit 64 not in label set. Hardware tokens use bit 16.
Correction:
predicate: (q_auth_methods.outcome & 16) != 0
Issue 2: q_legacy_system_access is dead code
--------------------------------------------
Original precondition:
predicate: q_has_legacy_systems.outcome == 1
Analysis: Earlier postcondition forces q_has_legacy_systems = 0
when q_modernization_complete = 1, but the precondition
of this question also requires q_modernization_complete = 1.
Correction options:
a) Remove the conflicting postcondition on q_modernization_complete
b) Change this question's precondition to not require modernization
c) Remove this question (if legacy systems are truly gone)
Recommended: Option (a) - relax the postcondition
Re-validating...
All items now valid
Step 4: Iteration Until Valid¶
The generation cycle continues until:
- All per-item validations pass
- Global consistency is confirmed
- No dead code exists (optional strictness level)
Validation Summary - Chapter 2: Access Control
==============================================
Iterations: 3
Final status: All validations passed
Per-item results:
15/15 items have valid preconditions
15/15 items have feasible postconditions
0 items classified as dead code
Global consistency: SAT (valid completion exists)
Path coverage: All CONDITIONAL items reachable
QML ready for review.
Customer Feedback Loop¶
After initial validation, the questionnaire enters a feedback cycle with the researcher.
Testing in SirWay¶
The AI can create a test survey in SirWay for the researcher to experience:
AI: I've created a test instance of Chapter 2 in SirWay.
Test URL: https://sirway.dev.askalot.io/survey/test-abc123
You can:
1. Walk through all question paths
2. See how conditional logic works
3. Test validation messages
4. Experience the respondent interface
The test includes debug mode showing:
- Current variable values
- Active preconditions
- Postcondition status
After testing, let me know what changes you'd like.
Incorporating Feedback¶
Researchers can request modifications through natural language:
User: "The MFA coverage question should only appear if they selected
MFA as an authentication method. Also, add a follow-up question
about MFA bypass procedures when coverage is below 100%."
AI: Understood. Making the following changes to Chapter 2:
Change 1: q_mfa_coverage precondition
-------------------------------------
The precondition already checks for MFA selection:
predicate: (q_auth_methods.outcome & 2) != 0
This is correct (bit 2 = MFA). No change needed.
Change 2: New question q_mfa_bypass
------------------------------------
Adding new question after q_mfa_coverage:
- id: q_mfa_bypass
kind: Question
title: "Are there documented procedures for MFA bypass in emergencies?"
precondition:
- predicate: q_mfa_coverage.outcome < 100
input:
control: Radio
labels:
1: "Yes, formally documented and tested"
2: "Yes, but informal/ad-hoc"
3: "No bypass procedures exist"
4: "Unknown"
Re-validating...
New question validated successfully
Precondition: CONDITIONAL (appears when MFA < 100%)
No conflicts with existing logic
QML Best Practices Applied¶
The AI applies best practices from the Creating Surveys guide:
Question Design¶
| Practice | AI Implementation |
|---|---|
| Be specific | Generates precise question titles based on document context |
| Use appropriate controls | Selects control types based on variable type and response distribution |
| Set realistic ranges | Derives min/max from domain documents and statistical norms |
| Provide context | Adds left/right text for units and clarifications |
Dependency Management¶
| Practice | AI Implementation |
|---|---|
| Keep related items together | Groups questions by theme within blocks |
| Order matters | Ensures dependency order in item sequence |
| Test all paths | Uses SMT validation to verify reachability |
Validation Strategy¶
| Practice | AI Implementation |
|---|---|
| Domain constraints | Sets input control min/max from extracted requirements |
| Postconditions | Generates logical consistency rules from document relationships |
| Helpful hints | Creates clear error messages explaining validation failures |
| Edge cases | Adds bounds checking and extreme value handling |
Complete Workflow Example¶
User: "I need to create a due diligence questionnaire for evaluating
potential acquisition targets. I've uploaded 5 documents:
- M&A policy (PDF)
- Financial checklist (Excel)
- Legal requirements (Word)
- IT integration guidelines (PDF)
- HR transition procedures (PDF)"
AI: Analyzing documents...
Document Summary:
+----------------------------+----------+-----------------------+
| Document | Pages | Key Topics |
+----------------------------+----------+-----------------------+
| M&A Policy | 24 | Process, approvals |
| Financial Checklist | 8 sheets | Revenue, assets, debt |
| Legal Requirements | 18 | Contracts, IP, claims |
| IT Integration Guidelines | 31 | Systems, security |
| HR Transition Procedures | 15 | Staff, benefits |
+----------------------------+----------+-----------------------+
Extracted 147 quantifiable dimensions across 6 themes.
Recommended questionnaire structure (8 chapters, ~85 questions):
1. Company Overview (8 questions)
2. Financial Health (15 questions)
3. Legal & Compliance (12 questions)
4. Operational Assessment (14 questions)
5. Technology & Systems (11 questions)
6. Human Resources (10 questions)
7. Market Position (9 questions)
8. Risk Summary (6 questions)
Estimated completion time: 20-25 minutes
Before I generate the QML, I have some clarification questions...
[Alignment dialogue continues]
---
Generating Chapter 1: Company Overview...
8 questions generated
SMT validation passed (2 iterations)
Generating Chapter 2: Financial Health...
15 questions generated
SMT validation found 1 issue
- Fixed: postcondition on debt_ratio referenced wrong variable
Re-validation passed
[... continues for all chapters ...]
Generation Complete
===================
Total questions: 85
Validation iterations: 11
Final status: All chapters validated
Test URL: https://sirway.dev.askalot.io/survey/test-dd-eval-2026
Would you like to review any specific chapter, or should I
proceed with creating the production questionnaire in Targetor?
Integration with Armiger¶
The AI-assisted generation integrates seamlessly with Armiger, the browser-based QML development environment:

The Armiger interface showing QML code (left), flowchart visualization (center), and AI assistant (right).
AI Panel Features¶
| Feature | Description |
|---|---|
| Chat interface | Natural language interaction for requirements and feedback |
| Document upload | Drag-and-drop source documents for analysis |
| Live preview | See generated QML update in real-time |
| Validation status | Instant feedback on SMT validation results |
| Suggestion cards | Proactive recommendations for improvements |
Code Synchronization¶
Changes made by the AI are synchronized with the code editor:
- AI-generated code appears with highlighting
- Manual edits trigger re-validation
- Conflict resolution for concurrent changes
- Version history with rollback capability
Related Documentation¶
- QML Syntax Reference - Complete QML language specification
- Creating Surveys - Manual survey creation guide
- Questionnaire Analysis - Mathematical foundations
- Comprehensive Validation - Validation theory
- AI-Assisted Campaign Management - Using AI for campaign setup