Skip to content

AI-Assisted Result Analysis

Balansor's AI Analyst evaluates survey data quality across two complementary dimensions: sample representativeness (does the sample match the target population?) and response quality (are questions producing informative, diverse answers?). The analyst combines statistical metrics with methodological expertise to produce structured reports with actionable recommendations.

Overview

After collecting survey responses, researchers need to answer two fundamental questions before trusting their data:

  1. Is the sample representative? — Do respondent demographics match the target population defined in the sampling strategy?
  2. Are the responses reliable? — Do answer patterns indicate genuine engagement, or are there signs of satisficing, inattention, or bias?

Balansor addresses both questions through computed metrics and an AI analyst agent that interprets results in methodological context.

flowchart LR
    subgraph input["Input"]
        b[Bronze Dataset]
        s[Sampling Strategy]
    end

    subgraph metrics["Quality Metrics"]
        r[Sample Representativeness]
        q[Response Quality]
    end

    subgraph output["Output"]
        w[Silver Dataset]
        rpt[Analyst Report]
    end

    b --> r
    s --> r
    b --> q
    r --> rpt
    q --> rpt
    b -->|apply raking| w
    w --> r

Quality Metrics

The platform computes quality metrics across two dimensions, grounded in established survey methodology (Groves, Kish, Krosnick, Shannon, Cronbach):

Dimension Key Metrics
Sample Representativeness RMSE, MAE, Chi-Square, Max Deviation, Composite Quality Score
Weighting Diagnostics Design Effect (DEFF), Effective Sample Size, Weight CV, Completion Rate
Response Quality Normalized Entropy, Straightlining Score, Cronbach's Alpha, Acquiescence Bias, Speeder Detection, Multi-Flag Aggregation

See the Quality Metrics Reference for complete definitions, mathematical formulas, interpretation thresholds, and recommended actions.


The AI Analyst Agent

The AI Analyst is a specialized agent that interprets quality metrics in methodological context and produces structured reports. It runs as a background task in Balansor, powered by the askalot_ai agent framework.

How It Works

sequenceDiagram
    participant U as Researcher
    participant B as Balansor
    participant R as Redis
    participant A as Analyst Agent
    participant P as Portor MCP

    U->>B: Start Quality Analysis
    B->>R: Create progress queue
    B->>A: Spawn background thread
    B-->>U: Return task_id

    A->>P: get_dataset_quality(dataset_id)
    P-->>A: RMSE, MAE, per-factor breakdown, weighting diagnostics
    A->>P: get_sampling_strategy(strategy_id)
    P-->>A: Target distributions
    A->>P: get_campaign(campaign_id)
    P-->>A: Completion rates, respondent counts
    A->>P: get_dataset_response_quality(dataset_id)
    P-->>A: Entropy, straightlining, speeders, multi-flag respondents
    A->>P: compare_dataset_quality(bronze_id, silver_id)
    P-->>A: Before/after weighting comparison

    A->>R: Emit progress events
    A->>R: Emit completed report

    U->>B: Poll for status
    B->>R: Read progress
    B-->>U: Display report
  1. Credential resolution: The agent resolves AI provider credentials from user settings, organization configuration, or environment variables (in that priority order)
  2. Background execution: The analysis runs in a background thread via askalot_ai.runner.subprocess, isolating it from the web request lifecycle
  3. MCP tool access: The agent connects to Portor's MCP interface to call quality assessment tools, gather campaign context, and retrieve sampling strategy targets
  4. Progress tracking: Redis Streams provide cross-worker progress reporting — safe across Gunicorn's multiple worker processes
  5. Report generation: The agent produces a structured markdown report following a defined template

Agent Profile

Property Value
Agent type analyst
Model tier High (Claude Opus)
Temperature 0.4 (low — favors precision over creativity)
Max turns 30
Knowledge base Data quality assessment, weighting methodology, response quality metrics

The analyst's system prompt encodes survey methodology expertise from AAPOR standards, Kish (1965), Groves et al. (2009), Kalton & Flores-Cervantes (2003), Krosnick (1991), and ESOMAR guidelines — including specific formulas, thresholds, and decision frameworks for interpreting design effects, straightlining scores, speeder flags, and multi-flag aggregation.

Report Structure

The analyst produces a report with five sections:

1. Executive Summary 2–3 sentences: overall quality assessment, fitness for purpose, and the single most important recommendation.

2. Sample Representativeness

  • Overall quality score with interpretation
  • RMSE and MAE values with context
  • Per-factor analysis identifying which demographics match targets and which deviate
  • Specific numbers: "Age 18–24 is over-represented by 8pp (32% actual vs 24% target)"

3. Weighting Assessment (if Silver dataset exists)

  • Quality improvement percentages from raking
  • Per-factor improvement breakdown
  • Weighting diagnostics: design effect (DEFF), effective sample size (ESS), weight CV, and weight ratio interpretation
  • Flags for any factors that worsened after weighting
  • Assessment of whether weighting was effective or structural changes are needed

4. Response Quality

  • Speeder detection: number flagged, percentage of sample, median completion time
  • Straightlining: groups with high scores (> 0.5)
  • Item non-response: questions exceeding 10% missing rates
  • Acquiescence bias index (if Likert scales present)
  • Multi-flag respondents (baseball rule) — exclusion recommendation if any

5. Key Findings 3–5 specific, data-driven findings. Each references actual numbers from quality metrics.

6. Recommendations 2–4 prioritized, actionable recommendations. Each includes what to change, why (linked to specific finding), and expected impact.


Using Quality Analysis in Balansor

Viewing Quality Metrics

  1. Navigate to Quality from the main menu
  2. Select a dataset from the dropdown
  3. Two tabs are available:
    • Sample Representativeness — demographics vs strategy targets
    • Response Quality — answer diversity, straightlining, consistency

Sample Representativeness Tab

Shows per-factor breakdowns with actual vs target distributions. If both Bronze and Silver datasets exist, a side-by-side comparison shows improvement from weighting.

If the dataset has no linked sampling strategy, a strategy selector appears so you can assign one.

Response Quality Tab

Displays four aggregate summary cards:

Card Metric What It Shows
Diversity Mean Normalized Entropy Average answer diversity across categorical questions
Acquiescence Bias Index Agreement tendency in Likert scales
Non-Response Overall Rate Average skip rate across questions
Coverage Question Count Number of questions analyzed by type

Below the summary: a per-question metrics table, straightlining detection panel (for groups), internal consistency panel (Cronbach's alpha for groups with 3+ sub-items), speeder detection results, and multi-flag respondent aggregation.

Running AI Analysis

  1. On the Quality page, click "Run AI Analysis"
  2. The analyst agent starts in the background — progress updates appear in real time
  3. When complete, the structured report appears with methodology-grounded interpretation

The AI analysis requires:

  • An AI provider API key (configured in user profile, organization settings, or environment)
  • A Portor MCP endpoint (for accessing quality tools and campaign context)
  • Redis (for cross-worker progress tracking)

Dataset Detail Page

Each dataset's detail page includes a compact Response Quality card showing diversity score, non-response rate, straightlining summary, and question type breakdown. Click through to the full quality analysis page for detailed metrics.


Interpreting Results

For detailed interpretation guidance — common patterns, recommended actions, and the two-dimensional quality matrix — see the Quality Metrics Reference: Interpreting Results.


MCP Tools

Quality analysis tools are available through the MCP interface for programmatic or AI-assisted access:

Tool Purpose
get_dataset_quality Sample representativeness metrics (RMSE, MAE, Chi-Square, per-factor). Silver datasets include weighting diagnostics (DEFF, ESS, weight CV) and completion rate
get_dataset_response_quality Response quality metrics (entropy, straightlining, consistency, acquiescence, speeder detection, multi-flag aggregation)
compare_dataset_quality Side-by-side Bronze vs Silver comparison

See the Dataset Tools Reference for complete parameter documentation.