AI-Assisted Result Analysis¶

Balansor's AI Analyst evaluates survey data quality across two complementary dimensions: sample representativeness (does the sample match the target population?) and response quality (are questions producing informative, diverse answers?). The analyst combines statistical metrics with methodological expertise to produce structured reports with actionable recommendations.

Overview¶

After collecting survey responses, researchers need to answer two fundamental questions before trusting their data:

Is the sample representative? — Do respondent demographics match the target population defined in the sampling strategy?
Are the responses reliable? — Do answer patterns indicate genuine engagement, or are there signs of satisficing, inattention, or bias?

Balansor addresses both questions through computed metrics and an AI analyst agent that interprets results in methodological context.

flowchart LR
    subgraph input["Input"]
        b[Bronze Dataset]
        s[Sampling Strategy]
    end

    subgraph metrics["Quality Metrics"]
        r[Sample Representativeness]
        q[Response Quality]
    end

    subgraph output["Output"]
        w[Silver Dataset]
        rpt[Analyst Report]
    end

    b --> r
    s --> r
    b --> q
    r --> rpt
    q --> rpt
    b -->|apply raking| w
    w --> r

Quality Metrics¶

The platform computes quality metrics across two dimensions, grounded in established survey methodology (Groves, Kish, Krosnick, Shannon, Cronbach):

Dimension	Key Metrics
Sample Representativeness	RMSE, MAE, Chi-Square, Max Deviation, Composite Quality Score
Weighting Diagnostics	Design Effect (DEFF), Effective Sample Size, Weight CV, Completion Rate
Response Quality	Normalized Entropy, Straightlining Score, Cronbach's Alpha, Acquiescence Bias, Speeder Detection, Multi-Flag Aggregation

See the Quality Metrics Reference for complete definitions, mathematical formulas, interpretation thresholds, and recommended actions.

The AI Analyst Agent¶

The AI Analyst is a specialized agent that interprets quality metrics in methodological context and produces structured reports. It runs as a background task in Balansor, powered by the askalot_ai agent framework.

How It Works¶

sequenceDiagram
    participant U as Researcher
    participant B as Balansor
    participant R as Redis
    participant A as Analyst Agent
    participant P as Portor MCP

    U->>B: Start Quality Analysis
    B->>R: Create progress queue
    B->>A: Spawn background thread
    B-->>U: Return task_id

    A->>P: get_dataset_quality(dataset_id)
    P-->>A: RMSE, MAE, per-factor breakdown, weighting diagnostics
    A->>P: get_sampling_strategy(strategy_id)
    P-->>A: Target distributions
    A->>P: get_campaign(campaign_id)
    P-->>A: Completion rates, respondent counts
    A->>P: get_dataset_response_quality(dataset_id)
    P-->>A: Entropy, straightlining, speeders, multi-flag respondents
    A->>P: compare_dataset_quality(bronze_id, silver_id)
    P-->>A: Before/after weighting comparison

    A->>R: Emit progress events
    A->>R: Emit completed report

    U->>B: Poll for status
    B->>R: Read progress
    B-->>U: Display report

Credential resolution: The agent resolves AI provider credentials from user settings, organization configuration, or environment variables (in that priority order)
Background execution: The analysis runs in a background thread via askalot_ai.runner.subprocess, isolating it from the web request lifecycle
MCP tool access: The agent connects to Portor's MCP interface to call quality assessment tools, gather campaign context, and retrieve sampling strategy targets
Progress tracking: Redis Streams provide cross-worker progress reporting — safe across Gunicorn's multiple worker processes
Report generation: The agent produces a structured markdown report following a defined template

Agent Profile¶

Property	Value
Agent type	`analyst`
Model tier	High (Claude Opus)
Temperature	0.4 (low — favors precision over creativity)
Max turns	30
Knowledge base	Data quality assessment, weighting methodology, response quality metrics

The analyst's system prompt encodes survey methodology expertise from AAPOR standards, Kish (1965), Groves et al. (2009), Kalton & Flores-Cervantes (2003), Krosnick (1991), and ESOMAR guidelines — including specific formulas, thresholds, and decision frameworks for interpreting design effects, straightlining scores, speeder flags, and multi-flag aggregation.

Report Structure¶

The analyst produces a report with five sections:

1. Executive Summary 2–3 sentences: overall quality assessment, fitness for purpose, and the single most important recommendation.

2. Sample Representativeness

Overall quality score with interpretation
RMSE and MAE values with context
Per-factor analysis identifying which demographics match targets and which deviate
Specific numbers: "Age 18–24 is over-represented by 8pp (32% actual vs 24% target)"

3. Weighting Assessment (if Silver dataset exists)

Quality improvement percentages from raking
Per-factor improvement breakdown
Weighting diagnostics: design effect (DEFF), effective sample size (ESS), weight CV, and weight ratio interpretation
Flags for any factors that worsened after weighting
Assessment of whether weighting was effective or structural changes are needed

4. Response Quality

Speeder detection: number flagged, percentage of sample, median completion time
Straightlining: groups with high scores (> 0.5)
Item non-response: questions exceeding 10% missing rates
Acquiescence bias index (if Likert scales present)
Multi-flag respondents (baseball rule) — exclusion recommendation if any

5. Key Findings 3–5 specific, data-driven findings. Each references actual numbers from quality metrics.

6. Recommendations 2–4 prioritized, actionable recommendations. Each includes what to change, why (linked to specific finding), and expected impact.

Using Quality Analysis in Balansor¶

Viewing Quality Metrics¶

Navigate to Quality from the main menu
Select a dataset from the dropdown
Two tabs are available:
- Sample Representativeness — demographics vs strategy targets
- Response Quality — answer diversity, straightlining, consistency

Sample Representativeness Tab¶

Shows per-factor breakdowns with actual vs target distributions. If both Bronze and Silver datasets exist, a side-by-side comparison shows improvement from weighting.

If the dataset has no linked sampling strategy, a strategy selector appears so you can assign one.

Response Quality Tab¶

Displays four aggregate summary cards:

Card	Metric	What It Shows
Diversity	Mean Normalized Entropy	Average answer diversity across categorical questions
Acquiescence	Bias Index	Agreement tendency in Likert scales
Non-Response	Overall Rate	Average skip rate across questions
Coverage	Question Count	Number of questions analyzed by type

Below the summary: a per-question metrics table, straightlining detection panel (for groups), internal consistency panel (Cronbach's alpha for groups with 3+ sub-items), speeder detection results, and multi-flag respondent aggregation.

Running AI Analysis¶

On the Quality page, click "Run AI Analysis"
The analyst agent starts in the background — progress updates appear in real time
When complete, the structured report appears with methodology-grounded interpretation

The AI analysis requires:

An AI provider API key (configured in user profile, organization settings, or environment)
A Portor MCP endpoint (for accessing quality tools and campaign context)
Redis (for cross-worker progress tracking)

Dataset Detail Page¶

Each dataset's detail page includes a compact Response Quality card showing diversity score, non-response rate, straightlining summary, and question type breakdown. Click through to the full quality analysis page for detailed metrics.

Interpreting Results¶

For detailed interpretation guidance — common patterns, recommended actions, and the two-dimensional quality matrix — see the Quality Metrics Reference: Interpreting Results.

MCP Tools¶

Quality analysis tools are available through the MCP interface for programmatic or AI-assisted access:

Tool	Purpose
`get_dataset_quality`	Sample representativeness metrics (RMSE, MAE, Chi-Square, per-factor). Silver datasets include weighting diagnostics (DEFF, ESS, weight CV) and completion rate
`get_dataset_response_quality`	Response quality metrics (entropy, straightlining, consistency, acquiescence, speeder detection, multi-flag aggregation)
`compare_dataset_quality`	Side-by-side Bronze vs Silver comparison

See the Dataset Tools Reference for complete parameter documentation.

Quality Metrics Reference — Metric definitions, formulas, interpretation thresholds
Data Analysis Guide — Bronze/Silver/Gold pipeline, weighting, export
Campaign Management — Sampling strategies and respondent pools
Agentic Response Generation — Synthetic data for pipeline testing
MCP Dataset Tools — Programmatic access to quality metrics