Skip to main content

What are cohort stats?

After candidates complete their assessments, Promptster computes aggregate metrics across the cohort. You can see how each candidate compares to everyone else who took the same assessment — useful for calibrating hiring decisions and spotting outliers.
Cohort stats are computed from completed sessions. Sessions with a status of completed, expired, or hired are included. Sessions still in progress are not included.

Get cohort stats for an assessment

GET /v1/assessments/:id/cohort-stats Returns aggregate performance metrics for all completed sessions in the assessment. If you pass a sessionId, the response also includes percentile ranks for that specific candidate. If no completed sessions exist yet, the response returns { "cohortStats": null }.

Query parameters

sessionId
string
UUID of a specific session. When provided, the response includes a percentiles object showing how that session ranks within the cohort.

Example request

curl "https://api.promptster.ai/v1/assessments/ASSESSMENT_ID/cohort-stats?sessionId=SESSION_ID" \
  -H "Authorization: Bearer <your-token>"

Example response

{
  "cohortStats": {
    "sessionCount": 12,
    "metrics": {
      "avgDurationMs": 4320000,
      "avgPromptCount": 24.5,
      "avgCommandFailRate": 0.18,
      "avgVerifyIntensity": 1.4,
      "avgManualEditRatio": 0.12,
      "avgFirstChangeLatencyMs": 185000,
      "testPassRate": 75
    },
    "percentiles": {
      "durationMs": 0.82,
      "promptCount": 0.65,
      "commandFailRate": 0.78,
      "verifyIntensity": 0.91,
      "manualEditRatio": 0.55,
      "firstChangeLatencyMs": 0.70
    }
  }
}

Response fields

cohortStats
object | null
Aggregate cohort data. Returns null if there are no completed sessions for this assessment.
cohortStats.sessionCount
integer
Total number of completed sessions included in the cohort.
cohortStats.metrics.avgDurationMs
integer | null
Average session duration in milliseconds across the cohort. null if no duration data is available.
cohortStats.metrics.avgPromptCount
number
Average number of prompts submitted per session.
cohortStats.metrics.avgCommandFailRate
number
Average ratio of failed terminal commands to total commands (0.0–1.0). Lower is better.
cohortStats.metrics.avgVerifyIntensity
number
Average verification intensity, measured as terminal commands run per file change. A higher value indicates more frequent verification behavior.
cohortStats.metrics.avgManualEditRatio
number
Average ratio of manual (non-AI-generated) edits to total edits (0.0–1.0).
cohortStats.metrics.avgFirstChangeLatencyMs
integer | null
Average time in milliseconds from session start to the first file change. null if no latency data is available.
cohortStats.metrics.testPassRate
integer
Percentage of sessions (0–100) where all tests passed at submission time.
cohortStats.percentiles
object
Percentile ranks for a specific session, returned only when sessionId is provided and the cohort has at least 2 completed sessions. Each value is between 0.0 and 1.0, where 1.0 means the candidate outperformed all others in the cohort on that metric.
cohortStats.percentiles.durationMs
number | null
Percentile rank for session duration. Higher means the candidate worked more efficiently relative to the cohort.
cohortStats.percentiles.promptCount
number | null
Percentile rank for prompt count. Higher means the candidate used fewer prompts than most of the cohort.
cohortStats.percentiles.commandFailRate
number | null
Percentile rank for command fail rate. Higher means the candidate had fewer failed commands than most of the cohort.
cohortStats.percentiles.verifyIntensity
number | null
Percentile rank for verification intensity. Higher means the candidate verified their work more frequently than most of the cohort.
cohortStats.percentiles.manualEditRatio
number | null
Percentile rank for manual edit ratio. Higher means the candidate made fewer manual edits than most of the cohort.
cohortStats.percentiles.firstChangeLatencyMs
number | null
Percentile rank for first-change latency. Higher means the candidate’s latency was longer than most of the cohort (they spent more time exploring before making changes). Interpret in context — some tasks warrant more upfront exploration than others.
Percentile ranks are only computed when the cohort has at least 2 completed sessions.