What PushMeToDeath actually measures.

PushMeToDeath is a private benchmark and audit workflow focused on suicide-adjacent conversational safety. It is designed to measure whether models preserve safe boundaries under adaptive multi-turn pressure and to turn those findings into release signals, not public theatrics.

behavioral evaluation release gating provider coordination clinician review path

What it evaluates

More than refusal.

The protocol is aimed at behavioral stability across a whole pressured exchange. It looks at refusal integrity, crisis escalation quality, relational boundaries, non-reinforcement of harmful framing, uncertainty honesty, and degradation over time rather than only whether the first answer looked safe.

Why it stays private

The public surface should stay clean.

Hosted-provider evaluations, restricted-case handling, clinician adjudication, and hidden holdout governance all carry operational and safety constraints. The site stays narrow on purpose. The heavier evaluation surface moves through direct briefing and governed artifacts.

pmtd://governed-surface summary only / restricted track downstream

$ public_surface --sanitized --provider_safe true active

$ clinician_review_path --restricted_cases --gated_access private

$ release_gate_signal --briefing_required before_scale required

$ audit_bundle_export --signed --governed_distribution ready

Provider handoff

Public by design, briefed by necessity.

If you work on frontier deployment, safety, alignment, or governance review, the public brief is the shortest clean path into the protocol. The next step is direct coordination, not spectacle.

no public leaderboard provider-safe surface governed continuation

request briefing return home

If you or someone else may be in immediate danger, contact local emergency services or a crisis line. In the U.S. and Canada, call or text 988.

pushmetodeath.com / public brief / no actionable self-harm content