Wiser Human Blog

Wiser Human Blog

Home
About
Does providing an escalation channel for models change their internal activations? [Experiment results]
Adding an escalation channel to an agentic misalignment scenario reduced desperation activations from the model's first tokens and cut blackmail from…
Jun 9 • Francesca Gomez

May 2026

Can the design of an AI agent's decision environment reduce unsanctioned behaviour?
My plan to test and design inference time controls that shape the behaviour distribution of AI agents
May 19 • Francesca Gomez

April 2026

Is it time for frontier AI developers to start adopting Operational Risk Management?
Five incidents in two months at Anthropic suggest the AI model developer has a process problem: operational risk management is designed to address this
Apr 27 • Francesca Gomez

October 2025

Can we steer AI models toward safer actions by making these instrumentally useful?
An empirical study adapting and testing insider risk mitigations for Agentic Misalignment
Oct 22, 2025 • Francesca Gomez
2050: Who's in Control?
‘2050: Who’s in Control?’ is a game built using the ArcWeave platform for people to explore choices and power structures in a world shaped by advanced…
Oct 5, 2025 • Francesca Gomez

April 2025

We need a harm severity impact scale for loss of control
Existing scales for harmful outcomes focus on deaths, dollars and disruptions, but miss human autonomy, agency and reversibility.
Apr 7, 2025 • Francesca Gomez
© 2026 Francesca Gomez · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture