Agentic AI and Security

Frame: why agentic AI shifts security

Modern AI systems operate at machine bandwidth; humans interpret at human bandwidth.
Incidents unfold faster than review cycles and slower-than-real-time approvals.
Agentic systems turn text into actions (tools, APIs, workflows), expanding the attack surface.

Despite rapid change, one constant remains: the attack surface keeps expanding.
Teams ship faster than ever; threats scale faster than most review and governance cycles.
Many organisations lack a judgement layer that turns security noise into “what matters now.”
The challenge is not knowledge — it is contextual reasoning under time pressure.

Institutional tacit knowledge

The “judgement layer” in organisations is often tacit: norms, exceptions, escalation paths, and context that rarely makes it into documentation.
It lives in handoffs and approvals: what gets challenged, what gets waived, and what triggers a halt.
Ceding this tacit knowlege without making it explicit is how we accumulate agentic debt.

The Judgment Layer

Organisations as emulsions

Organisations are a stable mixture: automatable routines and irreducible human context.
In a working emulsion the two phases are inseparable — like oil and water held in suspension.
The judgment layer is the interface: what gets challenged, what gets waived, what triggers a halt.

What the judgment layer looks like

Approval exceptions: “we usually accept this class of risk without escalation.”
Escalation triggers: “this pattern always goes to a human, even if the rule says approve.”
Context-dependent waivers: policy says X, but the situation means Y is safer.

Competitive advantage is not just model quality, it’s deployability: delegation that stays legible under audit, incident response, and regulation.
When authority boundaries are explicit, security becomes an accelerant (faster approvals, safer automation), not a brake.
Agentic systems need “delegation poilcies”: what is allowed, on what evidence, with what recovery path.

Information Theory and AI

Claude Shannon developed information theory at Bell Labs
Information measured in bits, separated from context
Makes information fungible and comparable

Information Transfer Rates

Humans speaking: ~2,000 bits per minute
Machines communicating: ~600 billion bits per minute
Machines share information 300 million times faster than humans

New Flow of Information

Evolved Relationship

Bandwidth vs Complexity


bits/min	\(100 \times 10^{-9}\)	\(2,000\)	\(600 \times 10^9\)

A lens: Human Analogue Machines (HAMs)

HAMs amplify capability: summarise, plan, draft, search, coordinate.
HAMs amplify vulnerability: persuasion, authority bias, social engineering.
Security becomes interface security: what the system can be induced to do, and who can induce it.

Human Analogue Machine

A human-analogue machine is a machine that has created a feature space that is analagous to the “feature space” our brain uses to reason.
The latest generation of LLMs are exhibiting this charateristic, giving them ability to converse.

Heider and Simmel (1944)

Counterfeit People

Perils of this include counterfeit people.
Daniel Dennett has described the challenges these bring in an article in The Atlantic.

Psychological Representation of the Machine

But if correctly done, the machine can be appropriately “psychologically represented”
This might allow us to deal with the challenge of intellectual debt where we create machines we cannot explain.

HAM

Three phases of security change

Use GenAI to compress bandwidth for defenders: triage, summarise, correlate, explain.
Turn logs and alerts into decision-ready narratives with provenance.
Main risk: over-trust and automation bias (false confidence at scale).

Secure summarisation: bounded context, redaction, and provenance links to primary logs.
Analyst copilots: draft investigations, but keep approvals and irreversible actions human.
“Faster-than-human” response: pre-authorise containment actions, not remediation.

Prompt injection becomes an operational threat when the model has tools.
Indirect prompt injection via documents/web pages contaminates the instruction stream.
Data exfiltration shifts from perimeter breach to model-mediated leakage.

Computer Science Paradigm Shift

Von Neuman Architecture:
- Code and data integrated in memory
Today (Harvard Architecture):
- Code and data separated for security

Computer Science Paradigm Shift

Machine learning:
- Software is data
Machine learning is a high level breach of the code/data separation.

Treat prompts, tool outputs, and retrieved documents as untrusted inputs.
Make instruction hierarchy explicit: system/developer/user/tool/data.
Apply least privilege to tools; require confirmations for high-impact actions.

Re-design systems for delegation with accountability.
Make authority boundaries explicit: who can cause which actions, with which evidence.
Build for recovery: audit trails, reversible actions, and containment-by-default.

Intellectual Debt

Technical Debt

Compare with technical debt.
Highlighted by Sculley et al. (2015).

Separation of Concerns

Decompose your complex problem/task into parts.
Each part manageable (e.g. by a small team)
Recompose to solve total problem

Addresses Complex Challenge

Highly successful approach to complex tasks.
Tuned to the human bandwidth limitation.
But the whole system still hard to understand.

Intellectual Debt

Technical debt is the inability to maintain your complex software system.
Intellectual debt is the inability to explain your software system.

Agentic Debt

Agentic AI could pay down technical and intellectual debt.
But it can create agentic debt: delegation without authority/authorship

Agentic Debt

Delegation of workflows without crisp boundaries.
Agentic debt is about unsafe or illegible delegation.

Paying Down Agentic Debt

Making the judgment layer explicit

Agentic debt accumulates silently when automation buries judgment calls rather than surfacing them.
Paying it down means extracting tacit policies into explicit, inspectable rules.
Start with the edges: approval exceptions, escalation triggers, known-safe waivers.

“I don’t know” as a control primitive

Robust agentic systems need an explicit “I don’t know” action — not just low-confidence prose.
“I don’t know” must be operational: halt the task, escalate with evidence, or request human input.
This is not a model weakness. It is a safety control.

Time-bounded delegation

Every delegated task should have a time budget and a termination policy.
At timeout: complete with evidence, or emit “I don’t know” and escalate.
Converts hidden judgment debt into managed, measurable risk.

Lancelot

Separate “thinking” from “acting”: plan, justify, then execute with logged evidence.
Design for rollback: reversible actions and short-lived credentials.
Make audits cheap: every action produces an explanation and a trace.

Case studies and practical takeaways

What happened can matter less than how quickly it unfolded.
At scale, the defender’s bottleneck is often interpretation and coordination, not detection.
Modern attackers exploit organisational latency (handoffs, approvals, ambiguity).

Agentic workflows chain: retrieval → reasoning → tool use → action.
The critical security question: who can influence what the agent believes and what it does?
Threat model: indirect prompt injection, authority confusion, and data boundary violations.

Bandwidth mismatch is the core risk: systems move faster than human sense-making.
Agentic AI turns text attacks into action attacks: model + tools = new threat model.
Organisations carry a judgement layer of tacit norms and escalation paths; automating without extracting it creates agentic debt.
Design for legibility: instruction hierarchies, provenance, and auditable action boundaries.
Prefer reversible, least-privilege delegation with strong defaults and fast containment.
Pay down agentic debt: make policies explicit, require evidence, and give every delegation a recovery path.

Thanks!

company: Trent AI
book: The Atomic Human
twitter: @lawrennd
The Atomic Human pages topography, information 34-9, 43-8, 57, 62, 104, 115-16, 127, 140, 192, 196, 199, 291, 334, 354-5 , anthropomorphization (‘anthrox’) 30-31, 90-91, 93-4, 100, 132, 148, 153, 163, 216-17, 239, 276, 326, 342, Human evolution rates 98-99, Psychological representation of Ecologies 323-327, ignorance: HAMs 347, test pilot 163-8, 189, 190, 192-3, 196, 197, 200, 211, 245, psychological representation 326–329, 344–345, 353, 361, 367, human-analogue machine 343–5, 346–7, 358–9, 365–8, human-analogue machine (HAMs) 343-347, 359-359, 365-368, intellectual debt 84, 85, 349, 365, separation of concerns 84-85, 103, 109, 199, 284, 371, intellectual debt 84-85, 349, 365, 376.
newspaper: Guardian Profile Page
blog: http://inverseprobability.com

References

Heider, F., Simmel, M., 1944. An experimental study of apparent behavior. The American Journal of Psychology 57, 243–259. https://doi.org/10.2307/1416950

Scally, A., 2016. Mutation rates and the evolution of germline structure. Philosophical Transactions of the Royal Society B 371. https://doi.org/10.1098/rstb.2015.0137

Sculley, D., Holt, G., Golovin, D., Davydov, E., Phillips, T., Ebner, D., Chaudhary, V., Young, M., Crespo, J.-F., Dennison, D., 2015. Hidden technical debt in machine learning systems, in: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (Eds.), Advances in Neural Information Processing Systems 28. Curran Associates, Inc., pp. 2503–2511.