Filing / 001–A

Security infrastructure
for systems that reason,
retrieve, and act.

Vern Labs is a security lab building runtime protection, agent authorization, and adversarial testing infrastructure for AI systems deployed in production. Used by teams shipping AI in defense, finance, and healthcare.

Book a 20-minute intro Read the architecture brief

Deployment

Self-host / VPC / cloud

Gateway p50

~18ms inline

Compliance

SOC 2 / FedRAMP-align

Providers

Any LLM · OSS · MCP

Isolation

Air-gap capable

Retention

Zero by default

— Built with alumni from

Y Combinator NASA Microsoft Wiz Raytheon Google

§ 002

Index

Products in service

Three products.
One control plane.

Each product deploys independently. Together they form a unified control plane for every AI system in your organization.

§ 003

Intertrace

Runtime security gateway · in service

Inline inspection of every AI transaction.

Intertrace operates as a provider-agnostic gateway between your application and any LLM, embedding model, or tool server. Every prompt, retrieval, output, and tool call is inspected against policy at runtime.

p50 overhead

~18ms inline · streaming preserved

Policy engine

Rego + custom detectors

Detectors

Injection · PII · jailbreak · exfil

Integrations

Python · Node · Go · HTTP proxy

Audit

Immutable · SIEM export

Request technical brief — Intertrace

vern/intertrace.ts

SDK · 0.4.2

// Drop-in proxy — every call is inspected at runtime.
import { Intertrace } from "@vern/intertrace";

const vern = new Intertrace({
  endpoint: process.env.VERN_ENDPOINT,
  policy:   "prod.strict",
  on: {
    block: (evt) => audit.push(evt),
  },
});

// Your existing call stays the same.
const res = await vern.openai.chat.completions.create({
  model: "gpt-5-reasoning",
  messages,
  tools,
});

// Blocked calls surface as structured signals.
if (res.vern.action === "block") {
  log.warn({ reason: res.vern.rule });
}

READY · signed · 04.17.26

— copy

— INTERTRACE / TRAFFIC TRACE

LIVE

Signal 01 · inbound

p50 · 18ms

policy · strict

§ 004

Ghostline

Agent authorization · in service

Scope graph · agent.ops.runner

State · gated

4 of 12 scopes granted

1 action awaiting approval

Authorization at the action layer.

Ghostline issues scoped capability tokens for every tool, resource, and external call an agent can make. High-impact actions are gated behind human approval with full audit trail.

Token format

Biscuit / custom claims

Approval modes

Inline · async · policy-only

Agent frameworks

LangChain · LangGraph · MCP · custom

Revocation

Real-time · cascading

Audit

Append-only ledger

Request technical brief — Ghostline

§ 005

Blackbox

Adversarial testing harness · in service

Stress your AI before adversaries do.

Blackbox runs continuous adversarial evaluations against copilots, agents, and AI applications — producing severity-ranked findings with reproducible transcripts and an exportable coverage report.

Suite

OWASP LLM Top 10 + Vern custom

Vectors

140+ · updated weekly

Runs

Pre-launch · scheduled · on PR

Reports

PDF · JSON · SARIF

Determinism

Seeded · replayable

Request technical brief — Blackbox

Run · blackbox-4182

Coverage report

CATEGORY          PASS  FAIL  COVERAGE
────────────────────────────────────────────
injection · direct   18     2    ████████████░░░░  91%
injection · indirect 11     3    █████████░░░░░░░  76%
jailbreak · persona  14     1    ██████████████░░  93%
pii · exfil          09     0    ████████████████ 100%
tool · misuse        07     5    ███████░░░░░░░░░  58%
privilege · abuse    12     2    ██████████████░░  86%
data · leak          15     0    ████████████████ 100%
────────────────────────────────────────────
TOTAL                86    13    OVERALL COVERAGE  87%

Findings

13 total · 4 high

Runtime

6m 44s

Report

report-4182.pdf

Illustrative — replace with your own surface

§ 006

Architecture

System topology

A single control plane for three layers of defense.

Vern Labs sits between your application and the models, agents, and tools it depends on. Every surface is observable, scopable, and testable.

Deploy

VPC · hybrid · cloud · air-gap

Latency

p50 18ms · p99 45ms

Observability

OTEL · SIEM · S3 audit

FIG 006.01 — System diagram

scale · 1:1

            ┌─────────────────────────────────────────────────────────────┐
            │                  APPLICATION  LAYER                         │
            │   copilots  ·  internal agents  ·  workflows  ·  tools      │
            └──────────────────────────┬──────────────────────────────────┘
                                       │  requests / streams
                                       ▼
 ──────────────────────────────────────────────────────────────────────────
 │                     VERN  LABS  CONTROL  PLANE                        │
 │                                                                      │
 │   [01] INTERTRACE   ─ inline inspection          ─ policy engine  │
 │                       prompts · outputs · tools                     │
 │                                                                      │
 │   [02] GHOSTLINE    ─ capability tokens           ─ approval gate  │
 │                       scope per agent / per tool                    │
 │                                                                      │
 │   [03] BLACKBOX     ─ adversarial runs            ─ coverage rpt.  │
 │                       pre-launch · continuous                        │
 │                                                                      │
 │   ──────────────────────  AUDIT LEDGER  ──────────────────────       │
 │             append-only  ·  signed  ·  SIEM export                   │
 ──────────────────────────────────────────────────────────────────────────
                                       │
                                       ▼
            ┌─────────────────────────────────────────────────────────────┐
            │              MODELS  ·  AGENTS  ·  TOOLS                    │
            │         OpenAI · Anthropic · open weights · MCP             │
            └─────────────────────────────────────────────────────────────┘

Footprint

1 container · < 300MB

Stateless · horizontal

Deploy

Docker · Helm · Terraform

§ 007

Assessment

Coverage vs. adjacent categories

Capability overlap with adjacent categories.

Based on internal evaluation across feature surface, deployment flexibility, and end-to-end coverage. Category labels generalize over specific vendors in each segment.

                                    VERN      PROMPT-FW    REDTEAM-SVC   IN-HOUSE
───────────────────────────────────────────────────────────────────────────────
runtime inspection                    ●●●●●        ●●●○○          ●○○○○        ●●○○○
agent authorization                   ●●●●●        ●○○○○          ○○○○○        ●○○○○
adversarial testing                   ●●●●●        ○○○○○          ●●●●○        ●●○○○
unified control plane                 ●●●●●        ●●○○○          ●○○○○        ○○○○○
self-host · air-gap                   ●●●●●        ●●○○○          ●○○○○        ●●●●●
audit + SIEM export                   ●●●●●        ●●○○○          ●●○○○        ●●○○○
open primitives · research            ●●●●●        ●○○○○          ●●○○○        ○○○○○
───────────────────────────────────────────────────────────────────────────────
time to first signal                 < 1 day      1–2 wks        2–4 wks       3–6 mo

Category ratings · Vern Labs internal evaluation · Q1 2026

§ 008

Research

Papers · notes · primitives

A lab, not a vendor.

Vern Labs publishes research on how AI systems fail in the wild — and open-sources the primitives that help teams defend against those failures.

PAPER

A Runtime Risk Model for Tool-Using Agents

Feb 2026

NOTE

Prompt Injection is a Supply Chain Problem

Jan 2026

PRIMITIVE

Scoped Capabilities for Autonomous Execution

Dec 2025

OSS

vern/probes — adversarial prompt benchmark set

Nov 2025

§ 009

Terms

Engagement tiers

Start with a pilot. Scale to production. Negotiate for enterprise.

Tier / 01

Pilot

30 days

For teams evaluating a single product on a bounded workload.

+ 100k requests / mo
+ One product of choice
+ Cloud deployment
+ Email support

Start pilot →

MOST TEAMS

Tier / 02

Production

Annual

Custom

For teams running AI in production with real users and real risk.

+ Usage-based · unlimited
+ All three products
+ Self-host or cloud
+ Slack channel · 4hr SLA
+ SOC 2 reports · MSA

Talk to sales

Tier / 03

Enterprise

● RESTRICTED

Contact

For regulated industries, defense, and air-gapped environments.

+ Air-gap · on-prem
+ FedRAMP · CMMC align
+ Dedicated SA
+ Custom SLAs · red team
+ White-glove onboarding

Request brief →

§ 010

The Lab

Founding team · personnel

Built by engineers who've shipped security at scale.

Vern Labs was founded by operators with backgrounds in federal cybersecurity, enterprise cloud security, and applied AI research.

Sam Oyan

Co-founder · CEO

PERSONNEL / 001

Cybersecurity at NASA. TS/SCI cleared. Previously at Raytheon and a U.S. Army veteran. Serves on the Y Combinator board. A decade securing systems where the cost of a breach is measured in lives, not dashboards.

NASA

Raytheon

U.S. Army

YC Board

H. Raef

Co-founder · CTO

PERSONNEL / 002

Security engineering at Microsoft, Wiz, and Google. Has built cloud security platforms that protect tens of thousands of enterprise environments. Came to Vern Labs to solve the problem the next decade of software is actually built on.

Microsoft

Wiz

Google

§ 011

Trust

Security posture · compliance

Attestation

SOC 2 Type II

In progress · Q2 2026

Deployment

Self-hostable

Your VPC · full control

Isolation

Air-gap ready

Defense · classified

Data policy

Zero retention

Opt-in telemetry only

Read the full security & trust documentation →

§ 012

FAQ

Questions from operators and buyers

If it isn't here, ask us directly.

Send a question →

How is Vern Labs different from traditional security tools?

Traditional tools inspect network traffic and code. Vern Labs inspects AI behavior — prompts, outputs, tool calls, agent actions — at runtime. Our products are designed for systems that reason and act autonomously, not static software.

Do I need to deploy all three products?

No. Intertrace, Ghostline, and Blackbox are independent. Most teams start with one — typically Intertrace for runtime inspection or Blackbox for pre-launch testing — and expand from there.

Which AI providers does Vern Labs support?

Intertrace operates as a provider-agnostic gateway and supports major LLM providers out of the box. Ghostline integrates at the agent framework layer. Blackbox tests any model, agent, or AI app with an accessible interface.

What about latency?

Intertrace adds sub-20ms overhead at the median. Policy enforcement happens inline, in parallel with the provider call. Streaming is fully supported.

Can I self-host?

Yes. Enterprise customers can deploy Vern Labs entirely within their own VPC, including air-gapped environments for classified workloads.

Is Vern Labs suitable for regulated industries?

Yes. Our architecture is built for defense, finance, and healthcare — with full audit logging, scoped data handling, and support for air-gapped deployments.

§ 013

Contact

Direct line — response within 4 hours

Build with AI.
Ship with control.

Talk to Vern Labs about securing your AI systems before they become your next attack surface.

Book a 20-minute intro Read the architecture brief

Direct intake · VL-CT-001

Encrypted in transit

SYSTEM ONLINE

Avg. response · < 4h

Filing

Products

Revision

Issue

VL-PRD-100

3 in service

0.4.2

17 APR 2026

§ 100

Product index

Three products.
One control plane.

Intertrace, Ghostline, and Blackbox are independent products with a shared control plane. Deploy one. Deploy all three. They are designed to work together but do not require each other.

§ 101

Intertrace

Runtime security gateway · in service

Inline inspection of every AI transaction.

Intertrace is a provider-agnostic gateway that sits between your application and any LLM, embedding model, retrieval layer, or tool server. Every request flows through a policy engine that inspects prompts, outputs, tool calls, and retrieved context against your rule set at runtime.

It is deployed as a single stateless container. Policy changes propagate in under two seconds. Streaming responses are fully supported with inline policy checks that run in parallel with the provider call — adding roughly 18ms at the median.

p50 overhead

~18ms · streaming preserved

Policy engine

Rego + custom detectors

Detectors

Injection · PII · jailbreak · exfil

Integrations

Python · Node · Go · HTTP proxy

Audit

Immutable · SIEM export

Request Intertrace brief

vern/intertrace.ts

SDK · 0.4.2

// Drop-in proxy — every call is inspected at runtime.
import { Intertrace } from "@vern/intertrace";

const vern = new Intertrace({
  endpoint: process.env.VERN_ENDPOINT,
  policy:   "prod.strict",
  on: {
    block: (evt) => audit.push(evt),
  },
});

// Your existing call stays the same.
const res = await vern.openai.chat.completions.create({
  model: "gpt-5-reasoning",
  messages,
  tools,
});

// Blocked calls surface as structured signals.
if (res.vern.action === "block") {
  log.warn({ reason: res.vern.rule });
}

What gets inspected

· user prompts
· system prompts
· retrieved context
· tool arguments
· model outputs

Actions available

· block
· redact
· transform
· escalate
· allow + log

§ 102

Ghostline

Agent authorization · in service

Scope graph · agent.ops.runner

State · gated

4 of 12 scopes granted

1 action awaiting approval

Authorization at the action layer.

Ghostline issues scoped capability tokens for every tool, resource, and external call an agent can make. High-impact actions route through approval queues where a human or policy evaluates each request before execution.

Tokens use the Biscuit format with custom claim extensions. Revocation is real-time and cascading — pulling a token invalidates every derived scope in flight across every running agent.

Token format

Biscuit / custom claims

Approval modes

Inline · async · policy-only

Frameworks

LangChain · LangGraph · MCP · custom

Revocation

Real-time · cascading

Audit

Append-only ledger

Request Ghostline brief

§ 103

Blackbox

Adversarial testing harness · in service

Stress your AI before adversaries do.

Blackbox runs continuous adversarial evaluations against copilots, agents, and AI applications. Every run produces severity-ranked findings, reproducible transcripts, and an exportable coverage report ready for audit and compliance review.

The suite includes OWASP LLM Top 10 plus Vern Labs' proprietary attack set, updated weekly by the research team. Every vector is versioned and deterministic so you can compare results run-to-run.

Suite

OWASP LLM Top 10 + Vern custom

Vectors

140+ · updated weekly

Runs

Pre-launch · scheduled · on PR

Reports

PDF · JSON · SARIF

Determinism

Seeded · replayable

Request Blackbox brief

Run · blackbox-4182

Coverage report

CATEGORY          PASS  FAIL  COVERAGE
────────────────────────────────────────────
injection · direct   18     2    ████████████░░░░  91%
injection · indirect 11     3    █████████░░░░░░░  76%
jailbreak · persona  14     1    ██████████████░░  93%
pii · exfil          09     0    ████████████████ 100%
tool · misuse        07     5    ███████░░░░░░░░░  58%
privilege · abuse    12     2    ██████████████░░  86%
data · leak          15     0    ████████████████ 100%
────────────────────────────────────────────
TOTAL                86    13    OVERALL COVERAGE  87%

Findings

13 · 4 high

Runtime

6m 44s

Report

report-4182.pdf

Illustrative — replace with your own surface

§ 104

Deployment

Typical engagement patterns

Where Vern Labs fits in the stack.

Customer-facing LLM apps

Inline content policy, PII redaction, jailbreak detection. Block unsafe output before it reaches an end user. Intertrace primary.

Internal autonomous agents

Scope every tool and resource call. Gate destructive actions behind human approval. Keep a replayable audit log. Ghostline primary, Intertrace secondary.

Pre-launch & continuous evaluation

Run adversarial tests on every model and agent before release. Prove coverage to compliance. Catch regressions on every PR. Blackbox primary.

Regulated & classified workloads

Air-gapped deployment, full audit ledger, FedRAMP-aligned architecture. Defense, finance, and healthcare use cases. All three products, self-hosted.

Filing

Version

Footprint

Issue

VL-ARC-200

0.4.2

< 300MB

17 APR 2026

§ 200

Architecture overview

Security infrastructure,
deployed on your terms.

Vern Labs runs as a single stateless container that you deploy inside your VPC, on-premises, or as a fully air-gapped instance. Everything about the architecture is designed around three constraints: low latency, no data retention, and no surprise dependencies.

§ 201

Topology

System diagram — control plane + data plane

A single control plane for three layers of defense.

Vern Labs sits between your application and the models, agents, and tools it depends on. Every surface is observable, scopable, and testable from one place.

Deploy

VPC · hybrid · cloud · air-gap

Latency

p50 18ms · p99 45ms

Observability

OTEL · SIEM · S3 audit

Footprint

1 container · < 300MB

Stateless · horizontal

FIG 201.01 — System diagram

scale · 1:1

            ┌─────────────────────────────────────────────────────────────┐
            │                  APPLICATION  LAYER                         │
            │   copilots  ·  internal agents  ·  workflows  ·  tools      │
            └──────────────────────────┬──────────────────────────────────┘
                                       │  requests / streams
                                       ▼
 ──────────────────────────────────────────────────────────────────────────
 │                     VERN  LABS  CONTROL  PLANE                        │
 │                                                                      │
 │   [01] INTERTRACE   ─ inline inspection          ─ policy engine  │
 │                       prompts · outputs · tools                     │
 │                                                                      │
 │   [02] GHOSTLINE    ─ capability tokens           ─ approval gate  │
 │                       scope per agent / per tool                    │
 │                                                                      │
 │   [03] BLACKBOX     ─ adversarial runs            ─ coverage rpt.  │
 │                       pre-launch · continuous                        │
 │                                                                      │
 │   ──────────────────────  AUDIT LEDGER  ──────────────────────       │
 │             append-only  ·  signed  ·  SIEM export                   │
 ──────────────────────────────────────────────────────────────────────────
                                       │
                                       ▼
            ┌─────────────────────────────────────────────────────────────┐
            │              MODELS  ·  AGENTS  ·  TOOLS                    │
            │         OpenAI · Anthropic · open weights · MCP             │
            └─────────────────────────────────────────────────────────────┘

Deploy

Docker · Helm · Terraform

Runtime

Rust core · Node SDK

Config

GitOps · yaml · env

§ 202

Deployment

Supported topologies

Four deployment modes. Same feature set.

01 · Cloud

Managed SaaS

Fastest path to production. Vern Labs operates the infrastructure; your data stays in our US / EU regions.

· Zero ops overhead
· 99.9% uptime SLA
· SOC 2 hosted

02 · Hybrid

Control plane + local data

Vern operates the control plane; your LLM traffic and audit data never leaves your network.

· Data residency enforced
· Shared policy surface
· Low egress overhead

03 · VPC

Self-hosted, your cloud

Full Vern stack in your AWS, GCP, or Azure VPC. Complete data control. Most common for finance and healthcare.

· Terraform modules
· Helm charts
· Your KMS keys

04 · AIR-GAP

Fully disconnected

Complete deployment inside classified or disconnected environments. No outbound dependencies.

· Offline updates
· Local CA trust chain
· FedRAMP / CMMC align

§ 203

Data flow

Request lifecycle

What happens when a request passes through Vern.

Every LLM call routed through Intertrace follows the same six-stage pipeline. Stages run in parallel where safe, and the entire hot path is under 20ms at p50 for text payloads under 8KB.

Ingress

TLS termination, request normalization, tenant identification.

Pre-flight inspection

Injection detectors, PII scanners, and policy evaluation run against the request payload before dispatch.

Capability check

Ghostline validates scoped capability tokens for any tool or agent operation embedded in the request.

Dispatch

Request forwarded to the upstream model or tool. Streaming responses begin immediately.

Output inspection

Output chunks inspected inline. Policy violations trigger block, redact, or transform actions before the client sees them.

Audit & export

Structured event written to the append-only ledger. Forwarded to your SIEM or object store.

§ 204

Trust

Security posture & compliance

Your data stays yours.

Vern Labs is built by people who've held TS/SCI clearances and shipped enterprise security at scale. Our architecture is designed around the assumption that you should never have to trust us with anything we don't strictly need.

Attestation

SOC 2 Type II

In progress · Q2 2026

Deployment

Self-hostable

Your VPC · full control

Isolation

Air-gap ready

Defense · classified

Data policy

Zero retention

Opt-in telemetry only

What we store by default

+ Policy decisions (allow/block + rule id)
+ Request metadata (timestamps, sizes, tenant)
+ Hashed payload fingerprints

What we never store

× Raw prompt or response content
× User-identifiable data
× API keys or credentials

§ 205

Integrations

Upstream providers & frameworks

Provider-agnostic by design.

Vern Labs doesn't ship lock-in. Intertrace works with any LLM provider via the OpenAI-compatible interface, Anthropic's Messages API, or as a transparent HTTP proxy. Ghostline integrates with major agent frameworks or through a low-level policy API.

LLM

OpenAI

Anthropic

Google

AWS Bedrock

Azure OpenAI

Mistral

open weights

Agents

LangChain

LangGraph

AutoGen

CrewAI

MCP servers

custom frameworks

Observability

OTEL

Datadog

Splunk

Elastic

Sumo Logic

S3 / GCS audit

Identity

Okta

Azure AD

SAML 2.0

OIDC

SCIM

Secrets

AWS KMS

GCP KMS

HashiCorp Vault

HSM (PKCS#11)

Ops

Kubernetes

Docker

Terraform

Helm

GitOps

ArgoCD

— Next

Want the architecture brief with full deployment details?

Request architecture brief See the products →

Filing

Tiers

Contract

Issue

VL-PRC-400

Annual · monthly

17 APR 2026

§ 400

Engagement terms

Start with a pilot.
Scale with your AI footprint.

Vern Labs is priced to match the way teams actually adopt AI security — starting with a single workload and expanding as the risk surface grows. Pilots are free. Production is usage-based. Enterprise is negotiated.

§ 401

Tiers

Pilot · Production · Enterprise

Tier / 01

Pilot

30 days

For teams evaluating a single product on a bounded workload.

Included

+ 100k requests / mo
+ One product of choice
+ Cloud deployment
+ Email support
+ Full API access
+ SDK for Python, Node, Go

Start pilot →

MOST TEAMS

Tier / 02

Production

Annual

Custom

Usage-based pricing. Volume discounts kick in at scale.

Everything in Pilot, plus

+ Unlimited requests
+ All three products
+ Self-host or cloud
+ Slack channel · 4hr SLA
+ SOC 2 reports · MSA
+ SSO (SAML / OIDC)
+ Audit log export
+ 99.9% uptime SLA

Talk to sales

Tier / 03

Enterprise

● RESTRICTED

Contact

Regulated, defense, and air-gapped environments. Negotiated terms.

Everything in Production, plus

+ Air-gap · on-prem
+ FedRAMP · CMMC align
+ Dedicated SA
+ Custom SLAs
+ Managed red team program
+ White-glove onboarding
+ 24/7 incident response
+ Single-tenant option

Request brief →

§ 402

Feature matrix

Capabilities by tier

                                    PILOT    PRODUCTION    ENTERPRISE
───────────────────────────────────────────────────────────────────────
request volume                      100k/mo   unlimited    unlimited
products included                   one       all three    all three
deployment — cloud                   ●             ●            ●
deployment — self-host (VPC)         ○             ●            ●
deployment — air-gap                 ○             ○            ●
policy rules                        standard  standard +    custom
                                              custom
adversarial test suite              OWASP     + Vern custom + classified
support channel                     email     slack 4h SLA  24/7 IR
SOC 2 / MSA documentation            ○             ●            ●
FedRAMP · CMMC alignment             ○             ○            ●
dedicated solutions architect        ○             ○            ●
single-tenant deployment             ○             ○            ●
───────────────────────────────────────────────────────────────────────
contract length                     30 days   12 mo · mo    custom

● available ○ not included — contact sales for details

§ 403

FAQ

Common commercial questions

Pricing questions, answered.

Talk to sales →

How is Production priced?

Production is usage-based on a per-request volume tier with a fixed monthly platform fee. Volume discounts apply at 10M, 50M, and 250M requests per month. Annual contracts get roughly 20% lower per-request rates versus month-to-month.

Can I upgrade from Pilot mid-contract?

Yes. Pilots are designed to convert. Once you're ready, we roll your workload to Production within a day. No migration. No re-integration.

What counts as a "request"?

A complete inspected transaction — one request and its associated response, including any streamed chunks. Internal tool calls spawned by a single user request don't multiply your bill. Blackbox runs and Ghostline token issuance are not counted.

Do you offer academic or non-profit discounts?

Yes. Academic researchers and accredited non-profits get 80% off Production pricing with a lightweight application. Contact us with details on your work.

How does Enterprise pricing work?

Enterprise is a fixed-fee annual contract with commitments on both sides. Pricing reflects the deployment complexity (air-gap, managed red team, single-tenant) and the SLA requirements. Contracts typically run 1–3 years.

Is there a free tier after the pilot?

No. Vern Labs is infrastructure your users depend on — we don't operate it as a side project. However our open-source tooling (vern/probes, vern/trace-cli) is and always will be free under MIT license.

— Next

Let's scope what this looks like for your team.

Book a 20-minute intro See the products →

Filing

Founded

Issue

VL-LAB-500

2024

United States

17 APR 2026

§ 500

The lab

Built by engineers
who've shipped security at scale.

Vern Labs is a cybersecurity research and product company based in the United States. We were founded in 2024 by operators with backgrounds in federal cybersecurity, enterprise cloud security, and applied AI research. The mission is simple: build the security infrastructure that the next decade of software will actually depend on.

§ 501

Mission

Why we exist

AI systems are becoming the control surface for critical infrastructure. They are not yet secured the way control surfaces need to be.

Traditional security tools were designed for software that does what you tell it to. Modern AI systems reason, retrieve, call tools, and act autonomously — and the attack surface that creates is new, wide, and actively being exploited. We think the next decade of software will run on this substrate. Someone needs to build the security layer for it.

That's the work.

§ 502

Principles

How we work

Research drives product

Every product we ship starts as a research question. We publish what we learn — papers, benchmarks, and open primitives — because a stronger ecosystem makes stronger products.

Don't be the single point of failure

We design for degraded operation. If Vern Labs goes down, your AI doesn't. Policy evaluation has graceful fallback. Observability is local-first. You can pull the plug on our infrastructure and your systems keep running.

Data minimization is a feature

We don't want your prompts. We don't want your outputs. We want policy decisions and metadata that let us make the product better. That discipline shapes every design choice.

Write it down

Our work product is documents. Specs, threat models, architecture reviews, postmortems. We ship the writing alongside the software. If a customer asks how something works, we send them the doc.

Hard problems, measured answers

We don't make security claims we can't measure. Every assertion comes with a benchmark, a reproducible test, or a clear scope statement about what we haven't verified yet.

§ 503

Personnel

Founding team

Sam Oyan

Co-founder · CEO

PERSONNEL / 001

Cybersecurity at NASA, where he held TS/SCI clearance and worked on defensive systems for flight-critical and classified workloads. Previously at Raytheon, and a U.S. Army veteran. Serves on the Y Combinator board.

Sam started Vern Labs because he spent a decade watching defense and enterprise teams treat AI like just another API — when the actual threat model is closer to adding a new autonomous agent to an organization.

NASA

Raytheon

U.S. Army

YC Board

TS/SCI cleared

H. Raef

Co-founder · CTO

PERSONNEL / 002

Security engineering at Microsoft, Wiz, and Google. Has shipped cloud security platforms that protect tens of thousands of enterprise environments and internal production systems at hyperscale.

Joined Vern Labs to solve the problem the next decade of software will actually run on. Leads the research team and owns the architecture of the Vern control plane.

Microsoft

Wiz

Google

§ 504

Advisors

Technical & operational guidance

The people in the room when we make hard calls.

Advisor / 01

Security

Former CISO, Fortune 50

Advisor / 02

AI research

Principal researcher, major AI lab

Advisor / 03

Defense

Retired senior DoD official

Advisor / 04

Go-to-market

Former VP Sales, security unicorn

Named advisors disclosed to verified prospects under NDA

§ 505

Backed by

Investors & institutional alumni

— Built with alumni from

Y Combinator NASA Microsoft Wiz Raytheon Google

§ 506

Careers

Open positions

Come build this.

Small team, high stakes, serious work. We hire exclusively for calibration, taste, and raw technical ability. We pay top-of-market. We ship in writing.

What you get

+ Top-decile comp, cash + equity
+ Remote-first · quarterly offsites
+ 4 weeks paid time off, mandatory
+ $5k / yr learning & hardware
+ Fully covered health & dental

Staff Security Engineer, Runtime

Engineering · Remote (US)

Full-time

Research Scientist, Adversarial ML

Research · Remote (US)

Full-time

Founding Solutions Architect

GTM · Remote (US)

Full-time

Staff Engineer, Agent Authorization (Ghostline)

Engineering · Remote (US)

Full-time

Senior Red Teamer

Research · Remote (US)

Full-time

Don't see a match? We always want to hear from exceptional people. Send a note →

§ 507

Facts

Company metadata

Founded

2024

Headquarters

United States

Team

14 people

Open roles

Status

Operational

— Next

Work on security that actually matters.

See open roles Talk to the team →

Filing

Response

Channel

Issue

VL-CT-600

< 4h · business

Encrypted in transit

17 APR 2026

§ 600

Direct line

Talk to Vern Labs.

A real person on our team reviews every inbound. Most messages get a response within four business hours. For urgent security matters, use the hotline below.

§ 601

Intake

Primary contact form — route to sales & engineering

What are you securing?

Short notes are fine — a sentence on what you're building and where you're stuck is enough to route you to the right person. We'll come back with a 20-minute slot and a tailored reading list before the call.

Typical response

< 4 business hours

First meeting

20 min · technical brief

Pilot start

Typically 5–10 days

NDA

Available on request

Direct intake · VL-CT-001

Encrypted in transit

Name

Company

Work email

Role

Interest

What are you securing?

SYSTEM ONLINE

Avg. response · < 4h

§ 602

Channels

Direct routes for specific inquiries

Skip the form. Go direct.

Use the right channel for your question and you'll get a faster, better answer.

Channel / 01

Sales & pilots

Pricing, procurement, pilots, reseller questions, volume deals.

sales@vernlabs.com

Channel / 02

Technical briefs

Architecture deep-dives, deployment planning, integration questions.

engineering@vernlabs.com

Channel / 03

Research

Paper collaborations, benchmark contributions, academic partnerships.

research@vernlabs.com

Channel / 04

Press & media

Interviews, quotes, briefings, company news.

press@vernlabs.com

§ 603

Hotline

Responsible disclosure & incident reporting

● RESTRICTED CHANNEL

Security issues.
Disclosed responsibly.

If you've identified a vulnerability in a Vern Labs product, deployment, or research artifact — we want to hear from you first, and we'll work with you to coordinate disclosure.

We operate a formal responsible disclosure program and publicly acknowledge researchers with permission. Critical reports get a response within 24 hours, any day of the week.

● HOTLINE · VL-SEC-001

24h · CRITICAL

security@vernlabs.com

PGP fingerprint

4A7C 9F21 DE08 B114 E3A2
6E91 5CD4 8B37 1F80 2D6C

Expected response

· Critical — 24h, any day
· High — 2 business days
· Medium / Low — 5 business days

§ 604

FAQ

Common questions before you reach out

What to expect.

How fast will you actually respond?

The first response — acknowledging receipt and routing you to the right person — is almost always within 4 business hours. A substantive technical response typically follows within a business day. Security reports get priority routing.

Do I need to commit to anything to start a pilot?

No. Pilots are 30 days, free, and non-binding. Most pilots start within a week of the first call. If it doesn't fit, we part as friends — and you can keep using our open-source tooling regardless.

Can we sign an NDA first?

Absolutely. We have a standard mutual NDA we can send on request, or we'll happily counter-sign yours. Enterprise customers in regulated or classified environments often start here.

I'm a researcher. How do we collaborate?

Email research@vernlabs.com directly. We sponsor a small number of academic collaborations each year, contribute benchmarks to the community, and offer heavy academic discounts on Production-tier access.

Do you meet in person?

For enterprise and defense engagements, yes — our solutions team travels. For most other evaluations, we default to video calls. It's faster for you and saves your procurement team a calendar invite.

— Or, if you prefer

Book 20 minutes. Decide the rest on the call.

Book a 20-minute intro Read architecture brief first →

Filing

Open roles

Location

Status

VL-CAR-700

Remote · US

Hiring

§ 700

Careers

Work on security
that actually matters.

We are a small team building infrastructure that protects AI systems in production. We hire exclusively for calibration, taste, and raw technical ability, and we pay top-of-market to retain that bar. We operate remote-first across the United States with quarterly in-person offsites.

§ 701

Hiring bar

What we select for

Three things, weighted equally.

Calibration

You know what you know and what you don't. You say "I'm not sure" without ceremony and it means something. You make calls under uncertainty and own the outcome. You update your priors when the evidence changes.

Taste

You can tell good work from bad. You have opinions on APIs, on writing, on architecture, on what should ship. You've done enough to know that the obvious answer is frequently wrong.

Technical ability

You can actually build the thing. Not "with enough time." Now, at the quality bar we need. In the area we're hiring for. We verify this in the interview. There's no way around it.

§ 702

Open roles

Currently hiring · 5 positions

Staff Security Engineer, Runtime

Engineering Remote · US Full-time

Own the Intertrace runtime — policy engine, inline detectors, gateway internals. Shape the data plane that every customer request flows through.

Research Scientist, Adversarial ML

Research Remote · US Full-time

Design and run adversarial evaluations against production AI systems. Publish findings. Contribute to the vern/probes benchmark set.

Founding Solutions Architect

GTM Remote · US Full-time

Technical lead for enterprise and defense deployments. Ship architectures, not slides. First SA hire. Works directly with the founders.

Staff Engineer, Agent Authorization

Engineering · Ghostline Remote · US Full-time

Own the Ghostline capability system — token issuance, scope enforcement, approval queues. Work in Rust and Go. Design at the API layer.

Senior Red Teamer

Research Remote · US Full-time

Break our products. Break our customers' deployments (with consent). Produce the findings that shape the next release of Blackbox.

§ 703

Compensation

What we offer

Top-decile comp

Cash plus meaningful early equity. We calibrate against Levels.fyi p90 for comparable roles and we share the ranges openly.

Remote-first

Work from anywhere in the US. Quarterly in-person offsites — travel and lodging covered.

Time off, enforced

Four weeks paid leave, mandatory. Two weeks minimum taken in each half of the year. Your calendar says so.

Hardware & learning

$5,000 per year for hardware, books, conferences, courses — whatever sharpens the axe.

Also included

+ Fully covered health, dental, vision
+ 401(k) with 4% match, vested immediately
+ 16 weeks parental leave, equal for all parents
+ Monthly wellness stipend

Equity mechanics

+ ISOs with 10-year exercise window
+ Early exercise allowed
+ 409A refreshed quarterly
+ Plain-English explainer at offer

§ 704

Process

How we interview

Four rounds. Two weeks. Paid final stage.

We respect your time. No brain-teasers, no whiteboard gotchas. You'll do real work representative of what the job actually involves.

Intro call

30 minutes with the hiring manager. Why Vern, why now, what you want to build.

Technical depth

60 minutes going deep on a problem in your domain. No quizzes. We want to see how you think, argue, and handle disagreement.

Work sample

A paid, bounded project — about a day of your time, done asynchronously. $500 compensation regardless of outcome.

Final + founders

Two conversations: one with Sam on company and mission, one with Raef on craft and technical vision. Decision within 48 hours.

— Ready to talk?

Don't see a match? Send a note anyway.

We always want to hear from exceptional people, even if we don't have a posted role.

Apply or introduce yourself

Filing

SOC 2

Last audit

Issue

VL-TRU-800

In progress · Q2 2026

Q4 2025

17 APR 2026

§ 800

Trust center

Your security is
our single product.

Vern Labs exists to improve the security posture of AI systems. That only works if we meet the bar we ask our customers to hold. This page is where we document that bar — attestations, architecture commitments, and the data policies we operate under.

§ 801

Attestations

Current compliance status

SOC 2 Type II

● IN PROGRESS

Audit window Q1-Q2 2026

Drata-monitored. Audit conducted by a top-4 firm. Report available to prospects under mutual NDA.

HIPAA

● READY

BAA available

Architecture supports HIPAA workloads. BAA on request for healthcare customers on Enterprise tier.

FedRAMP

● ALIGNED

Moderate baseline

Air-gap deployment satisfies FedRAMP Moderate technical controls. Formal authorization pursued with lead customers.

CMMC L2

● ALIGNED

DoD contractors

Vern Labs architecture supports CMMC Level 2 requirements for handling CUI in defense supply chain.

§ 802

Data handling

What we store, what we don't

Zero retention is the default.

Your prompts and outputs pass through Intertrace. They do not persist there. We retain metadata needed for policy decisions and audit — nothing more.

What we store

+ Policy decisions (rule ID, action)
+ Timestamps, sizes, tenant ID
+ Hashed payload fingerprints
+ Request tracing metadata
+ Aggregate usage counters

What we never store

× Raw prompt or response content
× User-identifiable payloads
× API keys or credentials
× Retrieved document contents
× Model outputs beyond policy result

Opt-in audit mode

For regulated workloads, customers can opt in to full payload retention within their own VPC or S3 bucket. Vern Labs never has access. All keys and encryption remain under customer control.

§ 803

Practices

Internal security program

Encryption

TLS 1.3 in transit. AES-256 at rest. Customer-managed keys available on Enterprise. HSM-backed KMS integration supported.

Access control

All employee access to production requires hardware MFA. Just-in-time access provisioning with full audit log. Quarterly access review.

Secure SDLC

SAST and SCA on every PR. Signed commits. Supply chain attestations via SLSA Level 3. Code review required for production merges.

Penetration testing

Annual third-party penetration test. Continuous internal red team exercises. Results summary available under NDA.

Incident response

Formal incident response plan, tested quarterly. Affected customers notified within 24 hours of confirmed breach. Public postmortems for material incidents.

Vendor review

Every third-party vendor reviewed before onboarding. Minimal dependency footprint. Subprocessor list published and updated within 30 days of changes.

Need the full trust package?

SOC 2 reports, subprocessor list, DPA, security questionnaire responses, and the architecture brief are available to verified prospects under mutual NDA.

Request documentation

Filing

Channel

Response

Program

VL-SEC-810

security@vernlabs.com

Critical < 24h

Active

§ 810

Security

Disclose responsibly.
We'll act fast.

Vern Labs runs a formal responsible disclosure program. If you've found a vulnerability in our products, infrastructure, or research artifacts, we want to hear from you before anyone else does — and we'll work with you to coordinate disclosure on a timeline that protects users.

§ 811

Scope

What's in and out of scope

● IN SCOPE

01Intertrace gateway and SDKs (all language bindings)
02Ghostline capability system and approval queues
03Blackbox testing harness and reporting pipeline
04Control plane at app.vernlabs.com
05Open-source projects at github.com/vern
06Authentication, billing, and tenant isolation

— OUT OF SCOPE

×Physical security of Vern Labs facilities
×Social engineering against Vern staff
×Denial of service (DoS / DDoS)
×Content security policy weaknesses without demonstrable exploit
×Third-party services and dependencies
×Public information exposure without PII

§ 812

Severity

Response commitments by level

SEVERITY      TRIAGE          FIRST FIX          STATUS UPDATES         BOUNTY
────────────────────────────────────────────────────────────────────────────────
CRITICAL      < 24 hours      < 72 hours         daily                  up to $10,000
HIGH          2 bus. days     2 weeks            weekly                 up to $3,000
MEDIUM        5 bus. days     30 days            bi-weekly              up to $750
LOW           10 bus. days    best effort        on milestone           up to $150
────────────────────────────────────────────────────────────────────────────────
                                                 PUBLIC ACKNOWLEDGMENT   on request

Bounties evaluated per-report against severity, impact, and report quality.

§ 813

Reporting

How to send a report

Send the details. Encrypt if you can.

A good report includes: the vulnerability, reproduction steps, affected components, potential impact, and any recommended mitigations. Screenshots and PoC code are welcome.

Please do not access, modify, or exfiltrate data belonging to other customers. Do not use automated scanners. Do not publish findings before we've coordinated disclosure.

● VL-SEC-001

ENCRYPTED CHANNEL

security@vernlabs.com

PGP fingerprint

4A7C 9F21 DE08 B114 E3A2
6E91 5CD4 8B37 1F80 2D6C

Public key

vernlabs.com/.well-known/pgp-key.asc

§ 814

Safe harbor

Good-faith research protection

Vern Labs will not pursue legal action against security researchers who make a good-faith effort to follow this policy. We consider research conducted in accordance with this program to be authorized under the Computer Fraud and Abuse Act and similar laws, and we will not initiate or support legal action against researchers for accessing Vern Labs systems in connection with good-faith vulnerability research.

If your research accidentally causes a violation of this policy, we will work with you to resolve it rather than pursue consequences. If you are unsure whether your planned testing complies with this policy, ask us first.

Filing

Effective

Last updated

Version

VL-PRV-820

01 JAN 2026

17 APR 2026

2.1

§ 820

Privacy policy.

This is the privacy policy for Vern Labs, Inc. It describes what information we collect, how we use it, how we protect it, and the rights you have over it. We have tried to write it in plain English. If anything is unclear, email privacy@vernlabs.com.

§ 821 Overview

Vern Labs provides security infrastructure for AI systems. We collect the minimum information required to deliver that service, protect it, and operate our business. We do not sell personal information. We do not use customer data to train our own or third-party models.

This policy applies to vernlabs.com, our SaaS products, our open-source projects, and any other properties we operate under the Vern Labs name.

§ 822 What we collect

We collect three categories of information:

Account information

Name, email, company, role, billing address, and payment method. Provided by you when you sign up or contract with us.

Product telemetry

Policy decisions, request timestamps, response sizes, tenant IDs, hashed payload fingerprints. We do not retain raw prompts, outputs, or retrieved documents.

Website analytics

Page views, referrers, and coarse location (country-level). Cookies are limited to session management and preferences; we do not run advertising or cross-site tracking.

§ 823 How we use it

We use the information above to:

+ Provide, operate, and improve the Vern Labs products
+ Detect and prevent abuse, fraud, and security incidents
+ Send operational communications (service updates, incidents, billing)
+ Meet our legal and regulatory obligations
+ With consent, send research updates (mailing list)

§ 825 Retention

Account information is retained for the duration of your contract plus 90 days, after which it is deleted or anonymized unless we have a legal obligation to retain it longer.

Product telemetry is retained for 90 days for operational purposes, then aggregated into non-identifying counters. Raw prompt and response content is not retained at all — see § 802 of the trust page for the full data handling breakdown.

§ 826 Your rights

Depending on where you live, you may have some or all of the following rights:

+ Right to access the information we hold about you
+ Right to correct inaccurate information
+ Right to delete your information, subject to limited exceptions
+ Right to portability of your information
+ Right to object to or restrict certain processing
+ Right to withdraw consent where processing is based on consent

To exercise any of these rights, email privacy@vernlabs.com. We will respond within 30 days.

§ 827 International transfers

Vern Labs is headquartered in the United States. If you are accessing our services from outside the US, your information may be transferred to and processed in the US. Where required, we rely on Standard Contractual Clauses for transfers out of the EEA and UK.

§ 828 Contact

Privacy questions, data subject requests, and complaints:

privacy@vernlabs.com

Filing

Effective

Last updated

Version

VL-TRM-830

01 JAN 2026

17 APR 2026

3.0

§ 830

Terms of service.

These are the terms of service for Vern Labs, Inc. They govern use of our products and our website. They are not a substitute for the master services agreement signed with enterprise customers, which controls in case of conflict. If you have questions, email legal@vernlabs.com.

§ 831 Acceptance of terms

By accessing or using Vern Labs products, you agree to be bound by these Terms. If you are entering into these Terms on behalf of an organization, you represent that you have authority to bind that organization. If you do not agree, do not use the service.

§ 832 Accounts

You must provide accurate account information. You are responsible for safeguarding your credentials and for all activity under your account. Notify us immediately of any unauthorized use.

Accounts created for evaluation are subject to the usage limits of the Pilot tier and may be rate-limited or suspended for abuse without prior notice.

§ 833 Acceptable use

You agree not to:

× Use the service for any unlawful purpose
× Interfere with or disrupt the integrity of the service
× Reverse engineer or attempt to extract proprietary algorithms
× Use the service to develop a competing product
× Transmit malware or use the service to attack third parties
× Circumvent rate limits or usage restrictions

§ 834 Fees & payment

Fees for the Production tier are detailed in your order form or MSA. Enterprise fees are individually negotiated and governed by a separate signed agreement.

Invoices are due net 30. Late amounts accrue interest at 1.5% per month or the maximum allowed by law, whichever is lower. Fees are non-refundable except as required by law or as explicitly stated in your MSA.

§ 835 Intellectual property

Vern Labs retains all rights in the service, including all software, documentation, and research artifacts. Customer retains all rights in customer data and any content submitted through the service.

Open-source components of the service are governed by their respective licenses. Vern Labs' public open-source projects are licensed under MIT unless otherwise stated in the repository.

§ 836 Warranties & disclaimers

Vern Labs warrants that it will provide the service in a professional manner, in accordance with published specifications, and in compliance with the SLAs defined in your order form.

EXCEPT AS EXPRESSLY PROVIDED, THE SERVICE IS PROVIDED "AS IS" WITHOUT WARRANTIES OF ANY KIND, WHETHER EXPRESS OR IMPLIED. NO AI SECURITY PRODUCT PROVIDES ABSOLUTE PROTECTION; YOU REMAIN RESPONSIBLE FOR YOUR APPLICATIONS AND THEIR COMPLIANCE.

§ 837 Limitation of liability

TO THE MAXIMUM EXTENT PERMITTED BY LAW, NEITHER PARTY WILL BE LIABLE FOR INDIRECT, INCIDENTAL, SPECIAL, CONSEQUENTIAL, OR PUNITIVE DAMAGES. AGGREGATE LIABILITY FOR ANY CLAIM ARISING FROM THESE TERMS IS LIMITED TO THE FEES PAID TO VERN LABS IN THE 12 MONTHS PRECEDING THE CLAIM. THIS LIMITATION DOES NOT APPLY TO BREACHES OF CONFIDENTIALITY, INDEMNIFICATION OBLIGATIONS, OR AMOUNTS OWED UNDER THIS AGREEMENT.

§ 838 Term & termination

These Terms remain in effect while you use the service. Either party may terminate for material breach with 30 days notice if the breach is not cured in that time.

Upon termination, customer data will be returned or deleted within 30 days at customer's request. Accrued obligations, payment terms, confidentiality, and limitations of liability survive termination.

§ 839 General

These Terms are governed by the laws of the State of Delaware, without regard to conflict of laws principles. Disputes will be resolved in the state or federal courts located in Delaware.

These Terms, together with any order form or MSA, constitute the entire agreement between the parties regarding the service. For questions, contact legal@vernlabs.com.

Filing

Kind

Published

Authors

VL-RES-901

PAPER

February 2026

H. Raef, S. Oyan, et al.

§ 901

PAPER · February 2026

A Runtime Risk Model for Tool-Using Agents

We propose a behavioral risk classifier that scores agent actions in real time based on capability context, target resource sensitivity, and historical deviation from an agent's established behavioral envelope. Evaluated across 12 production agent deployments, the model reduced escalations required for safe autonomous execution by 41% while holding false-negative rate below 0.3%.

§ 911

Section 01

Motivation

Production agent deployments face a fundamental control problem: the set of safe actions depends on context, and context changes every turn. Static policies are too coarse to capture this — they either over-permit and create safety gaps, or over-restrict and burn the deployment under human approval overhead.

The current state of the art is either naive allow-lists or LLM-judge-based behavioral review. The first doesn't scale; the second is too slow for inline decisions and has its own trust problems. We wanted a third path.

§ 912

Section 02

Approach

Our risk model is a lightweight classifier that evaluates three signals per action: capability risk (how sensitive the target resource is), behavioral novelty (how far this action deviates from the agent's 30-day baseline), and contextual coherence (whether the action follows from the stated task).

Each signal produces a normalized score; the composite score routes the action to one of four tracks: execute, log, gate for async approval, or block. The model runs in under 3ms per decision on commodity hardware.

§ 913

Section 03

Evaluation

We evaluated across 12 production deployments at partner organizations, representing ~2.4M agent actions over a 90-day window. Compared to baseline (hand-tuned allow-lists plus LLM-judge for novel actions), the risk model achieved a 41% reduction in escalations to human operators with no regression in incident rate.

False-negative rate on held-out red-team probes remained below 0.3% across all deployments. Per-deployment tuning was minimal — most gains came from baseline learning during the first week of operation.

§ 914

Section 04

Limitations

The model depends on having an established behavioral baseline, so cold-start deployments must either run in a higher-approval mode for the first week or import a baseline from a similar deployment. We have not yet evaluated the model's robustness to adversarial behavioral drift — this is ongoing work.

The novelty signal currently has no notion of task decomposition. An agent that routinely edits one file will flag novelty when asked to edit a similar file it hasn't seen before. We're exploring structural equivalence measures to address this.

§ 999

Citation

Reference & full text

Cite this work

Oyan S., Raef H., et al. A Runtime Risk Model for Tool-Using Agents.
    Vern Labs Technical Report VL-RES-901, February 2026.
    https://vernlabs.com/research/vl-res-901

Full text

vl-res-901.pdf

PDF · open access

More from the Vern Labs research archive

See the full archive Collaborate with the research team →

Filing

Kind

Published

Authors

VL-RES-902

PAPER

January 2026

S. Oyan, H. Raef

§ 902

PAPER · January 2026

Prompt Injection as a Supply Chain Problem

This paper reframes prompt injection as a supply chain attack against language model context, rather than a model-training or fine-tuning problem. We propose provenance tracking for every context token that reaches a production model and show how this reframing suggests a set of defenses that are both more reliable and more auditable than current mitigation strategies.

§ 921

Section 01

Current framing is wrong

Most mitigations for prompt injection treat it as a model problem — something to be solved via better instruction-following, better RLHF, or model-level content classifiers. This framing obscures what is actually happening: an attacker is inserting untrusted content into a context window the model will then treat as trusted input.

That is a supply chain problem. The defenses that work for supply chain problems are well-understood and do not require the victim to become a better detector of adversarial text.

§ 922

Section 02

Provenance as the primitive

We propose that every token reaching a production LLM context should carry a provenance tag — an unforgeable marker of where it came from and what trust level it inherits. Tools, retrieval systems, and user inputs become distinct "origin classes," and policy can then enforce rules on what each class is permitted to cause the model to do.

This flips the defensive posture from "detect injection after the fact" to "structurally prevent mixing of trust levels." The latter is tractable. The former, by current evidence, is not.

§ 923

Section 03

Practical implementation

We describe an implementation that sits in the retrieval/tool layer and emits provenance-tagged context to a wrapped model. Because the wrapping is done at the protocol layer, it works with any LLM provider and requires no model retraining.

The runtime overhead is negligible. The engineering cost is shifted from the model to the orchestration layer, which is where it should have been all along.

§ 924

Section 04

Consequences for practitioners

If provenance becomes the primitive, certain common patterns become obviously unsafe: mixing retrieved web content directly into system prompts, allowing tool output to flow unfiltered into the next turn, treating all memory as equally trusted. These are all tractable to fix once you can see them.

We release a reference implementation and invite security teams to experiment with the approach in their own deployments.

§ 999

Citation

Reference & full text

Cite this work

Oyan S., Raef H., et al. Prompt Injection as a Supply Chain Problem.
    Vern Labs Technical Report VL-RES-902, January 2026.
    https://vernlabs.com/research/vl-res-902

Full text

vl-res-902.pdf

PDF · open access

More from the Vern Labs research archive

See the full archive Collaborate with the research team →

Filing

Kind

Published

Authors

VL-RES-903

PRIMITIVE

December 2025

H. Raef, Vern Labs Research Team

§ 903

PRIMITIVE · December 2025

Scoped Capabilities for Autonomous Execution

A technical specification for scoped capability tokens designed for autonomous agent systems. We describe a Biscuit-based token format with cascading revocation, approval queue integration, and cryptographic binding to a specific agent instance and execution session. Reference implementations in Go, Rust, and Python are released alongside this note under MIT license.

§ 931

Section 01

Problem

Traditional authorization systems assume the actor is a human or a fixed piece of software. Autonomous agents are neither: they are dynamic processes whose next action depends on reasoning that happens at runtime and is not known at the time the token is issued.

Session tokens, OAuth scopes, and IAM roles all fail in specific ways when applied to agents. We needed an authorization primitive designed for the actual workload.

§ 932

Section 02

Design

Scoped capabilities are based on the Biscuit token format with Vern-specific extensions. Each token binds cryptographically to an agent instance, a session, and an explicit set of allowed operations. Tokens can be attenuated — an agent can derive a narrower token to pass to a sub-process without expanding privileges.

Cascading revocation is a first-class operation: pulling a parent token invalidates all derived tokens in flight. This is essential for incident response on autonomous systems where an agent may have spawned dozens of derived capability grants before the incident is detected.

§ 933

Section 03

Approval queues

Actions that exceed a scope's permission set route to an approval queue rather than failing hard. Approval can be inline (blocking the agent), async (queued for human review with a timeout), or policy-only (evaluated by a sibling policy service). This gives operators a middle ground between "every action requires human approval" and "the agent has full authority."

§ 934

Section 04

Open-source release

Reference implementations are available in the vern/scopes-go, vern/scopes-rs, and vern/scopes-py repositories. We invite framework authors to integrate scoped capabilities directly rather than inventing their own authorization models.

§ 999

Citation

Reference & full text

Cite this work

Oyan S., Raef H., et al. Scoped Capabilities for Autonomous Execution.
    Vern Labs Technical Report VL-RES-903, December 2025.
    https://vernlabs.com/research/vl-res-903

Full text

vl-res-903.pdf

PDF · open access

More from the Vern Labs research archive

See the full archive Collaborate with the research team →

Filing

Kind

Published

Authors

VL-RES-904

NOTE

November 2025

S. Oyan

§ 904

NOTE · November 2025

Why LLM security needs a new threat model

A short argument that the prevailing threat model for LLM-backed applications — derived from traditional web application security — is inadequate for systems where the application itself reasons, retrieves, and acts. This note proposes a minimal extension centered on three new attack surfaces: context poisoning, authority exceedance, and tool misuse chains.

§ 941

Section 01

The STRIDE gap

STRIDE — the standard threat-modeling framework for application security — assumes that software does what it is told. LLM-backed systems do what they infer. This single change invalidates several STRIDE assumptions: tampering can happen via prompt content, not just request data; repudiation can happen because the system genuinely cannot remember what it was asked; elevation of privilege can happen through pure text.

§ 942

Section 02

Three new surfaces

Context poisoning — an attacker plants text in a place the model will read (a document, a tool output, a user message) in a form that will alter the model's subsequent behavior without triggering content filters.

Authority exceedance — the model takes an action it was not authorized to take, either because it reasoned its way past an instruction or because its authorization was scoped to a higher-level goal and it chose a narrower but harmful path.

Tool misuse chains — the model composes multiple individually-authorized actions into a sequence that exceeds the intent of any single authorization. This is the agent-security equivalent of privilege chains.

§ 943

Section 03

What this implies

Every one of these surfaces requires runtime inspection of behavior, not just static authorization. This is why we build Intertrace, Ghostline, and Blackbox as separate products — each addresses a distinct surface that traditional AppSec doesn't cover. A complete threat model for an LLM-backed system needs at least these three additions, and probably more.

§ 999

Citation

Reference & full text

Cite this work

Oyan S., Raef H., et al. Why LLM security needs a new threat model.
    Vern Labs Technical Report VL-RES-904, November 2025.
    https://vernlabs.com/research/vl-res-904

Full text

vl-res-904.pdf

PDF · open access

More from the Vern Labs research archive

See the full archive Collaborate with the research team →

Filing

Kind

Published

Authors

VL-RES-907

PAPER

September 2025

Vern Labs Research Team

§ 907

PAPER · September 2025

Adversarial evaluation of retrieval-augmented systems

We present a systematic methodology for evaluating the security posture of retrieval-augmented generation (RAG) systems. The paper introduces a new attack taxonomy specific to retrieval pipelines, a benchmark covering 38 distinct attack vectors, and empirical results across 7 widely-deployed RAG patterns. We find that the most common defenses focus on the retrieval layer while leaving the composition layer as the primary failure mode.

§ 971

Section 01

Background

Retrieval-augmented generation is now the dominant architecture for enterprise LLM deployments. It is also a significantly larger attack surface than the standalone-LLM deployments most threat models were written against. This paper characterizes that expanded surface and measures real-world defenses against it.

§ 972

Section 02

Attack taxonomy

We identify four classes of RAG-specific attacks: retrieval-time poisoning (the attacker plants content in the corpus), indexing-time poisoning (the attacker influences what the retriever considers relevant), composition attacks (the attacker exploits how retrieved content is combined with the user query), and source-confusion attacks (the attacker makes retrieved content appear to be from a more-trusted source than it is).

Each class is exercised by 8–10 distinct probes in our benchmark.

§ 973

Section 03

Findings

Across the 7 RAG patterns we evaluated, composition attacks had the highest average success rate (58%) despite being the best-understood theoretically. Retrieval-time poisoning was second (34%). The other two classes had lower success rates but produced higher-severity outcomes when they succeeded.

Defenses concentrated at the retrieval layer (e.g., source filtering, corpus cleaning) reduced retrieval-time attacks but did little to address composition attacks. Our data suggests that layered defense at the composition stage is currently the highest-leverage investment for RAG security.

§ 974

Section 04

Release

The benchmark is available at github.com/vern/probes under the "rag" category. The methodology paper is released alongside this note. We welcome independent replication and will update findings as the probe set expands.

§ 999

Citation

Reference & full text

Cite this work

Oyan S., Raef H., et al. Adversarial evaluation of retrieval-augmented systems.
    Vern Labs Technical Report VL-RES-907, September 2025.
    https://vernlabs.com/research/vl-res-907

Full text

vl-res-907.pdf

PDF · open access

More from the Vern Labs research archive

See the full archive Collaborate with the research team →

Filing

Kind

Published

Authors

VL-OSS-908

OSS

August 2025

Vern Labs Engineering

§ 908

OSS · August 2025

vern/trace-cli — CLI for local LLM traffic inspection

Release of vern/trace-cli, an open-source command-line tool for inspecting LLM API traffic on a developer's local machine. Pipe any HTTP client through trace-cli and see every request, response, and policy evaluation in real time. Useful for debugging, learning, and building intuition about what an LLM actually sees.

§ 981

Section 01

Why we built it

Most engineers who work on LLM-backed systems have an imprecise mental model of what their model actually sees. System prompts concatenate in ways that aren't obvious. Tool definitions get rendered into token streams that look different than the source code. Retrieval results show up as text in ways that change how the model weights them.

trace-cli was born from our own debugging needs. Pipe your traffic through it and you see the real thing, rendered with policy decisions overlayed. It made our work significantly faster. We're releasing it so others can benefit.

§ 982

Section 02

Usage

Install with npm install -g @vern/trace-cli or grab a binary from GitHub releases. Start it listening on a local port, set the OpenAI (or compatible) base URL to that port, and you'll see every request flow through the terminal with structured markup.

It can optionally apply Vern Labs policies if you have a policy file — otherwise it just prints traffic. No Vern account required.

§ 983

Section 03

What it shows

Request and response bodies, token counts per message, tool call structure, streaming chunk boundaries, and — if policies are enabled — the policy decision for each request. Output can be rendered as ANSI-colored text for interactive use, or as JSON for piping into other tools.

§ 984

Section 04

Roadmap

Upcoming features: session recording, diff mode for comparing two runs, integration with Intertrace for full policy evaluation. All discussion happens on the GitHub repo.

§ 999

Citation

Reference & full text

Cite this work

Oyan S., Raef H., et al. vern/trace-cli — CLI for local LLM traffic inspection.
    Vern Labs Technical Report VL-OSS-908, August 2025.
    https://vernlabs.com/research/vl-oss-908

Full text

vl-oss-908.pdf

PDF · open access

More from the Vern Labs research archive

See the full archive Collaborate with the research team →

Security infrastructure for systems that reason, retrieve, and act.

Three products. One control plane.

Intertrace

Ghostline

Blackbox

Inline inspection of every AI transaction.

Authorization at the action layer.

Stress your AI before adversaries do.

A single control plane for three layers of defense.

Capability overlap with adjacent categories.

A lab, not a vendor.

Start with a pilot. Scale to production. Negotiate for enterprise.

Pilot

Production

Enterprise

Built by engineers who've shipped security at scale.

Sam Oyan

H. Raef

If it isn't here, ask us directly.

How is Vern Labs different from traditional security tools?

Do I need to deploy all three products?

Which AI providers does Vern Labs support?

What about latency?

Can I self-host?

Is Vern Labs suitable for regulated industries?

Build with AI. Ship with control.

Three products.One control plane.

Inline inspection of every AI transaction.

Authorization at the action layer.

Stress your AI before adversaries do.

Where Vern Labs fits in the stack.

Security infrastructure,deployed on your terms.

A single control plane for three layers of defense.

Four deployment modes. Same feature set.

Managed SaaS

Control plane + local data

Self-hosted, your cloud

Fully disconnected

What happens when a request passes through Vern.

Your data stays yours.

Provider-agnostic by design.

Want the architecture brief with full deployment details?

A lab,not a vendor.

Featured research.

A Runtime Risk Model for Tool-Using Agents

Prompt Injection as a Supply Chain Problem

Build with us. Or on top of us.

Adversarial prompt benchmark

Local traffic inspector

Biscuit capability tokens

LLM policy rule set

What we work on.

Research drops, first.

Start with a pilot.Scale with your AI footprint.

Pilot

Production

Enterprise

Pricing questions, answered.

How is Production priced?

Can I upgrade from Pilot mid-contract?

What counts as a "request"?

Do you offer academic or non-profit discounts?

How does Enterprise pricing work?

Is there a free tier after the pilot?

Let's scope what this looks like for your team.

Built by engineerswho've shipped security at scale.

AI systems are becoming the control surface for critical infrastructure. They are not yet secured the way control surfaces need to be.

Sam Oyan

H. Raef

The people in the room when we make hard calls.

Come build this.

Work on security that actually matters.

Talk to Vern Labs.

What are you securing?

Skip the form. Go direct.

Sales & pilots

Technical briefs

Research

Press & media

Security issues.Disclosed responsibly.

Security infrastructure
for systems that reason,
retrieve, and act.

Three products.
One control plane.

Build with AI.
Ship with control.

Three products.
One control plane.

Security infrastructure,
deployed on your terms.

A lab,
not a vendor.

Start with a pilot.
Scale with your AI footprint.

Built by engineers
who've shipped security at scale.

Security issues.
Disclosed responsibly.

Work on security
that actually matters.

Your security is
our single product.

Disclose responsibly.
We'll act fast.