agent-dns-firewall

agent-dns-firewall is an in-process domain firewall for AI agents. Before your agent calls fetch(), call isDomainBlocked(hostname) and drop known-bad destinations — no infrastructure required. It is a TypeScript npm library that downloads public blocklists, builds a fast lookup index, and exposes a single synchronous check.

Why I Built This

AI agents make network requests. Most agent frameworks give you hooks to intercept those requests — but the default is to let everything through. Adding a domain-level blocklist check before fetch() is simple in principle and surprisingly absent in practice. The tooling for "don't let the agent call this domain" is either heavy infrastructure (DNS server, HTTP proxy, network policy) or nonexistent.

agent-dns-firewall is not a DNS server. It is not an HTTP proxy. It is not a system-level network blocker. It is an in-process lookup: call isDomainBlocked(hostname) before you call fetch(). If the domain is blocked, don't call fetch(). The check is synchronous, has zero runtime dependencies, and works with any HTTP client or framework — fetch, axios, undici, anything.

The library wraps two well-maintained public blocklists: StevenBlack's unified hosts file (malware, adware, tracking) and Hagezi's DNS light list. It fetches them on start(), builds an in-memory index with subdomain matching, and auto-refreshes using conditional HTTP headers (ETag and If-Modified-Since) to minimize bandwidth on repeat fetches.

How It Works

The API is a single factory function and three methods.

const firewall = createDomainFirewall({
  sources: [PRESET_STEVENBLACK_UNIFIED, PRESET_HAGEZI_LIGHT],
  allow: ['safe.example.com'],
  deny:  ['evil.example.com'],
  refreshMinutes: 60,
});

await firewall.start();   // fetches and indexes blocklists

const decision = firewall.isDomainBlocked('malware-domain.example.com');
// decision.blocked → true
// decision.reason  → 'blocklist'
// decision.listId  → 'stevenblack-unified'

isDomainBlocked() returns a BlockDecision with three fields: blocked (boolean), reason ('custom-deny' | 'blocklist' | undefined), and listId (which source matched when reason is 'blocklist'). The check evaluates in order: allow list first, then deny list, then blocklists with subdomain matching. It never throws.

Blocklist fetches use conditional HTTP headers automatically. On first fetch the library stores ETag and If-Modified-Since values from the server response. Subsequent refreshes send those headers back — a 304 Not Modified response reuses the cached domain set without re-downloading. No configuration is needed; this behavior is built into every start() and refresh cycle.

The subdomain matching logic ensures that a block on malware.example.com also blocks sub.malware.example.com without requiring a wildcard entry in the blocklist. Public blocklists list domains, not wildcards; the library adds the subdomain matching layer.

What I Learned

The library's narrow scope is its most important design decision. Not a DNS server, not an HTTP proxy — a domain-level in-process check. That constraint means it can be zero runtime dependencies and synchronous. A DNS server adds infrastructure; a proxy adds a network hop and TLS complexity. An in-process library adds a function call. Each step up in infrastructure sophistication reduces adoption because most teams will not stand up a DNS server to get a domain blocklist working in their agent.

The BlockDecision return type — rather than a plain boolean — turned out to be worth the extra verbosity. Agents (and the developers debugging them) need to know not just whether a domain was blocked but why: was it a custom deny rule, and which blocklist matched. That structured response makes the library useful for logging and auditing, not just dropping requests. Boolean functions are easy to write and hard to debug in production.

Auto-refresh with conditional headers matters more than it seems for a long-running agent process. Blocklists update regularly; an agent running for hours should not be checking against an index that is days old. ETag-based conditional fetches make the refresh cheap enough to run on a short interval without hammering the blocklist providers' servers. The bandwidth savings and the freshness guarantee are both real at any meaningful scale.


Check before fetch — drop the bad destinations before they become bad decisions.