Security Model
Sockguard's defense-in-depth model — transport admission, client admission, method/path filtering, request-body inspection, ownership isolation, visibility-controlled reads, and structured access plus audit logging.
Why Socket Proxying Matters
The Docker socket (/var/run/docker.sock) is equivalent to root access on the host. Any container with unrestricted socket access can:
- Create a privileged container that mounts the host filesystem
- Execute arbitrary commands via
docker exec - Access host PID, network, and IPC namespaces
- Pull and run malicious images
- Manipulate Swarm clusters
A socket proxy sits between consumers and the raw socket, filtering requests to limit what each consumer can do.
Defense in Depth
Sockguard implements multiple layers of filtering:
Layer 0: Policy Integrity
Before any request is evaluated, Sockguard can verify that the loaded
configuration was signed by a trusted key. When policy_bundle.enabled: true,
the on-disk YAML is treated as untrusted until a cosign sigstore bundle at
policy_bundle.signature_path confirms it. An unsigned, malformed, or
wrong-key bundle aborts startup with a wrapped policy bundle: error — no
rules compile, no listener opens.
Two verification paths are supported:
- Keyed — PEM-encoded ECDSA, RSA, or ed25519 public keys listed under
policy_bundle.allowed_signing_keys. No network round-trip required. - Keyless (Fulcio + Rekor) —
policy_bundle.allowed_keylessentries constrain the Fulcio cert chain by exact OIDC issuer URL and subject SAN regex. Whenpolicy_bundle.require_rekor_inclusion: true, a Rekor transparency-log entry is additionally required.
Verification also runs on every hot reload. A reload whose bundle fails
verification is rejected with result=reject_signature in
sockguard_config_reload_total and never touches the running policy. The
trust material (enabled, allowed_signing_keys, allowed_keyless,
require_rekor_inclusion, verify_timeout) is reload-immutable so a SIGHUP
cannot silently widen the set of accepted signers; only signature_path is
reload-mutable so an operator can re-sign without a restart.
The verified signer (keyed:<spki-fingerprint> or keyless:<issuer>:<san>)
and the YAML's SHA-256 digest are stamped onto GET /admin/policy/version in
the bundle_signer and bundle_digest fields, giving operators a tamper-evident
audit trail of exactly who signed the running policy and over which bytes.
Layer 1: Transport Admission
Non-loopback TCP listeners require mutual TLS 1.3 by default via listen.tls. Plaintext remote TCP is rejected unless you set both listen.insecure_allow_plain_tcp: true and listen.insecure_allow_unauthenticated_clients: true — two deliberate acknowledgments (one without the other is rejected) for legacy compatibility on a private network. Unix socket listeners bypass this layer because they are filesystem-bounded. listen.tls.client_ca_file defines the issuing trust root, and the optional listen.tls.common_names, dns_names, ip_addresses, uri_sans, and public_key_sha256_pins fields can narrow that trust to specific verified client certificates instead of implicitly accepting every client cert issued by the configured CA.
Layer 2: Client Admission
clients.allowed_cidrs gates incoming TCP callers by source CIDR before any rule evaluation runs. When clients.container_labels.enabled is true, Sockguard resolves the calling container by source IP and enforces per-client com.sockguard.allow.<method> label allowlists in addition to the global rule set.
Named client profiles sit on top of that admission layer. Sockguard can now select a per-client ruleset and request-body policy by source IP, verified mTLS certificate selectors (common_names, dns_names, ip_addresses, uri_sans, spiffe_ids, public_key_sha256_pins), or unix peer credentials (uids, gids, pids), with a configurable default profile for unmatched callers. That turns one proxy from "one ruleset in front of Docker" into a shared control plane for multiple consumers without collapsing back to broad allowlists.
Layer 3: Method Filtering
Block entire HTTP methods. Most consumers only need GET (read-only mode).
Layer 4: Path Filtering
Allow or deny specific Docker API endpoint paths using glob patterns. Before matching, Sockguard strips Docker API version prefixes (/v1.45/), percent-decodes the path (including double-encoded separators and mixed-case escapes such as %2F, %2E, and %252F), and resolves . / .. segments via path.Clean. That means /v1.45/containers/%2e%2e/images/json canonicalizes to /images/json before the glob matcher sees it, so adversarial path shapes cannot slip past a literal allowlist or skip a request-body inspector.
Layer 5: Request Body Inspection
POST /containers/create bodies are parsed on every request and denied when they contain dangerous configuration:
HostConfig.Privileged: trueHostConfig.NetworkMode: hostHostConfig.PidMode: hostHostConfig.IpcMode: hostHostConfig.UsernsMode: host- A non-empty
HostConfig.Sysctlsmap (kernel parameter tuning), unlessrequest_body.container_create.allow_sysctlsis set - Any bind mount whose source is outside
request_body.container_create.allowed_bind_mounts - Any
HostConfig.Deviceshost path outsiderequest_body.container_create.allowed_devices HostConfig.DeviceRequestsunless explicitly allowedHostConfig.DeviceCgroupRulesunless explicitly allowed
Five further HostConfig fields are denied unconditionally — no policy setting opts back in — because each one opens a namespace-escape or privilege-escalation path: VolumesFrom, UTSMode: host, a non-empty CgroupParent, GroupAdd, and ExtraHosts.
POST /containers/*/exec and POST /exec/*/start can also be inspected now. When request_body.exec.allowed_commands is configured, Sockguard denies argv vectors that match no allowlist entry, denies privileged exec unless explicitly allowed, denies root-user exec unless explicitly allowed, and re-inspects POST /exec/*/start against Docker's stored exec metadata before the command runs. Each allowlist entry is an argv template whose tokens are sockguard globs (* matches a run of non-slash characters, ** matches any sequence): a command matches when its token count equals an entry's and every token matches the glob at that position, so an exec whose argv carries a variable component — a run ID, timestamp, or generated path — can be allowlisted without enumerating every literal form. Keep glob tokens as tight as the use case allows; a token of ** matches anything.
The exec-start re-inspection is necessarily best effort: Docker exposes metadata inspection and exec start as separate API calls, so Sockguard cannot make the check atomic with the eventual start operation. Treat this as a narrow TOCTOU window inherent to Docker's API shape, and prefer tight exec allowlists plus conservative per-client profile assignment for clients that do not need interactive command execution.
POST /images/create is inspected by default. Sockguard blocks fromSrc imports unless explicitly allowed and constrains pulls to Docker Hub official images unless the operator opts into allow_all_registries or an explicit registry allowlist.
POST /build is inspected by default. Sockguard blocks remote build contexts, networkmode=host, and Dockerfiles containing RUN instructions unless those behaviors are explicitly allowed.
POST /services/create and POST /services/*/update are inspected by default. Sockguard blocks host-network services, bind mounts outside request_body.service.allowed_bind_mounts, and service images outside the configured official/allowlisted registry set.
POST /volumes/create, POST /secrets/create, and POST /configs/create are inspected by default. Sockguard blocks non-local volume drivers and driver options unless explicitly allowed, and blocks custom or template drivers on secrets/configs unless explicitly allowed.
POST /swarm/init, POST /swarm/join, and POST /swarm/update are inspected by default. Sockguard blocks ForceNewCluster, external CA configuration, non-allowlisted join targets, token rotations, manager unlock-key rotations, manager autolock, and signing-CA updates unless explicitly allowed.
POST /plugins/pull, POST /plugins/*/upgrade, POST /plugins/*/set, and POST /plugins/create are inspected by default. Sockguard constrains remote registries, privilege grants, plugin-set assignments, local plugin tar config.json, host mounts, device exposure, and capability requests unless explicitly allowed. POST /plugins/create is treated as multipart/form-data as well as raw tar: Sockguard spools the upload to a temporary file, parses the multipart envelope, extracts config.json from the embedded tar, and applies the same plugin policy it applies to POST /plugins/pull. Uploads without a parseable config.json, or whose config.json fails policy, are denied before the body reaches Docker.
POST /networks/create, POST /networks/*/connect, and POST /networks/*/disconnect are inspected by default. Sockguard blocks custom network drivers, swarm/ingress/attachable/config-only networks, custom IPAM drivers/config/options, driver options, endpoint static IP/MAC/alias/driver options, and forced disconnects unless explicitly allowed.
POST /containers/*/update and PUT /containers/*/archive are inspected by default. Sockguard blocks restart-policy/resource-control changes, privileged/device/capability-like update fields, unsafe archive target paths, tar traversal, setuid/setgid entries, device nodes, and escaping symlinks/hardlinks unless explicitly allowed.
POST /images/load is inspected by default. Image archives are denied unless image-load policy allows matching registries or untagged images; Docker manifest.json repo tags are checked against the same official/registry allowlist model used for pulls.
POST /swarm/unlock and POST /nodes/*/update are inspected by default. Swarm unlock is denied unless explicitly allowed, and node updates block role, availability, name, and arbitrary label mutations unless the corresponding node policy permits them. The default owner-label key remains allowed for controlled node claims.
Bounded JSON/tar inspectors read request bodies under per-endpoint byte caps and return 413 Payload Too Large when those caps are exceeded, instead of streaming unbounded bodies into memory. A malformed or hostile client cannot tie up the filter or the Docker daemon with oversized payloads — the bounded reader short-circuits before the JSON decode or tar parse even begins. The filter also applies a 30-second read deadline to the request body before an inspector runs, so a client that opens a request and then dribbles the body slowly cannot pin the inspector indefinitely. On the upstream side, the reverse-proxy and side-channel transports set a 30-second response-header timeout, so a Docker daemon that accepts a connection but never replies cannot pin a goroutine; streaming endpoints send headers promptly and are unaffected.
These inspectors intentionally decode only the Docker request fields Sockguard actually enforces. They are not full Docker-schema validators, so full payload validation still belongs to Docker once Sockguard has checked the policy-relevant subset.
The remaining blind-write guardrail covers body-bearing writes Sockguard still cannot constrain safely, chiefly arbitrary exec without an allowlist, POST /swarm/join without configured allowed_join_remote_addrs, and plugin setting writes without allowed assignment prefixes. Validation refuses to start with those rules allowed unless you explicitly set insecure_allow_body_blind_writes: true, to keep the enforcement boundary honest.
Sockguard now applies the same honesty rule to raw archive/export and log/attach streaming reads. Validation refuses to start broad read rules that would expose GET /containers/*/archive, GET /containers/*/export, GET /containers/*/logs, GET /containers/*/attach/ws, POST /containers/*/attach, GET /services/*/logs, GET /tasks/*/logs, GET /images/get, or GET /images/*/get unless you explicitly set insecure_allow_read_exfiltration: true. Because this validation layer only sees method + path, /containers/*/logs is treated conservatively whether or not the caller also sets follow=1. That keeps backup/export and raw-stream use cases possible without letting a casual GET /containers/** or Tecnativa-style section gate silently include filesystem or log stream exfiltration.
Layer 6: Owner Label Isolation
When ownership.owner is set, Sockguard stamps label-capable creates and build-produced images with an owner label, injects owner filters into list/prune/events responses, and inspects target resources on individual requests to deny cross-owner access. That now covers owned containers, images, networks, volumes, services, tasks, secrets, configs, nodes, and swarm state, with service writes stamping both the service and its task template so downstream tasks inherit the same owner identity, /nodes using Docker's node.label filter key, and unlabeled node/swarm resources only claimable through their update paths. This turns one shared Docker socket into N isolated identity views.
Layer 7: Visibility-Controlled Reads
Sockguard's response filter applies to known protected Docker JSON response shapes on successful body-bearing 2xx responses across request methods, not only GET 200. If a protected successful response cannot be parsed or sanitized safely, Sockguard fails closed with a generic 502 instead of forwarding unsanitized data. Non-success responses, HEAD responses, no-body statuses, non-protected paths, and streaming endpoints (logs, attach, events) pass through unmodified — those are protected by request-side rules and the read-side exfiltration guardrail, not by response rewriting.
Together with request-side visibility and exfiltration guardrails, the read-side layer narrows what callers can see:
- Inject label visibility selectors into
GET /containers/json,GET /images/json,GET /networks,GET /volumes, andGET /events - Inject label visibility selectors into
GET /services,GET /tasks,GET /secrets,GET /configs, andGET /nodes - Return
404for hidden targets on inspect/log-style reads such asGET /containers/*/json,GET /images/*/json,GET /networks/*,GET /volumes/*,GET /exec/*/json,GET /services/*,GET /services/*/logs,GET /tasks/*,GET /tasks/*/logs,GET /secrets/*,GET /configs/*,GET /nodes/*, andGET /swarm - Fail startup unless raw archive/export and stream-style reads are explicitly acknowledged via
insecure_allow_read_exfiltration: true - Redact
Config.EnvonGET /containers/*/json - Redact
HostConfig.Bindshost paths plusMounts[*].Sourceon container list/inspect responses - Redact volume
MountpointonGET /volumesandGET /volumes/* - Redact container and network address topology on container/network list and inspect responses
- Redact service/task env, mount, secret/config-reference, and network metadata
- Redact config payload data, plugin env/path metadata, node/swarm TLS material, swarm join/unlock material, and
/infoplus/system/dftopology-sensitive fields
Single-resource inspect denials honor rollout mode: under a profile in warn or audit mode, a target that visibility policy would hide is forwarded upstream with a would_deny audit verdict instead of being hard-404'd, so visibility policy can be staged like every other deny gate. When response.name_patterns or response.image_patterns filter a list response, Sockguard buffers the upstream body under an 8 MiB cap and rejects a larger response with a 502 rather than buffering it unbounded.
These controls are on by default where they are pure redaction because runtime env vars routinely carry credentials and Docker read APIs expose raw host mount paths plus internal network layout.
Layer 8: Structured Access And Audit Logging
Every request is stamped with a proxy-generated canonical X-Request-Id and logged with method, raw path, normalized_path, decision, matched rule index, selected client profile when present, latency, request ID, trace context, and client metadata. If the caller supplied its own request ID, Sockguard preserves it separately as client_request_id in logs instead of trusting it as the canonical correlation key.
path is the client-controlled URL path exactly as received and is retained for forensic replay. Detection logic, SIEM grouping, and policy analysis should use normalized_path, which is the canonical path after Sockguard strips Docker API version prefixes, decodes escaped separators, and resolves dot segments before rule evaluation.
When log.audit.enabled is true, Sockguard also emits a dedicated JSON audit event with a stable schema: request ID, client request ID, trace ID, trace parent/span IDs, sampled flag, raw and normalized path, decision, machine-readable reason_code, human-readable reason, matched rule, selected profile, flattened actor and transport identity fields, ownership context, and final HTTP status. Upstream reverse-proxy errors overwrite the audit reason code with bounded values such as upstream_socket_unreachable or upstream_response_rejected_by_policy, so the terminal result remains explicit even after an allow decision has already been made.
The audit ownership object is emitted on every event. If ownership.owner is configured, that owner identifier is repeated in every audit record, not only resource ownership decisions, so it should be a non-secret tenant/workload label suitable for the audit sink.
Sockguard preserves valid W3C traceparent trace IDs and sampled flags, forwards a proxy-local span ID, and includes trace_id, trace_parent_id, trace_span_id, and trace_sampled in access, audit, and upstream reverse-proxy error logs. Invalid or absent trace context starts a fresh local trace without enabling any OTLP span exporter.
When health.watchdog.enabled is true, Sockguard actively probes the upstream Docker socket, logs reachable/unreachable state transitions, and lets /health reflect the latest watchdog state. When metrics.enabled is true, Sockguard serves Prometheus text metrics from /metrics by default, including a sockguard_build_info{version,commit,build_date,go_version} gauge, a sockguard_start_time_seconds gauge, and watchdog state and check counters if the watchdog is enabled. The scrape endpoint is local to Sockguard, is never forwarded to Docker, bypasses Docker API allow rules like /health, and remains behind listener security plus client ACLs.
Dangerous Docker API Endpoints
| Risk Level | Endpoints |
|---|---|
| Critical | POST /containers/create, POST /containers/{id}/exec, POST /exec/{id}/start, PUT /containers/{id}/archive |
| High | POST /images/create, POST /images/load, POST /build, POST /services/create, POST /services/{id}/update, POST /swarm/init, POST /swarm/join, POST /swarm/update, POST /swarm/unlock, POST /nodes/{id}/update, POST /plugins/pull, POST /plugins/{name}/upgrade, POST /plugins/{name}/set, POST /plugins/create |
| Medium | POST /containers/{id}/update, POST /volumes/create, POST /networks/create, POST /networks/{id}/connect, POST /networks/{id}/disconnect, POST /secrets/create, POST /configs/create, DELETE /containers/{id} |
| Low | GET /containers/json, GET /events, GET /version, GET /_ping |
Image Security
Sockguard's container image is built on Wolfi (Chainguard):
- Minimal package set, which keeps the base image's CVE exposure low
- Built-in SBOM output and build provenance when release visibility supports attestations
- Cosign-signed for verification — see the image verification guide for the canonical
cosign verifyinvocation - No shell, no package manager in production image
Runtime Hardening
Sockguard runs as UID 65532 (Chainguard nonroot) inside the container.
On stock Docker hosts where /var/run/docker.sock is owned by the docker
group you may need a group_add: [docker] override or a matching GID. For a
Docker socket proxy, the real security frontier is what the daemon will accept
through the proxy, not the UID the proxy process reports after it has already
opened the upstream socket.
The runtime controls that matter are:
- Correct policy rules and request-body inspection
read_only: truecap_drop: [ALL]security_opt: ["no-new-privileges:true"]- Docker's default seccomp profile or a stricter custom profile
- AppArmor/SELinux confinement on the host
- Rootless dockerd on the host when available
The getting-started examples use the container-level controls above by default so the drop-in path stays simple without hiding the real hardening story.
Known Limitations
These are architectural constraints inherent to Sockguard's position in the stack. They are documented here for honest operator awareness rather than as open bugs.
IP-based client identity is soft isolation. When
clients.container_labels.enabled is true, or any clients.profiles[*].match
rule keys on source_cidrs, Sockguard resolves the calling container by
source IP through the Docker API. This is soft isolation — adequate against
configuration drift and friendly-fire mistakes, but not a hard boundary
against an attacker who can influence which container a given bridge IP
points at:
- A container restart can race the label lookup: if a new container acquires the same bridge IP before the lookup completes, the lookup may return the new container's labels rather than the previous container's.
- An attacker who can create containers on the same user-defined bridge can, in principle, claim a privileged IP and inherit its policy until the next legitimate container takes it back.
- Host-network containers (
network_mode: host) all share the host IP, so IP-keyed allowlists cannot tell them apart.
For workloads where caller identity is part of the security boundary, listen
on a unix socket and use clients.unix_peer_profiles with uids/gids.
SO_PEERCRED is supplied by the kernel and cannot be spoofed from within
the calling container — that is the hard-isolation path.
Exec TOCTOU (inspect/start split). Docker exposes exec metadata inspection
and exec start as separate API calls. Sockguard re-checks POST /exec/*/start
against Docker's stored exec metadata before the command runs, but the gap
between the create and start calls is an unavoidable time-of-check/time-of-use
window inherent to Docker's API shape. Keep exec allowlists narrow and
client profile assignments conservative for clients that do not need
interactive command execution.
Hijacked-stream redaction limits. Sockguard's response filter — including
the response.redact_container_env, response.redact_mount_paths,
response.redact_network_topology, and response.redact_sensitive_data
toggles — operates only on structured JSON responses with known shapes. Raw
streaming endpoints — GET /containers/*/logs, POST /containers/*/attach,
GET /services/*/logs, GET /events, exec attach, and image-build progress
output — switch the connection to a raw byte stream (or a non-JSON
chunked stream) after the initial HTTP response, at which point Sockguard
cannot inspect or redact the byte stream. A secret an application writes
to its own stdout will reach a caller that has been allowed to attach.
These paths are gated at request time via per-profile rule allowlists and
the insecure_allow_read_exfiltration guardrail (which keeps the streaming
read endpoints denied by default), but there is no post-admission byte-level
filtering of the stream content. Restrict these paths in your rules to only
the profiles and callers that genuinely need them, and treat the redaction
toggles as a guarantee for Docker's structured metadata only — not for
arbitrary workload output.
Migration
Drop-in migration paths from Tecnativa, LinuxServer, wollomatic, 11notes, and CetusGuard socket proxies — current env compatibility, same intent, stronger inspection underneath.
Image Verification
Verify sockguard images and release tarballs with cosign keyless GitHub Actions OIDC signatures.