Enterprise collector deployment¶
Enterprise collector rollout is the target deployment model for fleets of 100 or more hosts. The current public install path still uses hand-issued mTLS certificates for evaluation and small pilots; that is not the intended long-term production workflow.
ARIS uses published Docker Hub images for core and web, and is moving toward a bounded bootstrap enrollment model for collectors. The server-side enrollment data model, token hashing rules, development CA signer, external-command signer integration point, admin enrollment API MVP, collector enrollment endpoint, initial aris-collector enroll command, renewal endpoint, explicit aris-collector renew command, automatic renewal scheduling in aris-collector run, fleet-health heartbeat API, initial fleet-health UI, central config API lifecycle, and aris-collector diag command now exist as MVP primitives. The full customer-facing workflow is still incomplete because polished fleet revocation and rollback UI controls are not shipped:
- An admin creates an enrollment policy that defines accepted collector identities, certificate lifetime, platform constraints, and rollout limits.
- An admin creates short-lived bootstrap tokens scoped to that policy.
- Fleet tooling installs the collector package and delivers the bootstrap token through a protected channel.
- The collector generates its private key locally, requests a client certificate from core, writes its config, and validates the install.
- The collector forwards over mTLS and later renews its certificate automatically without reusing the bootstrap token.
Manual OpenSSL certificate generation remains useful for labs, demos, and break-glass recovery. Do not treat one manually copied certificate per host as the production plan for a broad rollout.
Current support¶
Today, production ingest uses mTLS with manually provisioned collector client certificates:
- core verifies collector certificates against
ARIS_INGEST_CLIENT_CA_FILE; - each collector pins core's ingest CA and server name;
- each collector certificate must carry exactly one DNS or URI Subject Alternative Name (SAN) identity;
- Common Name (CN)-only collector certificates are not accepted for identity;
- core derives the collector
host_idfrom that single DNS or URI SAN and rejects envelopes that claim another identity; - when
ARIS_INGEST_REQUIRE_ENROLLMENT=true, core also requires an active enrolled certificate row and active collector identity row before accepting ingest.
Use a stable identity value your PKI or enrollment flow can issue, such as DNS SAN eng-laptop-42.acme.internal or URI SAN spiffe://acme.internal/device/eng-laptop-42. In secure mode the collector derives host_id from the certificate SAN; if host_id is set explicitly in collector.yaml, it must match the SAN exactly.
Current manual mTLS support does not include background automatic certificate renewal or ARIS-managed CRL/OCSP checks. If a collector private key is exposed, replace that host's certificate and key. The old certificate can remain accepted until expiry unless ARIS_INGEST_REQUIRE_ENROLLMENT=true and the certificate is revoked in ARIS state through the admin enrollment API, or unless revocation is enforced by customer PKI/TLS infrastructure, an external proxy, or collector client CA rotation.
Use Deploy the collector for the current evaluation setup flow.
Published images¶
Core and web images are published on Docker Hub:
| Component | Image |
|---|---|
| Core API, directory sync, enrollment management, and ingest | ryora/aris-core:<version> |
| Operator UI | ryora/aris-web:<version> |
| Collector runtime image for containerized deployments and smoke tests | ryora/aris-collector:<version> |
Use the version and digest from your ARIS release notes. For production and mirrors, pin by digest:
ryora/aris-core:1.2.3@sha256:<digest>
ryora/aris-web:1.2.3@sha256:<digest>
ryora/aris-collector:1.2.3@sha256:<digest>
Air-gapped installs should mirror the images, SBOMs, release manifest, package repository metadata, and detached signatures together.
Enrollment MVP¶
Core exposes admin-scoped endpoints to create enrollment policies, bootstrap tokens, and revocation actions:
| Endpoint | Purpose |
|---|---|
POST /v1/collector/enrollment-policies |
Create an enrollment policy. |
POST /v1/collector/enrollment-policies/{id}/tokens |
Create a short-lived bootstrap token; the raw token is shown only in this response. |
POST /v1/collector/enrollment-policies/{id}/freeze |
Stop future enrollment under a policy. |
POST /v1/collector/tokens/{id}/revoke |
Revoke a bootstrap token. |
POST /v1/collector/identities/{id}/revoke |
Revoke an enrolled collector identity and its active certificates. |
POST /v1/collector/certificates/{id}/revoke |
Revoke one enrolled collector certificate. |
Policy creation rejects negative enrollment limits and caps enrollment certificate TTLs at 30 days. Token creation rejects negative use limits and caps bootstrap-token TTLs at seven days. A zero ttl_seconds keeps the server default of one hour. Freeze and revoke requests can include an optional audit reason code such as device_retired or key_exposed; reason codes must match ^[a-z][a-z0-9_]{0,63}$.
When core is started with an enrollment signer, it exposes POST /v1/collector/enroll. The local CA mode uses ARIS_ENROLLMENT_SIGNER_MODE=local-ca with ARIS_ENROLLMENT_CA_CERT_FILE and ARIS_ENROLLMENT_CA_KEY_FILE. Production environments that cannot load a CA private key into core can use ARIS_ENROLLMENT_SIGNER_MODE=external-command, ARIS_ENROLLMENT_SIGNER_COMMAND=/absolute/path/to/signer, and ARIS_ENROLLMENT_CA_CERT_FILE as the trust bundle for returned certificates; core sends the CSR/profile as JSON on stdin, verifies the returned certificate chain, and validates the returned certificate profile before recording it. The collector MVP command can then enroll against an admin-created policy/token:
aris-collector enroll \
--core https://core.example.com \
--ingest core.example.com:8443 \
--token-file /etc/aris/collector/enrollment.token \
--device-id eng-laptop-42 \
--identity-prefix spiffe://acme.internal/device/ \
--ca-bundle /etc/aris/collector/core-ca.crt
Use --token-stdin instead of --token-file when fleet tooling can pipe the secret. Use --ingest-ca-bundle when the ingest server CA differs from the management CA. Do not pass bootstrap tokens in command-line arguments.
The enrollment management URL should be HTTPS. HTTP enrollment is accepted only with an explicit loopback-development flag and is not appropriate for fleet rollout.
For HTTPS enrollment, the generated collector config includes management.core_url and leaves management.server_name unset so TLS verifies the management URL host. A running enrolled collector uses that URL for mTLS management heartbeats. Loopback HTTP development enrollment intentionally omits management.core_url because heartbeat authentication requires HTTPS client certificates.
Delete the bootstrap token file after successful enrollment. If the generated config is intended to run as root, pass --allow-root-acknowledged I-UNDERSTAND-ROOT-MODE; other values will not satisfy runtime validation.
If a reverse proxy or load balancer sits in front of core enrollment, terminate TLS there and forward to core over a trusted private path. The current MVP source-IP policy and failed-use throttle use the direct remote address observed by aris-server; proxy-aware trusted header extraction is not implemented yet.
Renewal MVP¶
When core is started with an enrollment signer, it also exposes POST /v1/collector/renew. Renewal uses the current collector client certificate on an HTTPS mTLS management request, not the bootstrap token. Core checks that the presented certificate is still an active enrolled certificate for an active collector identity, requires the renewal CSR to request the same identity SAN, issues a replacement certificate through the configured signer, returns renew_after, and records a renewal audit event.
The explicit MVP command is:
aris-collector renew \
--config /etc/aris/collector/collector.yaml \
--core https://core.example.com \
--ca-bundle /etc/aris/collector/core-ca.crt
By default the command reads the current cert/key paths from collector.yaml, generates a fresh local private key, renews over mTLS, verifies that core returned the same identity and a certificate matching the new key, replaces the configured cert/key files as one logical pair with rollback on failure, and validates the config.
When management.core_url is set, aris-collector run also schedules automatic renewal. It renews before certificate expiry with deterministic per-certificate jitter, retries failed renewal attempts with a delay bounded by remaining certificate lifetime, and restarts the in-process supervisor after a successful renewal so forwarding and management calls reload the new cert/key material.
The shipped aris-server management listener is normally run behind TLS termination and does not itself provide a management TLS listener. For renewal and heartbeat behind a reverse proxy, configure the proxy to strip any incoming client-cert header from external requests, verify the collector certificate itself, and inject the URL-escaped PEM certificate in a private header named by ARIS_ENROLLMENT_CLIENT_CERT_HEADER. Renewal and heartbeat requests also carry proof signatures made by the current collector private key, so a copied public certificate alone cannot renew or write health state. Heartbeat proof covers a sent_at timestamp that core accepts only inside a short clock-skew window. Without that trusted header configuration, renewal and heartbeat only work in embedded/test deployments where aris-server receives the client certificate directly in r.TLS.
Operators should still monitor renewal failures and keep a re-enrollment runbook for hosts that miss the renewal grace window.
Fleet health MVP¶
When an enrollment signer is configured, core exposes POST /v1/collector/heartbeat for enrolled collectors and GET /v1/collector/health for audit-scoped operators. Heartbeats are authenticated with the collector's current client certificate as observed directly by aris-server, or from the trusted proxy header above when ARIS_ENROLLMENT_CLIENT_CERT_HEADER is configured, plus a heartbeat proof signature from the corresponding private key. They are rejected for revoked, replaced, expired, unknown, or proof-mismatched certificates. Deploy the trusted-header mode only behind a proxy that strips inbound copies of that header and verifies collector mTLS before injection.
The health API reports enrolled count, active recent heartbeats, heartbeat thresholds, last envelope time, certificate status and expiry, collector version, config hashes, queue depth, dropped record count, renewal status, last auth failure, and derived states such as healthy, stale, cert_expiring, queue_at_risk, config_invalid, unsupported_version, renewal_failed, revoked, and enrolling. Collectors compute running_config_hash locally; optional management.desired_config_hash and management.last_known_good_config_hash values can be supplied by fleet tooling for reporting during migration to central config. Those reporting-only fields are excluded from the computed running hash. Fleet tooling should derive a comparable desired hash by running the same aris-collector binary against the candidate rendered config and copying config.running_config_hash from aris-collector diag --config <candidate>. To find collectors that enrolled and heartbeat but never forwarded, filter rows whose last_heartbeat_at is present and last_envelope_at is absent.
The Web /fleet page is the first UI over this report. It lets audit-scoped operators inspect fleet state, filter by collector state, search device/identity/platform/version fields, identify desired/running/last-known-good config drift, distinguish recent heartbeats from actual envelope forwarding, and export CSVs for stale collectors, queue risk, renewal failures, and config drift. It is inspect/export only; revocation and central-config rollback remain API/runbook workflows until dedicated permission-gated action UI ships.
Core has an initial API-first central config lifecycle. Admin-scoped API callers can publish validated collector YAML, assign it as desired for a collector identity, and roll desired config back to the last-known-good hash recorded from heartbeats. Enrolled collectors can fetch assigned desired config from POST /v1/collector/config/desired over the same mTLS management path used by renewal and heartbeat, with a signed freshness proof from the collector private key. Collector-side automatic fetch/validate/activate loops and Web rollback controls are still future work.
Published collector YAML is stored and returned verbatim after validation. Keep secrets, private comments, bootstrap tokens, certificates, and private keys out of centrally managed collector YAML; reference local files or your secret manager instead.
Diagnostics MVP¶
aris-collector diag --config /etc/aris/collector/collector.yaml emits a JSON support bundle. The bundle includes version/commit, config validation result, forwarder endpoint and public certificate metadata, and explicit redaction notes. It does not include private keys, private key paths, bootstrap tokens, raw telemetry, or queue contents. The command exits 0 when it successfully writes the bundle, even when validation.ok is false; scripts should inspect the JSON field. Live connectivity checks and recent server-side reason codes are not yet included.
Enterprise rollout requirements¶
Before rolling ARIS out beyond a small pilot with current manual mTLS support, plan for:
- private networking or a private load balancer for collector ingest;
- one durable device identity per managed host;
- no shared collector client certificates;
- certificate rotation procedures and a revocation plan appropriate to the current manual mTLS limits;
- fleet tooling such as MDM, Ansible, Jamf, Intune, package repositories, or equivalent;
- queue and disk limits sized for core outages;
- health monitoring for stale collectors, queue risk, cert expiry, auth failures, and version drift, using ARIS surfaces as they ship plus your fleet tooling where needed;
- package rollback and config rollback procedures owned by your fleet tooling until ARIS-published packages and central config are available.
For the enrollment model, also plan for short-lived bounded bootstrap credentials, scheduled renewal behavior, operator-facing identity/certificate revocation runbooks, and fleet-health monitoring. The admin enrollment API is an API-first MVP; polished UI workflows are still future work.
Hostnames are display metadata, not authoritative device identity. For broad rollout, bind collector identity to a durable source with trusted issuance or attestation, such as MDM-attested inventory, cloud instance identity, customer PKI subject with issuance controls, or a protected per-device secret. Serial numbers and asset IDs are useful claims only when they come from a trusted inventory or attested source.
Queue limits MVP¶
Collector queue defaults for enterprise rollout are:
| Setting | Default |
|---|---|
buffer.max_disk_usage_mb |
100 |
buffer.max_record_age_seconds |
604800 |
buffer.drop_newest_on_full |
false |
buffer.queue_at_risk_bytes_percent |
80 |
buffer.queue_at_risk_oldest_seconds |
86400 |
At the cap, the default behavior is backpressure/blocking for lossless sources. drop_newest_on_full is an explicit degraded mode for fleets that prefer freshness during outages. Drop-oldest is not supported.
ARIS also ships aris-fleet-sim, a local deterministic JSON simulator for smoke, 100-host, and 1,000-host rollout scenarios. It models enrollment token races, outage queue growth, reconnect jitter, renewal wave smearing and peak rate, revocation, bad config, clock anomalies, disk-full hosts, corrupt queue DB hosts, long-offline recovery, and queue bottlenecks. It is useful for planning and regression checks, but it is not a substitute for a full process-level load test on customer-like infrastructure.
For Linux pilots, ARIS has internal Debian and RPM package builders that preserve the documented runtime contract and install a systemd unit with a validation ExecCondition. That guard lets fleet tooling install the package before enrollment without causing a service restart loop. The packages intentionally do not ship collector.yaml; enrollment creates it. The intended sequence is package install, token and CA staging, aris-collector enroll with package-compatible output paths, token removal, package runtime ownership repair for the generated config/key material, and then systemctl enable --now aris-collector.
Linux pilot hardening now includes an internal host-side evidence checker that records package version, systemd state, validate-config, redacted diag, and key runtime path modes as JSON. Use that evidence beside /fleet and the health API when validating healthy, stale, queue-risk, config-invalid, renewal-failed, revoked, uninstall, purge, package rollback, and config rollback scenarios. The checker is an internal pilot aid, not a published customer support bundle.
Windows pilot packaging now has an internal WiX MSI builder and Windows Service entrypoint. The first Windows mode installs aris-collector.exe under C:\Program Files\ARIS\Collector, creates a demand-start service under NT AUTHORITY\LocalService, and keeps config, state, and file-backed certificate material under C:\ProgramData\ARIS\Collector. Intune or SCCM should handle CA staging, protected one-time token-file staging, enrollment, service SID ACL repair, token cleanup, config validation, and service start. Authenticode signing, full Windows-host smoke, Windows certificate store/TPM-backed key storage, and Intune Win32 metadata remain rollout work.
Release automation also has a manifest contract. aris-release-manifest consumes a strict JSON spec for package artifacts, required SBOM files, and image references pinned as image:version@sha256:<digest> with SemVer release tags, then emits an aris.release-manifest.v1 JSON manifest with SHA-256 checksums and byte sizes. Treat that manifest as the checksum source for customer mirrors and release notes. Run manifest generation on Unix release workers for hardened symlink/path handling. The command does not sign artifacts or generate SBOMs by itself.
Enterprise security mode status: core now supports a production integration point for customer-operated signers through the external-command signer mode. Vault PKI, AWS KMS/Private CA, GCP CAS/KMS, HSMs, and offline CA workflows should be wrapped behind that contract until first-class provider clients are added. ARIS does not claim FIPS validation for its Go cryptographic module or package builders; customers with FIPS requirements should run core and signer wrappers on FIPS-validated platforms and treat the signer boundary as the controlled cryptographic module. Air-gapped installs should mirror packages, SBOMs, manifest, repository metadata, and detached signatures together; checksum-only mirrors are not sufficient for production acceptance. Hardware attestation, TPM-bound collector keys, macOS Keychain, and Windows certificate-store private-key support remain later security tiers.
ARIS_ENROLLMENT_SIGNER_COMMAND is an executable path only, not a shell command line with arguments. Keep provider-specific flags and credentials in the wrapper's own config file or managed runtime environment.
Remaining rollout roadmap¶
The first enterprise spine is in place, but full deployment support is sequenced behind the current Debian package MVP:
| Order | Workstream | Outcome |
|---|---|---|
| 1 | Multi-distro Linux packaging | Add RHEL 9-family RPM support, artifact signing, package repository or mirror guidance, and package smoke tests. Initial RPM builder and containerized package-smoke support are present; signing and repository automation remain. |
| 2 | macOS deployment | Ship a signed/notarized package, choose LaunchDaemon or LaunchAgent, document certificate material handling, and provide Jamf-oriented guidance. Initial unsigned LaunchDaemon package builder support is present with root-acknowledged Jamf enrollment guidance; signing, notarization, and managed-host smoke remain. |
| 3 | Fleet operations UI | Initial /fleet health workflow is present for stale collectors, renewal failures, queue risk, config drift, and exports. Revocation UI remains. |
| 4 | Linux pilot hardening | Internal evidence checker and Debian/RPM Ansible pilot support are present; real-host execution on 10-25 Linux hosts remains. |
| 5 | Windows deployment | Initial MSI/service path and Intune enrollment script are present; signing, full Windows-host smoke, certificate store/TPM key storage, and Intune metadata remain. |
| 6 | Central config lifecycle | Initial Core publish, desired assignment, collector fetch, rollback-to-last-known-good, audit, and fleet-health hash flow are present; collector automatic activation and Web actions remain. |
| 7 | Enterprise security modes | Initial external-command signer integration point is present; first-class provider clients, full external PKI lifecycle, FIPS validation claims, and attestation remain. |
Rollout stages¶
Use staged promotion instead of deploying to the full fleet at once:
| Stage | Size | Promotion criteria |
|---|---|---|
| Lab | 1-3 hosts | Pull digest-pinned Docker Hub images, install collector package or binary, enroll or provision certs, forward, heartbeat, diagnose, uninstall, and rollback. |
| Pilot | 5 hosts | All collectors forward and heartbeat successfully; no unexpected queue growth; /fleet shows expected state. |
| Canary | 25 hosts | Stale collector detection, renewal, package rollback, and config rollback runbooks verified. A stale collector is one that was expected to forward or heartbeat but has not been seen inside your operational threshold. |
| Initial enterprise rollout | 100 hosts | CPU, memory, queue risk, auth failures, renewal status, config drift, and forwarding success stay within thresholds agreed from lab and pilot observations. |
| Broad rollout | Remaining fleet | Compatibility and rollback path are confirmed. |