Speeding up APT with GoAPTCacher

In CI/CD pipelines and on isolated networks, apt performance often tanks because every build fetches the same packages over the WAN - sometimes from flaky mirrors. GoAPTCacher solves this with a pull-through cache: requested artifacts are stored locally and served from disk on subsequent requests. You keep your normal repository layout and package signatures; clients don’t need vendor-specific tooling. 🚀

This post explains what GoAPTCacher does, how the request flow works, when HTTPS interception makes sense (and when it’s a bad idea), and how to roll it out without shooting yourself in the foot. 😅

GoAPTCacher is open source and available on GitLab: https://gitlab.com/bella.network/goaptcacher. If you find it useful, star the repo and share your feedback! I’m also happy to help with setup questions or feature requests. 🙌

Why a pull-through cache?

APT traffic looks deceptively simple (“just download packages”), but the reality is noisy:

CI jobs are bursty: 20 runners start at once and pull the same index/package set repeatedly.
Mirrors vary in latency, availability, and sometimes correctness during sync windows.
Air-gapped or “no direct internet” environments still need updates - just controlled.

A pull-through cache is a pragmatic solution because it does not try to mirror everything. It caches only what clients actually request and becomes “warm” quickly in real workloads.

What you gain

Speed & predictability: Once warm, repeated installs become disk-speed instead of WAN-speed.
Bandwidth relief: Identical .deb, index files, and metadata are reused locally.
Mirror governance: You can allowlist domains and steer requests to preferred mirrors.
Isolation: Clients can be blocked from direct internet access while still getting updates via the proxy.
Simplicity: No need to set up a full mirror or deal with sync complexities and high storage requirements.

The honest downside

A caching proxy is another moving part. If it’s down, your installs are down (unless clients can bypass it). Also: enabling HTTPS interception is a security-sensitive decision, not a “performance toggle”. More on that later.

The security model of APT

APT relies heavily on package signatures and repository metadata for security. All the verification and security checks still happen on the client side using GPG keys and repository metadata. The client verifies that the packages and metadata it receives are signed by trusted keys, and that they haven’t been tampered with. This is a fundamental part of how APT ensures the integrity and authenticity of the software it installs.
This model allows APT to still maintain security guarantees even when a caching proxy is involved and all data is served unencrypted using HTTP. The client is still able to verify signatures and ensure that the packages it installs are legitimate, regardless of whether they were served directly from the upstream repository or through a caching proxy like GoAPTCacher.

Therefore, a caching proxy must preserve this integrity - it should never modify packages or metadata, and clients must still be able to verify signatures as if they were talking directly to the upstream repository. This is a critical aspect of the design and operation of GoAPTCacher.
If the proxy were to modify packages or metadata, it would break the security guarantees of APT and could potentially allow for malicious packages to be served.

GoAPTCacher is designed to be a transparent cache that does not alter the content it serves, and clients continue to verify signatures as they normally would with upstream repositories.

GoAPTCacher Skyscraper

What GoAPTCacher is

GoAPTCacher is a Go-based proxy for Debian/Ubuntu style repositories. It supports HTTP and HTTPS, optional TLS interception (MITM) to cache encrypted repositories, passthrough domains for traffic that must never be intercepted, domain allowlists, URL remapping, mirror overrides, DNS-SRV/mDNS discovery, a built-in web UI under /_goaptcacher/, persistent local stats/metadata, and automatic expiration of unused cache entries.

It aims to be “APT-oriented”: handle GET, HEAD, and CONNECT well - and avoid becoming a generic “do-everything proxy”.

By using a pull-through cache, you get the benefits of caching without the overhead of maintaining a full mirror. You only store what you actually use, and you can control which repositories are cached and how requests are handled.

Screenshots

One picture is worth a thousand words, so here are some screenshots of the web UI and stats pages to give you a feel for what it looks like in action:

GoAPTCacher Web UI Overview Web UI overview page showing basic instructions

GoAPTCacher Setup Instructions Setup instructions page with config examples

GoAPTCacher Stats Stats page showing hit/miss counts and bandwidth saved

How the request flow works (and why it matters)

GoAPTCacher supports these methods:

GET → download artifact (cacheable)
HEAD → metadata checks (cacheable behavior)
CONNECT → HTTPS tunnel, optionally intercepted

Cache hit / miss behavior

GET
- HIT → serves files directly from disk, no upstream request involved
- MISS → streams upstream response to client while writing to cache
HEAD
- If cached → returns metadata headers
- If not cached → fetched once, then headers returned
CONNECT
- https.prevent: true → blocked (403)
- passthrough domain or https.intercept: false → tunnel mode (no TLS interception)
- https.intercept: true → intercepted flow handled like proxy logic

The important “gotcha”: empty domain lists

If both domains and passthrough_domains are empty:

everything is technically allowed,
but GET/HEAD requests are tunneled → effectively no cache usage,
and the service logs a warning.

In this mode everything is “allowed” and restrictions must be applied using another mechanism (e.g., firewall rules, network policies) - but the proxy won’t cache anything because all traffic is tunneled.

Installation

You have multiple common paths to get GoAPTCacher up and running:

1) Debian/Ubuntu package (recommended)

The cleanest path is installing via your repository on repo.bella.network:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
echo "Types: deb
URIs: https://repo.bella.network/deb
Suites: stable
Components: main
Architectures: amd64 arm64
Signed-By: /usr/share/keyrings/bella-archive-keyring.gpg
" | sudo tee /etc/apt/sources.list.d/repo.bella.network.sources

curl -fsSL https://repo.bella.network/_static/bella-archive-keyring.gpg \
  | sudo tee /usr/share/keyrings/bella-archive-keyring.gpg >/dev/null

sudo apt update
sudo apt install goaptcacher

Then create a minimal config at /etc/goaptcacher/config.yaml (example file is available in the same directory) and start the service:

1
sudo systemctl enable --now goaptcacher

Web UI (if enabled): http://:8090/_goaptcacher/ or https://:8091/_goaptcacher/

2) Docker (example)

You can also run GoAPTCacher in a container. Here’s a quick example using the official image:

1
2
3
4
5
6
7
docker run -d --name goaptcacher \
  -p 8090:8090 \
  -p 8091:8091 \
  -p 3142:3142 \
  -v "$PWD/config.yaml":/etc/goaptcacher/config.yaml:ro \
  -v "$PWD/cache":/var/cache/goaptcacher \
  registry.gitlab.com/bella.network/goaptcacher:latest

Quick start: a minimal, safe config

Create /etc/goaptcacher/config.yaml:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
cache_directory: "/var/cache/goaptcacher"
listen_port: 8090
listen_port_secure: 8091

# Optional: expose additional ports (e.g., classic apt-cacher-ng port)
alternative_ports:
  - 3142

# Allowlist of cacheable domains
domains:
  - "archive.ubuntu.com"
  - "security.ubuntu.com"
  - "ports.ubuntu.com"
  - "security.debian.org"
  - ".debian.org"

# Domains that are always tunneled (never cached)
passthrough_domains:
  - "esm.ubuntu.com"

https:
  prevent: false
  intercept: false

index:
  enable: true
  hostnames:
    - "cache.example.com"

Start it:

1
sudo systemctl enable --now goaptcacher

Client configuration

Static proxy config (simple, reliable)

1
2
3
4
5
6
cat <<'APT' | sudo tee /etc/apt/apt.conf.d/10proxy
Acquire::http::Proxy "http://<cache-host>:8090/";
Acquire::https::Proxy "http://<cache-host>:8090/";
APT

sudo apt update

This is boring - and that’s good. For managed hosts (VMs, servers) which are always run in the same environment it’s usually the best approach. It’s explicit, easy to debug, and doesn’t rely on discovery mechanisms that can fail in weird ways (e.g., DNS issues).

HTTPS: tunnel mode vs interception (MITM)

GoAPTCacher supports two modes for handling HTTPS traffic. The “tunnel mode” is the default and simplest option, where HTTPS traffic is passed through without interception and any interception. In this mode all traffic is blindly passed through the proxy, without knowledge or control over it.
The main benefit here is that client’s don’t need to trust any custom CA, and you avoid the operational and security complexities of running a CA-like component.

The “interception (MITM)” mode allows GoAPTCacher to mint leaf certificates and intercept HTTPS flows to enable caching for HTTPS-based repositories. On the first request to an HTTPS repo, the proxy will issue itself a certificate for the target domain, and clients will need to trust the proxy’s CA to avoid TLS errors.

Tunnel Mode (CONNECT)

The tunnel mode is the default and simplest option, where HTTPS traffic is passed through without interception. This will blindly pass through all HTTPS traffic, only knowing the hostname from the CONNECT request and the transfer size, but without any insight or control over the actual requests or responses.

Pros: You do not need to distribute trust anchors to clients, and you avoid the operational and security complexities of running a CA-like component. This is a good option if you have a mix of HTTP and HTTPS repos and only care about caching the HTTP ones, or are unable to manage trust on clients.

Cons: You cannot cache HTTPS downloads (only “pass through”) - so if your repos are HTTPS-only, you won’t get caching benefits for those. Also all traffic here is tunneled, you do not have insights or control over it what is really going through the proxy, so you need to be careful about what you allow. You can still use passthrough_domains to control this, but it’s not as flexible as interception for caching.

Use this if:

your environment can’t safely deploy trust anchors
you don’t control clients well
or you simply don’t need HTTPS caching

Interception (MITM)

GoAPTCacher can create leaf certs on-the-fly and intercept HTTPS flows to enable caching for HTTPS-based repositories.

Pros: Caching works even when repos are HTTPS-only, and you can apply allowlists and remaps to intercepted traffic. You also get visibility into the requested domains and paths, which can be useful for monitoring and governance. Every request is audited and clients get the full caching benefits regardless of protocol.
Basically, the client only needs to trust the interception CA, and then you can cache everything that goes through the proxy. This has the benefit that other certificate authorities don’t need to be trusted by clients, and you can have a single trust anchor for all intercepted traffic.

Cons: You are operating a CA-like component; compromise is fatal (traffic observation/manipulation, impersonation), and you need a secure strategy for key storage, distribution, rotation, and revocation. You also need to be careful about which domains you intercept - some (auth portals, subscription services) may break if intercepted or are even illegal to intercept due to terms of service or legal restrictions. You can use passthrough_domains to exclude those, but it requires careful configuration and ongoing maintenance.

If you enable interception, treat the CA key as crown jewels:

permissions 0600, minimal access,
no casual backups,
rotate/revoke strategy,
incident plan (what happens if the key leaks?).

Nevertheless, my recommended approach is to use interception for all cacheable traffic, and use passthrough for the few exceptions that must never be touched or only used by a very small subset of clients where caching doesn’t offer much benefit.
This way you get the maximum caching benefits while minimizing the risk of breaking things.

Example interception config (high-level):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
https:
  intercept: true
  # exact paths depend on your setup
  cert: "/etc/goaptcacher/intermediate.crt"
  key: "/etc/goaptcacher/intermediate.key"

# Never intercept these (auth/pinning/subscription portals)
passthrough_domains:
  - "esm.ubuntu.com"
  - "enterprise.proxmox.com"

Client trust distribution

Clients must trust the interception CA (or intermediate), otherwise you’ll see TLS errors:

1
2
sudo cp your-ca.crt /usr/local/share/ca-certificates/goaptcacher.crt
sudo update-ca-certificates

My recommendation: Use a carefully crafted Certificate Authority (CA) strategy:

Use a base Root Certificate Authority (offline, highly protected) to only sign Intermediate CAs.
Use one or more Intermediate CAs for the proxy, which can be rotated/revoked without affecting the root.
Distribute the Root CA to clients, not the intermediate, to allow for future rotation without re-distribution.

What must go into passthrough_domains

Any repo domain that:

uses certificate pinning (this isn’t very common in APT repos, but some enterprise services do it),
requires special auth flows,
is subscription-bound (tokens, client certs, SSO),
or is simply “too risky to touch”.

In practice, passthrough is not a rare exception - it’s a core control. It’s also a good way to allow traffic for some repos on appliances that doesn’t allow you to add custom CAs.

Mirror governance: overrides and remaps

Debian and Ubuntu have multiple official mirrors, and many third-party ones. Some are faster than others, but clients often end up hitting the same “random” mirror due to DNS load balancing or geo-routing - which can lead to inconsistent performance. One of my main problems some time ago was that the default mirror for my region was very slow with only 100 Mbps bandwidth for Austria. With GoAPTCacher, I can override the mirror selection and steer clients to a faster, more reliable mirror of my choice - like instead of a slow 100 Mbps mirror to a 20 Gbps one (e.g., mirror.netcologne.de).

Note: NetCologne is a great example of a high-quality mirror with excellent performance and reliability. It’s not in my country, but I really like their service and their public Munin stats that show current load and bandwidth. - Big thanks to them for providing such a great mirror and making it easy to monitor!

GoAPTCacher supports:

distro overrides (ubuntu_server, debian_server)
path remap rules (remap)

Use cases:

forcing regional mirrors,
keeping CI deterministic,
replacing “random mirror rotation” with something stable.

Example:

1
2
3
4
5
6
7
remap:
  - from: "ubuntu.uni-klu.ac.at" # 100 Mbps mirror
    to: "ubuntu.anexia.at" # 10 Gbps mirror

overrides:
  ubuntu_server: "mirror.netcologne.de"
  debian_server: "mirror.netcologne.de"

Reality check: mirror steering is only as good as your mirror choice. If you pin to a bad mirror, you’ll get consistently bad results. 😄

Web UI and operational endpoints

GoAPTCacher includes a built-in web UI for monitoring and setup instructions. It’s not a full-blown dashboard, but it gives you quick access to stats, cache contents, and configuration tips.

Everything lives under /_goaptcacher/. Examples:

/_goaptcacher/ overview
/_goaptcacher/setup setup guide
/_goaptcacher/stats traffic stats
/_goaptcacher/goaptcacher.crt CA cert (if interception enabled)
/_goaptcacher/revocation.crl CRL (if enabled)
/.well-known/security.txt security contact
/robots.txt disallow-all

Debug endpoints exist only when debug.enable: true:

/_goaptcacher/debug JSON diagnostics
/_goaptcacher/debug/pprof pprof handlers

Hard rule: keep debug endpoints local-only (debug.allow_remote: false). If you expose pprof to the network, you’re basically begging for trouble.

Cache persistence and housekeeping

The primary cache storage is on disk under cache_directory (default: /var/cache/goaptcacher). A fast SSD is recommended for best performance, especially in CI environments with high concurrency as this is the limiting factor for cache hit performance.

This also includes metadata and stats storage, this is stored in the same directory but separated from the actual cached artifacts.

statistics: cache_directory/.stats.json - aggregated hit/miss counts, bandwidth saved, etc. written periodically and on shutdown
per-file metadata: sidecar files like *.access.json that track individual artifact usage, timestamps, and upstream info

It also has:

conditional upstream checks (If-Modified-Since / If-None-Match)
expiration of unused cache entries

The key operational question is: how big is your cache allowed to grow and how fast do you evict?

If you set eviction too aggressively, you’ll never reach a warm cache and performance won’t improve. If you never evict, you’ll eventually fill disk and start failing writes (storage errors due to full disk). A good starting point is to use a separate partition or disk for the cache, and monitor usage closely.

Recommendation: Start with a large cache size (e.g., 100 GB or more) and a long eviction time (e.g., 90 days) to let the cache warm up. Use a monitoring tool for disk usage and set up alerts for when you approach capacity. Once you have a sense of your workload and hit/miss patterns, you can adjust eviction policies accordingly. Real use is very dependent on your specific environment and workload, so it’s worth experimenting with different settings.

In my case, 100 GB is sufficient for a Ubuntu/Debian environment with 150+ VMs and a few dozens of CI runners, and I have a long eviction time (90 days) to keep the cache warm.

Running this in CI/CD pipelines

Especially in CI, the benefits of GoAPTCacher can be huge. CI pipelines often involve many machines (runners) that repeatedly install the same packages, and they can be very sensitive to mirror performance and availability. By caching packages locally, you can speed up builds, reduce WAN bandwidth usage, and improve overall stability.

By speeding up APT, you can reduce build times significantly, especially for larger jobs that install many packages. This can lead to faster feedback loops and more efficient development cycles. On top due to faster builds, you can also save on CI costs if your provider charges based on build time, number of runners/VMs, or bandwidth usage.

Typical patterns:

A dedicated cache VM/service inside the CI network.
Runners configured with proxy env or apt.conf.d snippet.

Auto-discovery

If you have a dynamic environment where machines come and go, or you want to avoid manual config on clients, GoAPTCacher supports auto-discovery via DNS SRV records and optional mDNS announcements. This allows clients to find the proxy without hardcoding its address.

This is especially useful in environments like client networks with client devices (e.g. company with developer laptops) which use WSL or similar setups where you don’t want to manage static proxy configs on each device and adapt on every location change (e.g. office, home, coffee shop). In such cases, discovery mechanisms can provide a more seamless experience.

Supported discovery mechanisms:

If you have a DNS infrastructure that supports SRV records and overrides of DNS queries for non-authoritative zones, you can set up a record like this:

`1`	`_apt_proxy._tcp.archive.ubuntu.com. 300 IN SRV 10 5 8090 cache1.example.com.`

Counterpoint: discovery is convenient, but static config is easier to reason about and debug. In production, “boring config” tends to win.

Alternative: You can also use auto-apt-proxy for client-side discovery via DNS lookups to a static DNS domain like apt-proxy.<DNS-Suffix> to find the proxy address. If your clients are correctly configured with either a DNS suffix, search domain or a FQDN that matches the discovery domain, they can automatically find the proxy without manual config.

If the destination isn’t resolvable or reachable, the client will fall back to direct connections which results in no impact.

Security notes (the part people skip, but you shouldn’t)

A proxy is a choke point. If it’s compromised, it can become a traffic observation or manipulation point.

With interception enabled, the proxy becomes a certificate issuer. That changes the threat model massively. Restrict and monitor access to the proxy and its keys like you would with any CA. If the key leaks, an attacker can impersonate any site to your clients. This problem is not unique to GoAPTCacher, but it’s critical to understand if you enable interception.

What GoAPTCacher intentionally does not do

This is important for expectations:

GoAPTCacher is not a full mirror or sync tool. It does not try to replicate entire repositories or keep them in sync with upstream. It only caches what clients request, and if something isn’t requested, it won’t be cached.

This means if you build a cache with GoAPTCacher, you won’t have a full mirror of Debian/Ubuntu repositories on disk. When the target repository goes down and you request a package that isn’t cached, you’ll get an error instead of a cached response. The cache is “warm” for popular packages and indexes, but it won’t magically have everything. This is a tradeoff for simplicity and storage efficiency - you only store what you actually use.

No rolling updates or sync windows. You won’t have a “snapshot” of the repository at a given point in time. If upstream changes and you request something new, it will be fetched and cached on demand. This is different from mirror tools that try to keep a local copy in sync with upstream. It’s not designed to be a management tool on what packages are the latest one or what to distribute like WSUS or similar tools. It’s a “cache-as-you-go” model.

No own repository management. You can’t use GoAPTCacher to host your own packages or manage a custom repository. It’s purely a caching proxy for existing repositories.

Real-world story

GoAPTCacher was born out of my frustration with slow and unreliable APT performance in my homelab and CI environments. I had dozens of VMs and runners that all hit the same mirrors, and it was a nightmare during peak times or mirror syncs. I wanted a simple solution that didn’t require setting up a full mirror or dealing with complex sync issues.

Additionally, other solutions I tried were either too heavy (full mirrors), too generic (squid), or didn’t support HTTPS well. The nearest thing was apt-cacher-ng, but due to many known bugs and performance issues my CI builds frequently failed or all VMs failed to update anymore until I restarted the service or deleted all locally cached files (basically resetting the entire instance and starting over). I wanted something more robust, able to handle HTTPS properly, and with better operational visibility.

I built GoAPTCacher to solve this specific problem, and it’s been a game-changer for my environments. The performance improvements are significant, and the ability to control mirror selection and cache behavior has made my CI pipelines, homelab and even the environment in some companies much more stable.

Network isolation

In my environment, I want to keep my VMs isolated from the internet for security reasons, but they still need to get updates. With GoAPTCacher, I can allow them to access the proxy while blocking direct internet access. This gives me a controlled way to provide updates without exposing my machines to unnecessary risks. The proxy acts as a gatekeeper, and I can use domain allowlists to ensure that only approved repositories are accessed. This has been a critical part of my security strategy while still maintaining the ability to keep my systems up to date.

Server upgrades

When I upgrade my servers or VMs, I often need to install a lot of packages. With GoAPTCacher, the first upgrade might be slow as it populates the cache, but subsequent upgrades are much faster because the packages are served from disk. This has made maintenance windows much more efficient (as long as you don’t pre-download everything on every VM).

In my environment, a typical upgrade of a VM with Ubuntu 22.04 LTS with 200+ packages to Ubuntu 24.04 LTS takes around 15 minutes without caching, and around 2-3 minutes with a warm cache. This is a huge improvement in terms of downtime and maintenance efficiency.
I also performed these upgrades with Debian 10 -> 11 -> 12 -> 13 and saw similar improvements, same with the upgrade of Proxmox VE 7 to 8 which also involves a lot of package changes. The performance boost is especially noticeable when you have multiple VMs upgrading at the same time, as they can all benefit from the shared cache instead of hitting the WAN individually.

In addition, it’s very nice to see the gigabit connection between the cache and the VMs nearly fully saturated during upgrades, which is a good sign that the cache is working as intended and providing the expected performance boost. Instead of maxing out the WAN connection (~26 MB/s) I get around 110 MB/s from the cache over a single gigabit link, which is a huge difference and makes the upgrade process much smoother. As the machine where GoAPTCacher is running has plenty of free RAM, frequently accessed packages are cached in memory by the OS, which further boosts performance and reduces latency for repeated requests.

CI builds

If packages are not bundled in the base image, CI builds mostly needs to install dependencies on every run. With GoAPTCacher, this repetitive installation becomes much faster after the first run, which has significantly reduced build times and improved developer productivity.

Some of GitLab CI pipelines trigger around 20 runners at once, and without caching they all hit the same mirrors and consume a lot of bandwidth. Due to limited bandwidth on my side and the mirror’s side, this often led to waiting times of 30-60 seconds until the real CI operations could start. With GoAPTCacher, the first run populates the cache, and subsequent runs are much faster (around 5-10 seconds) because they hit the local cache instead of the WAN.

In my case, the runners are configured with a static proxy config that points to GoAPTCacher, and they have the interception CA installed to get caching benefits for HTTPS repositories. Daily, the total run time of CI pipelines has been reduced by around 30-50% due to faster package installations, which has a big impact on overall development speed and feedback loops. Especially due to limited homelab resources, additional scaling of runners is not an option, so improving the performance of existing runners with caching has been a critical optimization.

Golden Images

Very frequent, some images even daily are built from scratch with a base image and a long list of packages. With GoAPTCacher, the nearly no WAN traffic is involved in these builds, and they run much faster because the packages are served from the local cache. This reduces the package fetching nightly from many gigabytes to a few megabytes (only the new packages that aren’t cached yet and the metadata), which has a huge impact on build times and bandwidth usage.

Branch offices

In a company environment with multiple branch offices, GoAPTCacher can be deployed in each office to provide local caching for APT traffic. This can significantly improve performance for users in those offices, especially if they have limited bandwidth or unreliable connections to the central repository. By deploying a cache in each office, you can reduce WAN bandwidth usage and improve the user experience when installing updates or new software.

In a tested scenario with a branch office, centralized internal APT repositories were accessed over a Site-to-Site VPN with limited bandwidth (around 50 Mbps). With GoAPTCacher deployed in the branch office, the first installation of packages was slow as it fetched from the central repository, but subsequent installations were much faster (around 10-15 seconds) because they hit the local cache instead of the WAN. This has improved the user experience and reduced the load on the central repository.

Closing

GoAPTCacher is a practical tool for a very real problem: APT becomes slow and unreliable when you scale installs across many machines or runners. A pull-through cache gives you speed, stability, and governance - it caches what you actually use, and serves it locally. Instead of investing in longer build and update times, you can invest in a cache that pays off immediately and keeps paying off as your environment grows.

If you’re running isolated networks, CI, homelabs, or have limited bandwidth, GoAPTCacher can be a game-changer.
It’s not a silver bullet, and it does add complexity, but for the right use cases, caching close to where work happens is one of those “small changes, huge payoff” moves.

GoAPTCacher is open source and available on GitLab: https://gitlab.com/bella.network/goaptcacher.
Give it a try, star the repo if you find it useful, and share your feedback!