mirror of
https://github.com/zhaofengli/attic.git
synced 2024-12-14 11:57:30 +00:00
book/faqs: Talk about compression
This commit is contained in:
parent
0c1f362a62
commit
ee16664046
1 changed files with 33 additions and 0 deletions
|
@ -1,5 +1,7 @@
|
|||
# FAQs
|
||||
|
||||
<!-- TODO: Write more about design decisions in a separate section -->
|
||||
|
||||
## Does it replace [Cachix](https://www.cachix.org)?
|
||||
|
||||
No, it does not.
|
||||
|
@ -29,6 +31,37 @@ Authentication is done via signed JWTs containing the allowed permissions.
|
|||
Each instance of `atticd --mode api-server` is stateless.
|
||||
This design may be revisited later, with option for a more stateful method of authentication.
|
||||
|
||||
## How is compression handled?
|
||||
|
||||
Uploaded NARs are compressed on the server before being streamed to the storage backend.
|
||||
We use the hash of the _uncompressed NAR_ to perform global deduplication.
|
||||
|
||||
```
|
||||
┌───────────────────────────────────►NAR Hash
|
||||
│
|
||||
│
|
||||
├───────────────────────────────────►NAR Size
|
||||
│
|
||||
┌─────┴────┐ ┌──────────┐ ┌───────────┐
|
||||
NAR Stream──►│NAR Hasher├─►│Compressor├─►│File Hasher├─►File Stream─►S3
|
||||
└──────────┘ └──────────┘ └─────┬─────┘
|
||||
│
|
||||
├───────►File Hash
|
||||
│
|
||||
│
|
||||
└───────►File Size
|
||||
```
|
||||
|
||||
At first glance, performing compression on the client and deduplicating the result may sound appealing, but has problems:
|
||||
|
||||
1. Different compression algorithms and levels naturally lead to different results which can't be deduplicated
|
||||
2. Even with the same compression algorithm, the results are often non-deterministic (number of compression threads, library version, etc.)
|
||||
|
||||
When we do the compression on the server and use the hashes of uncompressed NARs for lookups, the problem of non-determinism is no longer a problem since we only compress once.
|
||||
|
||||
On the other hand, performing compression on the server leads to additional CPU usage, increasing compute costs and the need to scale.
|
||||
Such design decisions are to be revisited later.
|
||||
|
||||
## On what granularity is deduplication done?
|
||||
|
||||
Currently, global deduplication is done on the level of NAR files.
|
||||
|
|
Loading…
Reference in a new issue