1
0
Fork 0
mirror of https://github.com/dragonflydb/dragonfly.git synced 2024-12-14 11:58:02 +00:00

Update dashtable doc

This commit is contained in:
Roman Gershman 2022-05-23 19:41:55 +03:00
parent 8bed96d20b
commit 5ad6352ad7
3 changed files with 70 additions and 28 deletions

1
doc/bgsave_memusage.svg Normal file

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 78 KiB

View file

@ -94,23 +94,21 @@ Please note that with all efficiency of Dashtable, it can not decrease drastical
overall memory usage. Its primary goal is to reduce waste around dictionary management.
Having said that, by reducing metadata waste we could insert dragonfly-specific attributes
into a table's metadata in order to implement other intelligent algorithms like forkless save. This is where Dragonfly's most impactfuls savings happen.
into a table's metadata in order to implement other intelligent algorithms like forkless save. This is where some the Dragonfly's disrupting qualities [can be seen](#forkless-save).
## Benchmarks
There are many other improvements in dragonfly that save memory besides DT. I will not be
able to cover them all here. The results below show the final result as of May 2022.
To compare RD vs DT I use an internal debugging command "debug populate" that quickly fills
both datastores with small data. It just saves time and gives more consistent results compared to memtier_benchmark.
It also shows the raw speed at which each dictionary gets filled without intermediary factors
like networking, parsing etc.
### Populate single-threaded
To compare RD vs DT I often use an internal debugging command "debug populate" that quickly fills both datastores with data. It just saves time and gives more consistent results compared to memtier_benchmark.
It also shows the raw speed at which each dictionary gets filled without intermediary factors like networking, parsing etc.
I deliberately fill datasets with a small data to show how overhead of metadata differs between two data structures.
I run "debug populate 20000000" (20M) on both engines on my home machine "AMD Ryzen 5 3400G with 8 cores".
### Single-threaded scenario
| | Dragonfly | Redis 6 |
|-------------|-----------|---------|
| Time | 10.8s | 16.0s |
@ -120,49 +118,53 @@ When looking at Redis6 "info memory" stats, you can see that `used_memory_overhe
to `1.0GB`. That means that out of 1.73GB bytes allocated, a whooping 1.0GB is used for
the metadata. For small data use-cases the cost of metadata in Redis is larger than the data itself.
### Multi-threaded scenario
### Populate multi-threaded
Now I run Dragonfly on all 8 cores. Redis has the same results, of course.
Now I run Dragonfly on all 8 cores.
| | Dragonfly | Redis 6 |
|-------------|-----------|---------|
| Time | 2.43s | 16.0s |
| Memory used | 896MB | 1.73G |
Due to shared-nothing architecture, dragonfly maintains a dashtable per thread with its own slice of data.
Each thread fills 1/8th of 20M range it owns - and it much faster, almost 8 times faster.
You can see that the total usage is even smaller, because now we maintain smaller tables in each
Due to shared-nothing architecture, Dragonfly maintains a dashtable per thread with its own slice of data. Each thread fills 1/8th of 20M range it owns - and it much faster, almost 8 times faster.You can see that the total usage is even smaller, because now we maintain
smaller tables in each
thread (it's not always the case though - we could get slightly worse memory usage than with
single-threaded case ,depends where we stand compared to hash table utilization).
### Forkless Save
We run `memtier_benchmark` for this experiment. The loadtest has been sending write requests
according to the command below.
```bash
memtier_benchmark -p 6380 --ratio 1:0 -n 1000000 --threads=2 -c 20 --distinct-client-seed --key-prefix="key:" --hide-histogram --key-maximum=10000000 -d 256
```
This example shows how much memory Dragonfly uses during BGSAVE under load compared to Redis. Btw, BGSAVE and SAVE in Dragonfly is the same procedure because it's implemented using fully asynchronous algorithm that maintains point-in-time snapshot guarantees.
TODO
This test consists of 3 steps:
1. Execute `debug populate 5000000 key 1024` command on both servers to quickly fill them up
with ~5GB of data.
2. Run `memtier_benchmark --ratio 1:0 -n 600000 --threads=2 -c 20 --distinct-client-seed --key-prefix="key:" --hide-histogram --key-maximum=5000000 -d 1024` command in order to send constant update traffic. This traffic should not affect substantially the memory usage of both servers.
3. Finally, run `bgsave` on both servers while measuring their memory.
It's very hard, technically to measure exact memory usage of Redis during BGSAVE because it creates a child process that shares its parent memory in-part. We chose `cgroupsv2` as a tool to measure the memory. We put each server into a separate cgroup and we sampled `memory.current` attribute for each cgroup. Since a forked Redis process inherits the cgroup of the parent, we get an accurate estimation of their total memory usage. Although we did not need this for Dragonfly we applied the same approach for consistency.
![BGSAVE](./bgsave_memusage.svg)
As you can see on the graph, Redis uses 50% more memory even before BGSAVE starts. Around second 14, BGSAVE kicks off on both servers. Visually you can not see this event on Dragonfly graph, but it's seen very well on Redis graph. It took just few seconds for Dragonfly to finish its snapshot (again, not possible to see) and around second 20 Dragonfly is already behind BGSAVE. You can see a distinguishinable cliff at second 39
where Redis finishes its snapshot, reaching almost x3 times more memory usage at peak.
### Expiry of items during writes
Efficient Expiry is very important for many scenarios. See, for example,
[Pelikan paper'21](https://twitter.github.io/pelikan/2021/segcache.html). Twitter team says
that their their memory footprint could be reduced by as much as by 60% by employing better expiry
methodology. The authors of the post above show prons and cons of expiration methods in the table below:
that their their memory footprint could be reduced by as much as by 60% by employing better expiry methodology. The authors of the post above show prons and cons of expiration methods in the table below:
<img src="https://twitter.github.io/pelikan/assets/img/segcache/expiration.svg" width="400">
They argue that proactive expiration is very important for timely deletion of expired items.
Dragonfly, employs its own intelligent garbage collection procedure. By leveraging DashTable
compartmentalized structure it can actually employ passive expiry with low CPU overhead.
Its passive procedure is complimented with proactive gradual scanning of the table in background.
compartmentalized structure it can actually employ a very efficient passive expiry algorithm with low CPU overhead. Our passive procedure is complimented with proactive gradual scanning of the table in background.
The procedure is a follows:
A dashtable grows when its segment becomes full during the insertion and needs to be split.
This is a convenient point to perform garbage collection, but only for that segment.
We can scan its buckets for the expired items. If we delete some of them, we may avoid growing
the table altogether! The cost of scanning the segment before potential split is nor more the
split itself so can be described as `O(1)`.
We scan its buckets for the expired items. If we delete some of them, we may avoid growing the table altogether! The cost of scanning the segment before potential split is no more the
split itself so can be estimated as `O(1)`.
We use `memtier_benchmark` for the experiment to demonstrate Dragonfly vs Redis expiry efficiency.
We run locally the following command:
@ -172,7 +174,7 @@ memtier_benchmark --ratio 1:0 -n 600000 --threads=2 -c 20 --distinct-client-seed
--key-prefix="key:" --hide-histogram --expiry-range=30-30 --key-maximum=100000000 -d 256
```
We load larger values this time (256 bytes) to reduce the impact of metadata savings
We load larger values (256 bytes) to reduce the impact of metadata savings
of Dragonfly.
| | Dragonfly | Redis 6 |
@ -180,8 +182,7 @@ of Dragonfly.
| Memory peak usage | 1.45GB | 1.95GB |
| Avg SET qps | 131K | 100K |
Please note that Redis could sustain 30% less qps. That means that the optimal working sets for
Dragonfly and Redis are different - the former needed to host at least `20s*131k` items
Please note that Redis could sustain 30% less qps. That means that the optimal working sets for Dragonfly and Redis are different - the former needed to host at least `20s*131k` items
at any point of time and the latter only needed to keep `20s*100K` items.
So for `30%` bigger working set Dragonfly needed `25%` less memory at peak.

40
doc/memory_bgsave.tsv Normal file
View file

@ -0,0 +1,40 @@
Tiime Dragonfly Redis
4 4738531328 6819917824
5 4738637824 6819917824
6 4738658304 6819913728
7 4738777088 6820589568
8 4738781184 6820638720
9 4738768896 6820769792
10 4738494464 6820777984
11 4738756608 6820683776
12 4740325376 6820687872
13 4740243456 6820691968
14 4740194304 6820687872
15 4740194304 7429746688
16 4740734976 7942115328
17 4740370432 8400957440
18 4740366336 8863305728
19 4740390912 9302515712
20 4740399104 9697935360
21 4740423680 10074103808
22 4748312576 10362601472
23 4750438400 10649939968
24 4750315520 10926985216
25 4750426112 11195555840
26 4750180352 11444666368
27 4750417920 11665764352
28 4750131200 11872944128
29 4750233600 12060946432
30 4750475264 12232212480
31 12379299840
32 12521598976
33 12647915520
34 12756508672
35 12848570368
36 12944240640
37 13025046528
38 13105799168
39 13181427712
40 8000053248
41 7048486912
42 7048507392
1 Tiime Dragonfly Redis
2 4 4738531328 6819917824
3 5 4738637824 6819917824
4 6 4738658304 6819913728
5 7 4738777088 6820589568
6 8 4738781184 6820638720
7 9 4738768896 6820769792
8 10 4738494464 6820777984
9 11 4738756608 6820683776
10 12 4740325376 6820687872
11 13 4740243456 6820691968
12 14 4740194304 6820687872
13 15 4740194304 7429746688
14 16 4740734976 7942115328
15 17 4740370432 8400957440
16 18 4740366336 8863305728
17 19 4740390912 9302515712
18 20 4740399104 9697935360
19 21 4740423680 10074103808
20 22 4748312576 10362601472
21 23 4750438400 10649939968
22 24 4750315520 10926985216
23 25 4750426112 11195555840
24 26 4750180352 11444666368
25 27 4750417920 11665764352
26 28 4750131200 11872944128
27 29 4750233600 12060946432
28 30 4750475264 12232212480
29 31 12379299840
30 32 12521598976
31 33 12647915520
32 34 12756508672
33 35 12848570368
34 36 12944240640
35 37 13025046528
36 38 13105799168
37 39 13181427712
38 40 8000053248
39 41 7048486912
40 42 7048507392