Update dashtable doc

2024-12-14 11:58:02 +00:00 · 2022-05-23 19:41:55 +03:00 · 2022-05-23 19:41:55 +03:00 · 5ad6352ad7
commit 5ad6352ad7
parent 8bed96d20b
3 changed files with 70 additions and 28 deletions
--- a/doc/bgsave_memusage.svg
+++ b/doc/bgsave_memusage.svg
--- a/doc/dashtable.md
+++ b/doc/dashtable.md
@ -94,23 +94,21 @@ Please note that with all efficiency of Dashtable, it can not decrease drastical
 overall memory usage. Its primary goal is to reduce waste around dictionary management.

 Having said that, by reducing metadata waste we could insert dragonfly-specific attributes
-into a table's metadata in order to implement other intelligent algorithms like forkless save. This is where Dragonfly's most impactfuls savings happen.
+into a table's metadata in order to implement other intelligent algorithms like forkless save. This is where some the Dragonfly's disrupting qualities [can be seen](#forkless-save).
+

 ## Benchmarks

 There are many other improvements in dragonfly that save memory besides DT. I will not be
 able to cover them all here. The results below show the final result as of May 2022.

-To compare RD vs DT I use an internal debugging command "debug populate" that quickly fills
-both datastores with small data. It just saves time and gives more consistent results compared to memtier_benchmark.
-It also shows the raw speed at which each dictionary gets filled without intermediary factors
-like networking, parsing etc.
+### Populate single-threaded
+To compare RD vs DT I often use an internal debugging command "debug populate" that quickly fills both datastores with data. It just saves time and gives more consistent results compared to memtier_benchmark.
+It also shows the raw speed at which each dictionary gets filled without intermediary factors like networking, parsing etc.
 I deliberately fill datasets with a small data to show how overhead of metadata differs between two data structures.

 I run "debug populate 20000000" (20M) on both engines on my home machine "AMD Ryzen 5 3400G with 8 cores".

-### Single-threaded scenario
-
 |             | Dragonfly | Redis 6 |
 |-------------|-----------|---------|
 | Time        |   10.8s   |  16.0s  |
@ -120,49 +118,53 @@ When looking at Redis6 "info memory" stats, you can see that `used_memory_overhe
 to `1.0GB`. That means that out of 1.73GB bytes allocated, a whooping 1.0GB is used for
 the metadata. For small data use-cases the cost of metadata in Redis is larger than the data itself.

-### Multi-threaded scenario
+### Populate multi-threaded
+
+Now I run Dragonfly on all 8 cores. Redis has the same results, of course.

-Now I run Dragonfly on all 8 cores.
 |             | Dragonfly | Redis 6 |
 |-------------|-----------|---------|
 | Time        |   2.43s   |  16.0s  |
 | Memory used |    896MB  |  1.73G  |

-Due to shared-nothing architecture, dragonfly maintains a dashtable per thread with its own slice of data.
-Each thread fills 1/8th of 20M range it owns - and it much faster, almost 8 times faster.
-You can see that the total usage is even smaller, because now we maintain smaller tables in each
+Due to shared-nothing architecture, Dragonfly maintains a dashtable per thread with its own slice of data. Each thread fills 1/8th of 20M range it owns - and it much faster, almost 8 times faster.You can see that the total usage is even smaller, because now we maintain
+smaller tables in each
 thread (it's not always the case though - we could get slightly worse memory usage than with
 single-threaded case ,depends where we stand compared to hash table utilization).

 ### Forkless Save
-We run `memtier_benchmark` for this experiment. The loadtest has been sending write requests
-according to the command below.

-```bash
-memtier_benchmark -p 6380 --ratio 1:0 -n 1000000 --threads=2 -c 20 --distinct-client-seed  --key-prefix="key:"  --hide-histogram  --key-maximum=10000000 -d 256
-```
+This example shows how much memory Dragonfly uses during BGSAVE under load compared to Redis. Btw, BGSAVE and SAVE in Dragonfly is the same procedure because it's implemented using fully asynchronous algorithm that maintains point-in-time snapshot guarantees.

-TODO
+This test consists of 3 steps:
+1. Execute `debug populate 5000000 key 1024` command on both servers to quickly fill them up
+   with ~5GB of data.
+2. Run `memtier_benchmark --ratio 1:0 -n 600000 --threads=2 -c 20 --distinct-client-seed  --key-prefix="key:"  --hide-histogram  --key-maximum=5000000 -d 1024` command in order to send constant update traffic. This traffic should not affect substantially the memory usage of both servers.
+3. Finally, run `bgsave` on both servers while measuring their memory.
+
+It's very hard, technically to measure exact memory usage of Redis during BGSAVE because it creates a child process that shares its parent memory in-part. We chose `cgroupsv2` as a tool to measure the memory. We put each server into a separate cgroup and we sampled `memory.current` attribute for each cgroup. Since a forked Redis process inherits the cgroup of the parent, we get an accurate estimation of their total memory usage. Although we did not need this for Dragonfly we applied the same approach for consistency.
+
+![BGSAVE](./bgsave_memusage.svg)
+
+As you can see on the graph, Redis uses 50% more memory even before BGSAVE starts. Around second 14, BGSAVE kicks off on both servers. Visually you can not see this event on Dragonfly graph, but it's seen very well on Redis graph. It took just few seconds for Dragonfly to finish its snapshot (again, not possible to see) and around second 20 Dragonfly is already behind BGSAVE. You can see a distinguishinable cliff at second 39
+where Redis finishes its snapshot, reaching almost x3 times more memory usage at peak.

 ### Expiry of items during writes
 Efficient Expiry is very important for many scenarios. See, for example,
 [Pelikan paper'21](https://twitter.github.io/pelikan/2021/segcache.html). Twitter team says
-that their their memory footprint could be reduced by as much as by 60% by employing better expiry
-methodology. The authors of the post above show prons and cons of expiration methods in the table below:
+that their their memory footprint could be reduced by as much as by 60% by employing better expiry methodology. The authors of the post above show prons and cons of expiration methods in the table below:

 <img src="https://twitter.github.io/pelikan/assets/img/segcache/expiration.svg" width="400">

 They argue that proactive expiration is very important for timely deletion of expired items.
 Dragonfly, employs its own intelligent garbage collection procedure. By leveraging DashTable
-compartmentalized structure it can actually employ passive expiry with low CPU overhead.
-Its passive procedure is complimented with proactive gradual scanning of the table in background.
+compartmentalized structure it can actually employ a very efficient passive expiry algorithm with low CPU overhead. Our passive procedure is complimented with proactive gradual scanning of the table in background.

 The procedure is a follows:
 A dashtable grows when its segment becomes full during the insertion and needs to be split.
 This is a convenient point to perform garbage collection, but only for that segment.
-We can scan its buckets for the expired items. If we delete some of them, we may avoid growing
-the table altogether! The cost of scanning the segment before potential split is nor more the
-split itself so can be described as `O(1)`.
+We scan its buckets for the expired items. If we delete some of them, we may avoid growing the table altogether! The cost of scanning the segment before potential split is no more the
+split itself so can be estimated as `O(1)`.

 We use `memtier_benchmark` for the experiment to demonstrate Dragonfly vs Redis expiry efficiency.
 We run locally the following command:
@ -172,7 +174,7 @@ memtier_benchmark --ratio 1:0 -n 600000 --threads=2 -c 20 --distinct-client-seed
   --key-prefix="key:"  --hide-histogram --expiry-range=30-30 --key-maximum=100000000 -d 256
 ```

-We load larger values this time (256 bytes) to reduce the impact of metadata savings
+We load larger values (256 bytes) to reduce the impact of metadata savings
 of Dragonfly.

 |                      | Dragonfly | Redis 6 |
@ -180,8 +182,7 @@ of Dragonfly.
 | Memory peak usage    | 1.45GB    |  1.95GB |
 | Avg SET qps          | 131K      | 100K    |

-Please note that Redis could sustain 30% less qps. That means that the optimal working sets for
-Dragonfly and Redis are different - the former needed to host at least `20s*131k` items
+Please note that Redis could sustain 30% less qps. That means that the optimal working sets for Dragonfly and Redis are different - the former needed to host at least `20s*131k` items
 at any point of time and the latter only needed to keep `20s*100K` items.
 So for `30%` bigger working set Dragonfly needed `25%` less memory at peak.

--- a/doc/memory_bgsave.tsv
+++ b/doc/memory_bgsave.tsv
@ -0,0 +1,40 @@
+Tiime	Dragonfly	Redis
+4	4738531328	6819917824
+5	4738637824	6819917824
+6	4738658304	6819913728
+7	4738777088	6820589568
+8	4738781184	6820638720
+9	4738768896	6820769792
+10	4738494464	6820777984
+11	4738756608	6820683776
+12	4740325376	6820687872
+13	4740243456	6820691968
+14	4740194304	6820687872
+15	4740194304	7429746688
+16	4740734976	7942115328
+17	4740370432	8400957440
+18	4740366336	8863305728
+19	4740390912	9302515712
+20	4740399104	9697935360
+21	4740423680	10074103808
+22	4748312576	10362601472
+23	4750438400	10649939968
+24	4750315520	10926985216
+25	4750426112	11195555840
+26	4750180352	11444666368
+27	4750417920	11665764352
+28	4750131200	11872944128
+29	4750233600	12060946432
+30	4750475264	12232212480
+31		12379299840
+32		12521598976
+33		12647915520
+34		12756508672
+35		12848570368
+36		12944240640
+37		13025046528
+38		13105799168
+39		13181427712
+40		8000053248
+41		7048486912
+42		7048507392