mirror of
https://github.com/dragonflydb/dragonfly.git
synced 2024-12-14 11:58:02 +00:00
fix: typos (#1986)
This commit is contained in:
parent
20b924f9d5
commit
d16195bfb7
7 changed files with 24 additions and 24 deletions
|
@ -131,7 +131,7 @@ There are also some Dragonfly-specific arguments:
|
||||||
./dragonfly-x86_64 --logtostderr --requirepass=youshallnotpass --cache_mode=true -dbnum 1 --bind localhost --port 6379 --save_schedule "*:30" --maxmemory=12gb --keys_output_limit=12288 --dbfilename dump.rdb
|
./dragonfly-x86_64 --logtostderr --requirepass=youshallnotpass --cache_mode=true -dbnum 1 --bind localhost --port 6379 --save_schedule "*:30" --maxmemory=12gb --keys_output_limit=12288 --dbfilename dump.rdb
|
||||||
```
|
```
|
||||||
|
|
||||||
Arguments can be also provided from a configuration file by runnning `dragonfly --flagfile <filename>`. The file should list one flag per line, with equal signs instead of spaces for key-value flags.
|
Arguments can be also provided from a configuration file by running `dragonfly --flagfile <filename>`. The file should list one flag per line, with equal signs instead of spaces for key-value flags.
|
||||||
|
|
||||||
For more options like logs management or TLS support, run `dragonfly --help`.
|
For more options like logs management or TLS support, run `dragonfly --help`.
|
||||||
|
|
||||||
|
@ -170,7 +170,7 @@ Go to the URL `:6379/metrics` to view Prometheus-compatible metrics.
|
||||||
The Prometheus exported metrics are compatible with the Grafana dashboard, [see here](tools/local/monitoring/grafana/provisioning/dashboards/dashboard.json).
|
The Prometheus exported metrics are compatible with the Grafana dashboard, [see here](tools/local/monitoring/grafana/provisioning/dashboards/dashboard.json).
|
||||||
|
|
||||||
|
|
||||||
Important! The HTTP console is meant to be accessed within a safe network. If you expose Dragonfly's TCP port externally, we advise you disable the console with `--http_admin_console=false` or `--nohttp_admin_console`.
|
Important! The HTTP console is meant to be accessed within a safe network. If you expose Dragonfly's TCP port externally, we advise you to disable the console with `--http_admin_console=false` or `--nohttp_admin_console`.
|
||||||
|
|
||||||
|
|
||||||
## <a name="background"><a/>Background
|
## <a name="background"><a/>Background
|
||||||
|
|
|
@ -2,7 +2,7 @@
|
||||||
|
|
||||||
## Running the server
|
## Running the server
|
||||||
|
|
||||||
Dragonfly runs on linux. We advice running it on linux version 5.11 or later
|
Dragonfly runs on linux. We advise running it on linux version 5.11 or later
|
||||||
but you can also run Dragonfly on older kernels as well.
|
but you can also run Dragonfly on older kernels as well.
|
||||||
|
|
||||||
> :warning: **Dragonfly releases are compiled with LTO (link time optimization)**:
|
> :warning: **Dragonfly releases are compiled with LTO (link time optimization)**:
|
||||||
|
|
|
@ -1,7 +1,7 @@
|
||||||
|
|
||||||
# Dashtable in Dragonfly
|
# Dashtable in Dragonfly
|
||||||
|
|
||||||
Dashtable is very important data structure in Dragonfly. This document explain
|
Dashtable is a very important data structure in Dragonfly. This document explains
|
||||||
how it fits inside the engine.
|
how it fits inside the engine.
|
||||||
|
|
||||||
Each selectable database holds a primary dashtable that contains all its entries. Another instance of Dashtable holds an optional expiry information, for keys that have TTL expiry on them. Dashtable is equivalent to Redis dictionary but have some wonderful properties that make Dragonfly memory efficient in various situations.
|
Each selectable database holds a primary dashtable that contains all its entries. Another instance of Dashtable holds an optional expiry information, for keys that have TTL expiry on them. Dashtable is equivalent to Redis dictionary but have some wonderful properties that make Dragonfly memory efficient in various situations.
|
||||||
|
@ -80,7 +80,7 @@ in terms of memory and cpu.
|
||||||
In practice, each segment grows independently from others,
|
In practice, each segment grows independently from others,
|
||||||
so the table has smooth memory usage of 22-32 bytes per item or **6-16 bytes overhead**.
|
so the table has smooth memory usage of 22-32 bytes per item or **6-16 bytes overhead**.
|
||||||
|
|
||||||
1. Speed: RD requires an allocation for dictEntry per insertion and deallocation per deletion. In addition, RD uses chaining, which is cache unfriendly on modern hardware. There is a consensus in engineering and research communities that classic chaining schemes are slower tha open addressing alternatives.
|
1. Speed: RD requires an allocation for dictEntry per insertion and deallocation per deletion. In addition, RD uses chaining, which is cache unfriendly on modern hardware. There is a consensus in engineering and research communities that classic chaining schemes are slower than open addressing alternatives.
|
||||||
Having said that, DT also needs to go through a single level of indirection when
|
Having said that, DT also needs to go through a single level of indirection when
|
||||||
fetching a segment pointer. However, DT's directory size is relatively small:
|
fetching a segment pointer. However, DT's directory size is relatively small:
|
||||||
in the example above, all 9K could resize in L1 cache. Once the segment is determined,
|
in the example above, all 9K could resize in L1 cache. Once the segment is determined,
|
||||||
|
@ -95,7 +95,7 @@ Please note that with all efficiency of Dashtable, it can not decrease drastical
|
||||||
overall memory usage. Its primary goal is to reduce waste around dictionary management.
|
overall memory usage. Its primary goal is to reduce waste around dictionary management.
|
||||||
|
|
||||||
Having said that, by reducing metadata waste we could insert dragonfly-specific attributes
|
Having said that, by reducing metadata waste we could insert dragonfly-specific attributes
|
||||||
into a table's metadata in order to implement other intelligent algorithms like forkless save. This is where some the Dragonfly's disrupting qualities [can be seen](#forkless-save).
|
into a table's metadata in order to implement other intelligent algorithms like forkless save. This is where some of the Dragonfly's disrupting qualities [can be seen](#forkless-save).
|
||||||
|
|
||||||
## Benchmarks
|
## Benchmarks
|
||||||
|
|
||||||
|
@ -128,10 +128,10 @@ Now I run Dragonfly on all 8 cores. Redis has the same results, of course.
|
||||||
| Time | 2.43s | 16.0s |
|
| Time | 2.43s | 16.0s |
|
||||||
| Memory used | 896MB | 1.73G |
|
| Memory used | 896MB | 1.73G |
|
||||||
|
|
||||||
Due to shared-nothing architecture, Dragonfly maintains a dashtable per thread with its own slice of data. Each thread fills 1/8th of 20M range it owns - and it much faster, almost 8 times faster.You can see that the total usage is even smaller, because now we maintain
|
Due to shared-nothing architecture, Dragonfly maintains a dashtable per thread with its own slice of data. Each thread fills 1/8th of 20M range it owns - and it much faster, almost 8 times faster. You can see that the total usage is even smaller, because now we maintain
|
||||||
smaller tables in each
|
smaller tables in each
|
||||||
thread (it's not always the case though - we could get slightly worse memory usage than with
|
thread (it's not always the case though - we could get slightly worse memory usage than with
|
||||||
single-threaded case ,depends where we stand compared to hash table utilization).
|
single-threaded case, depends where we stand compared to hash table utilization).
|
||||||
|
|
||||||
### Forkless Save
|
### Forkless Save
|
||||||
|
|
||||||
|
@ -155,7 +155,7 @@ where Redis finishes its snapshot, reaching almost x3 times more memory usage at
|
||||||
|
|
||||||
Efficient Expiry is very important for many scenarios. See, for example,
|
Efficient Expiry is very important for many scenarios. See, for example,
|
||||||
[Pelikan paper'21](https://twitter.github.io/pelikan/2021/segcache.html). Twitter team says
|
[Pelikan paper'21](https://twitter.github.io/pelikan/2021/segcache.html). Twitter team says
|
||||||
that their their memory footprint could be reduced by as much as by 60% by employing better expiry methodology. The authors of the post above show pros and cons of expiration methods in the table below:
|
that their memory footprint could be reduced by as much as by 60% by employing better expiry methodology. The authors of the post above show pros and cons of expiration methods in the table below:
|
||||||
|
|
||||||
<img src="https://pelikan.io/assets/img/segcache/expiration.svg" width="400">
|
<img src="https://pelikan.io/assets/img/segcache/expiration.svg" width="400">
|
||||||
|
|
||||||
|
@ -190,7 +190,7 @@ at any point of time and the latter only needed to keep `20s*100K` items.
|
||||||
So for `30%` bigger working set Dragonfly needed `25%` less memory at peak.
|
So for `30%` bigger working set Dragonfly needed `25%` less memory at peak.
|
||||||
|
|
||||||
<em>*Please ignore the performance advantage of Dragonfly over Redis in this test - it has no meaning.
|
<em>*Please ignore the performance advantage of Dragonfly over Redis in this test - it has no meaning.
|
||||||
I run it locally on my machine and ot does not represent a real throughput benchmark. </em>
|
I run it locally on my machine and it does not represent a real throughput benchmark. </em>
|
||||||
|
|
||||||
<br>
|
<br>
|
||||||
|
|
||||||
|
|
|
@ -6,7 +6,7 @@ Dragonfly is a modern replacement for memory stores like Redis and Memcached. It
|
||||||
|
|
||||||
Dragonfly uses a single process with a multiple-thread architecture. Each Dragonfly thread is indirectly assigned several responsibilities via fibers.
|
Dragonfly uses a single process with a multiple-thread architecture. Each Dragonfly thread is indirectly assigned several responsibilities via fibers.
|
||||||
|
|
||||||
One such responsibility is handling incoming connections. Once a socket listener accepts a client connection, the connection spends its entire lifetime bound to a single thread inside a fiber. Dragonfly is written to be 100% non-blocking; it uses fibers to provide asynchronisity in each thread. One of the essential properties of asynchronisity is that a thread cannot be blocked as long as it has pending CPU tasks. Dragonfly preserves this property by wrapping each unit of execution context in a fiber; we wrap units of execution that can potentially be blocked on I/O. For example, a connection loop runs within a fiber; a function that writes a snapshot runs inside a fiber, and so on.
|
One such responsibility is handling incoming connections. Once a socket listener accepts a client connection, the connection spends its entire lifetime bound to a single thread inside a fiber. Dragonfly is written to be 100% non-blocking; it uses fibers to provide asynchronicity in each thread. One of the essential properties of asynchronicity is that a thread cannot be blocked as long as it has pending CPU tasks. Dragonfly preserves this property by wrapping each unit of execution context in a fiber; we wrap units of execution that can potentially be blocked on I/O. For example, a connection loop runs within a fiber; a function that writes a snapshot runs inside a fiber, and so on.
|
||||||
|
|
||||||
As a side comment - asynchronicity and parallelism are different terms. Nodejs, for example, provides asynchronous execution but is single-threaded. Similarly, each Dragonfly thread is asynchronous on its own; therefore, Dragonfly is responsive to incoming events even when it handles long-running commands like saving to disk or running Lua scripts.
|
As a side comment - asynchronicity and parallelism are different terms. Nodejs, for example, provides asynchronous execution but is single-threaded. Similarly, each Dragonfly thread is asynchronous on its own; therefore, Dragonfly is responsive to incoming events even when it handles long-running commands like saving to disk or running Lua scripts.
|
||||||
|
|
||||||
|
@ -31,9 +31,9 @@ So when we say that thread 1 is an I/O thread, we mean that Dragonfly can pin fi
|
||||||
I suggest reading my [intro post](https://www.romange.com/2018/12/15/introduction-to-fibers-in-c-/) about `Boost.Fibers` to learn more about fibers.
|
I suggest reading my [intro post](https://www.romange.com/2018/12/15/introduction-to-fibers-in-c-/) about `Boost.Fibers` to learn more about fibers.
|
||||||
|
|
||||||
By the way, I want to compliment `Boost.Fibers` library–it has been exceptionally well designed:
|
By the way, I want to compliment `Boost.Fibers` library–it has been exceptionally well designed:
|
||||||
it's unintrusive, lightweight, and efficient. Moreover, its default scheduler can be overidden. In the case of `helio`, the I/O library that powers Dragonfly, we overrode the `Boost.Fibers` scheduler to support shared-nothing architecture and integrate it with the I/O polling loop.
|
it's unintrusive, lightweight, and efficient. Moreover, its default scheduler can be overridden. In the case of `helio`, the I/O library that powers Dragonfly, we overrode the `Boost.Fibers` scheduler to support shared-nothing architecture and integrate it with the I/O polling loop.
|
||||||
|
|
||||||
Importantly, fibers require bottom-up support in the application layer to preserve their asynchronisity. For example, in the snippet below, a blocking write into `fd` won't magically allow a fiber to preempt and switch to another fiber. No, the whole thread will be blocked.
|
Importantly, fibers require bottom-up support in the application layer to preserve their asynchronicity. For example, in the snippet below, a blocking write into `fd` won't magically allow a fiber to preempt and switch to another fiber. No, the whole thread will be blocked.
|
||||||
|
|
||||||
|
|
||||||
```cpp
|
```cpp
|
||||||
|
@ -77,7 +77,7 @@ Another way to think of this flow is that a connection fiber serves as a coordin
|
||||||
<br>
|
<br>
|
||||||
<img src="http://static.dragonflydb.io/repo-assets/coordinator.svg" border="0"/>
|
<img src="http://static.dragonflydb.io/repo-assets/coordinator.svg" border="0"/>
|
||||||
|
|
||||||
Here, a coordinator (or connection fiber) might even reside on one of the threads that coincidently owns one of the shards. However, it iseasier to think of it as a separate entity that never directly accesses any shard data.
|
Here, a coordinator (or connection fiber) might even reside on one of the threads that coincidently owns one of the shards. However, it is easier to think of it as a separate entity that never directly accesses any shard data.
|
||||||
|
|
||||||
The coordinator serves as a virtualization layer that hides all the complexity of talking to multiple shards. It employs start-of-the-art algorithms to provide atomicity (and strict serializability) semantics for multi-key commands like "mset, mget, and blpop." It also offers strict serializability for Lua scripts and multi-command transactions.
|
The coordinator serves as a virtualization layer that hides all the complexity of talking to multiple shards. It employs start-of-the-art algorithms to provide atomicity (and strict serializability) semantics for multi-key commands like "mset, mget, and blpop." It also offers strict serializability for Lua scripts and multi-command transactions.
|
||||||
|
|
||||||
|
|
|
@ -12,7 +12,7 @@ SORT does not take any locale into account.
|
||||||
|
|
||||||
## Expiry ranges.
|
## Expiry ranges.
|
||||||
Expirations are limited to 8 years. For commands with millisecond precision like PEXPIRE or PSETEX,
|
Expirations are limited to 8 years. For commands with millisecond precision like PEXPIRE or PSETEX,
|
||||||
expirations greater than 2^28ms are quietly rounded to the nearest second loosing precision of less than 0.001%.
|
expirations greater than 2^28ms are quietly rounded to the nearest second losing precision of less than 0.001%.
|
||||||
|
|
||||||
## Lua
|
## Lua
|
||||||
We use lua 5.4.4 that has been released in 2022.
|
We use lua 5.4.4 that has been released in 2022.
|
||||||
|
|
|
@ -14,7 +14,7 @@ We followed the trend of other technological companies like Elastic, Redis, Mong
|
||||||
License wise you are free to use dragonfly in your production as long as you do not provide dragonfly as a managed service. From a code maturity point of view, Dragonfly's code is covered with unit testing. However as with any new software there are use cases that are hard to test and predict. We advise you to run your own particular use case on dragonfly for a few days before considering production usage.
|
License wise you are free to use dragonfly in your production as long as you do not provide dragonfly as a managed service. From a code maturity point of view, Dragonfly's code is covered with unit testing. However as with any new software there are use cases that are hard to test and predict. We advise you to run your own particular use case on dragonfly for a few days before considering production usage.
|
||||||
|
|
||||||
## Dragonfly provides vertical scale, but we can achieve similar throughput with X nodes in a Redis cluster.
|
## Dragonfly provides vertical scale, but we can achieve similar throughput with X nodes in a Redis cluster.
|
||||||
Dragonfly utilizes the underlying hardware in an optimal way. Meaning it can run on small 8GB instances and scale verticly to large 768GB machines with 64 cores. This versatility allows to drastically reduce complexity of running cluster workloads to a single node saving hardware resources and costs. More importantly, it reduces the complexity (total cost of ownership) of handling the multi-node cluster. In addition, Redis cluster-mode imposes some limitations on multi-key and transactinal operations while Dragonfly provides the same semantics as single node Redis.
|
Dragonfly utilizes the underlying hardware in an optimal way. Meaning it can run on small 8GB instances and scale vertically to large 768GB machines with 64 cores. This versatility allows to drastically reduce complexity of running cluster workloads to a single node saving hardware resources and costs. More importantly, it reduces the complexity (total cost of ownership) of handling the multi-node cluster. In addition, Redis cluster-mode imposes some limitations on multi-key and transactional operations while Dragonfly provides the same semantics as single node Redis.
|
||||||
|
|
||||||
## If only Dragonfly had this command I would use it for sure
|
## If only Dragonfly had this command I would use it for sure
|
||||||
Dragonfly implements ~130 Redis commands which we think represent a good coverage of the market. However this is not based empirical data. Having said that, if you have commands that are not covered, please feel free to open an issue for that or vote for an existing issue. We will do our best to prioritise those commands according to their popularity.
|
Dragonfly implements ~130 Redis commands which we think represent a good coverage of the market. However this is not based empirical data. Having said that, if you have commands that are not covered, please feel free to open an issue for that or vote for an existing issue. We will do our best to prioritise those commands according to their popularity.
|
||||||
|
|
|
@ -20,9 +20,9 @@ TODO: to rename them in the codebase to another name (SnapshotShard?) since `sna
|
||||||
3. Each SnapshotShard instantiates its own RdbSerializer that is used to serialize each K/V entry into a binary representation according to the Redis format spec. SnapshotShards combine multiple blobs from the same Dash bucket into a single blob. They always send blob data at bucket granularity, i.e. they never send blob into the channel that only partially covers the bucket. This is needed in order to guarantee snapshot isolation.
|
3. Each SnapshotShard instantiates its own RdbSerializer that is used to serialize each K/V entry into a binary representation according to the Redis format spec. SnapshotShards combine multiple blobs from the same Dash bucket into a single blob. They always send blob data at bucket granularity, i.e. they never send blob into the channel that only partially covers the bucket. This is needed in order to guarantee snapshot isolation.
|
||||||
4. The RdbSerializer uses `io::Sink` to emit binary data. The SnapshotShard instance passes into it a `StringFile` which is just a memory-only based sink that wraps `std::string` object. Once `StringFile` instance becomes large, it's flushed into the channel (as long as it follows the rules above).
|
4. The RdbSerializer uses `io::Sink` to emit binary data. The SnapshotShard instance passes into it a `StringFile` which is just a memory-only based sink that wraps `std::string` object. Once `StringFile` instance becomes large, it's flushed into the channel (as long as it follows the rules above).
|
||||||
4. RdbSave also creates a fiber (SaveBody) that pull all the blobs from the channel. Blobs migh come in unspecified order though it's guaranteed that each blob is self sufficient but itself.
|
4. RdbSave also creates a fiber (SaveBody) that pull all the blobs from the channel. Blobs migh come in unspecified order though it's guaranteed that each blob is self sufficient but itself.
|
||||||
5. DF uses direct I/O, to improve i/o throughput, which, in turn requires properly aligned memory buffers to work. Unfortunately, blobs that come from the rdb channel come in different sizes and they are not aligned by OS page granularity. Therefore, DF passes all the data from rdb channel through AlignedBuffer transformation. The purpose of this class is to copy the incoming data into a properly aligned buffer. Once it accumalates enough data, it flushes it into the output file.
|
5. DF uses direct I/O, to improve i/o throughput, which, in turn requires properly aligned memory buffers to work. Unfortunately, blobs that come from the rdb channel come in different sizes and they are not aligned by OS page granularity. Therefore, DF passes all the data from rdb channel through AlignedBuffer transformation. The purpose of this class is to copy the incoming data into a properly aligned buffer. Once it accumulates enough data, it flushes it into the output file.
|
||||||
|
|
||||||
To summarize, this configuration employes a single sink to create one file or one stream of data that represents the whole database.
|
To summarize, this configuration employs a single sink to create one file or one stream of data that represents the whole database.
|
||||||
|
|
||||||
## Dragonfly Snapshot (TBD)
|
## Dragonfly Snapshot (TBD)
|
||||||
|
|
||||||
|
@ -48,7 +48,7 @@ this *relaxed snapshotting*. The reason for relaxed snapshotting is to avoid kee
|
||||||
of all mutations during the snapshot creation.
|
of all mutations during the snapshot creation.
|
||||||
|
|
||||||
As a side comment - we could, in theory, support the same (relaxed)
|
As a side comment - we could, in theory, support the same (relaxed)
|
||||||
semantics for file snapshots, but it's no necessary since it might increase the snapshot sizes.
|
semantics for file snapshots, but it's not necessary since it might increase the snapshot sizes.
|
||||||
|
|
||||||
The snapshotting phase (full-sync) can take up lots of time which add lots of memory pressure on the system.
|
The snapshotting phase (full-sync) can take up lots of time which add lots of memory pressure on the system.
|
||||||
Keeping the change-log aside during the full-sync phase will only add more pressure.
|
Keeping the change-log aside during the full-sync phase will only add more pressure.
|
||||||
|
@ -58,7 +58,7 @@ in order to know when the snapshotting finished and the stable state replication
|
||||||
|
|
||||||
## Conservative and relaxed snapshotting variations
|
## Conservative and relaxed snapshotting variations
|
||||||
|
|
||||||
Both algorithms maintain a scanning process (fiber) that iterarively goes over the main dictionary
|
Both algorithms maintain a scanning process (fiber) that iteratively goes over the main dictionary
|
||||||
and serializes its data. Before starting the process, the SnapshotShard captures
|
and serializes its data. Before starting the process, the SnapshotShard captures
|
||||||
the change epoch of its shard (this epoch is increased with each write request).
|
the change epoch of its shard (this epoch is increased with each write request).
|
||||||
|
|
||||||
|
@ -70,7 +70,7 @@ For sake of simplicity, we can assume that each entry in the shard maintains its
|
||||||
By capturing the epoch number we establish a cut: all entries with `version <= SnapshotShard.epoch`
|
By capturing the epoch number we establish a cut: all entries with `version <= SnapshotShard.epoch`
|
||||||
have not been serialized yet and were not modified by the concurrent writes.
|
have not been serialized yet and were not modified by the concurrent writes.
|
||||||
|
|
||||||
The DashTable iteration algorithm guarantees convergeance and coverage ("at most once"),
|
The DashTable iteration algorithm guarantees convergence and coverage ("at most once"),
|
||||||
but it does not guarantee that each entry is visited *exactly once*.
|
but it does not guarantee that each entry is visited *exactly once*.
|
||||||
Therefore, we use entry versions for two things: 1) to avoid serialization of the same entry multiple times,
|
Therefore, we use entry versions for two things: 1) to avoid serialization of the same entry multiple times,
|
||||||
and 2) to correctly serialize entries that need to change due to concurrent writes.
|
and 2) to correctly serialize entries that need to change due to concurrent writes.
|
||||||
|
@ -86,7 +86,7 @@ Serialization Fiber:
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
To allow concurrent writes during the snapshotting phase, we setup a hook that is triggerred on each
|
To allow concurrent writes during the snapshotting phase, we setup a hook that is triggered on each
|
||||||
entry mutation in the table:
|
entry mutation in the table:
|
||||||
|
|
||||||
OnWriteHook:
|
OnWriteHook:
|
||||||
|
|
Loading…
Reference in a new issue