* fix(replication): Correctly replicate commands even when OOM
Before this change, OOM in shard callbacks could have led to data
inconsistency between the master and the replica. For example, commands
which mutated data on 1 shard but failed on another, like `LMOVE`.
After this change, callbacks that result in an OOM will correctly
replicate their work (none, partial or complete) to replicas.
Note that `MSET` and `MSETNX` required special handling, in that they are
the only commands that can _create_ multiple keys, and so some of them
can fail.
Fixes#2381
* fixes
* test fix
* RecordJournal
* UNDO idiotnessness
* 2 shards
* fix pytest
The (subtle) bug is that the previous code uses an `initializer_list` c'tor, which copies the
`string_view` locally. Then it keeps that reference to the `string_view`,
but it goes out of scope in the next line
* feat: allow throttling tiered writes
The throttling is controlled by tiered_storage_throttle_us flag
and can be disabled by passing `--tiered_storage_throttle_us=0`.
This introduces a soft back-pressure during writes.
On my machine `debug POPULATE 10000000 key 1000 RAND` with tiered_storage_throttle_us=0
offloads 12% of all the entries, but with tiered_storage_throttle_us=1 it offloads
almost 100% by prolonging the operation from 0.96s to 1.72s.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* feat(server): Implement `CLIENT KILL`
Currently, it supports the following syntax:
* `CLIENT KILL <addr>:<port>`
* `CLIENT KILL ID <id>`
* `CLIENT KILL ADDR <addr>:<port>`
* `CLIENT KILL LADDR <addr>:<port>`
It will not allow killing an admin-connection from a non-admin port.
There are a few parameters of `CLIENT KILL` that Redis supports but this
PR does not yet add. Let's add them as needed.
Fixes#1614
* Add tests
* fixes
* chore: fix our release pipeline
Also remove alpine prod.wip file that has not been used and unlikely will be for prod.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: remove atomic<> from ReplicaInfo::state
This field is protected by ReplicaInfo::mu so non-protected access to it shows a design problem.
Indeed, it was done for being able to access this field without a mutex inside ReplicationLags() function.
I moved the access to this field to GetReplicasRoleInfo where we need to lock ReplicaRoleInfo anyways.
Also, done some cleanups in the file.
Finally, raised a threshold for "tx queue too long" warnings.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
fixes#2296
added a regression test that tests both policy based eviction as well as heart beat eviction.
---------
Signed-off-by: Yue Li <61070669+theyueli@users.noreply.github.com>
* feat: add os string
Adding it both to the info log and the INFO response in the server section, similarly
to how it's reported by redis.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* Update src/server/server_family.cc
Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>
Signed-off-by: Roman Gershman <romange@gmail.com>
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Signed-off-by: Roman Gershman <romange@gmail.com>
Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>
Also, change its build directory to build-release.
Simplify a bit its configuration steps as well. No change in functionality is expected.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* feat(cluster): Add `RestoreStreamer`.
`RestoreStreamer`, like `JournalStreamer`, streams journal changes to a
sink. However, in addition, it traverses the DB like `RdbSerializer` and
sends existing entries as `RESTORE` commands.
Adding it required a bit of plumbing to get all journal changes to be
slot-aware.
In a follow-up PR I will remove the now unneeded `SerializerBase`.
* Fix build
* Fix bug
* Remove unimplemented function
* Iterate DB, drop support for db1+
* Send FULL-SYNC-CUT
chore: cosmetic improvements in dash code
1. Better naming
2. Improve improving the interface of ForEachSlot command
3. Wrap the repeating code of updating the bucket version into the UpdateVersion function
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* feat(server): Add `RestoreSerializer`
This utility class serializes `CompactObj`s as `RESTORE` commands, and
has a similar interface (and a common base class) as `RdbSerializer`
* RETURN_ON_ERR
* fixes
Issue was that in `ServerFamilyTest.SlowLogLen` we set the threshold to
be 0 microseconds and make sure that all commands are logged as slow.
However, in opt, some commands sometimes take 0 microseconds, which
fails the test.
Confirmed via:
```
./server_family_test --gtest_repeat=100 --gtest_filter=ServerFamilyTest.SlowLogLen
```
* chore: consolidate facade stats under a single struct
Remove connection stats from server state and move them under FacadeStats.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: fixing comments
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* feat: introduce user timeout
* feat: introduce tcp_user_timeout flag.
See TCP_USER_TIMEOUT flag in tcp(7) man page.
This linux-only setting allows fail faster during the send operation
if for some reason the remote socket is unresponsive and does not send ACKs for
the transmitted segments.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* Update src/facade/dragonfly_listener.cc
Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>
Signed-off-by: Roman Gershman <romange@gmail.com>
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Signed-off-by: Roman Gershman <romange@gmail.com>
Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>