The test assumed any shutdown will take not more than 1s. This doesn't
always hold, and also waiting for 1s isn't ideal because usually it
takes less than that.
Changed to use `assert_eventually` instead.
Fixes#3684
There are 2 minor issues with this test:
1. It specified `cmdstat_replconf` as `cmd_stats` instead of `cmd`,
that's clearly a typo as `cmd_stats` is a map with stats, while
`replconf` is a Dragonfly command
2. Command `MULTI` is allowed to run even when the server is in paused
state, see
[here](https://github.com/dragonflydb/dragonfly/blob/main/src/server/main_service.cc#L1197):
```
// Don't interrupt running multi commands or admin connections.
```
Fixes#3675
Pre-this change, whenever Dragonfly was paused (either by a user or by
internal processes like takeover or slot migration finalization),
migrations and replications were also paused.
This could cause timing issues, which sometime result in migration
failures. Specifically, when 2 nodes have migrations from one to the
other **in parallel** (A->B and B->A), the `Pause()` that happens on A
(which happens because it's a source node) will stop it from processing
incoming traffic from B (incoming because it is also a target node).
If timed correctly, it will be locked until it times out, and so the
migration will fail.
The fix is to prevent replications and migrations from adhering to
`Pause()`s, which I think should not have happened in the first place
because they should use the admin port anyway.
Fixes#3319
We disable address space randomization when building the binary
and use addr2line to symbolize the stacktrace if it exists.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
For some cases, this map can grow indefinitely.
This change makes it less detailed by makes sure that number of possible keys is bounded.
Still it can provide a good summary of nature of exec transactions.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
chore: deprecate RecordsPopper and serialize channel records during push
Records channel is redundant for DFS/replication because we have single producer/consumer
scenario and both running on the same thread. Unfortunately we need it for RDB snapshotting.
For non-rdb cases we could just pass a io sink to the snapshot producer,
so that it would use it directly instead of StringFile inside FlushChannelRecord.
This would reduce memory usage, eliminate yet another memory copy and generally would make everything simpler.
For that to work, we must serialize the order of FlushChannelRecord, and that's implemented by
this PR. Also fixes#3658.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: cosmetic changes around Snapshot functions
Some renames and added comments. Refactored StartIncremental into a separate function
without any functional changes.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: fix comments
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Also, remove dependence of absl::TimeZone bloated monstrosity, which was required by
absl::FormatTime api, even though we do not actually format a timezone.
When absl::LocalTimeZone is accessed it allocates hundreds of thousands of bytes
for each shard thread (maybe due to lack thread safety during lazy initialization).
At the end, strftime does a great job without any shenanigans.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>