Pre-this change, whenever Dragonfly was paused (either by a user or by
internal processes like takeover or slot migration finalization),
migrations and replications were also paused.
This could cause timing issues, which sometime result in migration
failures. Specifically, when 2 nodes have migrations from one to the
other **in parallel** (A->B and B->A), the `Pause()` that happens on A
(which happens because it's a source node) will stop it from processing
incoming traffic from B (incoming because it is also a target node).
If timed correctly, it will be locked until it times out, and so the
migration will fail.
The fix is to prevent replications and migrations from adhering to
`Pause()`s, which I think should not have happened in the first place
because they should use the admin port anyway.
Fixes#3319
We disable address space randomization when building the binary
and use addr2line to symbolize the stacktrace if it exists.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
For some cases, this map can grow indefinitely.
This change makes it less detailed by makes sure that number of possible keys is bounded.
Still it can provide a good summary of nature of exec transactions.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
chore: deprecate RecordsPopper and serialize channel records during push
Records channel is redundant for DFS/replication because we have single producer/consumer
scenario and both running on the same thread. Unfortunately we need it for RDB snapshotting.
For non-rdb cases we could just pass a io sink to the snapshot producer,
so that it would use it directly instead of StringFile inside FlushChannelRecord.
This would reduce memory usage, eliminate yet another memory copy and generally would make everything simpler.
For that to work, we must serialize the order of FlushChannelRecord, and that's implemented by
this PR. Also fixes#3658.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: cosmetic changes around Snapshot functions
Some renames and added comments. Refactored StartIncremental into a separate function
without any functional changes.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: fix comments
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Also, remove dependence of absl::TimeZone bloated monstrosity, which was required by
absl::FormatTime api, even though we do not actually format a timezone.
When absl::LocalTimeZone is accessed it allocates hundreds of thousands of bytes
for each shard thread (maybe due to lack thread safety during lazy initialization).
At the end, strftime does a great job without any shenanigans.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
fixes#3636
The problem was that instead of implementing GT operator, we implemented
!LT, which is GE operator. As a result the iterators inside the sort algorithm reached
out of bounds.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* fix: crash when passing empty arguments
Fix the case where we pass an empty argument, which then is parsed as an
empty string view with null pointer. The null pointer is not handled correctly
by our low level c code, hence switch to using ""sv for that.
* chore: add more list asserts + improve test_hypothesis
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
1. The offset value can be negative, in that case we should return an empty array.
2. Fix edge cases of inf*0 and -inf + inf, so they will result in 0 and non NaN (similarily to Valkey).
Signed-off-by: Roman Gershman <roman@dragonflydb.io>