* Fix(regression test): fix test_flushall_in_full_sync
The bug: the test checks the replication using role command on replica
The replica updates the status to full sync when starting the full sync
flow, but actually the master did not start snapshoting yet.
The fix: check the status using role command on master, because master
updates the status only after snapshoting started.
Signed-off-by: adi_holden <adi@dragonflydb.io>
* sec: Adjust flag checks when using TLS.
* Trust default certificates if no specific roots are given
* Add regression tests for the different scenarios
* Validate that client connections work as well
The test fails sometimes when starting master after killing it.
The reason for this is that OS did not release port untill we started
master again.
The fix - adding sleep after kill
After we will have randomly selected ports on pytest we can remove this
sleep.
Signed-off-by: adi_holden <adi@dragonflydb.io>
* fix(regression_test): fix in shutdown and replication pytests
- skip test_gracefull_shutdown test
- fix test_take_over_seeder test:
bug: the dbfilename was not unique, therefore between different runs the server reload
the snapshot of the last test run and this failed the test.
fix: use random dbfilename
- fix test_take_over_timeout test:
bug: REPLTAKEOVER timeout was not small enough for opt dfly build
fix: decrease timeout
Signed-off-by: adi_holden <adi@dragonflydb.io>
1. add tls-ca-cert-file flag
2. add tls-ca-cert-dir flag
3. enables redis-cli to connect over tls without --insecure flag by properly validating certificate wtih CA
The issue was that, sometimes, the ID generated for one of the nodes
contained the slot ID that was used in the test (either 5259 or 5260).
This caused the test to replace the "slot" part of the id, which in turn
caused the node to think that it no longer owns any slot.
* fix(server): Initialize ServerFamily with all listeners.
- Add a test for CLIENT LIST which is the visible result of this.
* use std move
* feat: Implement replicas take over
* Basic test
* Address CR comments
* Write a better test. Sadly it fails
* chore: Expose AwaitDispatches for reuse in takeover
* Ensure that no commands can execute during or after a takeover
* CR progress
* Actually disable the expiration
* Improve tests coverage
* Fix the dispatch waiting code
* Improve testing coverage and fix a shutdown snaphot bug
* don't replicate a replica
Enables execution of global lua scripts inside multi/exec transactions if the defualt script config enables global execution for scripts. This change is only a fix and does not provide any safeguards against other execution scenarios (namely enabling globality with script flags). In the future, the proper execution mode should be determined more carefully by inspecting the scripts to be executed
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
Co-authored-by: Kostas Kyrimis <kostaskyrim@gmail.com>
The test case for checking is_loading == 1 is inherently racy because
the client can connect at any time before or after the dragonfly
instance loads the snapshot.
This PR is a temporary solution for clients that are not properly
removed from the connection pool triggering an active client assertion
during dragonfly instance shutdown
fix: remove bad check-fail in the transaction code.
Fixes#1421.
The failure reproduces for dragongly running with a single thread where all the
arguments grouped within the same ShardData
Also, we improve verbosity levels inside reply_builder.cc.
For that we extend SinkReplyBuilder to support protocol errors reporting
and we remove ad-hoc code for this from dragonfly_connection.
Required to track errors easily with `--vmodule=reply_builder=1`
Finally, a pytest is added to cover the issue.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
In this case, `redis.RedisCluster`.
To be double sure I also looked at the actual packets and saw that the
client asks for `CLUSTER SLOTS`, and then after the redistribution of
slots, following a few `MOVED` replied, it asks for the new slots again.
This allows masters to send data of non-owned keys to their replicas,
which is useful when:
1. Config is temporarily different between master and replica
2. Preparing for taking ownership over currently not-owned slots (in the upcoming migration feature)
Fixed#1319
* feat: Support ACKs from replica to master
* Rework after CR
* Split the acks into a different fiber and remove the PING loop
* const convention
* move around the order.
* revert sleep removal
* Exit ack fiber on cancellation
* Don't send ACKs if server doesn't support it