1
0
Fork 0
mirror of https://github.com/dragonflydb/dragonfly.git synced 2024-12-14 11:58:02 +00:00
Commit graph

555 commits

Author SHA1 Message Date
Shahar Mike
c3f9ec18ae
fix: Fix test_flushall_in_full_sync (#3929)
* fix: Fix `test_flushall_in_full_sync`

This test failed in CI many times. The issue was that we reach stable
sync too quickly, and miss the full sync stage.

I changed the seeder to add 100k (instead of 30k) keys for the stage to
take longer.

* StaticSeeder
2024-10-15 11:48:32 +00:00
adiholden
a1830e1b5e
feat(server): use listpack node encoding for list (#3914)
Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-10-15 13:55:26 +03:00
Shahar Mike
c868b27bbe
fix: Support replicating Valkey and Redis 7.2 (#3927)
Until now, we only tested Dragonfly against Redis 6.2.  It appears that
something has changed in the way Redis sends stable sync commands, and
now they also forward `MULTI` and `EXEC` as part of their replication.

Since we do not allow all commands to run under `MULTI`/`EXEC`,
specifically `SELECT`, a Dragonfly replica of such servers failed these
commands and became inconsistent with the data on the master.

The proposed fix is to simply ignore (i.e. not execute) `MULTI`/`EXEC`
coming from a Redis/Valkey master, and run the commands within those
transactions individually, like we do for other transactions.

To test this we randomly choose a redis/valkey server based on 3
available installed binaries and test against them.
2024-10-15 13:12:16 +03:00
Kostas Kyrimis
588d6cc339
chore: relax assertion in test_noreply_pipeline (#3908)
*adjust assert condition

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-10-14 09:27:57 +03:00
Vladislav
e71f083f34
feat(search): STOPWORDS (#3851)
Adds support for STOPWORDS option
2024-10-10 21:58:12 +03:00
Kostas Kyrimis
a5fa3ab9f5
chore: skip flaky test_noreply_pipeline (#3903)
* Disable the test_noreply_pipeline because it's really flaky. Will look on this once I wrap up with my pending tasks.
2024-10-10 07:37:13 +00:00
Vladislav
786c9cd44d
chore: collection size (#3844) 2024-10-08 18:51:11 +03:00
Shahar Mike
b2ebfd05d4
fix: Do not publish to connections without context (#3873)
* fix: Do not publish to connections without context

This is a rare case where a closed connection is kept alive while the
handling fiber yields, therefore leaving `cc_` (the connection context)
pointing to null for other fibers to see.

As far as I can see, this can only happen during server shutdown, but
there could be other cases that I have missed.

The test on its own does _not_ reproduce the crash, however with added
`ThisFiber::SleepFor()`s I could reproduce the crash:

* Right before `DispatchBrief()`
  [here](e3214cb603/src/server/channel_store.cc (L154))
* Right after connection context `reset()`
  [here](2ab480e160/src/facade/dragonfly_connection.cc (L750))

In any case, calling `SendPubMessageAsync()` to a connection where `cc_`
is null is a bug, and we fix that here.

* rewording
2024-10-08 14:45:57 +03:00
adiholden
fa288c19b2
test: more stabe test_bgsave_and_save (#3843)
Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-10-01 15:03:12 +03:00
Kostas Kyrimis
ec353e1522
chore: add logs to test_acl_cat_commands_multi_exec_squash (#3826)
* add logs to test

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-30 12:03:40 +00:00
Kostas Kyrimis
b19f722011
chore: do not close connections at the end of pytest (#3811)
A common case is that we need to clean up a connection before we exit a test via .close() method. This is needed because otherwise the connection will raise a warning that it is left unclosed. However, remembering to call .close() at each connection at the end of the test is cumbersome! Luckily, fixtures in python can be marked as async which allow us to:

* cache all clients created by DflyInstance.client()
* clean them all at the end of the fixture in one go

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-30 09:54:41 +03:00
Kostas Kyrimis
ed11c8d3a4
chore: allow config set notify_keyspace_events (#3790)
We do not allow notify_keyspace_events to be set at runtime via config set command.

* allow notify_keyspace_events in config set command
* add tests

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-30 09:54:02 +03:00
Borys
ce0320300b
test: update test_noreply_pipeline to prevent false fail (#3801) 2024-09-26 12:12:53 +00:00
Kostas Kyrimis
105c2bd761
fix: bitop do not add dst key if result is empty (#3751)
* fix bitiop creating the dst key if result is empty
* fix replicating dst with the wrong type
* make bitop a blind update (similar to set command)

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-25 09:45:20 +03:00
Shahar Mike
526bce4222
chore: Forbid replicating a replica (#3779)
* chore: Forbid replicating a replica

We do not support connecting a replica to a replica, but before this PR
we allowed doing so. This PR disables that behavior.

Fixes #3679

* `replicaof_mu_`
2024-09-24 13:42:22 +00:00
Shahar Mike
9aadc0cd2b
fix: Fix flaky test test_acl_revoke_pub_sub_while_subscribed (#3768)
fix: Fix flaky test `test_acl_revoke_pub_sub_while_subscribed`

The reason it failed is that, in some rare cases, the subscriber did not
get the first few messages of the publisher. This is likely due to
timing of subscribe and publish, in different connections / threads.

Given Pub/Sub has very weak guarantees, it's probably ok as is, so I
just added a sleep to get the test to pass always.
2024-09-24 11:47:17 +03:00
Kostas Kyrimis
15fce9df2d
chore: logs on assert fail for test_acl_cat_commands_multi_exec_squash (#3749)
* print result if assertion fails

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-23 09:51:58 +03:00
Roman Gershman
29b18f0dcb
fix: tune test_replicaof_reject_on_load parameters (#3762)
Reduce the snapshot size by 20% and increase the timeout to avoid failures due to slow loads.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-09-23 09:50:10 +03:00
Roman Gershman
e09ebe0c5c
fix: test deadlock with processing the stdout of sed (#3735)
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-09-22 15:40:27 +03:00
adiholden
4d38271efa
feat(server): introduce rss oom limit (#3702)
* introduce rss denyoom limit

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-09-22 13:28:24 +03:00
adiholden
5cf917871c
feat(server): introduce oom_deny_commands flag (#3718)
* server: introduce oom_deny_commands flag

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-09-22 09:32:18 +03:00
Kostas Kyrimis
ed21867fe9
chore: add missing await in test_take_over_seeder (#3744)
* add missing await

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-19 17:03:11 +00:00
Shahar Mike
55e3647248
chore: Switch ports for cluster_mgr_test.py (#3741)
We saw failures due to port already in use
2024-09-19 12:32:31 +03:00
adiholden
409c2a3beb
test: add test for replication deadlock on replication timeout (#3691)
* test: add test for replication deadlock on replication timeout

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-09-19 12:11:28 +03:00
Borys
bbaa2669f9
test: unskip test for debugging purpose (#3738) 2024-09-18 14:13:07 +00:00
Borys
f122a19a02
test: add tests for replication (#3734)
* test: add tests for replication
2024-09-18 16:32:21 +03:00
Kostas Kyrimis
6e45c9c3e2
fix: properly track json memory usage (#3641)
* add JsonMemTracker
* add logic based on MiMallocResource deltas that calculates json object usage
* add test

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-18 13:08:43 +00:00
Shahar Mike
ffb4c2b601
fix: Fix test_take_over_seeder (#3733)
The test assumed any shutdown will take not more than 1s. This doesn't
always hold, and also waiting for 1s isn't ideal because usually it
takes less than that.

Changed to use `assert_eventually` instead.

Fixes #3684
2024-09-18 14:17:09 +03:00
Shahar Mike
1c6be62a0b
fix: Fix cluster_mgr.py (#3730)
We updated the reply of `SLOT-MIGRATION-STATUS`, so `cluster_mgr.py`
needs to be adjusted as well.
2024-09-18 11:44:15 +03:00
Shahar Mike
a115bc2b9f
fix: Fix test test_client_pause_with_replica (#3729)
There are 2 minor issues with this test:
1. It specified `cmdstat_replconf` as `cmd_stats` instead of `cmd`,
   that's clearly a typo as `cmd_stats` is a map with stats, while
   `replconf` is a Dragonfly command
2. Command `MULTI` is allowed to run even when the server is in paused
   state, see
   [here](https://github.com/dragonflydb/dragonfly/blob/main/src/server/main_service.cc#L1197):

   ```
   // Don't interrupt running multi commands or admin connections.
   ```

Fixes #3675
2024-09-18 09:40:26 +03:00
Andy Dunstall
a64fc74ce1
tests: fix and enable s3 snapshot test (#3720)
* test: fix s3 snapshot test

* ci: configure s3 regression test

* tests: only run s3 snapshot test if bucket not empty
2024-09-17 17:35:53 +03:00
Roman Gershman
e21ba0b3d9
chore: symbolize stack traces in tests upon crash (#3714)
We disable address space randomization when building the binary
and use addr2line to symbolize the stacktrace if it exists.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-09-16 13:43:16 +03:00
Borys
93de559977
Update dflycluster slot-migration-status reply (#3707)
* feat: update DFLYCLUSTER SLOT-MIGRATION-STATUS reply
2024-09-15 09:44:40 +03:00
Kostas Kyrimis
b5929f0162
fix: allow parsing extra spaces on acl files (#3703)
* allow parsing extra whitespace characters in acl files

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-13 10:17:20 +03:00
Kostas Kyrimis
5819755af1
fix: test_replicaof_reject_on_load assert failure (#3697)
* increase snapshot size for the test

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-12 09:57:48 +03:00
Borys
bae2767707
test: fix test_cluster_replication_migration (#3699) 2024-09-11 23:00:53 +03:00
Kostas Kyrimis
d041386184
fix: test_acl_revoke_pub_sub_while_subscribed (#3680)
* add logs
* add asyncio sleep to avoid producer stalls

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-11 12:00:49 +03:00
Borys
35c287b813
test: unskip cluster tests and add debug info (#3681) 2024-09-09 22:21:17 +03:00
Shahar Mike
b10a4a5348
feat(server): Support CLIENT SETINFO (#3673)
Add support for `CLIENT SETINFO <LIB-NAME | LIB-VER>` and also return
that as part of `CLIENT LIST`, like Valkey.

Fixes #3137
2024-09-09 11:03:05 +03:00
Borys
1ed3702af7
test: fix MC test_expiration (#3663) 2024-09-06 14:20:21 +03:00
Borys
2cc2a23247
fix: deadlock in the cluster migration process (#3653) 2024-09-05 21:55:15 +03:00
Kostas Kyrimis
f8f8c69e6a
chore: disable big value ser on reg tests (#3629)
* disable big value ser on reg tests

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-03 08:04:27 +00:00
Borys
8fca7dd9f8
test: fix search tests (#3625) 2024-09-03 09:21:47 +03:00
Kostas Kyrimis
959b96e7cc
fix(test_auth_resp3_bug): release build failing (#3621)
* remove problematic assertion

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-02 14:32:53 +00:00
adiholden
658243fd09
fix pytest: use generic random dbfilename in tests (#3617)
Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-09-02 09:27:22 +03:00
Borys
8e9b097b9d
fix: fix expiration processing for set command (#3607)
* fix: fix expiration processing for set command
2024-09-02 08:44:11 +03:00
Shahar Mike
de5ecc7447
chore: Split --cluster_announce_ip and --replica_announce_ip (#3615)
chore: Split `cluster_announce_ip` and `replica_announce_ip`

This PR partially reverts #3421

Fixes #3541
2024-09-01 12:43:44 +00:00
Roman Gershman
dd0effac6f
feat: add slave_repl_offset to the replication section. (#3596)
* feat: add slave_repl_offset to the replication section.

In Valkey slave_repl_offset denotes the replication offset on replica site during stable sync phase.
During fullsync phase it appears with 0 value.

In Dragonfly this field appears only after full sync has completed, thus it allows
to check whether Dragonfly reached stable sync phase. The value of this field describes the cumulative progress
of all the replication flows and it does not directly correspond to master side metrics.

In addition, this PR fixes the bug in wait_available_async() function in our replication tests.
This function is intended to wait until a replica reaches stable state and it did by sending pings until they do not
respond with LOADING error, hence the assumption is that the replica is in full sync state already.

However it can happen that master_link_status is "up" but replica has not reached full sync state, and the PING will succeed
just because wait_available_async() was called before full sync started. The whole approach of polling the state is fragile.

Now we use `slave_repl_offset` explicitly to see if the replica reaches stable state.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>

* chore: simplify wait_available_async

* chore: comments

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-08-30 18:58:07 +03:00
Kostas Kyrimis
0705bbb536
feat(acl): add pub/sub (#3574)
* add support for pub/sub
* add tests
---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-08-30 15:41:28 +03:00
Roman Gershman
20336805f3
chore: enable experimental_new_io by default. (#3605)
* chore: enable experimental_new_io by default.

It has been running for weeks with the flag on, so enabled it also for community.

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
Co-authored-by: Vladislav Oleshko <vlad@dragonflydb.io>
2024-08-29 23:30:26 +03:00