1
0
Fork 0
mirror of https://github.com/dragonflydb/dragonfly.git synced 2024-12-14 11:58:02 +00:00
Commit graph

625 commits

Author SHA1 Message Date
Shahar Mike
3c65651c69
feat: Huge values breakdown in cluster migration (#4144)
* feat: Huge values breakdown in cluster migration

Before this PR we used `RESTORE` commands for transferring data between
source and target nodes in cluster slots migration.

While this _works_, it has a side effect of consuming 2x memory for huge
values (i.e. if a single key's value takes 10gb, serializing it will
take 20gb or even 30gb).

With this PR we break down huge keys into multiple commands (`RPUSH`,
`HSET`, etc), respecting the existing `--serialization_max_chunk_size`
flag.

Part of #4100
2024-11-25 15:58:18 +02:00
Shahar Mike
6a7f345bc5
chore: Hide replicas from CLUSTER subcmds in managed mode (#4174)
* chore: Hide replicas from `CLUSTER` subcmds in managed mode

Part of #4173 (see for context)

* server.client()
2024-11-24 13:10:32 +00:00
Roman Gershman
91caa940b9
chore: fix shutdown sequence in Dragonfly server (#4168)
1. Better logging in regtests
2. Release resources in dfly_main in more controlled manner.
3. Switch to ignoring signals when unregister signal handlers during the shutdown.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-24 10:35:00 +02:00
Roman Gershman
b8c2dd888a
chore: log exit code of failing dragonfly in tests (#4166)
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-22 11:40:10 +02:00
Shahar Mike
24a1ec6ab2
fix: Huge entries fail to load outside RDB / replication (#4154)
* fix: Huge entries fail to load outside RDB / replication

We have an internal utility tool that we use to deserialize values in
some use cases:

* `RESTORE`
* Cluster slot migration
* `RENAME`, if the source and target shards are different

We [recently](https://github.com/dragonflydb/dragonfly/issues/3760)
changed this area of the code, which caused this regression as it only
handled RDB / replication streams.

Fixes #4143
2024-11-20 14:00:07 +00:00
Roman Gershman
0e7ae34fe4
fix: enforce load limits when loading snapshot (#4136)
* fix: enforce load limits when loading snapshot

Prevent loading snapshots with used memory higher than max memory limit.

1. Store the used memory metadata only inside the summary file
2. Load the summary file before loading anything else, and if the used-memory is higher,
   abort the load.
---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-20 06:12:47 +02:00
Roman Gershman
d467a348ac
fix: allow SELECT in multi/exec if it's a noop (#4146)
Fixes #4120

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-18 22:27:34 +02:00
Daniel M
d241839cff
chore:update fakeredis, remove irrelevant tests (#4014)
* chore:update fakeredis, remove irrelevant tests
2024-11-17 20:24:46 +02:00
Roman Gershman
8e3b8ccbe3
chore: run tests with list_experimental_v2 enabled (#4112)
Also fix issues with memory_test.py running locally.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-15 10:33:45 +01:00
Borys
4bc9ad6f01
test: add test for snapshoting during migration (#4108)
* test: add test for snapshotting during migration

* test: add test to run replication after migration
2024-11-13 13:40:00 +02:00
Kostas Kyrimis
91c236ab2f
fix: slow CI tests (#4117)
* refactor test_big_value_serialization
* remove no needed replication tests for big value

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-11-13 10:32:31 +02:00
Roman Gershman
fa8f3f5564
fix: regression in squashing code when determining eval commands (#4116)
The regression was caused by #3947 and it causes crashes in bullmq.
It has not been found till now because python client sends commands in uppercase.
Fixes #4113

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Co-authored-by: Kostas Kyrimis <kostas@dragonflydb.io>
2024-11-11 19:54:47 +00:00
Roman Gershman
1eef773d0a
fix: test_noreply_pipeline flakiness (#4102)
Fixes #3896. Now we retry several times.
In my checks this should significantly reduce the failure probability.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-10 13:39:24 +02:00
adiholden
ae3faf59fb
feat(server): dont use channel for replication / save df (#4041)
* feat server: dont use channel for replication / save df

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-11-05 16:50:01 +02:00
dependabot[bot]
54c67a9198
chore(deps): bump pytest-repeat from 0.9.1 to 0.9.3 in /tests/dragonfly (#4057)
Bumps [pytest-repeat](https://github.com/pytest-dev/pytest-repeat) from 0.9.1 to 0.9.3.
- [Release notes](https://github.com/pytest-dev/pytest-repeat/releases)
- [Changelog](https://github.com/pytest-dev/pytest-repeat/blob/main/CHANGES.rst)
- [Commits](https://github.com/pytest-dev/pytest-repeat/compare/v0.9.1...v0.9.3)

---
updated-dependencies:
- dependency-name: pytest-repeat
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-04 23:47:24 +02:00
dependabot[bot]
94ee96afd0
chore(deps): bump redis-om from 0.2.2 to 0.3.3 in /tests/dragonfly (#4060)
Bumps [redis-om](https://github.com/redis/redis-om-python) from 0.2.2 to 0.3.3.
- [Release notes](https://github.com/redis/redis-om-python/releases)
- [Commits](https://github.com/redis/redis-om-python/compare/v0.2.2...v0.3.3)

---
updated-dependencies:
- dependency-name: redis-om
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-04 22:35:18 +02:00
dependabot[bot]
b70f9703f4
chore(deps): bump tomli from 2.0.1 to 2.0.2 in /tests/dragonfly (#4059)
Bumps [tomli](https://github.com/hukkin/tomli) from 2.0.1 to 2.0.2.
- [Changelog](https://github.com/hukkin/tomli/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hukkin/tomli/compare/2.0.1...2.0.2)

---
updated-dependencies:
- dependency-name: tomli
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-04 22:31:29 +02:00
Roman Gershman
cc6fdd7fbf
chore: add retry to test_noreply_pipeline test (#4045) 2024-11-04 08:32:48 +00:00
Borys
e4b468d953
fix: reduce memory consumption during migration (#4017)
* refactor: reduce memory consumption for RestoreStreamer
* fix: add Throttling into RestoreStreamer::WriteBucket
2024-11-03 17:03:45 +02:00
Borys
5a597cf6cc
test: update test_big_containers (#4025) 2024-11-03 16:20:11 +02:00
Roman Gershman
c8b56b69b4
chore: print info stats if test_noreply_pipeline fails (#4016) 2024-10-30 08:17:15 +00:00
Kostas Kyrimis
4b495182e8
fix: separate Heartbeat and ShardHandler to fibers (#3936)
* separate shard_handler from Heartbeat
* add test

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-10-29 09:22:53 +02:00
Roman Gershman
41d8c66a15
fix: flaky test_failover test (#4007)
Prevent from checking replica too early,
to avoid "Loading" exception.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-28 14:58:36 +02:00
Borys
c80d21fcba
fix: crash if we OOM during migration process (#3968) 2024-10-23 17:04:08 +03:00
Stepan Bagritsevich
ea9dc9c454
chore(fakeredis): Enable JSON tests in the Fakeredis tests (#3773)
* chore(fakeredis): Enable JSON tests in the Fakeredis tests

fixes dragonflydb#3671

Signed-off-by: Stsiapan Bahrytsevich <stefan@dragonflydb.io>

* refactor: address comments

Signed-off-by: Stsiapan Bahrytsevich <stefan@dragonflydb.io>

* tmp commit

Signed-off-by: Stsiapan Bahrytsevich <stefan@dragonflydb.io>

* refactor: address comments 2

Signed-off-by: Stsiapan Bahrytsevich <stefan@dragonflydb.io>

---------

Signed-off-by: Stsiapan Bahrytsevich <stefan@dragonflydb.io>
2024-10-23 08:53:14 +02:00
Kostas Kyrimis
37fec87070
chore: increase load in test_noreply_pipeline (#3960)
Test is flaky because it relies that the producer (the pytest) to send fast enough a bunch of commands before they get dispatched synchronously so I increased the load.
2024-10-22 21:00:25 +03:00
Borys
dec0712e15
test: add test to test big collections or collections with big values (#3959) 2024-10-22 15:01:32 +03:00
Kostas Kyrimis
119723316e
chore: tune test_rss_used_mem_gap (#3958)
It appears that newer versions of the gh runner require more memory. Some cases of the test test_rss_used_mem_gap allocate more than 6.5-7 gb of memory leaving barely 0.5gb to the gh runner (7.5 in total available) which sometimes cause the instance to run out of memory.
2024-10-22 10:31:13 +03:00
Kostas Kyrimis
478a5d476d
chore: disable test_cluster_memory_consumption_migration (#3948)
Test takes more than 10 minutes on the CI and it causes it to timeout
2024-10-21 08:41:15 +03:00
Vladislav
32a31cf1d8
chore(facade): Fix bad new IO glue (#3940)
* chore(facade): Fix bad new IO glue

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2024-10-18 23:25:56 +03:00
Borys
866c82a3fa
test: add test to reproduce a lot of memory consumtion during migration (#3939) 2024-10-17 14:23:26 +03:00
Shahar Mike
c3f9ec18ae
fix: Fix test_flushall_in_full_sync (#3929)
* fix: Fix `test_flushall_in_full_sync`

This test failed in CI many times. The issue was that we reach stable
sync too quickly, and miss the full sync stage.

I changed the seeder to add 100k (instead of 30k) keys for the stage to
take longer.

* StaticSeeder
2024-10-15 11:48:32 +00:00
adiholden
a1830e1b5e
feat(server): use listpack node encoding for list (#3914)
Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-10-15 13:55:26 +03:00
Shahar Mike
c868b27bbe
fix: Support replicating Valkey and Redis 7.2 (#3927)
Until now, we only tested Dragonfly against Redis 6.2.  It appears that
something has changed in the way Redis sends stable sync commands, and
now they also forward `MULTI` and `EXEC` as part of their replication.

Since we do not allow all commands to run under `MULTI`/`EXEC`,
specifically `SELECT`, a Dragonfly replica of such servers failed these
commands and became inconsistent with the data on the master.

The proposed fix is to simply ignore (i.e. not execute) `MULTI`/`EXEC`
coming from a Redis/Valkey master, and run the commands within those
transactions individually, like we do for other transactions.

To test this we randomly choose a redis/valkey server based on 3
available installed binaries and test against them.
2024-10-15 13:12:16 +03:00
Kostas Kyrimis
588d6cc339
chore: relax assertion in test_noreply_pipeline (#3908)
*adjust assert condition

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-10-14 09:27:57 +03:00
Vladislav
e71f083f34
feat(search): STOPWORDS (#3851)
Adds support for STOPWORDS option
2024-10-10 21:58:12 +03:00
Kostas Kyrimis
a5fa3ab9f5
chore: skip flaky test_noreply_pipeline (#3903)
* Disable the test_noreply_pipeline because it's really flaky. Will look on this once I wrap up with my pending tasks.
2024-10-10 07:37:13 +00:00
Vladislav
786c9cd44d
chore: collection size (#3844) 2024-10-08 18:51:11 +03:00
Shahar Mike
b2ebfd05d4
fix: Do not publish to connections without context (#3873)
* fix: Do not publish to connections without context

This is a rare case where a closed connection is kept alive while the
handling fiber yields, therefore leaving `cc_` (the connection context)
pointing to null for other fibers to see.

As far as I can see, this can only happen during server shutdown, but
there could be other cases that I have missed.

The test on its own does _not_ reproduce the crash, however with added
`ThisFiber::SleepFor()`s I could reproduce the crash:

* Right before `DispatchBrief()`
  [here](e3214cb603/src/server/channel_store.cc (L154))
* Right after connection context `reset()`
  [here](2ab480e160/src/facade/dragonfly_connection.cc (L750))

In any case, calling `SendPubMessageAsync()` to a connection where `cc_`
is null is a bug, and we fix that here.

* rewording
2024-10-08 14:45:57 +03:00
Daniel M
1958e09a9a
refactor: refactor fakeredis tests (#3852)
* refactor:fakeredis tests
2024-10-03 12:41:05 +03:00
adiholden
fa288c19b2
test: more stabe test_bgsave_and_save (#3843)
Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-10-01 15:03:12 +03:00
Kostas Kyrimis
ec353e1522
chore: add logs to test_acl_cat_commands_multi_exec_squash (#3826)
* add logs to test

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-30 12:03:40 +00:00
Kostas Kyrimis
b19f722011
chore: do not close connections at the end of pytest (#3811)
A common case is that we need to clean up a connection before we exit a test via .close() method. This is needed because otherwise the connection will raise a warning that it is left unclosed. However, remembering to call .close() at each connection at the end of the test is cumbersome! Luckily, fixtures in python can be marked as async which allow us to:

* cache all clients created by DflyInstance.client()
* clean them all at the end of the fixture in one go

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-30 09:54:41 +03:00
Kostas Kyrimis
ed11c8d3a4
chore: allow config set notify_keyspace_events (#3790)
We do not allow notify_keyspace_events to be set at runtime via config set command.

* allow notify_keyspace_events in config set command
* add tests

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-30 09:54:02 +03:00
Borys
ce0320300b
test: update test_noreply_pipeline to prevent false fail (#3801) 2024-09-26 12:12:53 +00:00
Kostas Kyrimis
105c2bd761
fix: bitop do not add dst key if result is empty (#3751)
* fix bitiop creating the dst key if result is empty
* fix replicating dst with the wrong type
* make bitop a blind update (similar to set command)

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-25 09:45:20 +03:00
Shahar Mike
526bce4222
chore: Forbid replicating a replica (#3779)
* chore: Forbid replicating a replica

We do not support connecting a replica to a replica, but before this PR
we allowed doing so. This PR disables that behavior.

Fixes #3679

* `replicaof_mu_`
2024-09-24 13:42:22 +00:00
Shahar Mike
9aadc0cd2b
fix: Fix flaky test test_acl_revoke_pub_sub_while_subscribed (#3768)
fix: Fix flaky test `test_acl_revoke_pub_sub_while_subscribed`

The reason it failed is that, in some rare cases, the subscriber did not
get the first few messages of the publisher. This is likely due to
timing of subscribe and publish, in different connections / threads.

Given Pub/Sub has very weak guarantees, it's probably ok as is, so I
just added a sleep to get the test to pass always.
2024-09-24 11:47:17 +03:00
Borys
3804076ea9
fix: setrange with empty value doesn't modify the DB (#3771) 2024-09-23 19:09:53 +03:00
Kostas Kyrimis
15fce9df2d
chore: logs on assert fail for test_acl_cat_commands_multi_exec_squash (#3749)
* print result if assertion fails

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-23 09:51:58 +03:00