1
0
Fork 0
mirror of https://github.com/dragonflydb/dragonfly.git synced 2024-12-14 11:58:02 +00:00
Commit graph

645 commits

Author SHA1 Message Date
Kostas Kyrimis
b37287bf14
chore: test metrics for huge value serialization (#4262)
* fix seeder bugs
* add test
* add assertions for huge value metrics

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-12-12 14:19:14 +02:00
Borys
f892d9b7fb
fix: increase cluster migration default timeout (#4293) 2024-12-11 14:39:41 +00:00
adiholden
03d679ac31
fix(server) : dont apply eviction on rss over limit (#4276)
Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-12-09 14:19:25 +02:00
Shahar Mike
bafd8b3b8b
chore: Fix test_rss_used_mem_gap for all types (#4254)
* chore: Fix `test_rss_used_mem_gap` for all types

The test fails when it checks the gap between `used_memory` and
`object_used_memory`, by assuming that all used memory is consumed by
the `type` it `DEBUG POPULATE`s with.

This assumption is wrong, because there are other overheads, for example
the dash table and string keys.

The test failed for types `STRING` and `LIST` because they used a larger
number of keys as part of the test parameters, which added a larger
overhead.

I fixed the parameters such that all types use the same number of keys,
and also the same number of elements, modifying only the element sizes
(except for `STRING` which doesn't have sub-elements) so that the
overall `min_rss` requirement of 3.5gb still passes.

Fixes #3723

* threshold

* list

* comments test assert

* previous numbers

* ???
2024-12-08 13:34:59 +02:00
Kostas Kyrimis
f9f93b108c
chore: monotonically increasing ports for cluster tests (#4268)
We have cascading failures in cluster tests because on assertion failures the nodes are not properly cleaned up and subsequent test cases that use the same ports fail. I added a monotonically increasing port generator to mitigate this effect.
2024-12-06 12:07:23 +02:00
Borys
17651b2610
fix: test_network_disconnect_during_migration data size was too big f… (#4260)
fix: test_network_disconnect_during_migration data size was too big for timeout
2024-12-05 13:04:58 +00:00
Stepan Bagritsevich
5483d1d05e
fix(eviction): Tune eviction threshold in cache mode (#4142)
* fix(eviction): Tune eviction threshold in cache mode

fixes #4139

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: small fix in tiered_storage_test

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: address comments

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* chore(dragonfly_test): Remove ResetService

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: fix test_cache_eviction_with_rss_deny_oom test

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: address comments

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* fix(dragonfly_test): Fix DflyEngineTest.Bug207

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* fix(dragonfly_test): Increase string size in the test Bug207

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: address comments 3

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: address comments 4

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* fix: Fix failing tests

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: address comments 5

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: resolve conficts

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

---------

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
2024-12-05 12:26:59 +00:00
Kostas Kyrimis
267d5ab370
chore: remove DbSlice mutex and add ConditionFlag in SliceSnapshot (#4073)
* remove DbSlice mutex
* add ConditionFlag in SliceSnapshot
* disable compression when big value serialization is on
* add metrics

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-12-05 13:24:23 +02:00
Kostas Kyrimis
7ccad66fb1
feat: add support for big values in SeederV2 (#4222)
* add support for big values in SeederV2

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-12-05 08:47:41 +00:00
Borys
071e299971
refactor: remove redundant allocations for streamer (#4225)
* refactor: remove redundant allocations for streamer
2024-12-05 08:15:31 +00:00
adiholden
7a23ec2aac
fix(server): fix memory leak on lua error (#4236)
The bug:
calling lua_error does not return, instead it unwinds the Lua call stack until an error handler is found or the
script exits. This lead to memory leak on object that should release memory in destructor.
Specific example is the absl::FixedArray<string_view, 4> args(argc); which allocates on heap if argc > 4. The free was not called leading to memory leak.
The fix:
Add scoping to to the function so that the destructor is called before calling raise error

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-12-03 16:47:43 +02:00
Shahar Mike
b0d633fb61
chore: Hide replica info in real cluster if --managed_service_info (#4241)
So far we only handled emulated cluster. This PR adds real cluster
support.

Related to #4173
2024-12-02 20:22:07 +02:00
Shahar Mike
779bba71f9
fix: Fix test_network_disconnect_during_migration test (#4224)
There are actually a few failures fixed in this PR, only one of which is a test bug:

* `db_slice_->Traverse()` can yield, causing `fiber_cancelled_`'s value to change
* When a migration is cancelled, it may never finish `WaitForInflightToComplete()` because it has `in_flight_bytes_` that will never reach destination due to the cancellation
* `IterateMap()` with numeric key/values overrode the key's buffer with the value's buffer

Fixes #4207
2024-12-02 15:55:23 +02:00
Borys
dc04b196d5
test: fix and unskip test_migration_timeout_on_sync (#4216) 2024-11-28 14:54:17 +02:00
Borys
d6f2b76666
fix: cluster_mgr script (#4210) 2024-11-27 14:09:19 +00:00
Kostas Kyrimis
66e0fd0908
fix: stream memory tracking (#4067)
* add object memory usage for streams
* add test
2024-11-27 12:41:08 +02:00
Borys
0531c39aae
test: skip test_cluster_mgr because of unclosed instance (#4191) 2024-11-26 09:34:06 +00:00
Roman Gershman
2c663f3833
chore: produce core files in regtests (#4185)
Should work only for self-hosted runners.
The core files will be kept in /var/crash/
We also copy automatically dragonfly binary into /var/crash to be able to debug later.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-25 17:13:11 +00:00
Roman Gershman
63742dd0cf
fix: stop using openssl for container healthchecks (#4181)
Dragonfly responds to ascii based requests to tls port with:
`-ERR Bad TLS header, double check if you enabled TLS for your client.`

Therefore, it is possible to test now both tls and non-tls ports with a plain-text PING.
Fixes #4171

Also, blacklist the bloom-filter test that Dragonfly does not support yet.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-25 17:41:17 +02:00
Borys
43c83d29fa
feat: cluster migrations restarts immediately if timeout happens (#4081)
* feat: cluster migrations restarts immediately if timeout happens

* feat: add DEBUG MIGRATION PAUSE command
2024-11-25 16:02:22 +02:00
Shahar Mike
3c65651c69
feat: Huge values breakdown in cluster migration (#4144)
* feat: Huge values breakdown in cluster migration

Before this PR we used `RESTORE` commands for transferring data between
source and target nodes in cluster slots migration.

While this _works_, it has a side effect of consuming 2x memory for huge
values (i.e. if a single key's value takes 10gb, serializing it will
take 20gb or even 30gb).

With this PR we break down huge keys into multiple commands (`RPUSH`,
`HSET`, etc), respecting the existing `--serialization_max_chunk_size`
flag.

Part of #4100
2024-11-25 15:58:18 +02:00
Shahar Mike
6a7f345bc5
chore: Hide replicas from CLUSTER subcmds in managed mode (#4174)
* chore: Hide replicas from `CLUSTER` subcmds in managed mode

Part of #4173 (see for context)

* server.client()
2024-11-24 13:10:32 +00:00
Roman Gershman
91caa940b9
chore: fix shutdown sequence in Dragonfly server (#4168)
1. Better logging in regtests
2. Release resources in dfly_main in more controlled manner.
3. Switch to ignoring signals when unregister signal handlers during the shutdown.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-24 10:35:00 +02:00
Roman Gershman
b8c2dd888a
chore: log exit code of failing dragonfly in tests (#4166)
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-22 11:40:10 +02:00
Shahar Mike
24a1ec6ab2
fix: Huge entries fail to load outside RDB / replication (#4154)
* fix: Huge entries fail to load outside RDB / replication

We have an internal utility tool that we use to deserialize values in
some use cases:

* `RESTORE`
* Cluster slot migration
* `RENAME`, if the source and target shards are different

We [recently](https://github.com/dragonflydb/dragonfly/issues/3760)
changed this area of the code, which caused this regression as it only
handled RDB / replication streams.

Fixes #4143
2024-11-20 14:00:07 +00:00
Roman Gershman
0e7ae34fe4
fix: enforce load limits when loading snapshot (#4136)
* fix: enforce load limits when loading snapshot

Prevent loading snapshots with used memory higher than max memory limit.

1. Store the used memory metadata only inside the summary file
2. Load the summary file before loading anything else, and if the used-memory is higher,
   abort the load.
---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-20 06:12:47 +02:00
Roman Gershman
d467a348ac
fix: allow SELECT in multi/exec if it's a noop (#4146)
Fixes #4120

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-18 22:27:34 +02:00
Daniel M
d241839cff
chore:update fakeredis, remove irrelevant tests (#4014)
* chore:update fakeredis, remove irrelevant tests
2024-11-17 20:24:46 +02:00
Roman Gershman
8e3b8ccbe3
chore: run tests with list_experimental_v2 enabled (#4112)
Also fix issues with memory_test.py running locally.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-15 10:33:45 +01:00
Borys
4bc9ad6f01
test: add test for snapshoting during migration (#4108)
* test: add test for snapshotting during migration

* test: add test to run replication after migration
2024-11-13 13:40:00 +02:00
Kostas Kyrimis
91c236ab2f
fix: slow CI tests (#4117)
* refactor test_big_value_serialization
* remove no needed replication tests for big value

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-11-13 10:32:31 +02:00
Roman Gershman
fa8f3f5564
fix: regression in squashing code when determining eval commands (#4116)
The regression was caused by #3947 and it causes crashes in bullmq.
It has not been found till now because python client sends commands in uppercase.
Fixes #4113

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Co-authored-by: Kostas Kyrimis <kostas@dragonflydb.io>
2024-11-11 19:54:47 +00:00
Roman Gershman
1eef773d0a
fix: test_noreply_pipeline flakiness (#4102)
Fixes #3896. Now we retry several times.
In my checks this should significantly reduce the failure probability.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-10 13:39:24 +02:00
adiholden
ae3faf59fb
feat(server): dont use channel for replication / save df (#4041)
* feat server: dont use channel for replication / save df

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-11-05 16:50:01 +02:00
dependabot[bot]
54c67a9198
chore(deps): bump pytest-repeat from 0.9.1 to 0.9.3 in /tests/dragonfly (#4057)
Bumps [pytest-repeat](https://github.com/pytest-dev/pytest-repeat) from 0.9.1 to 0.9.3.
- [Release notes](https://github.com/pytest-dev/pytest-repeat/releases)
- [Changelog](https://github.com/pytest-dev/pytest-repeat/blob/main/CHANGES.rst)
- [Commits](https://github.com/pytest-dev/pytest-repeat/compare/v0.9.1...v0.9.3)

---
updated-dependencies:
- dependency-name: pytest-repeat
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-04 23:47:24 +02:00
dependabot[bot]
94ee96afd0
chore(deps): bump redis-om from 0.2.2 to 0.3.3 in /tests/dragonfly (#4060)
Bumps [redis-om](https://github.com/redis/redis-om-python) from 0.2.2 to 0.3.3.
- [Release notes](https://github.com/redis/redis-om-python/releases)
- [Commits](https://github.com/redis/redis-om-python/compare/v0.2.2...v0.3.3)

---
updated-dependencies:
- dependency-name: redis-om
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-04 22:35:18 +02:00
dependabot[bot]
b70f9703f4
chore(deps): bump tomli from 2.0.1 to 2.0.2 in /tests/dragonfly (#4059)
Bumps [tomli](https://github.com/hukkin/tomli) from 2.0.1 to 2.0.2.
- [Changelog](https://github.com/hukkin/tomli/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hukkin/tomli/compare/2.0.1...2.0.2)

---
updated-dependencies:
- dependency-name: tomli
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-04 22:31:29 +02:00
Roman Gershman
cc6fdd7fbf
chore: add retry to test_noreply_pipeline test (#4045) 2024-11-04 08:32:48 +00:00
Borys
e4b468d953
fix: reduce memory consumption during migration (#4017)
* refactor: reduce memory consumption for RestoreStreamer
* fix: add Throttling into RestoreStreamer::WriteBucket
2024-11-03 17:03:45 +02:00
Borys
5a597cf6cc
test: update test_big_containers (#4025) 2024-11-03 16:20:11 +02:00
Roman Gershman
c8b56b69b4
chore: print info stats if test_noreply_pipeline fails (#4016) 2024-10-30 08:17:15 +00:00
Kostas Kyrimis
4b495182e8
fix: separate Heartbeat and ShardHandler to fibers (#3936)
* separate shard_handler from Heartbeat
* add test

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-10-29 09:22:53 +02:00
Roman Gershman
41d8c66a15
fix: flaky test_failover test (#4007)
Prevent from checking replica too early,
to avoid "Loading" exception.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-28 14:58:36 +02:00
Borys
c80d21fcba
fix: crash if we OOM during migration process (#3968) 2024-10-23 17:04:08 +03:00
Stepan Bagritsevich
ea9dc9c454
chore(fakeredis): Enable JSON tests in the Fakeredis tests (#3773)
* chore(fakeredis): Enable JSON tests in the Fakeredis tests

fixes dragonflydb#3671

Signed-off-by: Stsiapan Bahrytsevich <stefan@dragonflydb.io>

* refactor: address comments

Signed-off-by: Stsiapan Bahrytsevich <stefan@dragonflydb.io>

* tmp commit

Signed-off-by: Stsiapan Bahrytsevich <stefan@dragonflydb.io>

* refactor: address comments 2

Signed-off-by: Stsiapan Bahrytsevich <stefan@dragonflydb.io>

---------

Signed-off-by: Stsiapan Bahrytsevich <stefan@dragonflydb.io>
2024-10-23 08:53:14 +02:00
Kostas Kyrimis
37fec87070
chore: increase load in test_noreply_pipeline (#3960)
Test is flaky because it relies that the producer (the pytest) to send fast enough a bunch of commands before they get dispatched synchronously so I increased the load.
2024-10-22 21:00:25 +03:00
Borys
dec0712e15
test: add test to test big collections or collections with big values (#3959) 2024-10-22 15:01:32 +03:00
Kostas Kyrimis
119723316e
chore: tune test_rss_used_mem_gap (#3958)
It appears that newer versions of the gh runner require more memory. Some cases of the test test_rss_used_mem_gap allocate more than 6.5-7 gb of memory leaving barely 0.5gb to the gh runner (7.5 in total available) which sometimes cause the instance to run out of memory.
2024-10-22 10:31:13 +03:00
Kostas Kyrimis
478a5d476d
chore: disable test_cluster_memory_consumption_migration (#3948)
Test takes more than 10 minutes on the CI and it causes it to timeout
2024-10-21 08:41:15 +03:00
Vladislav
32a31cf1d8
chore(facade): Fix bad new IO glue (#3940)
* chore(facade): Fix bad new IO glue

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2024-10-18 23:25:56 +03:00