1
0
Fork 0
mirror of https://github.com/dragonflydb/dragonfly.git synced 2024-12-14 11:58:02 +00:00
Commit graph

2660 commits

Author SHA1 Message Date
Roman Gershman
6a13329523
fix: make sure dfly_bench reliably connects (#3802)
1. Issue ping upon connect, add a comment why.
2. log error if dfly_bench disconnects before all the requests were processed.
3. Refactor memcache parsing code into ParseMC function.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-09-26 23:28:20 +03:00
Roman Gershman
3945b7e4fa
chore: tune TieredStorageTest.MemoryPressure (#3805)
* chore: tune TieredStorageTest.MemoryPressure

* chore: print more stats on failure
2024-09-26 15:42:28 +00:00
adiholden
fbf12e9abb
chore: cleanup not used opcodes in replication (#3804)
feat server: cleanup not used opcodes in replication

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-09-26 14:51:57 +00:00
Stepan Bagritsevich
c2da601f6b
fix(generic_family): Update indexes in the RESTORE and RENAME commands (#3803) 2024-09-26 14:04:09 +00:00
Borys
cd279af6d4
fix: bitcount invalid range (#3792) 2024-09-26 16:31:09 +03:00
Borys
ce0320300b
test: update test_noreply_pipeline to prevent false fail (#3801) 2024-09-26 12:12:53 +00:00
Roman Gershman
70ad113e4b
chore: ScheduleInternal refactoring (#3794)
A small refactoring to improve the flow of ScheduleInternal() as well as
to prepare it for the next change that will reduce the CPU load from the shard queue.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-09-26 08:14:26 +03:00
Kostas Kyrimis
a5d34adc4c
chore: remove goto statements (#3791)
* replace goto statements with lambda calls

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-25 16:08:31 +03:00
romange
734be21407 chore(helm-chart): update to v1.23.0 2024-09-25 09:58:17 +00:00
Lakshya Garg
fb2ee90b2d
chore(acl_family): add allcomands and nocommands (#3783)
* add allcommands alias for acl
* add nocommands alias for acl
* add test
2024-09-25 10:58:33 +03:00
Kostas Kyrimis
105c2bd761
fix: bitop do not add dst key if result is empty (#3751)
* fix bitiop creating the dst key if result is empty
* fix replicating dst with the wrong type
* make bitop a blind update (similar to set command)

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-25 09:45:20 +03:00
Borys
987e6feaa5
fix: GETRANGE params validation (#3781)
fix: getrange params validation
2024-09-24 13:54:35 +00:00
Shahar Mike
526bce4222
chore: Forbid replicating a replica (#3779)
* chore: Forbid replicating a replica

We do not support connecting a replica to a replica, but before this PR
we allowed doing so. This PR disables that behavior.

Fixes #3679

* `replicaof_mu_`
2024-09-24 13:42:22 +00:00
Shahar Mike
9aadc0cd2b
fix: Fix flaky test test_acl_revoke_pub_sub_while_subscribed (#3768)
fix: Fix flaky test `test_acl_revoke_pub_sub_while_subscribed`

The reason it failed is that, in some rare cases, the subscriber did not
get the first few messages of the publisher. This is likely due to
timing of subscribe and publish, in different connections / threads.

Given Pub/Sub has very weak guarantees, it's probably ok as is, so I
just added a sleep to get the test to pass always.
2024-09-24 11:47:17 +03:00
Borys
3804076ea9
fix: setrange with empty value doesn't modify the DB (#3771) 2024-09-23 19:09:53 +03:00
Roman Gershman
b7b4cabacc
chore: some renames + fix a typo in RETURN_ON_BAD_STATUS (#3763)
* chore: some renames + fix a typo in RETURN_ON_BAD_STATUS

Renames in transaction.h - no functional changes.
Fix a typo in error.h following  #3758
---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-09-23 13:16:50 +03:00
Borys
9303591010
fix: mark pubusb commands as unsupported for cluster (#3767) 2024-09-23 09:59:13 +00:00
Roman Gershman
9c49aee43d
chore: give up on InlinedVector due to spurious warnings with optional (#3765) 2024-09-23 11:34:39 +03:00
adiholden
7df95dfb6e
fix server: fix last error reply (#3728)
fix 1: in multi command squasher error message was not set therefore it was not printed to log on the relevant command only on exec, fixed by setting the last error in CapturingReplyBuilder::SendError
fix 2: non clearing cached error replies before the command is Invoked

---------

Signed-off-by: adi_holden <adi@dragonflydb.io>
Co-authored-by: kostas <kostas@dragonflydb.io>
2024-09-23 11:34:13 +03:00
Andy Dunstall
45ffc605bd
feat(zset_family): add ZRANGESTORE (#3757) 2024-09-23 11:28:12 +03:00
Borys
6185617949
fix: substr/getrange result for invalid range (#3766) 2024-09-23 08:20:08 +00:00
Roman Gershman
0a049ab631
chore: add more error logs around ziplist parsing checks (#3764)
Also, reformat ziplist.c to valkey 8 formatting (no code changes besides this).

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-09-23 10:13:36 +03:00
Kostas Kyrimis
15fce9df2d
chore: logs on assert fail for test_acl_cat_commands_multi_exec_squash (#3749)
* print result if assertion fails

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-23 09:51:58 +03:00
Roman Gershman
29b18f0dcb
fix: tune test_replicaof_reject_on_load parameters (#3762)
Reduce the snapshot size by 20% and increase the timeout to avoid failures due to slow loads.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-09-23 09:50:10 +03:00
Roman Gershman
f1f8ee17dc
fix: make snapshotting process more responsive (#3759)
* fix: improve BreakStalledFlowsInShard heuristic

Before this change - we wrote in a single call whatever record chunks we pulled from the channel.
This can be problematic for 1GB chunks for example, which might take 10sec to write.

Lately we added a replication breaker on the master side that breaks the fully sync after
a predefined threshold has passed. By default it was 10sec. To improve the robustness of this
breaker, we now write chunks of upto 1MB and update last_write_time_ns_ more frequently.

Also, we added more logs to make replication delays on both sides more visible.
We also added logs of breaking the replication on the master sides.

Unfortunately, this did not help making BreakStalledFlowsInShard more robust because now the
problem moved to replica side which may take 20s+ seconds to parse huge values.
Therefore, I increased the threshold for breaking the replication to 30s.

Finally, instrument GetMetrics call as it takes sometimes more than 1 sec.

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-09-22 17:05:28 +03:00
Roman Gershman
2e9b133ea0
chore: add integrity checks to consumer->pel (#3754)
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-09-22 15:40:42 +03:00
Roman Gershman
e09ebe0c5c
fix: test deadlock with processing the stdout of sed (#3735)
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-09-22 15:40:27 +03:00
adiholden
4d38271efa
feat(server): introduce rss oom limit (#3702)
* introduce rss denyoom limit

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-09-22 13:28:24 +03:00
adiholden
5cf917871c
feat(server): introduce oom_deny_commands flag (#3718)
* server: introduce oom_deny_commands flag

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-09-22 09:32:18 +03:00
Stefan Roman
69db21db4c
feat(helm): add hostNetwork, topologySpreadConstraint and clusterIP su… (#3389)
* add(helm): add hostNetwork, topologySpreadConstraint and clusterIP support

* parameters hostNetwork and clusterIP shouold not be templated if they are not explicitly used

---------

Signed-off-by: Stefan Roman <elegant.frog3113@fastmail.com>
Co-authored-by: Stefan Roman <elegant.frog3113@fastmail.com>
2024-09-22 08:08:01 +03:00
Vladislav
d9f8f2553b
chore: fix return on bad status (#3758) 2024-09-22 01:36:39 +03:00
Roman Gershman
cce2eb35ed
chore: refactor a lambda function into a named one (#3753)
Also did some cosmetic improvements. No functionality should be changed.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-09-22 01:35:56 +03:00
Andy Dunstall
9dd79657ce
fix: zset store conclude transaction on error (#3755) 2024-09-21 19:08:53 +03:00
Borys
ce79da0f7a
fix: add value range check for SETBIT command (#3750) 2024-09-20 18:20:35 +03:00
Roman Gershman
c9a2334f6d
fix: allow the healthcheck run in non-privileged containers as well (#3731)
fix: allow the healthcheck running in non-privileged containers as well

Fixes #3644 (again).

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-09-20 05:41:06 +00:00
Kostas Kyrimis
ed21867fe9
chore: add missing await in test_take_over_seeder (#3744)
* add missing await

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-19 17:03:11 +00:00
Roman Gershman
abf3acec4a
chore: introduce a Clone function for the dense set (#3740)
* chore: introduce a Clone function for the dense set

We use a state machine to prefetch data in batches.
After this change, the hot spots are predominantly inside ObjectClone and
Hash methods.

All in all benchmarks show ~45% CPU reduction:
```
BM_Clone/elements:32000    1517322 ns      1517338 ns         2772
BM_Fill/elements:32000      841087 ns       841097 ns         4900
```

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-09-19 16:14:33 +03:00
Vladislav
3af2dfc4e7
chore: add SetReplies (#3727) 2024-09-19 12:54:25 +03:00
Kostas Kyrimis
0e0b2e78a4
chore: change log level to warning for empty keys (#3722)
* adjust log level to warning for allowed empty keys in rdb_load and rdb_save

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-19 09:45:09 +00:00
Shahar Mike
55e3647248
chore: Switch ports for cluster_mgr_test.py (#3741)
We saw failures due to port already in use
2024-09-19 12:32:31 +03:00
adiholden
409c2a3beb
test: add test for replication deadlock on replication timeout (#3691)
* test: add test for replication deadlock on replication timeout

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-09-19 12:11:28 +03:00
Borys
efa4efd2bf
refactor: use CmdArgParser for XGROUP command (#3739) 2024-09-18 22:30:37 +03:00
Borys
bbaa2669f9
test: unskip test for debugging purpose (#3738) 2024-09-18 14:13:07 +00:00
Borys
f122a19a02
test: add tests for replication (#3734)
* test: add tests for replication
2024-09-18 16:32:21 +03:00
Kostas Kyrimis
6e45c9c3e2
fix: properly track json memory usage (#3641)
* add JsonMemTracker
* add logic based on MiMallocResource deltas that calculates json object usage
* add test

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-18 13:08:43 +00:00
Stepan Bagritsevich
b235617a0d
fix(json_family): Fix out of bound ranges for the JSON.ARR* commands (#3712)
* fix(json_family): Fix out of bound ranges for theJSON.ARR* commands

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor(json_family): address comments

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor(json_family): address comments 2

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

---------

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
2024-09-18 14:31:17 +02:00
Vladislav
41ba864924
chore: Remove ReqSerializer (#3721)
Signed-off-by: Vladislav <vladislav.oleshko@gmail.com>
2024-09-18 14:31:47 +03:00
Shahar Mike
ffb4c2b601
fix: Fix test_take_over_seeder (#3733)
The test assumed any shutdown will take not more than 1s. This doesn't
always hold, and also waiting for 1s isn't ideal because usually it
takes less than that.

Changed to use `assert_eventually` instead.

Fixes #3684
2024-09-18 14:17:09 +03:00
Shahar Mike
1c6be62a0b
fix: Fix cluster_mgr.py (#3730)
We updated the reply of `SLOT-MIGRATION-STATUS`, so `cluster_mgr.py`
needs to be adjusted as well.
2024-09-18 11:44:15 +03:00
Shahar Mike
a115bc2b9f
fix: Fix test test_client_pause_with_replica (#3729)
There are 2 minor issues with this test:
1. It specified `cmdstat_replconf` as `cmd_stats` instead of `cmd`,
   that's clearly a typo as `cmd_stats` is a map with stats, while
   `replconf` is a Dragonfly command
2. Command `MULTI` is allowed to run even when the server is in paused
   state, see
   [here](https://github.com/dragonflydb/dragonfly/blob/main/src/server/main_service.cc#L1197):

   ```
   // Don't interrupt running multi commands or admin connections.
   ```

Fixes #3675
2024-09-18 09:40:26 +03:00