1
0
Fork 0
mirror of https://github.com/dragonflydb/dragonfly.git synced 2024-12-14 11:58:02 +00:00
Commit graph

324 commits

Author SHA1 Message Date
Borys
43808da27f
fix(cluster): fix slot filtration to RestoreStreamer (#2477)
* fix(cluster): fix slot filtration to RestoreStreamer

* test: add cluster data migration test
2024-01-28 12:29:54 +02:00
Vladislav
a5b9401449
fix: reduce test_pipeline_batching_while_migrating flakiness (#2475)
* fix: reduce test_pipeline_batching_while_migrating flakiness
2024-01-25 17:55:12 +03:00
Kostas Kyrimis
ba972923b3
feat(lua): add no-op redis.log command (#2476)
* add no-op redis.log command
2024-01-25 15:45:47 +02:00
Shahar Mike
e6f418575b
test(cluster): Enable seeder to work against a Dragonfly cluster (#2462) 2024-01-24 20:02:04 +02:00
Borys
a16b940a65
feat(cluster): add tx execution in cluster_shard_migration (#2385)
* feat(cluster): add tx execution in cluster_shard_migration
refactor(replication): move code that is common for cluster and
replica into a separate file, add full-sync-cut cmd
2024-01-22 21:19:39 +02:00
Shahar Mike
2f0287429d
fix(replication): Correctly replicate commands even when OOM (#2428)
* fix(replication): Correctly replicate commands even when OOM

Before this change, OOM in shard callbacks could have led to data
inconsistency between the master and the replica. For example, commands
which mutated data on 1 shard but failed on another, like `LMOVE`.

After this change, callbacks that result in an OOM will correctly
replicate their work (none, partial or complete) to replicas.

Note that `MSET` and `MSETNX` required special handling, in that they are
the only commands that can _create_ multiple keys, and so some of them
can fail.

Fixes #2381

* fixes

* test fix

* RecordJournal

* UNDO idiotnessness

* 2 shards

* fix pytest
2024-01-18 12:29:59 +02:00
Kostas Kyrimis
39e7e5ad87
fix: missing error reply to client after AddOrFind throw std::bad_alloc (#2411)
* Handle properly and reply on execution paths that throw std::bad_alloc within AddOrFind
2024-01-15 10:13:10 +02:00
Shahar Mike
13718699d8
feat(server): Implement CLIENT KILL (#2404)
* feat(server): Implement `CLIENT KILL`

Currently, it supports the following syntax:

* `CLIENT KILL <addr>:<port>`
* `CLIENT KILL ID <id>`
* `CLIENT KILL ADDR <addr>:<port>`
* `CLIENT KILL LADDR <addr>:<port>`

It will not allow killing an admin-connection from a non-admin port.

There are a few parameters of `CLIENT KILL` that Redis supports but this
PR does not yet add. Let's add them as needed.

Fixes #1614

* Add tests

* fixes
2024-01-15 09:49:23 +02:00
Vladislav
484b4de216
Fix flush when migrating connection (#2407)
fix: don't miss flush for control messages
2024-01-13 09:57:33 +03:00
Yue Li
8d09478474
bug(server): log evicted keys in journal in PrimeEvictionPolicy. (#2302)
fixes #2296

added a regression test that tests both policy based eviction as well as heart beat eviction.

---------

Signed-off-by: Yue Li <61070669+theyueli@users.noreply.github.com>
2024-01-11 01:45:29 -08:00
adiholden
f37c57c704
fix(server): crash on rename save command on background save (#2375)
* fix(server): crash on rename save command on baground save

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-01-07 12:21:09 +02:00
Yue Li
6f9107291e
test: Adding integration test using Relay benchmark (#2348)
Adding integration test using Relay benchmark
2024-01-02 12:44:22 -08:00
Borys
03f69ff6c3
feat: add SLOT-MIGRATION-STATUS cmd for source node (#2349)
* feat: add SLOT-MIGRATION-STATUS cmd for source node
implements #2232
add ability using SLOT-MIGRATION-STATUS without args
to print info about all migration processes for the current node
2024-01-02 12:10:06 +02:00
adiholden
5d67c95797
bug(server): reject replicaof while loading from snapshot (#2338)
fix #2337
The bug:
replicaof was not rejected while loading snapshot
The fix:
replicaof is allowed while server is in loading state to allow replicaof while replication in full sync mode
I now reject replicaof if the server is in loading state and it is master

Another bug fix:
allow cron snapshot if --replicaof flag was set

Signed-off-by: adi_holden <adi@dragonflydb.io>
2023-12-27 13:57:49 +02:00
Shahar Mike
a360b308c9
refactor(server): Privatize PreUpdate() and PostUpdate() (#2322)
* refactor(server): Privatize `PreUpdate()` and `PostUpdate()`

While at it:
* Make `PreUpdate()` not decrease object size
* Remove redundant leftover call to `PreUpdate()` outside `DbSlice`

* Add pytest

* Test delete leads to 0 counters

* Improve test

* fixes

* comments
2023-12-25 07:49:57 +00:00
Roman Gershman
700a65ece5
chore: refactor VersionMonitor into a separate file (#2326)
* chore: refactor VersionMonitor into a separate file
---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-12-24 22:06:57 +02:00
Roman Gershman
bbe3d9303b
feat: introduce transaction statistics in the info output (#2328)
1. How many transactions we processed by type
2. How many transactions we processed by width (number of unique shards).

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-12-23 13:18:49 +02:00
Roman Gershman
365cb439cf
chore: remove support for save_schedule flag (#2327)
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-12-22 11:17:18 +02:00
Borys
fd76c51310
feat: add command flow for slot migration process (#2292)
* feat(cluster): add command flow for slot migration process
fixes #2295

DFLYMIGRATE FLOW command was added to establish
connections for every shard replication process.
Slow serialization step is the separate issue so
for now only eof_token is sent for reply to
DFLYMIGRATE FLOW command.
Expected state for START-SLOT-MIGRATION is FULL_SYNC now.
2023-12-20 18:47:11 +02:00
Vladislav
aaf01d4244
feat(cluster): Cancel blocking commands on cluster update (#2255)
Handle blocking commands during cluster config update

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-12-17 15:32:35 +03:00
s-shiraki
bd3e57d262
feat(server): Implement NUMSUB subcommand (#2282)
* feat(server): Implement NUMSUB subcommand

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix: test

* fix: build error
2023-12-16 20:42:15 +02:00
Vladislav
7ca07a498f
fix(server): Fix client pause and add test (#2298)
Fixes a bug in which we incorrectly determined paused dispatches, which led to not allowing multiple (overlapping) client pauses
2023-12-12 19:28:48 +03:00
Kostas Kyrimis
8640edad71
feat(acl): add acl keys to acl log command (#2274)
* add acl keys to acl log command
* add tests
2023-12-12 17:00:41 +02:00
Kostas Kyrimis
8323c82dc5
feat(acl): add acl keys to acl save/load (#2273)
* add acl keys to acl savel/load
* add tests
2023-12-08 16:08:33 +00:00
Kostas Kyrimis
2703d4635d
feat(acl): add validation for acl keys (#2272)
* add validation for acl keys
* add tests
2023-12-08 17:28:53 +02:00
Kostas Kyrimis
8126cf8252
feat(acl): add acl keys to acl list command (#2261)
* add acl keys to acl list
2023-12-08 15:32:15 +03:00
Vladislav
11ef6623dc
feat: DispatchTracker to replace everything (#2179)
* feat: DispatchTracker

Use a DispatchTracker to track ongoing dispatches for commands that change global state

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-12-05 11:02:11 +03:00
Borys
24b13434cf
feat: add slot-migration-status command (#2239)
* feat: add slot-migration-status command
2023-12-04 12:47:46 +02:00
Roman Gershman
26512fdba4
fix: remove string copy in SendMGetResponse (#2246)
fix: eliminate the redundant string copy in SendMGetResponse

Also, allow selectively create DflyInstance in pytests that is attached to
an existing dragonfly port, created outside of tests.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-12-03 18:14:19 +02:00
Borys
bfb1b3b624
Start slot migration (#2218)
* feat: add new command START-SLOT-MIGRATION
2023-11-29 13:38:13 +02:00
Roman Gershman
b853b2ab00
fix: memcached VERSION is now parseable by php-memcached client (#2220)
The DF version is being unparseable by Memcached::getVersion() that expects n.n.n string.
Change the version to emulate the old memcached server.
The DF version can still be fetched via Memcached::getStats() function.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-11-27 20:54:00 +02:00
Vladislav
d6044edbab
fix(squashing): Reset base command id (#2209)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-11-26 12:40:37 +02:00
Borys
e6f3522d59
fix: forbid parallel save operations (#2172)
* fix: forbid parallel save operations

* feat: add SAVE option to takeover command
2023-11-21 13:56:27 +02:00
Vladislav
604c600166
fix(pytest): Fix renamed flag (#2197)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-11-20 20:54:11 +00:00
Vladislav
d21f82a5f9
chore: connection fixes (#2192)
* chore: add more states to client connections

* fix: clear pipelined messages before close

* fix: skip same thread on backpressure
---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
Co-authored-by: Roman Gershman <roman@dragonflydb.io>
2023-11-20 17:08:12 +00:00
Kostas Kyrimis
4a1cb5bfa2
fix(memcached): add length check for key values (#2153)
* fix length checks for store commands
* add test
2023-11-20 14:37:29 +02:00
adiholden
c95f4961be
fix(server): client pause fix on pipeline squash (#2180)
* fix(server): client pause fix on pipeline squash

allow squashing commands on pause
move await on client pause inside InvokeCommand - this way all flows of command invoke will read pause state

Signed-off-by: adi_holden <adi@dragonflydb.io>
2023-11-16 13:30:02 +02:00
adiholden
b61d07d2c1
regression: skip client pause test utill we fix the bug (#2177)
Signed-off-by: adi_holden <adi@dragonflydb.io>
2023-11-15 16:31:45 +02:00
Roy Jacobson
c3a2da559e
feat(server): Implement CLIENT PAUSE (#1875)
* feat(server): Implement CLIENT PAUSE

Signed-off-by: adi_holden <adi@dragonflydb.io>
2023-11-15 08:56:49 +02:00
Kostas Kyrimis
09415c4f57
chore(tls): add tls config test for ca_dir (#2152)
This PR introduces a test case for TLS with `ca_dir`. First, we
did not have any tests for this case. Second, using `ca_dir` requires
to call `c_rehash` on the directory before it is loaded by DF. We
did not have this use case anywhere and therefore we thought there was
a bug when we used `ca_dir` only to find out that we need to call
`c_rehash` on the directory before we load the certificates. Now,
both a test and a use case are properly documented

* add missing test for ca_dir
* use rehash to properly show how to load ca directories instead of
  files
2023-11-13 14:11:14 +02:00
Vladislav
46292968ad
fix(search): Fix replication (#2159)
* fix(search): Support replication

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-11-13 11:58:54 +03:00
Vladislav
564e38c05c
chore: lower takeover test load, add comments (#2151)
* chore: lower takeover test load, add comments

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-11-12 12:08:05 +03:00
Kostas Kyrimis
5381746158
fix(regTests): increase cancel replication test timeout (#2143)
* increase timeout on cancel replication immediately 
* reduce the amount of commands run to 100 in the test
2023-11-08 23:00:00 +03:00
Vladislav
821884e333
chore(search): Extend FT.INFO (#2133)
* chore(search): Add index definition info to ft.info

* chore(search): Add flags to ft.info

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-11-06 16:18:13 +03:00
Kostas Kyrimis
2baadd1e90
fix(acl): case insensitive parsing from files and serialization format (#2123)
* replace > with # for acl files
* replace ACL SETUSER with USER for acl files
* add case insensitive parsing for acl files
* update tests
2023-11-05 11:43:11 +02:00
Roman Gershman
7aa3dba423
chore: use decode_responses when creating a redis client (#2109)
* chore: use decode_responses when creating a redis client

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-11-03 11:00:26 +02:00
Borys
2f39e89189
fix: add ability to set snapshot_cron flag during runtime (#2101)
* fix: add validating for snapshot_cron flag during runtime
* refactor: move warning log to upper level
2023-11-03 10:10:16 +02:00
Shahar Mike
169c9d3975
fix(regTests): Wait between ACTIVE until `stable_sync (#2111)
Regression test sometimes fails because for a short period of time after `wait_available_async()` returns, the result of `ROLE` could still be different from `stable_sync`

[Failure example](https://github.com/dragonflydb/dragonfly/actions/runs/6726461923/job/18282759612#step:6:1863)

We change our state from `LOADING` to `ACTIVE` [here](d08d7f13b4/src/server/replica.cc (L426)), but then we change the sync state 2 times:
1. `!R_SYNCING` [here](d08d7f13b4/src/server/replica.cc (L427C28-L427C37))
2. And only later to `R_SYNC_OK` (meaning `stable_sync`) [here](d08d7f13b4/src/server/replica.cc (L221))

This is easy to reproduce by adding a sleep right after the set of state to `ACTIVE`, either before or after the flipping of `R_SYNCING` (with different returned states).

BTW without that added sleep I was not able to reproduce, having tried 1000s of times in various configurations.

We could change the order of things such that we first change `state_mask_` and only then switch state from `LOADING` to `ACTIVE` (which is probably the right thing to do), but that would require a subtle refactor, as we change these in a couple of places.

But we should keep in mind that this has no effect on users. So a simple sleep on the test side should fix this fairly well.
2023-11-02 13:09:42 +02:00
Kostas Kyrimis
d08d7f13b4
fix(regTests): can't execute command while loading on snapshots (#2110) 2023-11-02 12:17:08 +02:00
Roman Gershman
8a65aec805
chore: help users to fix a mistake of setting quotes in the flagfile (#2092)
* chore: help users to fix a common mistake of setting quotes in the flagfile

Specifically, the confusion is often around the cron expression.
---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-10-30 22:59:00 +02:00