dragonflydb-dragonfly

mirror of https://github.com/dragonflydb/dragonfly.git synced 2024-12-14 11:58:02 +00:00

Author	SHA1	Message	Date
Vladislav	1b1a83dc58	chore: SinkReplyBuilder2 with vec batching (#3454 )	2024-08-07 18:34:52 +03:00
Vladislav	8587377a03	chore(server): Remove old blocking debug (#3460 )	2024-08-07 16:52:09 +03:00
Roman Gershman	1cbfcd4912	chore: add timeout to replication sockets (#3434 ) * chore: add timeout fo replication sockets Master will stop the replication flow if writes could not progress for more than K millis. --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io> Signed-off-by: Roman Gershman <romange@gmail.com> Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>	2024-08-07 16:33:03 +03:00
Roman Gershman	7c84b8e524	chore: change how we track memory_budget during evictions (#3457 ) * chore: change how we track memory_budget during evictions We compared memory_budget vs 0 before inserting a new item in DbSlice, and retired cool pages if we are low on memory. The problem - when we decide whether we allow growing a table, we estimate the possible object size increase due to the future table growth. And the memory check described before was not consistent with the actual logic that rejected the insertion. Moreover, the memory_budget tracking interaction with EvictionPolicy was over-complicated: we passed the memory_budget counter to the evp object and then read it back, even though evp did not track object deletions memory impact during objects evictions. Now, we remove the responsibility from evp to update memory_budget_ so it's solely updated by DbSlice. We also update memory_budget_ during deletions, and when we pass it to evp, we add cool memory size as potential memory resource to avoid rejections in case we have lots of cool memory. Fixes #3456 --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-08-07 15:43:45 +03:00
Kostas Kyrimis	41be819dbb	fix: pytest teardown exception in os.remove(LAST_LOGS) (#3463 ) os.remove(LAST_LOGS) might throw an exception if the file does not exist which we do not handle. Wrap it in try/catch block * wrap in try/catch os.remove --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-08-07 14:27:04 +03:00
romange	5db8d27363	chore(helm-chart): update to v1.21.0	2024-08-07 09:13:04 +00:00
Stepan Bagritsevich	75452a7108	feat(json_family): Add support of the JSON legacy mode (#3284 )	2024-08-06 18:04:45 +02:00
Roman Gershman	7df72fd6d0	chore: integrate quicklist changes from valkey (#3440 ) The quicklist.* files are mostly equal to the versions at commit 0fc43edc6 Also, removed some unused code. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-08-06 16:25:16 +03:00
Borys	070e7b02c6	fix: JSON.MSET command (#3459 )	2024-08-06 15:37:03 +03:00
Kostas Kyrimis	ff716bb8b0	fix: missing logs on ci timeout (#3452 ) The env variables exported when regression tests timeout are not working properly and the if statement on the action step Print last log on timeout would fail to read and upload the files set in /tmp/last_log_file.txt. Furthermore, another problem is the job.timeout argument that kills the whole job/matrix before the upload log step has a chance to run. For that, we need manual timeouts on the workflow similar to what we do in regression tests action. * remove print last log on timeout action step * copy the logs on timeout directly within the timeout step * replace global timeout on CI workflow with timeout command per step --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-08-06 14:52:18 +03:00
Roman Gershman	420046aac8	fix: properly seriailize meta buffer in SendStringArrInternal (#3455 ) Fixes #3449 that was introduced by #3425 Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-08-06 10:43:05 +03:00
Roman Gershman	e482eefcbb	chore: disable serialization_max_chunk_size in regtests (#3445 ) Intended to stabilize regression tests before releasing our next version. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-08-05 07:38:18 +00:00
Shahar Mike	38fba1d398	fix: cluster_mgr.py to use `CLUSTER MYID` (#3444 )	2024-08-05 07:29:31 +00:00
Borys	faea4eef45	test: fix test_disconnect_replica (#3442 )	2024-08-05 10:07:27 +03:00
Roman Gershman	6da445fcfe	feat: DEBUG REPLICA PAUSE now pauses fullsync (#3441 ) Before that PAUSE paused the reconnection reconciler flow, now it also stops the ongoing full sync replication if such exists. In addition, this PR applies some clean-ups and removes redundant code Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-08-05 09:42:57 +03:00
Kostas Kyrimis	3f08a60148	chore: reset serialization_max_chunk_size to 0 (#3432 ) * reset serialization_max_chunk_size to 0 * reword flag information --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-08-05 09:36:23 +03:00
Roman Gershman	9eacedf58e	chore: simplify master replication cancelation interface (#3439 ) * chore: simplify master replication cancelation interface Before that CancelReplication did too many things, moreover, we had StopReplication that did the same. This PR moves CancelReplication under ReplicaInfo struct, and reduces code duplication around this change. Signed-off-by: Roman Gershman <roman@dragonflydb.io> * Update src/server/dflycmd.cc Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com> Signed-off-by: Roman Gershman <romange@gmail.com> --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io> Signed-off-by: Roman Gershman <romange@gmail.com> Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>	2024-08-04 19:00:52 +00:00
Vladislav	55d39b66ff	chore: fix memcached pipeline test (#3438 )	2024-08-04 15:41:17 +03:00
Roman Gershman	8f7c36e4b3	chore: reorganize EngineShard::Heartbeat (#3437 ) * chore: reorganize EngineShard::Heartbeat 1. Simplify CacheStats by using accessorts directly provided by DbSlice 2. Separate eviction for tiering as tiering can be done on replica. --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-08-04 15:00:43 +03:00
Roman Gershman	cfd2273fb0	chore: improve replication locks (#3436 ) * chore: improve replication locks Allow non-exclusive, read-only access to Dfly::ReplicaInfo structure. The most important change is in DflyCmd::CancelReplication, where before it has locked ReplicaInfo mutex and then continued with locking the global mutex. It is dangerous because most operation lock them in the opposite order. Also rename ambigous GetReplicaInfo accessors to clearer names. Signed-off-by: Roman Gershman <roman@dragonflydb.io> * chore: comments * chore: comments --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-08-04 10:55:50 +00:00
Vladislav	2ef475865f	test(cluster): Migration replication test (#3417 )	2024-08-04 12:45:02 +03:00
Shahar Mike	2aa0b70035	feat(server): Support `replica-announce-ip`/`port` (#3421 ) * feat: Support `replica-announce-ip`/`port` Before this PR, we only supported `cluster_announce_ip`. It's basically the same feature, but used for cluster announcements instead of replication. This PR adds support for `replica-announce-ip` and `replica-announce-port`, which can be set via new flags `--announce_ip=` and `--announce_port=`. These flags apply to both cluster and replica announcements. Tested via running Sentinel, and making sure it is able to connect to announced ip+port, while it can't connect to announced false / unavailable ip+port. Note: this PR deprecates `--cluster_announce_ip`, but continues to support it. We will remove it in a future version. Fixes #3380 * fix failing test * destructure	2024-08-04 12:35:14 +03:00
Roman Gershman	c9ed3f7b2b	chore: retire TEST_EnableHeartBeat (#3435 ) Now unit tests will run the same Hearbeat fiber like in prod. The whole feature was redundant, with just few explicit settings of maxmemory_limit I succeeeded to make all unit tests pass. In addition, this change allows passing a global handler that is called by heartbeat from a single thread. This is not used yet - preparation for the next PR to break hung up replication connections on a master. Finally, this change has some non-functional clean-ups and warning fixes to improve code quality. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-08-03 20:17:23 +03:00
Vladislav	82298b8122	fix(server): Implement SCRIPT GC command (#3431 ) * fix(server): Implement SCRIPT GC command	2024-08-02 23:49:51 +03:00
Roman Gershman	f652f10743	chore: optimize SendStringArrInternal even more (#3425 ) Before - sending 200K items requires more than 12K send calls. Now - requires less than 2K calls. Latency also went down though not by x6. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-08-02 14:53:20 +03:00
Roman Gershman	8622c27ce1	chore: expose metric that shows how many task submitters are blocked (#3427 ) * chore: expose metric that shows how many task submitters are blocked This should help us in identifying deadlocks quickly. --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-08-01 21:27:15 +03:00
Borys	e2b6cfb384	chore: skip cluster tests if redis-server wasn't found (#3416 ) * chore: skip cluster tests if redis-server wasn't found	2024-08-01 13:04:02 +00:00
Roman Gershman	a0918de2d3	feat: Support non-root paths for json.merge (#3419 ) * feat: Support non-root paths for json.merge Pass path argument and rewrite the JSON.MERGE code similar to OpToggle or other mutating functions. Currently works only with --experimental_flat_json=false. Signed-off-by: Roman Gershman <roman@dragonflydb.io> * chore: comments --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-08-01 08:33:36 +00:00
Roman Gershman	0ad310717d	chore: Tiered fixes (#3401 ) 1. Add background offloading stats 2. remove direct_fd override - helio is already updated with default=false, so it's not needed anymore. 3. remove redundant tiered_storage_memory_margin flag Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-08-01 11:03:13 +03:00
Roman Gershman	71b861572a	chore: remove verbose printing of tests (#3420 ) Motivation: to avoid 80MB logs into stdout like this one: https://github.com/dragonflydb/dragonfly/actions/runs/10174852001/job/28141278813?pr=3401 Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-08-01 10:52:50 +03:00
Vladislav	e273015c0b	fix(connection): Count memchached pipelined commands (#3413 )	2024-08-01 10:14:36 +03:00
Kostas Kyrimis	7e911c100a	fix: json.merge exception crash (#3409 ) json.merge would throw an exception when the json object did not contain the element to replace because RecursiveMerge functions used &dest->at(k_v.key()) which threw the exception. Remove RecursiveMerge completely and use the one implemented in jsoncons lib. * add test * replace RecursiveMerge with mergepatch::apply_merge_patch * add exception handling for that flow --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-31 17:03:36 +03:00
Borys	558a22d5b8	fix: crash with NS in multi/exec #3410 (#3415 )	2024-07-31 10:06:32 +00:00
Kostas Kyrimis	1aa0720843	chore: increase timeout of regression tests (#3412 ) The recent changes of the serialization_max_chunk_size set to 1 for extreme testing increased the running time of the tests causing them sometimes to timeout. * increase timeout on reg tests from 40 to 50 --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-31 07:28:44 +00:00
Kostas Kyrimis	aa02070e3d	chore: add db_slice lock to protect segments from preemptions (#3406 ) DastTable::Traverse is error prone when the callback passed preempts because the segment might change. This is problematic and we need atomicity while traversing segments with preemption. The fix is to add Traverse in DbSlice and protect the traversal via ThreadLocalMutex. * add ConditionFlag to DbSlice * add Traverse in DbSlice and protect it with the ConditionFlag * remove condition flag from snapshot * remove condition flag from streamer --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-30 15:02:54 +03:00
Vladislav	f536f8afbd	chore: cancel slot migrations on shutdown (#3405 )	2024-07-30 12:47:58 +03:00
Roman Gershman	e464990643	feat: stabilize non-coordinated omission mode (#3407 ) * feat: stabilize non-coordinated omission mode 1. Our latency/RPS computations were off because we started measuring before drivers started running. Now, Run/Start phases are separated, so the start time is measured more precisely (after the start phase) 2. Introduced progress per connection - one of my discoveries is that driver connections progress with differrent pace when running in coordinated omission mode. This can reach x5 speed differrences. Now we measure and output fastest/slowest progress. 3. Coordinated omission is great when the Server Under Test is able to sustain the required RPS. But if the actual RPS is lower than the one is sent than the final latencies will be infinitely high. We fix it by introducing self-adjusting sleep interval, so if the actual RPS is lower we will increase the interval to be closer to the actual RPS. Show p99 latency and maximum pending requests per connection. Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com> Signed-off-by: Roman Gershman <romange@gmail.com> --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io> Signed-off-by: Roman Gershman <romange@gmail.com> Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>	2024-07-30 11:55:43 +03:00
Shahar Mike	89a48a7aa8	chore: Support setting the value of `replica-priority` (#3400 ) * chore: Support setting the value of `replica-priority` This PR adds a small refactor to the way we set and get config names which have dashes (`-`) and underscores (`_`). Until now, words were separated by underscores because this is how our flags library (absl) works. However, this is incompatible with Valkey, which uses dashes as a word separator. Once merged, we will support both underscores and dashes in config names, but will only return the name with dashes. This is a behavior change. We're doing this in order to be compatible with `replica-priority` and possibly other config names that Valkey uses. * Flag restore * normalize to '_'	2024-07-29 23:02:49 +03:00
Shahar Mike	7100168bab	chore: Don't print password to log on replica `AUTH` failure (#3403 )	2024-07-29 22:36:39 +03:00
Roman Gershman	776bd79381	fix: reenable macos builds (#3399 ) * fix: reenable macos builds Also, add debug function that prints local state if deadlocks occure. * fix: free cold memory for non-cache mode as well * chore: disable FreeMemWithEvictionStep again Because it heavily affects the performance when performing evictions. --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-07-28 22:40:51 +03:00
Vladislav	1a8c12225b	chore(tiering): Move cool entry warmup to DbSlice (#3397 ) Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>	2024-07-28 17:30:41 +03:00
Shahar Mike	20bda84317	Revert "chore: set serialization_max_chunk_size to 1 byte (#3379 )" (#3398 ) This reverts commit `2867d54a05`.	2024-07-28 06:48:46 +00:00
Stepan Bagritsevich	28cfde0a27	fix: Fix unsupported object type rejson-rl in RedisInsight (#3384 ) * fix: Fix unsupported object type rejson-rl in RedisInsight Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com> * fix(generic_family): fix case for the TYPE option in SCAN command Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com> * feat(generic_family_test): Add test for the Redis GUI Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com> * refactor: address comments Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com> * refactor: address comments 2 Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com> * refactor: change variable name from obj_type_as_string to obj_type Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com> --------- Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com>	2024-07-27 19:05:00 +02:00
Roman Gershman	6b67f44e29	chore: tiering - make Modify work with cool storage (#3395 ) 1. Fully support tiered_experimental_cooling for all operations 2. Offset cool storage usage when computing memory pressure situations in Hearbeat. 3. Introduce realtime entry counting per db_slice and provide DCHECK to verify it vs the old approach. Later we will switch to realtime entry and free memory computations when computing bytes per object, and remove the old approach in CacheStats(). 4. Show hit rate during the run of dfly_bench loadtest. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-07-27 14:31:29 +03:00
Kostas Kyrimis	9d16bd6f6e	fix(acl): remove none from acl categories (#3392 ) None does not exist in Valkey and its entry was missing from the indexes we use to map categories to commands leading to an out of bounds access and causing a segfault. * remove none from acl categories --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-26 09:27:29 +03:00
Roman Gershman	0a26a06065	chore: tiered fixes (#3393 ) 1. Use introsive::list for CoolQueue. 2. Make sure that we ignore cool memory usage when computing average object size to prevent evictions during dashtable growth attempts. 3. Remove items from the cool storage before evicting them from the dash table. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-07-25 23:38:44 +03:00
Kostas Kyrimis	2867d54a05	chore: set serialization_max_chunk_size to 1 byte (#3379 ) Update the flag for extreme testing. We should remove this before the release. * set serialization_max_chunk_size to 1 byte --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-25 23:10:44 +03:00
Kostas Kyrimis	6d9e370e2d	fix: test_big_value_serialization_memory_limit shutdown timeout (#3390 ) The problem is that the test test_big_value_serialization_memory_limit will try to shutdown dragonfly at the end with a timeout of 15 seconds. Dragonfly during shutdown takes a snapshot which might take more than 15 seconds and the test fails. * call flushall before we exit the test --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-25 21:38:09 +03:00
Kostas Kyrimis	79d7f57b67	fix: disable inline transactions when db_slice has registered callbacks (#3391 ) Inline transactions do not acquire any locks and therefore they should not preempt. This is no longer true when db_slice has registered callbacks. * disable inline transactions when db_slice has registered callbacks --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-25 16:06:35 +00:00
Kostas Kyrimis	a95cf2e857	chore: do not preempt on db_slice::RegisterOnChange (#3388 ) For big value serialization it is required to support preemption when db_slice::RegisterOnChange is called to avoid UB when a code path is iterating over the change_cb_ and preempts because it serializes a big value. As this is problematic and can lead to data inconsistencies I replace the std::vector with std::list and bound the iteration of change_cb_ on paths that preempt. * replace std::vector with std::list for change_cb_ * bound iteration of change_cb_ on paths that preempt --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-25 16:08:02 +03:00

1 2 3 4 5 ...

2545 commits