dragonflydb-dragonfly

mirror of https://github.com/dragonflydb/dragonfly.git synced 2024-12-14 11:58:02 +00:00

Author	SHA1	Message	Date
Kostas Kyrimis	2867d54a05	chore: set serialization_max_chunk_size to 1 byte (#3379 ) Update the flag for extreme testing. We should remove this before the release. * set serialization_max_chunk_size to 1 byte --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-25 23:10:44 +03:00
Kostas Kyrimis	6d9e370e2d	fix: test_big_value_serialization_memory_limit shutdown timeout (#3390 ) The problem is that the test test_big_value_serialization_memory_limit will try to shutdown dragonfly at the end with a timeout of 15 seconds. Dragonfly during shutdown takes a snapshot which might take more than 15 seconds and the test fails. * call flushall before we exit the test --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-25 21:38:09 +03:00
Kostas Kyrimis	79d7f57b67	fix: disable inline transactions when db_slice has registered callbacks (#3391 ) Inline transactions do not acquire any locks and therefore they should not preempt. This is no longer true when db_slice has registered callbacks. * disable inline transactions when db_slice has registered callbacks --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-25 16:06:35 +00:00
Kostas Kyrimis	a95cf2e857	chore: do not preempt on db_slice::RegisterOnChange (#3388 ) For big value serialization it is required to support preemption when db_slice::RegisterOnChange is called to avoid UB when a code path is iterating over the change_cb_ and preempts because it serializes a big value. As this is problematic and can lead to data inconsistencies I replace the std::vector with std::list and bound the iteration of change_cb_ on paths that preempt. * replace std::vector with std::list for change_cb_ * bound iteration of change_cb_ on paths that preempt --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-25 16:08:02 +03:00
Kostas Kyrimis	4b851be57a	fix: remove fiber guard from non atomic section (#3381 ) We might preempt when we serialize a big value and the code in journal was protected by an atomic guard triggering a check failed. * remove fiber guard from non atomic section * move LocalBlockingCounter to common --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-25 16:06:35 +03:00
Roman Gershman	e2d65a0900	chore: reenable evictions upon insertion to avoid OOM rejections (#3387 ) * chore: reenable evictions upon insertion to avoid OOM rejections Before: when running dragonfly with --cache_mode we could get OOM rejections even though the eviction policy allowed to evict items to free memory. Ideally, dragonfly in cache mode should not respond with the OOM error. This PR reuses the same Eviction step we have in the Heartbeat and conditionally applies it during the insertion. In my test the OOM errors went from 500K to 0 and the server still respected memory limit. Also, remove the old heuristics that has never been used. Test: ./dfly_bench --key_prefix=bar: -d 1024 --ratio=1:0 --qps=200 -n 3000 ./dragonfly --dbfilename= --proactor_threads=2 --maxmemory=600M --cache_mode --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-07-25 15:28:57 +03:00
Shahar Mike	fb4222d01e	fix: Fix `test_take_over_seeder` (#3385 ) * fix: Fix `test_take_over_seeder` There are a few issues with the test: 1. Not using the admin port, which could cause pause to deadlock 2. Not waiting for some of the `task`s (although that won't cause a failure) But also in the product code: 1. We used to `std::move()` the same pointer multiple times 2. We assigned to the same status object from multiple threads Hopefully this fixes the test. It used to fail every ~100 attempts on my machine, now it's been >1,000 and they all passed. * add comments * remove shard_ptr param	2024-07-25 08:00:05 +00:00
Roman Gershman	181d356341	chore: update cached stats inside PollExecution (#3376 ) * chore: update cached stats inside PollExecution	2024-07-25 10:46:03 +03:00
Roman Gershman	8a9c9adbc5	chore: introduce a cool queue that gradually retires cool items (#3377 ) * chore: introduce a cool queue that gradually retires cool items This PR introduces a new state in which the offloaded value is not freed from memory but instead stays in the cool queue. Upon Read we convert the cool value back to hot table and delete it from storage. When we low on memory we retire oldest cool values until we are above the threshold. The PR does not fully finish the feature but it is workable enough to start (load)testing. Missing: a) Handle Modify operations b) Retire cool items in more cases where we are low on memory. Specifically, refrain from evictions as long as cool items exist. --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-07-25 09:09:40 +03:00
Roman Gershman	02b72c9042	chore: dfly_bench - print ongoing error counts (#3382 )	2024-07-24 22:13:11 +03:00
Kostas Kyrimis	52b29b302c	update: replication_acks_interval flag to 1000 (#3378 ) * update replication_acks_interval flag to 1000 --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-24 13:28:56 +00:00
Kostas Kyrimis	929222a7df	chore: add mem test for big values and default the flag (#3369 ) * default serialization_max_chunk_size to 10 mb * add test for big values * small rename of enum to conform style guide --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-24 16:07:27 +03:00
Roman Gershman	03b3f86aed	chore: Track db_slice table memory instantly (#3375 ) We update table_memory upon each deletion and insertion of an element. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-07-24 14:13:08 +03:00
Vladislav	f73c7d0e42	fix(transaction): Properly store block cancel status (#3371 )	2024-07-24 14:05:00 +03:00
Stepan Bagritsevich	37cb247cd4	chore(replica): remove unused methods in the Replica class (#3374 ) Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com>	2024-07-24 12:05:13 +02:00
Kostas Kyrimis	cd863b89b4	chore: disable cluster_fuzzymigration (#3373 ) * mark cluster_fuzzymigration as skipped Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-24 11:46:44 +03:00
Roman Gershman	499fa2268b	chore: simplify computation of used_mem_current (#3372 ) * chore: simplify computation of used_mem_current Before - each thread updated its own variable and then, the global "used_mem_current" was updated by summing used memory from each thread. Now, each thread updates used_mem_current directly. The code is simpler and also provides more precise results more frequently. --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-07-24 06:58:01 +00:00
Roman Gershman	eba722b774	chore: add a test for HeapSize() function (#3349 )	2024-07-23 19:02:57 +03:00
Roman Gershman	c8a98fd110	chore: small fixes around tiering (#3368 ) There are no changes in functionality here. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-07-23 16:00:50 +03:00
Roman Gershman	cd1f9d3923	chore: Introduce CoolQueue (#3365 ) Also, add the according API to compact object. Now external objects can be in two states: Cool and Offloaded. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-07-23 12:41:10 +03:00
Kostas Kyrimis	bcdfccc039	fix: protect OnJournalEntry with ConditionGuard (#3367 ) * add ConditionGuard on JournalEntry such that the stream state stays consistent Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-23 09:05:34 +00:00
Kostas Kyrimis	cd0e03a737	chore: disable compression on big values (#3358 ) * compression when we chunk big values Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-23 08:57:21 +00:00
Vladislav	759631e9ed	fix(transaction): Fix namespace access (#3364 ) Our area of attack during concurrent transaction access is the call to DisarmInShard and DisarmInShardWhen, which only access is_armed - an atomic varible. It is not safe to arbitrarily call GetNamespace() if we write to it in InitBase Solution: Don't write to it post first initialization	2024-07-23 11:25:04 +03:00
Shahar Mike	76edd0d027	fix(server): Require >=1 args to `GETEX` (#3366 ) Without this change, issuing `redis-cli getex` crashes Dragonfly	2024-07-23 08:17:23 +00:00
Roman Gershman	aac90f25b5	fix: failure in test_cluster_fuzzymigration (#3363 )	2024-07-22 22:39:41 +03:00
Vladislav	f81a893368	chore(tiering): Range functions + small refactoring (#3207 )	2024-07-22 18:36:11 +03:00
Roman Gershman	7df6771eaa	fix: do not upload offload values on a first hit (#3360 )	2024-07-22 16:23:22 +03:00
Vladislav	3ffd30f193	chore(server): Introduce StringSetWrapper (#3347 ) chore: use StringSetWrapper Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>	2024-07-22 15:51:19 +03:00
Roman Gershman	4b1574b5c8	chore: fix test_parser_memory_stats flakiness (#3354 ) * chore: fix test_parser_memory_stats flakiness 1. Added a robust assert_eventually decorator for pytests 2. Improved the assertion condition in TieredStorageTest.BackgroundOffloading 3. Added total_uploaded stats for tiering that tells how many times offloaded values were promoted back to RAM. * chore: skip test_cluster_fuzzymigration	2024-07-22 10:41:26 +00:00
Roman Gershman	1fc226b03c	chore: fixes to dfly_bench (#3353 ) 1. Moved CommandGenerator to thread scope - there is no need to maintain separate command generator per connection. 2. Added "done" metric - to know how much was done so far. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-07-22 12:38:51 +03:00
Borys	8a1038647d	refactor: reduce number of logs for cluster (#3357 )	2024-07-22 09:19:30 +00:00
Kostas Kyrimis	93ae2ef0e6	chore: small rename and add dcheck on LocalBlockingCounter (#3356 ) * add suffix _ to members * add dcheck in destructor Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-22 06:40:57 +00:00
Shahar Mike	2f9dc29dc6	chore: Log connection context when issuing dangerous cmds (#3352 ) * chore: Log connection context when issuing dangerous cmds * raise VLOG level	2024-07-21 15:33:03 +03:00
Roman Gershman	feb9bc266a	chore: pull helio (#3350 ) * chore: pull helio --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-07-21 15:26:25 +03:00
Roman Gershman	c46d95db2f	chore: clean up TaskQueue since we do not need multiple fibers for it (#3348 ) * chore: clean up TaskQueue since we do not need multiple fibers for it Implement TaskQueue as a wrapper around FiberQueue. --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-07-21 07:27:53 +00:00
Roman Gershman	7b2603aa46	fix: corruption in replication stream (#3344 ) Before it was possible to issue several concurrent AsyncWrite requests. But these are not atomic, which leads to replication stream corruption. Now we wait for the previous request to finish before sending the next one. ThrottleIfNeeded is now takes into account pending buffer size for throttling. Fixes #3329 Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-07-20 13:50:21 -04:00
Roman Gershman	fb7782bcce	chore: remove redundant metrics from memory stats (#3345 ) Leave only connection memory usage in memory stats. We should think how we can move it also to /metrics. In addition, added a test verifying that redis parser memory usage is tracked. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-07-20 06:02:55 -04:00
Roman Gershman	4e8c6ce515	fix: AllocationTracker::Remove return value was reversed (#3341 )	2024-07-19 15:47:24 +00:00
Vladislav	be59b5eeb4	chore: Make KeyIndex iterable (#3326 )	2024-07-19 14:23:46 +03:00
Shahar Mike	2b54fd985f	fix: Cancel outgoing migration when retrying / closing (#3339 )	2024-07-19 07:49:49 +00:00
Kostas Kyrimis	8a2d6ad1f4	fix: ub in RegisterOnChange and regression tests for big values (#3336 ) * fix replication test flag name for big values * fix a bug that triggers ub when RegisterOnChange is called on flows that iterate over the callbacks and preempt * add a stress test for big value serialization Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-19 07:03:17 +00:00
Borys	cad62679a4	Fix blocking commands moved error (#3334 ) * fix: BLPOP BZPOP(MIN\|MAX) moved error	2024-07-18 20:38:13 +03:00
Stepan Bagritsevich	d648e3ddd1	feat(hset_family): Add NX option to HSETEX (#3295 ) * feature(hset_family): Add NX option to HSETEX fixes dragonflydb#3265 Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com> * refactor(hset_family): Fix returned value in the HSETEX command Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com> * refactor: Revert the changes of the returned value for the HSETEX command Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com> --------- Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com>	2024-07-18 16:09:04 +04:00
Kostas Kyrimis	bfa5df5d6c	feat: add an option to flush serialized entries on threshold limit (#3241 ) * serialize big slots in chunks * allow preemption on large slots * disable big entries serialization for RDB files * add test Signed-off-by: kostas <kostas@dragonflydb.io>	2024-07-18 10:15:41 +00:00
Roman Gershman	37b992f27d	chore: implement sequential pass whithout the overlapping traffic (#3335 ) We divide the keyspace between connections in advance. This allows easily cover chunks of a key space in a predictable manner without having overlapping traffic. Excess traffic will just wrap around. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-07-18 13:02:02 +03:00
Roman Gershman	c670ffd09e	chore: Add coordinated omission mode (#3332 ) * chore: Add coordinated omission mode * chore: implement sequential mode in dfly_bench	2024-07-18 05:11:27 -04:00
Roman Gershman	b9f8671df9	chore(tiering): add protection against overruning memory budget (#3327 ) chore(tiering): Introduce second chance replacement strategy Introduce hot/cold replacement strategy https://www.geeksforgeeks.org/second-chance-or-clock-page-replacement-policy/ Also, add protection against overruning memory budget Finally, cancel in-flight offloading requests for entries that were looked up. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-07-18 03:52:43 -04:00
Stepan Bagritsevich	d51fea09e2	fix(json_family): fix JSON.STRAPPEND command for JSON legacy mode (#3264 ) * fix(json_family): fix JSON.STRAPPEND command for JSON legacy mode Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com> * fix(json_family): add tests Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com> * refactor(json_family): address comments Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com> * refactor(json_family): code clean up Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com> * refactor(json_family): address comments 2 Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com> * refactor(json_family_test): remove map_single_element_vector_ flag Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com> * fix(json_family_test): add more tests Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com> --------- Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com>	2024-07-18 09:30:27 +04:00
Borys	1acc824eff	fix(test): copy logs for failed test during TEARDOWN phase (#3331 ) * fix(test): copy logs for failed test during TEARDOWN phase	2024-07-17 22:16:08 +03:00
Shahar Mike	4898b25b49	fix: Proper shutdown sequence with Namespaces (#3333 ) This removes a race between periodic fiber and namespaces during shutdown.	2024-07-17 16:58:22 +03:00

1 2 3 4 5 ...

2399 commits