1
0
Fork 0
mirror of https://github.com/dragonflydb/dragonfly.git synced 2024-12-14 11:58:02 +00:00
Commit graph

2936 commits

Author SHA1 Message Date
Kostas Kyrimis
37fec87070
chore: increase load in test_noreply_pipeline (#3960)
Test is flaky because it relies that the producer (the pytest) to send fast enough a bunch of commands before they get dispatched synchronously so I increased the load.
2024-10-22 21:00:25 +03:00
Borys
dec0712e15
test: add test to test big collections or collections with big values (#3959) 2024-10-22 15:01:32 +03:00
Kostas Kyrimis
119723316e
chore: tune test_rss_used_mem_gap (#3958)
It appears that newer versions of the gh runner require more memory. Some cases of the test test_rss_used_mem_gap allocate more than 6.5-7 gb of memory leaving barely 0.5gb to the gh runner (7.5 in total available) which sometimes cause the instance to run out of memory.
2024-10-22 10:31:13 +03:00
Stepan Bagritsevich
e96a99a868
fix(search_family): Temporary remove the error when a field name does not have the '@' sign at the beginning in the FT.AGGREGATE command (#3956) 2024-10-21 18:35:23 +02:00
Kostas Kyrimis
478a5d476d
chore: disable test_cluster_memory_consumption_migration (#3948)
Test takes more than 10 minutes on the CI and it causes it to timeout
2024-10-21 08:41:15 +03:00
Roman Gershman
3028314701
chore: pass SinkReplyBuilder and Transaction explicitly. Part1 (#3946)
For some of our commands we need to inject another transaction and another SinkReplyBuilder.
This results into error-prone injections of temporary objects into ConnectionContext.
Most commands just need Transaction and SinkReplyBuilder, so lets pass them explicitly.
The final goal will be to remove Transaction and reply_builder fields from ConnectionContext.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-20 19:54:50 +03:00
Roman Gershman
f0c30a6d59
feat: track request sizes histograms (#3951)
This PR introduces "DEBUG RECVSIZE ENABLE|DISABLE|tid"
command that allows tracking of request sizes.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-20 19:54:34 +03:00
Vladislav
32a31cf1d8
chore(facade): Fix bad new IO glue (#3940)
* chore(facade): Fix bad new IO glue

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2024-10-18 23:25:56 +03:00
Roman Gershman
14220a6a20
chore: get rid of ToUpper/ToLower mutations on arguments (#3950)
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-18 23:23:59 +03:00
Roman Gershman
84e22aa658
chore: remove ToUpper calls in main_service (#3947)
* chore: remove ToUpper calls in main_service

Also, test for IsPaused() first to avoid doing more checks for common-case.

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-18 14:04:47 +00:00
Roman Gershman
a7c9fde38e
chore: get rid of ToUpper call and use AsciiStrToUpper (#3944)
Also remove std:: in bitops family to reduce noise.
No functional changes.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-18 12:47:40 +03:00
Roman Gershman
5ab32b97d9
chore(refactoring): header clean ups (#3943)
Move privately used header code to cc files. Remove redunandant includes.
No functional changes.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-18 12:47:26 +03:00
Borys
866c82a3fa
test: add test to reproduce a lot of memory consumtion during migration (#3939) 2024-10-17 14:23:26 +03:00
Borys
ef814f6670
chore: ignore applying the same cluster config twice (#3932) 2024-10-16 09:07:01 +03:00
romange
98bb5da67d chore(helm-chart): update to v1.24.0 2024-10-16 05:56:17 +00:00
Vladislav
7870f59466
chore(search): Fix deprecated functions (#3933)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2024-10-15 22:50:22 +03:00
Roman Gershman
e9e169d074
fix: dragonfly_connection should only access the original reply_builder (#3924)
ConnectionContext.reply_builder can be injected and replaced by the service logic.
before - dragonfly_connection accessed it via cc_->reply_builder in some places,
which led it to access the injected object. Moreover, EVAL commands can be offloaded
to another thread and that thread could inject the object, making the access to cc_->reply_builder_
non thread-safe.

Now dragonfly_connection copies aside the replier_builder_ pointer, and uses only this pointer for communicating with client.

Also, remove redundant arguments.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-15 15:37:50 +03:00
Shahar Mike
c3f9ec18ae
fix: Fix test_flushall_in_full_sync (#3929)
* fix: Fix `test_flushall_in_full_sync`

This test failed in CI many times. The issue was that we reach stable
sync too quickly, and miss the full sync stage.

I changed the seeder to add 100k (instead of 30k) keys for the stage to
take longer.

* StaticSeeder
2024-10-15 11:48:32 +00:00
adiholden
a1830e1b5e
feat(server): use listpack node encoding for list (#3914)
Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-10-15 13:55:26 +03:00
Shahar Mike
c868b27bbe
fix: Support replicating Valkey and Redis 7.2 (#3927)
Until now, we only tested Dragonfly against Redis 6.2.  It appears that
something has changed in the way Redis sends stable sync commands, and
now they also forward `MULTI` and `EXEC` as part of their replication.

Since we do not allow all commands to run under `MULTI`/`EXEC`,
specifically `SELECT`, a Dragonfly replica of such servers failed these
commands and became inconsistent with the data on the master.

The proposed fix is to simply ignore (i.e. not execute) `MULTI`/`EXEC`
coming from a Redis/Valkey master, and run the commands within those
transactions individually, like we do for other transactions.

To test this we randomly choose a redis/valkey server based on 3
available installed binaries and test against them.
2024-10-15 13:12:16 +03:00
Kostas Kyrimis
d2a83121e4
fix: pre-commit ci workflow (#3917)
* fix: pre-commit ci workflow


Signed-off-by: kostas <kostas@dragonflydb.io>
2024-10-15 12:15:57 +03:00
Vladislav
f455981927
feat(search): Prefix search* (#3913) 2024-10-14 13:54:07 +03:00
Kostas Kyrimis
588d6cc339
chore: relax assertion in test_noreply_pipeline (#3908)
*adjust assert condition

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-10-14 09:27:57 +03:00
Shahar Mike
cf2e94f374
chore: Add --allocator_tracker for default tracking (#3901)
* chore: Add `--allocator_tracker` for default tracking

Before, in order to use allocation tracker, one had to issue a `MEMORY
TRACK` command. This flag is identical to that, but allows starting
Dragonfly with certain ranges without issuing a command.

While here, fix a bug. Apparently, `absl::InlinedVector<>` has a bug in
the implementation of `max_size()` and so in practice we did not limit
the number of trackers. I switched to use `capacity()` instead, which I
tested and it works well.

Notes:
1. Currently the flag always add 100% "sampling", we can extend that in
   the future if need be
2. I added the flag in `dfly_main.cc` with custom initialization,
   because it's low level, and I couldn't get it reasonably working with
   changes only to `allocation_tracker.cc`

* fixes
2024-10-13 12:50:05 +03:00
Vladislav
92217b6045
chore(search): Rax TreeMap (#3909)
RaxTreeMap based on redis rax.h
2024-10-13 09:18:54 +03:00
Roman Gershman
04a29bfafa
fix: macos build (#3912)
It has been failing due to OVERFLOW constant, which is predefined
as a define macro on MacOS.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-12 22:50:42 +03:00
Roman Gershman
4012ad1855
fix: prevents Dragonfly from blocking in epoll during snapshotting (#3911)
The problem - we used file write in non-direct mode when writing snapshots in epoll mode.
As a result - lots of data was cached into OS memory. But then during the rename operation,
when we rename "xxx.dfs.tmp" into "xxx.dfs", the OS flushes the file caches and the thread
is stuck in OS system call rename for a long time.

The fix - to use DIRECT mode and to avoid caching the data into OS caches at all.
Fixes #3895

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-12 18:26:12 +03:00
Diskein
ba57145c53
fix!: fix BITPOS command responses (#3893) (#3910)
BITPOS returns 0 for non-existent keys according to Redis's
implmentation.

BITPOS allows only 0 and 1 as the bit mode argument.

Signed-off-by: Denis K <kalantaevskii@gmail.com>
2024-10-12 10:55:01 +03:00
Roman Gershman
5d2c308c99
chore: schedule chains (#3819)
Use intrusive queue that allows batching of scheduling calls instead of handling each call separately.
This optimizations improves latency and throughput by 3-5%
In addition, we expose batching statistics in info transaction block.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-11 22:41:31 +03:00
Vladislav
e71f083f34
feat(search): STOPWORDS (#3851)
Adds support for STOPWORDS option
2024-10-10 21:58:12 +03:00
adiholden
d876bcd5cb chore(helm-chart): update to v1.23.2 2024-10-10 18:22:34 +00:00
Kostas Kyrimis
0cea5fe2ff
chrore: parse cgroup v2 (#3857)
* chrore: parse cgroup v2

* small parsing logic
2024-10-10 19:15:26 +03:00
Andy Dunstall
1e429c8e59
fix(cluster): fix unknown migration error (#3899) 2024-10-10 10:14:15 +01:00
Kostas Kyrimis
a5fa3ab9f5
chore: skip flaky test_noreply_pipeline (#3903)
* Disable the test_noreply_pipeline because it's really flaky. Will look on this once I wrap up with my pending tasks.
2024-10-10 07:37:13 +00:00
Vladislav
786c9cd44d
chore: collection size (#3844) 2024-10-08 18:51:11 +03:00
Shahar Mike
50a7f2bcb1
fix: Do not kill Dragonfly on failed DFLY LOAD (#3892)
Today, some of the failures to load an RDB file passed via
`--dbfilename` cause Dragonfly to terminate with an error code. This is
ok and works as expected.

The problem is that the same code path is used for `DFLY LOAD`, which
means that if there's an error loading the file (such as corrupted
file), Dragonfly will exit instead of returning an error code to the
client.

This change fixes that, by only exiting in the code path which loads on
init.

Note to reviewer: apparently we can't call `Future::Get()` more than
once, as the first call resets the state of the future and drops the
previously saved value, so we use a Fiber here instead.
2024-10-08 14:47:31 +03:00
Shahar Mike
b2ebfd05d4
fix: Do not publish to connections without context (#3873)
* fix: Do not publish to connections without context

This is a rare case where a closed connection is kept alive while the
handling fiber yields, therefore leaving `cc_` (the connection context)
pointing to null for other fibers to see.

As far as I can see, this can only happen during server shutdown, but
there could be other cases that I have missed.

The test on its own does _not_ reproduce the crash, however with added
`ThisFiber::SleepFor()`s I could reproduce the crash:

* Right before `DispatchBrief()`
  [here](e3214cb603/src/server/channel_store.cc (L154))
* Right after connection context `reset()`
  [here](2ab480e160/src/facade/dragonfly_connection.cc (L750))

In any case, calling `SendPubMessageAsync()` to a connection where `cc_`
is null is a bug, and we fix that here.

* rewording
2024-10-08 14:45:57 +03:00
Kostas Kyrimis
c1e9d510a3
chore: lock keys for optimistic transactions (#3865)
We do not acquire any locks for transactions that are executing optimistically. However, this is problematic for callbacks that need to preempt (e.g. because a journal is active).

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-10-08 14:25:36 +03:00
Kostas Kyrimis
dac1b0f3ca
chore: allow rdb version 12 (#3860)
We currently support rdb files up to version 11. This is a blocker for people who want to migrate to dragonfly with newer versions of the format. As of now, there is only v12 and it only includes the addition of RDB_OPCODE_SLOT_INFO.

* adds support to load rdb files up to version 12
* reads and discards with a warning the contents of RDB_OPCODE_SLOT_INFO if found in the rdb file

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-10-07 07:14:11 +00:00
Shahar Mike
5efc8f11d2
opt: Optimize AllocationTracker to be efficient when enabled (#3875)
Today there's a cost to enabling AllocationTracker, even for rarely used
memory bands.

This PR slightly optimizes the "happy path" (i.e. allocations outside
the tracked range), and also for the case where we track 100% of the
allocations.

Also, add a unit test for this class.
2024-10-06 22:00:57 +03:00
Roman Gershman
45aba139b6
chore: reduce usage of ToUpper (#3874)
We would like to stop passing MutableSlice as arguments and removing ToUpper
is the first step to it.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-06 21:45:11 +03:00
Kostas Kyrimis
129ff0b0f7
chore: run memory decommit after snapshot load/save (#3828)
Sometimes for large values during snapshot loading/saving we allocate a lot of extra memory. For that, we might need to manually run memory decommit for mimalloc to release memory pages back to the OS. This PR addresses that by manually running memory decommit after each shard finishes loading or saving a snapshot.

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-10-06 08:19:24 +03:00
Roman Gershman
612c75c67b
chore: Refactor AddMany (#3869)
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-05 19:27:48 +03:00
Andy Dunstall
4dbed3f8dd
feat(rdb_load): add support for loading huge streams (#3855)
* chore: remove RdbLoad Ltrace::arr nested vector

* feat(rdb_load): add support for loading huge streams
2024-10-05 07:19:03 +03:00
Roman Gershman
07e0b9db4b
chore: ClearInternal now can clear partially (#3867)
* chore: ClearInternal now can clear partially

Intended for future use - to deallocate large objects gradually.
Currently nothing is changed in the functionality besides some cleanups.
---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-04 22:50:43 +03:00
Roman Gershman
bd972b6384
chore: Implement AddMany method (#3866)
* chore: Implement AddMany method

1. Fix a performance bug in Find2 that made redundant comparisons
2. Provide a method to StringSet that adds several items in a batch
3. Use AddMany inside set_family

Before:
```
BM_Add        4253939 ns      4253713 ns          991
```

After:
```
BM_Add        3482177 ns      3482050 ns         1206
BM_AddMany    3101622 ns      3101507 ns         1360
```

Signed-off-by: Roman Gershman <roman@dragonflydb.io>

* chore: fixes

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-04 22:50:05 +03:00
Roman Gershman
a86fcf80be
chore: Remove DenseSet::AddOrFindDense and AddSds (#3864)
Clean up interface a bit. AddOrFindDense does not make much sense as a single function
because it does not provide any performance benefits - we still must perform a lookup
before inserting. AddSds should have been removed a long time ago.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-04 17:14:49 +03:00
Roman Gershman
2e1d81ac80
chore: improve performance of ClearInternal (#3863)
Before:
`BM_Clear/elements:32000     418763 ns       418749 ns         9965`

After:
`BM_Clear/elements:32000     323252 ns       323239 ns        12790`

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-04 14:26:25 +03:00
Joakim Rishaug
0f972a0ec6
feat: add HEXPIRE and FIELDEXPIRE (#3842)
* add hexpire
* add fieldexpire
* add tests
2024-10-04 14:24:16 +03:00
Roman Gershman
819f6e125d
chore: simplify CloneBatch code (#3862)
Remove awkward fetch_tail case and streamline the code.
Fix invalid prefetch adresses. Performance improved a little.

Before:
`BM_Fill/elements:32000     874677 ns       874647 ns         4774`

After:
`BM_Fill/elements:32000     831786 ns       831761 ns         5111`

Also added a benchmark for Clear() operation.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-10-04 12:08:41 +03:00