1
0
Fork 0
mirror of https://github.com/dragonflydb/dragonfly.git synced 2024-12-14 11:58:02 +00:00
Commit graph

2934 commits

Author SHA1 Message Date
Kostas Kyrimis
66e0fd0908
fix: stream memory tracking (#4067)
* add object memory usage for streams
* add test
2024-11-27 12:41:08 +02:00
Roman Gershman
065a63cab7
chore: qlist improvements (#4194)
fix: qlist improvements + bug fix in Erase

1. Reduce code duplication.
2. Expose qlist node count via "debug object"
3. Add more tests to qlist_test
4. Fix a bug in QList::Erase

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-27 11:28:40 +02:00
Roman Gershman
45f8e8446f
chore: get back on the decision to put a hard limit on command interface (#4203)
* chore: get back on the decision to put a hard limit on command interface

Limiting commands to only Transaction* and SinkReplyBuilder does not hold.
We need sometimes to access context fields for multitude of reasons.

But I do not want to pass the huge ConnectionContext object because, it's hard
then to track unusual access patterns.

The compromise: to introduce CommandContext that currently has tx, rb and extended fields.
It will be relatively easy to identify irregular access patterns by tracking the extended field.

This commit is the first one in series of probably 10-15 commits. No functional changes here.

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-27 11:28:02 +02:00
Borys
3327e1a908
feat: add ability reading stream_listpacks_2/3 rdb types (#4192)
* feat: add ability reading stream_listpacks_2/3 rdb types

* refactor: address comments
2024-11-26 16:43:30 +02:00
Roman Gershman
f84e1eeac8
fix: debug object encoding names (#4188)
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-26 16:11:18 +02:00
Kostas Kyrimis
2f748c24dd
chore: fix false positives sanitizers (#4190)
* disable false positive

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-11-26 11:49:20 +02:00
Borys
0531c39aae
test: skip test_cluster_mgr because of unclosed instance (#4191) 2024-11-26 09:34:06 +00:00
dependabot[bot]
5c84f21caf
chore(deps): bump github/codeql-action from 3.27.4 to 3.27.5 in the actions group (#4186)
chore(deps): bump github/codeql-action in the actions group

Bumps the actions group with 1 update: [github/codeql-action](https://github.com/github/codeql-action).


Updates `github/codeql-action` from 3.27.4 to 3.27.5
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](ea9e4e3799...f09c1c0a94)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-26 10:04:40 +02:00
Roman Gershman
1709061ae6
chore: stop periodic task earlier during the shutdown process (#4187)
Fixes #4151

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-26 10:02:15 +02:00
romange
e7b49fa1c4 chore(helm-chart): update to v1.25.3 2024-11-26 02:24:10 +00:00
Roman Gershman
2c663f3833
chore: produce core files in regtests (#4185)
Should work only for self-hosted runners.
The core files will be kept in /var/crash/
We also copy automatically dragonfly binary into /var/crash to be able to debug later.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-25 17:13:11 +00:00
Roman Gershman
63742dd0cf
fix: stop using openssl for container healthchecks (#4181)
Dragonfly responds to ascii based requests to tls port with:
`-ERR Bad TLS header, double check if you enabled TLS for your client.`

Therefore, it is possible to test now both tls and non-tls ports with a plain-text PING.
Fixes #4171

Also, blacklist the bloom-filter test that Dragonfly does not support yet.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-25 17:41:17 +02:00
Roman Gershman
7ac1631424
fix: deduplicate mget response (#4175)
* fix: deduplicate mget response

In case of duplicate mget keys, skips fetching the same key twice.
The optimization is straighforward - we just copy the response for the original key,
since the response is a shallow object, we potentially save lots of memory with this
deduplication. Always deduplicate inside OpMGet.

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-25 17:29:53 +02:00
Borys
43c83d29fa
feat: cluster migrations restarts immediately if timeout happens (#4081)
* feat: cluster migrations restarts immediately if timeout happens

* feat: add DEBUG MIGRATION PAUSE command
2024-11-25 16:02:22 +02:00
Shahar Mike
3c65651c69
feat: Huge values breakdown in cluster migration (#4144)
* feat: Huge values breakdown in cluster migration

Before this PR we used `RESTORE` commands for transferring data between
source and target nodes in cluster slots migration.

While this _works_, it has a side effect of consuming 2x memory for huge
values (i.e. if a single key's value takes 10gb, serializing it will
take 20gb or even 30gb).

With this PR we break down huge keys into multiple commands (`RPUSH`,
`HSET`, etc), respecting the existing `--serialization_max_chunk_size`
flag.

Part of #4100
2024-11-25 15:58:18 +02:00
Stepan Bagritsevich
2b3c182cc9
fix(search_family): Fix LOAD fields parsing in the FT.AGGREGATE and FT.SEARCH commands (#4012)
* fix(search_family): Fix LOAD fields parsing in the FT.AGGREGATE and FT.SEARCH commands

fixes dragonflydb#3989

Signed-off-by: Stsiapan Bahrytsevich <stefan@dragonflydb.io>

* refactor: address comments

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor(search_family): Address comments 2

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor(search_family): address comments 3

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor(search_family): address comments 4

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor(search_family): address comments 5

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

---------

Signed-off-by: Stsiapan Bahrytsevich <stefan@dragonflydb.io>
Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
2024-11-25 09:50:31 +00:00
Tarun Pothulapati
3d68c9c99e
fix(release/helm): allow empty commits for rerun (#4163)
* fix(release/helm): allow empty commits for rerun

* remove empty commit and just move forward
2024-11-25 08:29:10 +00:00
Roman Gershman
872d5e2d7d
chore: more parser improvements (#4177)
The long-term goal is to make the parser to consume the whole input when
it returns INPUT_PENDING. It requires several baby step PRs.

This PR:
1. Adds more invariant checks
2. Avoids calling RedisParser::Parse with an empty buffer.
3. In bulk string parsing - remove redundant "optimization" of rejecting partial strings of less than 32 bytes,
   in other words consume small parts as well. The unit test adjusted accordingly.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-25 09:15:29 +02:00
s13k
ff2359af30
fix(tools): Prevent dragonfly.logrotate to stop logrotate service (#4176)
Update dragonfly.logrotate

If multiple logs are being rotated and one of them fails (due to exit 1), the other logs that follow won't be rotated either, unless logrotate is run again.

If you want to prevent the rotation of a specific log file and not affect the rest of the logs, you'll want to handle the condition properly to ensure that logrotate doesn't abort due to the failure of the prerotate script.

To prevent the rotation of a specific log file without causing issues for other logs, you can use exit 0 to prevent rotation cleanly or design your prerotate script to handle conditions carefully.

Signed-off-by: s13k <s13k@pm.me>
2024-11-24 17:27:05 +00:00
Shahar Mike
6a7f345bc5
chore: Hide replicas from CLUSTER subcmds in managed mode (#4174)
* chore: Hide replicas from `CLUSTER` subcmds in managed mode

Part of #4173 (see for context)

* server.client()
2024-11-24 13:10:32 +00:00
Andy Dunstall
e05363995f
feat(server): add eval_ro and evalsha_ro (#4091) 2024-11-24 11:53:06 +00:00
Roman Gershman
91caa940b9
chore: fix shutdown sequence in Dragonfly server (#4168)
1. Better logging in regtests
2. Release resources in dfly_main in more controlled manner.
3. Switch to ignoring signals when unregister signal handlers during the shutdown.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-24 10:35:00 +02:00
Sebastian Struß
cfca3e798d
adjusted grafana dashboard to be more user friendly (#4165) 2024-11-24 09:16:00 +02:00
Kostas Kyrimis
a012539f2c
fix: remove DenseSet::IteratorBase::TraverseApply (#4170)
Signed-off-by: kostas <kostas@dragonflydb.io>
2024-11-23 18:21:50 +02:00
Roman Gershman
b8c2dd888a
chore: log exit code of failing dragonfly in tests (#4166)
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-22 11:40:10 +02:00
Roman Gershman
a694bf46b8
chore: fix a regression build break (#4162) 2024-11-21 10:40:18 +00:00
Roman Gershman
581cfbf6c5
chore: allow slow and precise memory measurement of an object (#4160)
Specifically fixes "MEMORY USAGE" for lists.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-21 09:21:48 +02:00
romange
7d6f745636 chore(helm-chart): update to v1.25.2 2024-11-21 00:02:14 +00:00
Shahar Mike
24a1ec6ab2
fix: Huge entries fail to load outside RDB / replication (#4154)
* fix: Huge entries fail to load outside RDB / replication

We have an internal utility tool that we use to deserialize values in
some use cases:

* `RESTORE`
* Cluster slot migration
* `RENAME`, if the source and target shards are different

We [recently](https://github.com/dragonflydb/dragonfly/issues/3760)
changed this area of the code, which caused this regression as it only
handled RDB / replication streams.

Fixes #4143
2024-11-20 14:00:07 +00:00
Roman Gershman
36135f516f
fix: test_replication_all failure (#4155)
Fixes #4150. The failure can be reproduced with high probability on ARM via
`pytest dragonfly/replication_test.py -k test_replication_all[df_factory0-mode0-8-t_replicas3-seeder_config3-2000-False]`

Not sure why this barrier is needed but #4146 removes the barrier
which breaks a gentle balance in the code in unexpected way.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-20 14:00:28 +02:00
Roman Gershman
0e7ae34fe4
fix: enforce load limits when loading snapshot (#4136)
* fix: enforce load limits when loading snapshot

Prevent loading snapshots with used memory higher than max memory limit.

1. Store the used memory metadata only inside the summary file
2. Load the summary file before loading anything else, and if the used-memory is higher,
   abort the load.
---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-20 06:12:47 +02:00
Borys
4e7800f94f
fix: UB during cmd squashing reply size calculation (#4149)
* fix: UB during cmd squashing reply size calculation

* feat: add promtheus metric commands_squashing_replies_bytes
2024-11-19 13:40:30 +02:00
Roman Gershman
794bd1cdb3
chore: tune logs and improve restrict denied error (#4145)
1. Now error stats show "restrict_denied" instead of "Cannot execute restricted command ..." error.
2. Increased verbosity level when loading a key with expired timestamp.
3. pulled helio with better logs coverage of tls_socket.cc code.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-18 23:14:14 +02:00
dependabot[bot]
907346e3e6
chore(deps): bump github/codeql-action from 3.27.1 to 3.27.4 in the actions group (#4148)
chore(deps): bump github/codeql-action in the actions group

Bumps the actions group with 1 update: [github/codeql-action](https://github.com/github/codeql-action).


Updates `github/codeql-action` from 3.27.1 to 3.27.4
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](4f3212b617...ea9e4e3799)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-18 23:05:42 +02:00
Roman Gershman
d467a348ac
fix: allow SELECT in multi/exec if it's a noop (#4146)
Fixes #4120

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-18 22:27:34 +02:00
Borys
e16ef838e4
feat: add INFO memory section for squashing replies memory consuming (#4147)
* feat: add INFO memory section for squashing replies memory consuming

* refactor: address comments
2024-11-18 21:16:41 +02:00
Borys
5e2b48c3f3
fix: migration ACK response processing (#4140) 2024-11-18 09:28:07 +02:00
Roman Gershman
ee01dc4fb5
chore: fix a potential crash during client list (#4141) 2024-11-18 06:32:30 +02:00
Daniel M
d241839cff
chore:update fakeredis, remove irrelevant tests (#4014)
* chore:update fakeredis, remove irrelevant tests
2024-11-17 20:24:46 +02:00
adiholden
59c81fb98a
fix server: fix write to slowlog on squashing flow (#4138)
Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-11-17 16:03:30 +02:00
Roman Gershman
8bd2b9ed3e
chore: optimize info command (#4137)
* chore: optimize info command

    Info command has a large latency when returning all the sections.
    But often a single section is required. Specifically,
    SERVER and REPLICATION sections are often fetched by clients
    or management components.

    This PR:
    1. Removes any hops for "INFO SERVER" command.
    2. Removes some redundant stats.
    3. Prints latency stats around GetMetrics command if it took to much.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>

* Update src/server/server_family.cc

Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>
Signed-off-by: Roman Gershman <romange@gmail.com>

* chore: remove GetMetrics dependency from the REPLICATION section

Also, address comments

Signed-off-by: Roman Gershman <roman@dragonflydb.io>

* fix: clang build

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Signed-off-by: Roman Gershman <romange@gmail.com>
Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>
2024-11-17 13:33:29 +02:00
Roman Gershman
8e3b8ccbe3
chore: run tests with list_experimental_v2 enabled (#4112)
Also fix issues with memory_test.py running locally.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-15 10:33:45 +01:00
Roman Gershman
2ff6bf35c1
chore: improve the state machine of RedisParser (#4085)
1. Simplify conditions inside the main loop.
2. Improve the logic inside ConsumeBulk() function.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-15 11:14:50 +02:00
Roman Gershman
c46cb2514f
chore: fix plain node insertion (#4134)
The blob allocation had invalid size and the value has never been copied.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-14 23:53:34 +02:00
adiholden
db67b35f8e
fix server: fix stats of pipeline squashed commands (#4132)
Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-11-14 20:57:05 +02:00
dependabot[bot]
a887d822a9
chore(deps): bump github/codeql-action from 3.27.0 to 3.27.1 in the actions group (#4115)
chore(deps): bump github/codeql-action in the actions group

Bumps the actions group with 1 update: [github/codeql-action](https://github.com/github/codeql-action).


Updates `github/codeql-action` from 3.27.0 to 3.27.1
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](662472033e...4f3212b617)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-14 19:28:25 +02:00
Shahar Mike
1513134247
fix: Use MOVED error type for moved replies (#4125)
**The problem:**

When in cluster mode, `MOVED` replies (which are arguably not even errors) are aggregated per slot-id + remote host, and displayed in `# Errorstats` as such. For example, in a server that does _not_ own 8k slots, we will aggregate 8k different errors, and their counts (in memory).

This slows down all `INFO` replies, takes a lot of memory, and also makes `INFO` replies very long.

**The fix:**

Use `type` `MOVED` for moved replies, making them all the same under `# Errorstats`

Fixes #4118
2024-11-14 13:12:38 +02:00
Shahar Mike
f04e5497bc
fix: Do not use cc_ in connection if it's null (#4131)
* fix: Do not use `cc_` in connection if it's null

This is a rare condition, which we know can happen during shutdown (see
[here](https://github.com/dragonflydb/dragonfly/pull/3873#issue-2568503374))

* add comment
2024-11-14 12:41:51 +02:00
Roman Gershman
ab6088f5d6
chore: simplify BumpUps deduplication (#4098)
* chore: simplify BumpUps deduplication

This pr #2474 introduced iterator protection by
tracking which keys where bumped up during the transaction operation.
This was done by managing keys view set. However, this can be simplified
using fingerprints. Also, fingerprints do not require that the original keys exist.

In addition, this #3241 PR introduces FetchedItemsRestorer that tracks bumped set and
saves it to protect against fiber context switch. My claim is that it's redundant.
Since we only keep the auto-laundering iterators, when a fiber preempts these iterators recognize it
(see IteratorT::LaunderIfNeeded) and refresh themselves anyway.

To summarize: fetched_items_ protect us from iterator invalidation during atomic scenarios,
and auto-laundering protects us from everything else, so fetched_items_ can be cleared in that case.
---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-13 17:27:53 +02:00
adiholden
fb84d47b4d
feat server: experimental_new_io flag add as deprecated (#4127)
* feat server: experimental_new_io flag add as deprecated

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-11-13 13:29:40 +00:00