There are actually a few failures fixed in this PR, only one of which is a test bug:
* `db_slice_->Traverse()` can yield, causing `fiber_cancelled_`'s value to change
* When a migration is cancelled, it may never finish `WaitForInflightToComplete()` because it has `in_flight_bytes_` that will never reach destination due to the cancellation
* `IterateMap()` with numeric key/values overrode the key's buffer with the value's buffer
Fixes#4207
* fix: bugs in stream code
1. Memory leak in streamGetEdgeID
2. Addresses CVE-2022-31144
3. Fixes XAUTOCLAIM bugs and adds tests.
4. Limits the count argument in XAUTOCLAIM command to 2^18 (CVE-2022-35951)
Also fixes#3830
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Signed-off-by: Roman Gershman <romange@gmail.com>
Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>
1. Use transaction time in streams code, similarly to how we do it in other commands.
Stop using mstime() and delete unused redis code.
2. Check for sequence overflow issue when passing huge sequence ids.
Add a test.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Also fix "debug objhist" so that its value histogram will show effective malloc
used distributions for all types.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
After running `debug POPULATE 100 list 100 rand type list elements 10000`
with `--list_experimental_v2=false`:
```
type_used_memory_list:16512800
used_memory:105573120
```
When running with `--list_experimental_v2=true`:
```
used_memory:105573120
type_used_memory_list:103601700
```
TODO: does not yet handle compressed entries correctly but we do not enable compression by default.
Fixes#3800
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: Add more qlist tests
Also fix a typo bug in NodeAllowMerge.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: fix build
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
fix(search_family): fix(search_family): Fix crash when no SEPARATOR is specified in the FT.CREATE command
Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
* chore: get back on the decision to put a hard limit on command interface
Limiting commands to only Transaction* and SinkReplyBuilder does not hold.
We need sometimes to access context fields for multitude of reasons.
But I do not want to pass the huge ConnectionContext object because, it's hard
then to track unusual access patterns.
The compromise: to introduce CommandContext that currently has tx, rb and extended fields.
It will be relatively easy to identify irregular access patterns by tracking the extended field.
This commit is the first one in series of probably 10-15 commits. No functional changes here.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* fix: deduplicate mget response
In case of duplicate mget keys, skips fetching the same key twice.
The optimization is straighforward - we just copy the response for the original key,
since the response is a shallow object, we potentially save lots of memory with this
deduplication. Always deduplicate inside OpMGet.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* feat: Huge values breakdown in cluster migration
Before this PR we used `RESTORE` commands for transferring data between
source and target nodes in cluster slots migration.
While this _works_, it has a side effect of consuming 2x memory for huge
values (i.e. if a single key's value takes 10gb, serializing it will
take 20gb or even 30gb).
With this PR we break down huge keys into multiple commands (`RPUSH`,
`HSET`, etc), respecting the existing `--serialization_max_chunk_size`
flag.
Part of #4100
The long-term goal is to make the parser to consume the whole input when
it returns INPUT_PENDING. It requires several baby step PRs.
This PR:
1. Adds more invariant checks
2. Avoids calling RedisParser::Parse with an empty buffer.
3. In bulk string parsing - remove redundant "optimization" of rejecting partial strings of less than 32 bytes,
in other words consume small parts as well. The unit test adjusted accordingly.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
1. Better logging in regtests
2. Release resources in dfly_main in more controlled manner.
3. Switch to ignoring signals when unregister signal handlers during the shutdown.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* fix: Huge entries fail to load outside RDB / replication
We have an internal utility tool that we use to deserialize values in
some use cases:
* `RESTORE`
* Cluster slot migration
* `RENAME`, if the source and target shards are different
We [recently](https://github.com/dragonflydb/dragonfly/issues/3760)
changed this area of the code, which caused this regression as it only
handled RDB / replication streams.
Fixes#4143
Fixes#4150. The failure can be reproduced with high probability on ARM via
`pytest dragonfly/replication_test.py -k test_replication_all[df_factory0-mode0-8-t_replicas3-seeder_config3-2000-False]`
Not sure why this barrier is needed but #4146 removes the barrier
which breaks a gentle balance in the code in unexpected way.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* fix: enforce load limits when loading snapshot
Prevent loading snapshots with used memory higher than max memory limit.
1. Store the used memory metadata only inside the summary file
2. Load the summary file before loading anything else, and if the used-memory is higher,
abort the load.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
1. Now error stats show "restrict_denied" instead of "Cannot execute restricted command ..." error.
2. Increased verbosity level when loading a key with expired timestamp.
3. pulled helio with better logs coverage of tls_socket.cc code.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: optimize info command
Info command has a large latency when returning all the sections.
But often a single section is required. Specifically,
SERVER and REPLICATION sections are often fetched by clients
or management components.
This PR:
1. Removes any hops for "INFO SERVER" command.
2. Removes some redundant stats.
3. Prints latency stats around GetMetrics command if it took to much.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* Update src/server/server_family.cc
Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>
Signed-off-by: Roman Gershman <romange@gmail.com>
* chore: remove GetMetrics dependency from the REPLICATION section
Also, address comments
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* fix: clang build
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Signed-off-by: Roman Gershman <romange@gmail.com>
Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>