1. Use clib malloc for allocating fiber stacks but reduce the fiber stack size.
clib malloc uses default 4K OS pages when reserving memory from the OS.
The reason for not using mi_malloc, because we use 2MB large OS pages with mimalloc.
However, allocating stacks is one of the cases, when using smaller 4KB memory pages is actually more
RSS efficient because memory pages become hot at better granularity.
2. Add "memory_fiberstack_vms_bytes" metric exposing fiber stack vm usage.
3. Fix macos dependencies & update ci versions.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* feat(json): Deserialize ReJSON format
This PR adds support for Redis-based JSON RDB format deserialization.
Since Redis uses ReJSON as a module, serialization is slightly different
from other types, but overall it's not a big change once we know where
all bits should be.
While this change knows how to _read_ Redis-based JSON keys, it does not
_save_ them in Redis format. That will be in a different PR.
This PR also ignores unknown (non-keys) module data instead of failing the load.
Fixes#2718
* Cleanup
* Add tests
* Skip unsupported modules
* Small refactor
1. Replaces run_barrier as a synchronization point with is_armed + an embedded blocking counter for awaiting running jobs
2. Replaces IsArmedInShard + GetLocalMask + is_armed.exchange chain with a single DisarmInShard() / DisarmInShardWhen
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
1. Pipeline squashing was not recorded
2. Apparently Redis counts commands of MULTI/EXEC transations separately, so I assume we also should
-> Place RecordCmd() in Invoke()
* chore: prevent crashing upon inconsistent expiry table
Also, introduce "DFLY LOAD <filename>" command in addition to "DEBUG LOAD"
as an official command to load snapshots into the running server.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
We already have a Fiber-aware DNS resolver in Helio, so it's trivial to
change and use.
I tested this end-to-end and it really resolves DNS addresses, not just
localhost.
Fixes#947
* chore: improve compatibility of EXPIRE functions with Redis
Also, provide a module name if stumbled upon module data that can not be loaded
by dragonfly.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* feat(server): Use mimalloc in SSL calls
Until now, OpenSSL used `malloc()` directly. This PR overrides it to use
mimalloc.
Fixes#2709
* Add generate-tls-files.sh
* feat(cluster): Add `--cluster_id` flag
This flag sets the unique ID of a node in a cluster.
It is UB (and bad) to set the same IDs to multiple nodes in the same
cluster.
If unset (default), the `master_replid` (previously known as `master_id`) is used.
Fixes#2643
Related to #2636
* gh comments
* oops - revert line removal
* fix
* replica
* disallow cluster_node_id in emulated mode
* fix replica test
* chore: add malloc-based stats and decommit
Provides more stats and control with glibc-malloc based allocator.
For example,
with v1.15.0 (--proactor_threads=2), empty database, `info memory`returns
```
used_memory:614576
used_memory_human:600.2KiB
used_memory_peak:614576
used_memory_peak_human:600.2KiB
used_memory_rss:19922944
used_memory_rss_human:19.00MiB
```
then during `memtier_benchmark -n 300000 --key-maximum 100000 --ratio 0:1 --threads=30 -c 100` (i.e GET-only with 3k connections):
```
used_memory:614576
used_memory_human:600.2KiB
used_memory_peak:614576
used_memory_peak_human:600.2KiB
used_memory_rss:59985920
used_memory_rss_human:57.21MiB
used_memory_peak_rss:59985920
```
connections overhead grows by ~39MB.
when the traffic stops, `used_memory_rss_human` becomes `30.35MiB`
and we do not know where 11MB gets lost and `MEMORY DECOMMIT` does not reduce the RSS.
With this change, `memory malloc-stats` return during the memtier traffic
```
malloc arena: 394862592
malloc fordblks: 94192
```
i.e. 395MB virtual memory was allocated by malloc and only 94KB is chunks available for reuse.
395MB is arena virtual memory, and not RSS obviously, but at least we have some visibility into malloc reservations.
The RSS usage is the same ~57MB and the difference between virtual and RSS is due to the fact we reserve fiber stacks of size 131KB but we touch less.
After the traffic stops, `arena` is reduced to 134520832 bytes, and fordblks are 133016592, i.e. majority of reserved ranges are also free (available to reuse) in the malloc pools.
RSS goes down similarly to before to ~31MB.
So far, this PR only demonstrated the increased visibility to mmapped ranges reserved by glibc malloc.
The additional functional change is in `MEMORY DECOMMIT` that now trims malloc RSS usage from reserved but unused (fordblks) pages
by calling `malloc_trim`.
After the call, RSS is: `used_memory_rss_human:20.29MiB` which is almost the same as when we started the empty process.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: fix build for older glibc environments
Disable these extensions for alpine and use legacy version
for older glibc libraries.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
In the fiber we used to call `mi_heap_collect()` when we're done
deleting items. But since that fiber captures a `vector` of intrusive
pointers to `DbTable`s, it can't free all memory used by the tables
themselves.
A local test shows that this fix helps almost entirely: when occupying a
5gb DB, `FLUSHALL` will reduce RSS by 4.7gb, leaving 300mb still used. A
follow up `MEMORY DECOMMIT` *will* indeed remove these 300mb, but I'm
still not sure why they are not released immediately. Still looking...
Addresses (1) of #2690
* chore: add oom stats to /metrics
Expose oom/cmd errors when we reject executing a command if we reached OOM state (controlled by oom_deny_ratio flag).
Expose oom/insert errors when we do not insert a new key or do not grow a dashtable (controlled by table_growth_margin).
Move OOM command check to a place that covers all types of transactions - including multi and squashing transactions.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
1.Add back the search files to MacOs build (linker errors are fixed now).
2. Add default maxmemory argument (if not present already) when launching dragonfly process in regression tests.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Redis, due to its old lua enginer and bunch of historic reasons returns floats as integers from lua scripts.
This means `eval "return 42.9" 0` would return 42 as long integer.
Dragonfly supports both integers and floats in its lua engine, returning a precise "42.9" in the same scenario.
RESP2 does not support float types so "42.9" is returned as a bulk string for RESP2 connections. For RESP3, dragonfly
returns 42.9 as a native RESP3 double primitive.
This PR introduces an optional legacy behavior for Dragonfly only for the RESP2 protocol. When the `--lua_resp2_legacy_float` flag is passed,
Dragonfly will round down the double value to the nearest integer and return it as RESP2 native long integer.
Fixes#2664
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* WIP: `cluster_mgr.py` to work with remote targets
* Documentation
* No admin port
* Support different hostname move/migrate
* Fix migrate bug
* Fix typo in --help
* fix test
* self.update_id()