* fix truncating the timeout red dots on CI failures
* fix deprecated use of with timeout warnings
* remove @pytest.mark.dbg_only as it doesn't exist
---------
Signed-off-by: kostas <kostas@dragonflydb.io>
* chore(generic_family): Fix bad data format error in the RESTORE command
Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com>
* chore(generic_family): Remove the error reply for OpStatus::WRONG_TYPE in the RESTORE command, since it is no longer in use
Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com>
---------
Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com>
There are some problematic flows. First we did not handle deletions, so all sorts of consistency issues could arise while calling DbSlice::Traverse() and DbSlice::Del(). Second, we did not handle FlushAll (same as before, Traverse() preempts and FlushAll() kicks in. Third we did not handle expirations.
---------
Signed-off-by: kostas <kostas@dragonflydb.io>
**Background**
In v1.21.0 we introduced support for `--announce_ip` for replicas to
announce their public IP addresses.
Like Valkey, this uses `REPLCONF IP-ADDRESS` to announce their IP
address.
**The issue**
Older Dragonfly releases (<1.21) did not support this feature. The
master side simply returned an error for such `REPLCONF` attempts,
however the replica code failed the replication, resulting in
incompatible versions.
**The fix**
The fix is simple, just log an error if the master did not respect
`REPLCONF IP-ADDRESS`. We can make this non-optional in the future
again.
However, in addition, I added a regression test to make sure we are
backwards compatible with v1.19.2. We'll bump this up every once in a
while.
* feat: Allow pre-declaring Lua SHAs to run with undeclared keys
By using `--lua_undeclared_keys_shas=SHA,SHA,SHA` users can now specify
which scripts should run globally (undeclared keys) without explicit
support from the scripts themselves.
Fixes#2442
* chore: add timeout fo replication sockets
Master will stop the replication flow if writes could not progress for more than K millis.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Signed-off-by: Roman Gershman <romange@gmail.com>
Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>
* chore: change how we track memory_budget during evictions
We compared memory_budget vs 0 before inserting a new item in DbSlice,
and retired cool pages if we are low on memory.
The problem - when we decide whether we allow growing a table, we estimate the possible object size increase due to the future table growth.
And the memory check described before was not consistent with the actual logic that rejected the insertion.
Moreover, the memory_budget tracking interaction with EvictionPolicy was over-complicated: we passed the memory_budget counter to the evp object
and then read it back, even though evp did not track object deletions memory impact during objects evictions.
Now, we remove the responsibility from evp to update memory_budget_ so it's solely updated by DbSlice.
We also update memory_budget_ during deletions, and when we pass it to evp, we add cool memory size as potential memory resource to avoid
rejections in case we have lots of cool memory.
Fixes#3456
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
os.remove(LAST_LOGS) might throw an exception if the file does not exist which we do not handle. Wrap it in try/catch block
* wrap in try/catch os.remove
---------
Signed-off-by: kostas <kostas@dragonflydb.io>
The quicklist.* files are mostly equal to the versions at commit 0fc43edc6
Also, removed some unused code.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
The env variables exported when regression tests timeout are not working properly and the if statement on the action step Print last log on timeout would fail to read and upload the files set in /tmp/last_log_file.txt. Furthermore, another problem is the job.timeout argument that kills the whole job/matrix before the upload log step has a chance to run. For that, we need manual timeouts on the workflow similar to what we do in regression tests action.
* remove print last log on timeout action step
* copy the logs on timeout directly within the timeout step
* replace global timeout on CI workflow with timeout command per step
---------
Signed-off-by: kostas <kostas@dragonflydb.io>
Before that PAUSE paused the reconnection reconciler flow,
now it also stops the ongoing full sync replication if such exists.
In addition, this PR applies some clean-ups and removes redundant code
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: simplify master replication cancelation interface
Before that CancelReplication did too many things, moreover,
we had StopReplication that did the same.
This PR moves CancelReplication under ReplicaInfo struct,
and reduces code duplication around this change.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* Update src/server/dflycmd.cc
Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>
Signed-off-by: Roman Gershman <romange@gmail.com>
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Signed-off-by: Roman Gershman <romange@gmail.com>
Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>
* chore: reorganize EngineShard::Heartbeat
1. Simplify CacheStats by using accessorts directly provided by DbSlice
2. Separate eviction for tiering as tiering can be done on replica.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: improve replication locks
Allow non-exclusive, read-only access to Dfly::ReplicaInfo structure.
The most important change is in DflyCmd::CancelReplication, where before
it has locked ReplicaInfo mutex and then continued with locking the global mutex.
It is dangerous because most operation lock them in the opposite order.
Also rename ambigous GetReplicaInfo accessors to clearer names.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* chore: comments
* chore: comments
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>