The main change here is to add a helper function
`gather_optional_coroutines`, which works in a similar way as
`yieldable_gather_results` but takes a set of coroutines rather than a
function
Fixes#17823
While we're at it, makes a change where the redactions are sent as the
admin if the user is not a member of the server (otherwise these fail
with a "User must be our own" message).
Reset `sliding_sync_membership_snapshots` -> `forgotten` status when
membership changes (like rejoining a room).
Fix https://github.com/element-hq/synapse/issues/17781
### What was the problem before?
Previously, if someone used `/forget` on one of their rooms, it would
update `sliding_sync_membership_snapshots` as expected but when someone
rejoined the room (or had any membership change), the upsert didn't
overwrite and reset the `forgotten` status so it remained `forgotten`
and invisible down the Sliding Sync endpoint.
This adds Python 3.13.0 to the trial test matrix
Also updates `cffi` and `zope.interface` in the locked dependencies to
make sure we have versions compatible with Python 3.13. For some
reasons, they are not being picked up by dependabot.
Bumps [mypy](https://github.com/python/mypy) from 1.10.1 to 1.11.2.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/python/mypy/blob/master/CHANGELOG.md">mypy's
changelog</a>.</em></p>
<blockquote>
<h3>Mypy 1.11.2</h3>
<ul>
<li>Alternative fix for a union-like literal string (Ivan Levkivskyi, PR
<a
href="https://redirect.github.com/python/mypy/pull/17639">17639</a>)</li>
<li>Unwrap <code>TypedDict</code> item types before storing (Ivan
Levkivskyi, PR <a
href="https://redirect.github.com/python/mypy/pull/17640">17640</a>)</li>
</ul>
<h3>Acknowledgements</h3>
<p>Thanks to all mypy contributors who contributed to this release:</p>
<ul>
<li>Alex Waygood</li>
<li>Alexander Leopold Shon</li>
<li>Ali Hamdan</li>
<li>Anders Kaseorg</li>
<li>Ben Brown</li>
<li>Bénédikt Tran</li>
<li>bzoracler</li>
<li>Christoph Tyralla</li>
<li>Christopher Barber</li>
<li>dexterkennedy</li>
<li>gilesgc</li>
<li>GiorgosPapoutsakis</li>
<li>Ivan Levkivskyi</li>
<li>Jelle Zijlstra</li>
<li>Jukka Lehtosalo</li>
<li>Marc Mueller</li>
<li>Matthieu Devlin</li>
<li>Michael R. Crusoe</li>
<li>Nikita Sobolev</li>
<li>Seo Sanghyeon</li>
<li>Shantanu</li>
<li>sobolevn</li>
<li>Steven Troxler</li>
<li>Tadeu Manoel</li>
<li>Tamir Duberstein</li>
<li>Tushar Sadhwani</li>
<li>urnest</li>
<li>Valentin Stanciu</li>
</ul>
<p>I’d also like to thank my employer, Dropbox, for supporting mypy
development.</p>
<h2>Mypy 1.10</h2>
<p>We’ve just uploaded mypy 1.10 to the Python Package Index (<a
href="https://pypi.org/project/mypy/">PyPI</a>). Mypy is a static type
checker for Python. This release includes new features, performance
improvements and bug fixes. You can install it as follows:</p>
<pre><code>python3 -m pip install -U mypy
</code></pre>
<p>You can read the full documentation for this release on <a
href="http://mypy.readthedocs.io">Read the Docs</a>.</p>
<h3>Support TypeIs (PEP 742)</h3>
<p>Mypy now supports <code>TypeIs</code> (<a
href="https://peps.python.org/pep-0742/">PEP 742</a>), which allows</p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="789f02c83a"><code>789f02c</code></a>
Bump version to 1.11.2</li>
<li><a
href="917cc75fd6"><code>917cc75</code></a>
An alternative fix for a union-like literal string (<a
href="https://redirect.github.com/python/mypy/issues/17639">#17639</a>)</li>
<li><a
href="7d805b364e"><code>7d805b3</code></a>
Unwrap TypedDict item types before storing (<a
href="https://redirect.github.com/python/mypy/issues/17640">#17640</a>)</li>
<li><a
href="32675dddfa"><code>32675dd</code></a>
Revert "Fix Literal strings containing pipe characters" (<a
href="https://redirect.github.com/python/mypy/issues/17638">#17638</a>)</li>
<li><a
href="778542b93a"><code>778542b</code></a>
Revert "Fix <code>RawExpressionType.accept</code> crash with
<code>--cache-fine-grained</code>" (<a
href="https://redirect.github.com/python/mypy/issues/1">#1</a>...</li>
<li><a
href="14ab742dec"><code>14ab742</code></a>
Bump version to 1.11.2+dev</li>
<li><a
href="570b90a7a3"><code>570b90a</code></a>
Bump version to 1.11</li>
<li><a
href="b3a102ef31"><code>b3a102e</code></a>
Fix <code>RawExpressionType.accept</code> crash with
<code>--cache-fine-grained</code> (<a
href="https://redirect.github.com/python/mypy/issues/17588">#17588</a>)</li>
<li><a
href="aec04c7448"><code>aec04c7</code></a>
Fix PEP 604 isinstance caching (<a
href="https://redirect.github.com/python/mypy/issues/17563">#17563</a>)</li>
<li><a
href="cb44e4d8f1"><code>cb44e4d</code></a>
Fix <code>typing.TypeAliasType</code> being undefined on python <
3.12 (<a
href="https://redirect.github.com/python/mypy/issues/17558">#17558</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/python/mypy/compare/v1.10.1...v1.11.2">compare
view</a></li>
</ul>
</details>
<br />
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=mypy&package-manager=pip&previous-version=1.10.1&new-version=1.11.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Spawning from @kegsay [pointing
out](https://matrix.to/#/!cnVVNLKqgUzNTOFQkz:matrix.org/$ExOO7J8uPUQSyH-9Uxc_QCa8jlXX9uK4VRtkSC0EI3o?via=element.io&via=matrix.org&via=jki.re)
that the Sliding Sync endpoint doesn't handle a large room with a lot of
state well on initial sync (requesting all state via `required_state: [
["*","*"] ]`) (it just takes forever).
After investigating further, the slow part is just
`get_events_as_list(...)` fetching all of the current state ID's out for
the room (which can be 100k+ events for rooms with a lot of membership).
This is just a slow thing in Synapse in general and the same thing
happens in Sync v2 or the `/state` endpoint.
---
The only idea I had to improve things was to use `batch_iter` to only
try fetching a fixed amount at a time instead of working with large
maps, lists, and sets. This doesn't seem to have much effect though.
There is already a `batch_iter(event_ids, 200)` in
`_fetch_event_rows(...)` for when we actually have to touch the database
and that's inside a queue to deduplicate work.
I did notice one slight optimization to use `get_events_as_list(...)`
directly instead of `get_events(...)`. `get_events(...)` just turns the
result from `get_events_as_list(...)` into a dict and since we're just
iterating over the events, we don't need the dict/map.
Fixes https://github.com/element-hq/synapse/issues/17698
This handles `required_state` changes by checking if new state has been
added to the config, and if so fetching and returning that from the
current state.
This also takes care to ensure that given a state entry S that is added,
removed and then re-added that we do *not* send S down a second time if
there have been no changes to S in the current state. This is fine for
Rust SDK (as it just remembers all state), but we might decide not to do
this behaviour in the MSC. If we decide to always send down S then its
easy enough to rip out all the code.
---------
Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
c.f. https://wiki.ubuntu.com/Releases for the currently supported Ubuntu
releases.
Note: this removes support for 23.04 and 23.10, which are EOL.
Fixes#17811
- better validation on user input
- fix an early task completion
- when checking membership in rooms, check for rooms user has been
banned from as well
Two changes: a) use a batch lookup function instead of a loop, b) check
existing data to see if we already have what we need and only fetch what
we don't.
Adds the option to load the Redis password from a file, instead of
giving it in the config directly. The code is similar to how it’s done
for `registration_shared_secret_path`. I changed the example in the
documentation to represent the best practice regarding the handling of
secrets.
Reading secrets from files has the security advantage of separating the
secrets from the config. It also simplifies secrets management in
Kubernetes.
Added a note in the documentation suggesting that users may set
`PYTHONMALLOC=malloc` when using `jemalloc`. This allows jemalloc to
track memory usage more accurately by bypassing Python's internal
small-object allocator (`pymalloc`), helping to ensure that
`cache_autotuning` functions as expected.
This doc change aims to provide more clarity for users configuring
jemalloc with Synapse.
Based on:
4ac783549c/synapse/metrics/jemalloc.py (L198-L201)
There is a bug with the `StreamChangeCache` where it would incorrectly
return that all entities had changed if asked for entities changed
*since* the earliest stream position.
Note that for streams we use the inequalities: `$min_stream_id <
stream_id <= $max_stream_id`, i.e. when we ask the stream change cache
for all things that have changed since `$stream_id` we don't care for
events that happened *at* `$stream_id`.
Specifically: `_earliest_known_stream_pos` is the position at which we
know that we'll have entries for all changes since that point, we can
use the cache for any stream IDs that equal
`_earliest_known_stream_pos`.
`_earliest_known_stream_pos` is set in three places:
- On startup we set it either to:
- the current maximum stream ID, with not prefilled values; or
- the minimum of the latest N values we pulled from the DB
- When we evict items from the bottom, we set it to the stream ID of the
evicted items.
This was changed in https://github.com/matrix-org/synapse/pull/14435,
but I think we were overly conservative there.
---------
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Based on #17765.
Basically the idea is to reduce the overhead of calling
`ObservableDeferred` in a loop. The two gains are: a) just using a list
of deferreds rather than the machinery of `ObservableDeferred`, and b)
only calling `PreseverLoggingContext` once.
`PreseverLoggingContext` in particular is expensive to call a lot as
each time it needs to call `get_thread_resource_usage` twice, so that it
an update the CPU metrics of the log context.
The notifier is quite inefficient when it has to wake up many user
streams all at once
From a silly benchmark this takes the time to notify 1M user streams
from ~30s to ~5s
This works as instead of passing *all* rooms to `record_sent_rooms` we
only need to pass rooms that were previously not in the LIVE state.
This came from a py-spy where we were spending ~10% CPU calling these
functions. Note that `record_sent_rooms` is a no-op for rooms that are
already in the `LIVE` state, so we only need to call them for
`PREVIOUSLY` or `INITIAL` rooms.
This was a note added in the PR to move to AGPL, which we failed to
remove before landing.
(The context for this was that we needed to decide if we were going to
change which debian repository we published too, but decided not to in
the end)
This is basically exactly the same logic as for receipts. Essentially we
just need to track which room account data we have and haven't sent down
to clients, and use that when we pull stuff out.
I think this just needs a couple of extra tests written
---------
Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
Performance optimization: We can avoid fetching rooms that the user has
left themselves (which could be a significant amount), then only add
back rooms that the user has `newly_left` (left in the token range of an
incremental sync). It's a lot faster to fetch less rooms than fetch them
all and throw them away in most cases. Since the user only leaves a room
(or is state reset out) once in a blue moon, we can avoid a lot of work.
Based on @erikjohnston's branch, erikj/ss_perf
---------
Co-authored-by: Erik Johnston <erik@matrix.org>
Add cache to `get_tags_for_room(...)`
This helps Sliding Sync because `get_tags_for_room(...)` is going to be
used in https://github.com/element-hq/synapse/pull/17695
Essentially, we're just trying to match `get_account_data_for_room(...)`
which already has a tree cache.
No need to sort if the range is large enough to cover all of the rooms
in the list. Previously, we would only do this optimization if the range
was exactly large enough.
Follow-up to https://github.com/element-hq/synapse/pull/17672
This appears to be enough to make Element Web work (or at least move it
on to the next hurdle)
---------
Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
Move filters tests to rest layer in order to test the new (with sliding
sync tables) and fallback paths that Sliding Sync can use.
Also found a bug in the new path because it's not being tested which is
also fixed in this PR. We now take into account `has_known_state` when
filtering.
Spawning from
https://github.com/element-hq/synapse/pull/17662#discussion_r1755574791.
This should have been done when we started using the new sliding sync
tables in https://github.com/element-hq/synapse/pull/17630
This PR changes `from pydantic import BaseModel` to `from
synapse._pydantic_compat import BaseModel` (as well as `constr`,
`conbytes`, `conint`, `confloat`).
It allows `check_pydantic_models.py` to mock those pydantic objects only
in the synapse module, and not interfere with pydantic objects in
external dependencies.
This should solve the CI problems for #17144, which breaks because
`check_pydantic_models.py` patches pydantic models from
[scim2-models](https://scim2-models.readthedocs.io/).
/cc @DMRobertson @gotmax23
fixes#17659
### Pull Request Checklist
<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->
* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
- Use markdown where necessary, mostly for `code blocks`.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct
(run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
We need to bust the `get_sliding_sync_rooms_for_user`
cache when the room encryption is updated and any
other field that is used in the query.
Follow-up to https://github.com/element-hq/synapse/pull/17630
- Bust cache for membership change (cross-reference
`get_rooms_for_user`)
- Bust cache for room `encryption` (cross-reference
`get_room_encryption`)
- Bust cache for `forgotten` (cross-reference
`did_forget`/`get_forgotten_rooms_for_user`)
For rooms with a name we can skip fetching a full room summary, as we
don't need to calculate heroes, and instead just fetch the room counts
directly.
This also changes things to not return counts and heroes for non-joined
rooms. For left/banned rooms we were returning zero values anyway, and
for invite/knock rooms we don't really want to leak such information
(even if some of is included in the stripped state).
For rooms with a name we can skip fetching a full room summary, as we
don't need to calculate heroes, and instead just fetch the room counts
directly.
This also changes things to not return counts and heroes for non-joined
rooms. For left/banned rooms we were returning zero values anyway, and
for invite/knock rooms we don't really want to leak such information
(even if some of is included in the stripped state).
Speed up incremental sync by avoiding extra work. We first look at the
state delta changes and only fetch and calculate further derived things
if they have changed.
Instead of having a large cache of `room_id -> bool` about whether a
room is partially stated, replace with a "fetch rooms the user is which
are partially-stated". This is a lot faster as the set of partially
stated rooms at any point across the whole server is small, and so such
a query is fast.
The main issue with the bulk cache lookup is the CPU time looking all
the rooms up in the cache.
We ended up spending ~10% CPU creating a new dictionary and
`_RoomMembershipForUser`, so let's avoid creating new dicts and copying
by returning `newly_joined`, `newly_left` and `is_dm` as sets directly.
---------
Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
I thought ruff check would also format, but it doesn't.
This runs ruff format in CI and dev scripts. The first commit is just a
run of `ruff format .` in the root directory.
This is to make it easier to reuse the logic when adding support for the
new tables
---------
Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>