synapse

mirror of https://github.com/element-hq/synapse.git synced 2025-03-09 17:36:59 +00:00

Author	SHA1	Message	Date
Eric Eastwood	6ec5e13ec9	Fix join being denied after being invited over federation (#18075 ) This also happens for rejecting an invite. Basically, any out-of-band membership transition where we first get the membership as an `outlier` and then rely on federation filling us in to de-outlier it. This PR mainly addresses automated test flakiness, bots/scripts, and options within Synapse like [`auto_accept_invites`](https://element-hq.github.io/synapse/v1.122/usage/configuration/config_documentation.html#auto_accept_invites) that are able to react quickly (before federation is able to push us events), but also helps in generic scenarios where federation is lagging. I initially thought this might be a Synapse consistency issue (see issues labeled with [`Z-Read-After-Write`](https://github.com/matrix-org/synapse/labels/Z-Read-After-Write)) but it seems to be an event auth logic problem. Workers probably do increase the number of possible race condition scenarios that make this visible though (replication and cache invalidation lag). Fix https://github.com/element-hq/synapse/issues/15012 (probably fixes https://github.com/matrix-org/synapse/issues/15012 (https://github.com/element-hq/synapse/issues/15012)) Related to https://github.com/matrix-org/matrix-spec/issues/2062 Problems: 1. We don't consider [out-of-band membership](https://github.com/element-hq/synapse/blob/develop/docs/development/room-dag-concepts.md#out-of-band-membership-events) (outliers) in our `event_auth` logic even though we expose them in `/sync`. 1. (This PR doesn't address this point) Perhaps we should consider authing events in the persistence queue as events already in the queue could allow subsequent events to be allowed (events come through many channels: federation transaction, remote invite, remote join, local send). But this doesn't save us in the case where the event is more delayed over federation. ### What happened before? I wrote some Complement test that stresses this exact scenario and reproduces the problem: https://github.com/matrix-org/complement/pull/757 ``` COMPLEMENT_ALWAYS_PRINT_SERVER_LOGS=1 COMPLEMENT_DIR=../complement ./scripts-dev/complement.sh -run TestSynapseConsistency ``` We have `hs1` and `hs2` running in monolith mode (no workers): 1. `@charlie1:hs2` is invited and joins the room: 1. `hs1` invites `@charlie1:hs2` to a room which we receive on `hs2` as `PUT /_matrix/federation/v1/invite/{roomId}/{eventId}` (`on_invite_request(...)`) and the invite membership is persisted as an outlier. The `room_memberships` and `local_current_membership` database tables are also updated which means they are visible down `/sync` at this point. 1. `@charlie1:hs2` decides to join because it saw the invite down `/sync`. Because `hs2` is not yet in the room, this happens as a remote join `make_join`/`send_join` which comes back with all of the auth events needed to auth successfully and now `@charlie1:hs2` is successfully joined to the room. 1. `@charlie2:hs2` is invited and and tries to join the room: 1. `hs1` invites `@charlie2:hs2` to the room which we receive on `hs2` as `PUT /_matrix/federation/v1/invite/{roomId}/{eventId}` (`on_invite_request(...)`) and the invite membership is persisted as an outlier. The `room_memberships` and `local_current_membership` database tables are also updated which means they are visible down `/sync` at this point. 1. Because `hs2` is already participating in the room, we also see the invite come over federation in a transaction and we start processing it (not done yet, see below) 1. `@charlie2:hs2` decides to join because it saw the invite down `/sync`. Because `hs2`, is already in the room, this happens as a local join but we deny the event because our `event_auth` logic thinks that we have no membership in the room ❌ (expected to be able to join because we saw the invite down `/sync`) 1. We finally finish processing the `@charlie2:hs2` invite event from and de-outlier it. - If this finished before we tried to join we would have been fine but this is the race condition that makes this situation visible. Logs for `hs2`: ``` 🗳️ on_invite_request: handling event <FrozenEventV3 event_id=$PRPCvdXdcqyjdUKP_NxGF2CcukmwOaoK0ZR1WiVOZVk, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=invite, outlier=False> 🔦 _store_room_members_txn update room_memberships: <FrozenEventV3 event_id=$PRPCvdXdcqyjdUKP_NxGF2CcukmwOaoK0ZR1WiVOZVk, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=invite, outlier=True> 🔦 _store_room_members_txn update local_current_membership: <FrozenEventV3 event_id=$PRPCvdXdcqyjdUKP_NxGF2CcukmwOaoK0ZR1WiVOZVk, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=invite, outlier=True> 📨 Notifying about new event <FrozenEventV3 event_id=$PRPCvdXdcqyjdUKP_NxGF2CcukmwOaoK0ZR1WiVOZVk, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=invite, outlier=True> ✅ on_invite_request: handled event <FrozenEventV3 event_id=$PRPCvdXdcqyjdUKP_NxGF2CcukmwOaoK0ZR1WiVOZVk, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=invite, outlier=True> 🧲 do_invite_join for @user-2-charlie1:hs2 in !sfZVBdLUezpPWetrol:hs1 🔦 _store_room_members_txn update room_memberships: <FrozenEventV3 event_id=$bwv8LxFnqfpsw_rhR7OrTjtz09gaJ23MqstKOcs7ygA, type=m.room.member, state_key=@user-1-alice:hs1, membership=join, outlier=True> 🔦 _store_room_members_txn update room_memberships: <FrozenEventV3 event_id=$oju1ts3G3pz5O62IesrxX5is4LxAwU3WPr4xvid5ijI, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=join, outlier=False> 📨 Notifying about new event <FrozenEventV3 event_id=$oju1ts3G3pz5O62IesrxX5is4LxAwU3WPr4xvid5ijI, type=m.room.member, state_key=@user-2-charlie1:hs2, membership=join, outlier=False> ... 🗳️ on_invite_request: handling event <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=False> 🔦 _store_room_members_txn update room_memberships: <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=True> 🔦 _store_room_members_txn update local_current_membership: <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=True> 📨 Notifying about new event <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=True> ✅ on_invite_request: handled event <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=True> 📬 handling received PDU in room !sfZVBdLUezpPWetrol:hs1: <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=False> 📮 handle_new_client_event: handling <FrozenEventV3 event_id=$WNVDTQrxy5tCdPQHMyHyIn7tE4NWqKsZ8Bn8R4WbBSA, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=join, outlier=False> ❌ Denying new event <FrozenEventV3 event_id=$WNVDTQrxy5tCdPQHMyHyIn7tE4NWqKsZ8Bn8R4WbBSA, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=join, outlier=False> because 403: You are not invited to this room. synapse.http.server - 130 - INFO - POST-16 - <SynapseRequest at 0x7f460c91fbf0 method='POST' uri='/_matrix/client/v3/join/%21sfZVBdLUezpPWetrol:hs1?server_name=hs1' clientproto='HTTP/1.0' site='8080'> SynapseError: 403 - You are not invited to this room. 📨 Notifying about new event <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=False> ✅ handled received PDU in room !sfZVBdLUezpPWetrol:hs1: <FrozenEventV3 event_id=$O_54j7O--6xMsegY5EVZ9SA-mI4_iHJOIoRwYyeWIPY, type=m.room.member, state_key=@user-3-charlie2:hs2, membership=invite, outlier=False> ```	2025-01-27 11:21:10 -06:00
Sven Mäder	9c5d08fff8	Ratelimit presence updates (#18000 )	2025-01-24 19:58:01 +00:00
Quentin Gliech	7d52ce7d4b	Format files with Ruff (#17643 ) I thought ruff check would also format, but it doesn't. This runs ruff format in CI and dev scripts. The first commit is just a run of `ruff format .` in the root directory.	2024-09-02 12:39:04 +01:00
Erik Johnston	23740eaa3d	Correctly mention previous copyright (#16820 ) During the migration the automated script to update the copyright headers accidentally got rid of some of the existing copyright lines. Reinstate them.	2024-01-23 11:26:48 +00:00
Patrick Cloke	8e1e62c9e0	Update license headers	2023-11-21 15:29:58 -05:00
Patrick Cloke	85e5f2dc25	Add a new module API to update user presence state. (#16544 ) This adds a module API which allows a module to update a user's presence state/status message. This is useful for controlling presence from an external system. To fully control presence from the module the presence.enabled config parameter gains a new state of "untracked" which disables internal tracking of presence changes via user actions, etc. Only updates from the module will be persisted and sent down sync properly).	2023-10-26 15:11:24 -04:00
Patrick Cloke	85bfd4735e	Return an immutable value from get_latest_event_ids_in_room. (#16326 )	2023-09-18 09:29:05 -04:00
Patrick Cloke	8b5013dcbc	Time out busy presence status & test multi-device busy (#16174 ) Add a (long) timeout to when a "busy" device is considered not online. This does not match MSC3026, but is a reasonable thing for an implementation to do. Expands tests for the (unstable) busy presence with multiple devices.	2023-09-05 10:39:38 -04:00
Patrick Cloke	ea75346f6a	Track presence state per-device and combine to a user state. (#16066 ) Tracks presence on an individual per-device basis and combine the per-device state into a per-user state. This should help in situations where a user has multiple devices with conflicting status (e.g. one is syncing with unavailable and one is syncing with online). The tie-breaking is done by priority: BUSY > ONLINE > UNAVAILABLE > OFFLINE	2023-09-05 09:58:51 -04:00
Erik Johnston	d35bed8369	Don't wake up destination transaction queue if they're not due for retry. (#16223 )	2023-09-04 17:14:09 +01:00
Patrick Cloke	40901af5e0	Pass the device ID around in the presence handler (#16171 ) Refactoring to pass the device ID (in addition to the user ID) through the presence handler (specifically the `user_syncing`, `set_state`, and `bump_presence_active_time` methods and their replication versions).	2023-08-28 13:08:49 -04:00
Patrick Cloke	1bf143699c	Combine logic about not overriding BUSY presence. (#16170 ) Simplify some of the presence code by reducing duplicated code between worker & non-worker modes. The main change is to push some of the logic from `user_syncing` into `set_state`. This is done by passing whether the user is setting the presence via a `/sync` with a new `is_sync` flag to `set_state`. If this is `true` some additional logic is performed: * Don't override `busy` presence. * Update the `last_user_sync_ts`. * Never update the status message.	2023-08-28 11:03:23 -04:00
Patrick Cloke	da162cbe4e	Add tests for restoring the presence state after a restart. (#16151 )	2023-08-23 07:31:00 -04:00
Patrick Cloke	3f17178728	Clean-up presence tests (#16158 ) Reduce duplicated code & remove unused variables.	2023-08-22 11:43:44 -04:00
Eric Eastwood	1c802de626	Re-introduce the outbound federation proxy (#15913 ) Allow configuring the set of workers to proxy outbound federation traffic through (`outbound_federation_restricted_to`). This is useful when you have a worker setup with `federation_sender` instances responsible for sending outbound federation requests and want to make sure all outbound federation traffic goes through those instances. Before this change, the generic workers would still contact federation themselves for things like profile lookups, backfill, etc. This PR allows you to set more strict access controls/firewall for all workers and only allow the `federation_sender`'s to contact the outside world.	2023-07-18 09:49:21 +01:00
Eric Eastwood	c9bf644fa0	Revert "Federation outbound proxy" (#15910 ) Revert "Federation outbound proxy (#15773)" This reverts commit `b07b14b494`.	2023-07-10 11:10:20 -05:00
Eric Eastwood	b07b14b494	Federation outbound proxy (#15773 ) Allow configuring the set of workers to proxy outbound federation traffic through (`outbound_federation_restricted_to`). This is useful when you have a worker setup with `federation_sender` instances responsible for sending outbound federation requests and want to make sure all outbound federation traffic goes through those instances. Before this change, the generic workers would still contact federation themselves for things like profile lookups, backfill, etc. This PR allows you to set more strict access controls/firewall for all workers and only allow the `federation_sender`'s to contact the outside world. The original code is from @erikjohnston's branches which I've gotten in-shape to merge.	2023-07-05 18:53:55 -05:00
Patrick Cloke	652d1669c5	Add missing type hints to tests.handlers. (#14680 ) And do not allow untyped defs in tests.handlers.	2022-12-16 11:53:01 +00:00
realtyem	854a6884d8	Modernize unit tests configuration settings for workers. (#14568 ) Use the newer foo_instances configuration instead of the deprecated flags to enable specific features (e.g. start_pushers).	2022-12-01 07:38:27 -05:00
Andrew Morgan	618e4ab81b	Fix an invalid comparison of `UserPresenceState` to `str` (#14393 )	2022-11-16 15:25:35 +00:00
David Baker	73d8ded0b0	Prevent a sync request from removing a user's busy presence status (#12213 ) In trying to use the MSC3026 busy presence status, the user's status would be set back to 'online' next time they synced. This change makes it so that syncing does not affect a user's presence status if it is currently set to 'busy': it must be removed through the presence API. The MSC defers to implementations on the behaviour of busy presence, so this ought to remain compatible with the MSC.	2022-04-13 16:21:07 +01:00
Dirk Klimpel	9e06e22064	Add type hints to more tests files. (#12240 )	2022-03-17 07:25:50 -04:00
Patrick Cloke	02d708568b	Replace assertEquals and friends with non-deprecated versions. (#12092 )	2022-02-28 07:12:29 -05:00
Richard van der Hoff	e24ff8ebe3	Remove `HomeServer.get_datastore()` (#12031 ) The presence of this method was confusing, and mostly present for backwards compatibility. Let's get rid of it. Part of #11733	2022-02-23 11:04:02 +00:00
Richard van der Hoff	1800aabfc2	Split `FederationHandler` in half (#10692 ) The idea here is to take anything to do with incoming events and move it out to a separate handler, as a way of making FederationHandler smaller.	2021-08-26 21:41:44 +01:00
reivilibre	642a42edde	Flatten the synapse.rest.client package (#10600 )	2021-08-17 11:57:58 +00:00
Dirk Klimpel	6b61debf5c	Do not remove `status_msg` when user going offline (#10550 ) Signed-off-by: Dirk Klimpel dirk@klimpel.org	2021-08-09 16:21:04 +00:00
Patrick Cloke	8d609435c0	Move methods involving event authentication to EventAuthHandler. (#10268 ) Instead of mixing them with user authentication methods.	2021-07-01 14:25:37 -04:00
Eric Eastwood	96f6293de5	Add endpoints for backfilling history (MSC2716) (#9247 ) Work on https://github.com/matrix-org/matrix-doc/pull/2716	2021-06-22 10:02:53 +01:00
Andrew Morgan	21bd230831	Add a test for update_presence (#10033 ) https://github.com/matrix-org/synapse/issues/9962 uncovered that we accidentally removed all but one of the presence updates that we store in the database when persisting multiple updates. This could cause users' presence state to be stale. The bug was fixed in #10014, and this PR just adds a test that failed on the old code, and was used to initially verify the bug. The test attempts to insert some presence into the database in a batch using `PresenceStore.update_presence`, and then simply pulls it out again.	2021-05-21 17:29:14 +01:00
Erik Johnston	37623e3382	Increase perf of handling presence when joining large rooms. (#9916 )	2021-05-05 17:27:05 +01:00
Erik Johnston	e4ab8676b4	Fix tight loop handling presence replication. (#9900 ) Only affects workers. Introduced in #9819. Fixes #9899.	2021-04-28 14:42:50 +01:00
Erik Johnston	de0d088adc	Add presence federation stream (#9819 )	2021-04-20 14:11:24 +01:00
Jonathan de Jong	4b965c862d	Remove redundant "coding: utf-8" lines (#9786 ) Part of #9744 Removes all redundant `# -- coding: utf-8 --` lines from files, as python 3 automatically reads source code as utf-8 now. `Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`	2021-04-14 15:34:27 +01:00
Patrick Cloke	0b3112123d	Use mock from the stdlib. (#9772 )	2021-04-09 13:44:38 -04:00
Brendan Abolivier	405aeb0b2c	Implement MSC3026: busy presence state	2021-03-18 16:34:47 +01:00
Andrew Morgan	8bcfc2eaad	Be smarter about which hosts to send presence to when processing room joins (#9402 ) This PR attempts to eliminate unnecessary presence sending work when your local server joins a room, or when a remote server joins a room your server is participating in by processing state deltas in chunks rather than individually. --- When your server joins a room for the first time, it requests the historical state as well. This chunk of new state is passed to the presence handler which, after filtering that state down to only membership joins, will send presence updates to homeservers for each join processed. It turns out that we were being a bit naive and processing each event individually, and sending out presence updates for every one of those joins. Even if many different joins were users on the same server (hello IRC bridges), we'd send presence to that same homeserver for every remote user join we saw. This PR attempts to deduplicate all of that by processing the entire batch of state deltas at once, instead of only doing each join individually. We process the joins and note down which servers need which presence: * If it was a local user join, send that user's latest presence to all servers in the room * If it was a remote user join, send the presence for all local users in the room to that homeserver We deduplicate by inserting all of those pending updates into a dictionary of the form: ``` { server_name1: {presence_update1, ...}, server_name2: {presence_update1, presence_update2, ...} } ``` Only after building this dict do we then start sending out presence updates.	2021-02-19 11:37:29 +00:00
Eric Eastwood	0a00b7ff14	Update black, and run auto formatting over the codebase (#9381 ) - Update black version to the latest - Run black auto formatting over the codebase - Run autoformatting according to [`docs/code_style.md `](`80d6dc9783/docs/code_style.md`) - Update `code_style.md` docs around installing black to use the correct version	2021-02-16 22:32:34 +00:00
Patrick Cloke	30fba62108	Apply an IP range blacklist to push and key revocation requests. (#8821 ) Replaces the `federation_ip_range_blacklist` configuration setting with an `ip_range_blacklist` setting with wider scope. It now applies to: * Federation * Identity servers * Push notifications * Checking key validitity for third-party invite events The old `federation_ip_range_blacklist` setting is still honored if present, but with reduced scope (it only applies to federation and identity servers).	2020-12-02 11:09:24 -05:00
Richard van der Hoff	a34b17e492	Simplify `_locally_reject_invite` Update `EventCreationHandler.create_event` to accept an auth_events param, and use it in `_locally_reject_invite` instead of reinventing the wheel.	2020-10-13 23:58:48 +01:00
Patrick Cloke	c9c0ad5e20	Remove the deprecated Handlers object (#8494 ) All handlers now available via get_*_handler() methods on the HomeServer.	2020-10-09 07:24:34 -04:00
Patrick Cloke	ad6190c925	Convert stream database to async/await. (#8074 )	2020-08-17 07:24:46 -04:00
Erik Johnston	1f773eec91	Port PresenceHandler to async/await (#6991 )	2020-02-26 15:33:26 +00:00
Patrick Cloke	509e381afa	Clarify list/set/dict/tuple comprehensions and enforce via flake8 (#6957 ) Ensure good comprehension hygiene using flake8-comprehensions.	2020-02-21 07:15:07 -05:00
Richard van der Hoff	a5afdd15e5	Merge pull request #6806 from matrix-org/rav/redact_changes/3 Pass room_version into add_hashes_and_signatures	2020-01-31 10:57:03 +00:00
Richard van der Hoff	d7bf793cc1	s/get_room_version/get_room_version_id/ ... to make way for a forthcoming get_room_version which returns a RoomVersion object.	2020-01-31 10:06:21 +00:00
Richard van der Hoff	ef6bdafb29	Store the room version in EventBuilder	2020-01-30 22:15:50 +00:00
Erik Johnston	5859a5c569	Fix presence timeouts when synchrotron restarts. (#6212 ) * Fix presence timeouts when synchrotron restarts. Handling timeouts would fail if there was an external process that had timed out, e.g. a synchrotron restarting. This was due to a couple of variable name typoes. Fixes #3715.	2019-10-18 06:42:26 +01:00
Amber Brown	b36c82576e	Run Black on the tests again (#5170 )	2019-05-10 00:12:11 -05:00
Erik Johnston	40e56997bc	Review comments	2019-03-28 13:48:41 +00:00

1 2 3

120 commits