This PR makes a few radical changes to media. This now stores the SHA256
hash of each file stored in the database (excluding thumbnails, more on
that later). If a set of media is quarantined, any additional uploads of
the same file contents or any other files with the same hash will be
quarantined at the same time.
Currently this does NOT:
- De-duplicate media, although a future extension could be to do that.
- Run any background jobs to identify the hashes of older files. This
could also be a future extension, though the value of doing so is
limited to combat the abuse of recent media.
- Hash thumbnails. It's assumed that thumbnails are parented to some
form of media, so you'd likely be wanting to quarantine the media and
the thumbnail at the same time.
This background DB delta removes the old state group deletion background
update from the `background_updates` table if it exists.
The `delete_unreferenced_state_groups_bg_update` update should only
exist in that table if a homeserver ran v1.126.0rc1/v1.126.0rc2, and
rolled back or forward to any other version of Synapse before letting
the update finish.
### Pull Request Checklist
<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->
* [X] Pull request is based on the develop branch
* [X] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
- Use markdown where necessary, mostly for `code blocks`.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [X] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct
(run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
This PR fixes#18154 to avoid de-deltaing state groups which resulted in
DB size temporarily increasing until the DB was `VACUUM`'ed. As a
result, less state groups will get deleted now.
It also attempts to improve performance by not duplicating work when
processing state groups it has already processed in previous iterations.
### Pull Request Checklist
<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->
* [X] Pull request is based on the develop branch
* [X] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
- Use markdown where necessary, mostly for `code blocks`.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [X] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct
(run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
---------
Co-authored-by: Erik Johnston <erikj@element.io>
After the [recent supply chain attack](https://www.wiz.io/blog/new-github-action-supply-chain-attack-reviewdog-action-setup)
in `tj-actions/changed-files` and actions based on it, it's become clear
that relying on git tags to pin our dependencies is not enough (as tags
can simply be replaced). Therefore we need to switch to hashes.
Dependabot should continue to update these dependencies for us.
Best reviewed commit-by-commit. Though if CI passes, we're *probably*
fine.
To address a performance problem due to the foreign key on the same
column.
cc @erikjohnston
---------
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
I got rid of the `SYNAPSE_USE_FROZEN_DICTS` environment variable because
it will be overridden by the Synapse worker apps anyway and if we want
to support `SYNAPSE_USE_FROZEN_DICTS`, it should be in
`synapse/config/server.py`. It's also not documented so I'm assuming no
one is using it anyway.
Spawning from looking at the frozen dict stuff during the review of
https://github.com/element-hq/synapse/pull/18103#discussion_r1935876168
We do a few things in this PR to better support caching:
1. Change `Cache-Control` header to allow intermediary proxies to cache
media *only* if they revalidate on every request. This means that the
intermediary cache will still send the request to Synapse but with a
`If-None-Match` header, at which point Synapse can check auth and
respond with a 304 and empty content.
2. Add `ETag` response header to all media responses. We hardcode this
to `1` since all media is immutable (beyond being deleted).
3. Check for `If-None-Match` header (after checking for auth), and if it
matches then respond with a 304 and empty body.
---------
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.96 to 1.0.97.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/dtolnay/anyhow/releases">anyhow's
releases</a>.</em></p>
<blockquote>
<h2>1.0.97</h2>
<ul>
<li>Documentation improvements</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="bfb89ef244"><code>bfb89ef</code></a>
Release 1.0.97</li>
<li><a
href="c7fca9b086"><code>c7fca9b</code></a>
Ignore elidable_lifetime_names pedantic clippy lint</li>
<li><a
href="427c0bb0f3"><code>427c0bb</code></a>
Point standard library links to stable</li>
<li>See full diff in <a
href="https://github.com/dtolnay/anyhow/compare/1.0.96...1.0.97">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Revert "Add background job to clear unreferenced state groups (#18154)"
This mechanism is suspected of inserting large numbers of rows into
`state_groups_state`,
thus unreasonably increasing disk usage.
See: https://github.com/element-hq/synapse/issues/18217
This reverts commit 5121f9210c (#18154).
---------
Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>