1
0
Fork 0
mirror of https://github.com/prometheus-operator/prometheus-operator.git synced 2025-04-08 10:04:09 +00:00
Commit graph

3785 commits

Author SHA1 Message Date
Khaled Elkhawaga
5ab8453b13
helm: update chart reference (#3501)
the stable/prometheus-operator chart has been deprecated and further
development has been moved to prometheus-community/kube-prometheus-stack

Signed-off-by: Khaled Elkhawaga <k.elkhawaga@gmail.com>
2020-09-15 14:34:28 +02:00
Paweł Krupa
ac44b6180e
Merge pull request #3502 from paulfantom/livenessProbe
pkg/prometheus: remove liveness probe
2020-09-15 13:03:38 +02:00
paulfantom
35b2954459
pkg/prometheus: remove liveness probe
Removing liveness probe to prevent killing prometheus pod during WAL
replay.

This should be reverted around kubernetes 1.21 release. At that point
startupProbe should be added.
2020-09-15 12:05:18 +02:00
Paweł Krupa
9bbd18a7c8
Merge pull request #3504 from paulfantom/go1.15
Switch to go 1.15
2020-09-15 11:35:34 +02:00
Paweł Krupa
0021eb04a1
Merge pull request #3486 from paulfantom/lock_kind 2020-09-15 10:08:22 +02:00
paulfantom
2676aa022c
switch to golang 1.15 2020-09-15 09:14:40 +02:00
paulfantom
bbf76b2085
.github/workflows: wait for k8s cluster bootstrap by checking if all containers are ready 2020-09-14 16:31:48 +02:00
paulfantom
fc44e7f072
scripts: fix pushing tagged images by using github ref and push mutable 'master' tag 2020-09-14 16:31:22 +02:00
Frederic Branczyk
dc7578c762
Merge pull request #3484 from benjaminhuo/master
Change thanos ruler's http port to the default web
2020-09-14 07:42:41 +02:00
Simon Pasquier
675d303ee0
pkg/prometheus: enable Thanos uploads only when needed (#3485)
When the Thanos spec doesn't configure object storage, there's no need to
configure the Thanos sidecar for block uploads and mount the
Prometheus data volume.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-09-11 16:16:19 +02:00
Benjamin
cf2414edca Change thanos ruler's http port to the default web
Signed-off-by: Benjamin <benjamin@yunify.com>
2020-09-11 20:46:21 +08:00
Simon Pasquier
a818e30e24
go.mod: bump github.com/prometheus/prometheus (#3478)
Closes #3380

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-09-10 17:30:09 +02:00
Matthias Loibl
96094ad1ab
Merge pull request #3483 from metalmatze/changelog-0.42
Create release v0.42.0
2020-09-10 14:21:15 +02:00
Matthias Loibl
ecb6fc593f
Update changelog with latest feedback 2020-09-10 12:58:01 +02:00
Matthias Loibl
bfc0929264
Release v0.42.0 2020-09-10 12:57:58 +02:00
Paweł Krupa
93896327b0
Merge pull request #3475 from paulfantom/gh_actions 2020-09-10 12:41:54 +02:00
paulfantom
a1361008ff
.github/workflows: add image building stage and unify workflows
additionally added myself as github action code owner for future PR
reviews and swapped travis CI badge in README.md for GitHub actions one
2020-09-08 19:54:20 +02:00
paulfantom
94d0bd7e93
remove travis config and helper scripts 2020-09-08 16:02:17 +02:00
Sergiusz Urbaniak
289ee029ef
Merge pull request #3440 from s-urbaniak/remove-mlw
remove multilistwatcher and denylistfilter
2020-09-08 07:34:39 +02:00
paulfantom
9b146f87f4
switch to kind 2020-09-07 18:16:44 +02:00
paulfantom
52d9790329
initial GH actions 2020-09-07 17:07:02 +02:00
Sergiusz Urbaniak
c786d8ef2e pkg/informers: add mising godoc 2020-09-07 15:24:18 +02:00
Sergiusz Urbaniak
6d3aeef191 test/e2e/denylist_test: test deletion of resources 2020-09-07 15:22:57 +02:00
Sergiusz Urbaniak
c841309029 test/e2e: add e2e test for thanosruler denylisting 2020-09-07 11:52:04 +02:00
Sergiusz Urbaniak
e9b1081f10
Merge pull request #3465 from simonpasquier/instrument-k8s-client
Instrument client-go requests
2020-09-04 17:26:23 +02:00
Sergiusz Urbaniak
34ba8237f5 pkg/informers: fix stylistic nits
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
2020-09-04 17:08:33 +02:00
Sergiusz Urbaniak
ec3a83bae0 test/e2e: test allowlist against rolebindings, not cluster role bindings 2020-09-04 17:08:33 +02:00
Sergiusz Urbaniak
4f36b38e6c pkg/informers: add unit tests 2020-09-04 17:08:33 +02:00
Sergiusz Urbaniak
badeafdc36 pkg/informers: add godoc 2020-09-04 17:08:33 +02:00
Sergiusz Urbaniak
5e94344182 pkg/listwatch: remove multilistwatcher 2020-09-04 17:08:33 +02:00
Sergiusz Urbaniak
2379f59f6f pkg/prometheus: check error immediately after List 2020-09-04 17:08:33 +02:00
Sergiusz Urbaniak
27c1680975 pkg/*: renamings and reformatting 2020-09-04 17:08:33 +02:00
Sergiusz Urbaniak
0c9283465a pkg/thanos: remove multilistwatcher 2020-09-04 17:08:33 +02:00
Sergiusz Urbaniak
920f2490d9 pkg/alertmanager: remove multlistwatcher 2020-09-04 17:08:33 +02:00
Sergiusz Urbaniak
e9ad330bf8 pkg/prometheus: remove multilistwatcher 2020-09-04 17:08:33 +02:00
Sergiusz Urbaniak
f22fd2c7c0 pkg/listwach: remove denylist ListerWatcher 2020-09-04 16:58:51 +02:00
Sergiusz Urbaniak
54bbe620bb pkg/informers: initial commit 2020-09-04 16:58:51 +02:00
Simon Pasquier
3b2e17d714 Instrument client-go requests
This change adds 3 metrics tracking client-go requests to the Kubernetes
API:

* `prometheus_operator_kubernetes_client_http_requests_total`, counter
  with a `status_code` label.
* `prometheus_operator_kubernetes_client_http_request_duration_seconds`,
  summary with a `endpoint` label.
* `prometheus_operator_kubernetes_client_rate_limiter_duration_seconds`,
  summary with a `endpoint` label.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-09-04 16:03:13 +02:00
Frederic Branczyk
4b45d7d46b
Merge pull request #3466 from simonpasquier/use-context
*: pass context.Context to client-go functions
2020-09-04 12:50:53 +02:00
Simon Pasquier
053da63f0b *: pass context.Context to client-go functions
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-09-03 14:13:31 +02:00
Sergiusz Urbaniak
d1e9fc77e2
Merge pull request #3395 from matthiasr/mr/pkg-monitoring
Break the API types out into their own module
2020-09-02 09:35:50 +02:00
Sergiusz Urbaniak
909fc64585
Merge pull request #3445 from simonpasquier/fix-3327
pkg/prometheus: skip invalid service monitors
2020-08-31 16:56:45 +02:00
Sergiusz Urbaniak
608be1baec
Merge pull request #3436 from hwoarang/add-cluster-reconnect-timeout
pkg/alertmanager: Use lower value for --cluster.reconnect-timeout
2020-08-31 15:39:11 +02:00
Simon Pasquier
7ed47043ce Add tests for assetStore
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-08-31 14:51:30 +02:00
Simon Pasquier
a0a1816f4c Use cache.Store instead of custom stores 2020-08-31 10:51:09 +02:00
Simon Pasquier
caf6b9f3ce pkg/prometheus: skip invalid service monitors
Previously the operator would fail the reconciliation when a service
monitor was referencing a bad secret or configmap (either the object
didn't exist or the key was missing).

With this change, the operator will skip these service monitors.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-08-31 10:51:09 +02:00
Matthias Rampke
8d876ff138
Step into the API package for generating CRDs and code
`controller-gen` does not work across package boundaries. Run it from
inside the API package directory to work around this.

Signed-off-by: Matthias Rampke <matthias@rampke.de>
2020-08-28 13:41:47 +00:00
Matthias Rampke
2a67feba74
Break the API types out into their own module
This allows others to import them without incurring all the dependencies
of the operator transitively, and avoid version conflicts with other
dependencies as much as possible.

Fixed #3097.

Signed-off-by: Matthias Rampke <matthias@rampke.de>
2020-08-28 13:41:46 +00:00
Matthias Rampke
76d5211a6c
Avoid CI timeouts in TestConfigGeneration (#3432)
This test generates the same configuration many times, for each
Prometheus version, to see if it is deterministic. As the compatibility
matrix grows, test times increase. Now, this sometimes fails in CI
because Travis kills jobs after 10 minutes of no output.

Run each version as a subtest, and run tests with `-v`, so that output
is produced after each version. This avoids the no-output timeout.

Parallelize testing for each Prometheus version.

When the tests are run with `-short` (as in `make test-unit`), only try
one hundred iterations. With the race detector on, as in that target, this takes
around 5 seconds. Without the race detector, short tests on this
package now run quick enough for fast iteration in an IDE.

Add an additional target and Travis job for running the long tests, but
without the race detector. This brings the run time for the full 1000
iterations per version to under a minute.

Signed-off-by: Matthias Rampke <matthias@rampke.de>
2020-08-28 14:53:32 +02:00
Markos Chandras
86102e73e9
pkg/alertmanager: Use lower value for --cluster.reconnect-timeout
Alertmanager in cluster mode resolves the DNS name of each peer and
caches its IP address which uses on regular intervals to 'refresh'
the connection.

In high-dynamic environment like kubernetes, it's possible that
alertmanager pods come and go on frequent intervals. The default timeout
value of 6h is not suitable in that case as alertmanager will keep
trying to reconnect to a non-existing pod over and over until it gives
up and remove that peer from the member list. During this period of
time, the cluster is reported to be in a degraded state due to the
missing member.

As such, it's best to use a lower value which will allow the
alertmanager to remove the pod from the list of peers soon
after it disappears.

Related: https://github.com/prometheus/alertmanager/issues/2250
2020-08-26 13:02:35 +03:00