Removing liveness probe to prevent killing prometheus pod during WAL
replay.
This should be reverted around kubernetes 1.21 release. At that point
startupProbe should be added.
When the Thanos spec doesn't configure object storage, there's no need to
configure the Thanos sidecar for block uploads and mount the
Prometheus data volume.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Previously the operator would fail the reconciliation when a service
monitor was referencing a bad secret or configmap (either the object
didn't exist or the key was missing).
With this change, the operator will skip these service monitors.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
This test generates the same configuration many times, for each
Prometheus version, to see if it is deterministic. As the compatibility
matrix grows, test times increase. Now, this sometimes fails in CI
because Travis kills jobs after 10 minutes of no output.
Run each version as a subtest, and run tests with `-v`, so that output
is produced after each version. This avoids the no-output timeout.
Parallelize testing for each Prometheus version.
When the tests are run with `-short` (as in `make test-unit`), only try
one hundred iterations. With the race detector on, as in that target, this takes
around 5 seconds. Without the race detector, short tests on this
package now run quick enough for fast iteration in an IDE.
Add an additional target and Travis job for running the long tests, but
without the race detector. This brings the run time for the full 1000
iterations per version to under a minute.
Signed-off-by: Matthias Rampke <matthias@rampke.de>
* pkg: add prometheus_operator_reconcile_operations_total metric
We already have the `prometheus_operator_reconcile_errors_total` metric
to track the number of reconciliation attempts that failed but we miss
the number of attempts which makes it harder to alert on it. With this
change, we can compute the ratio of reconciliations that failed.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Update alert definition with new metric
This change adds a new `prometheus_operator_resources` metric that keeps
track of the number of resources currently managed by the operator. The
metric is broken down by controller and type of resource.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Now we can configure the operator to use mTLS RemoteWrite by referencing
the CA, cert and key directly from k8s Secrets/ConfigMaps.
If the key and the cert are both Secrets, they can exist as a single
Secret which contain both 'cert.pem' and 'key.pem' otherwise they can
exist as 2 different Secrets (or a Secret for the key and ConfigMap for
the cert).
Signed-off-by: Yoni Bettan <ybettan@redhat.com>
Move logic for building image URLs into the operator package.
This improves the consistency for building image URLs from the
combination of default settings, operator CLI args, and config in the
custom resources.
PodMonitors already default to relabeling namespace, pod and container
into the target labels. ServiceMonitors should do the same to allow easy
correlation between signals.
* refactor: decouple pod labels from selector labels
prometheus pods can not be rolled out without downtime when label's are changed
Fixes#3120
* chore: run go fmt
* fix unit tests
for each ServiceMonitor or/and PodMonitor. EnforcedSampleLimit is taken
in favour of any SampleLimit set per ServiceMonitor or/and PodMonitor
resource. This is meant to be used by admins to be able to enforce a
limit of samples/series for each target.
Add global configurable scrapeTimeout parameter to allow monitoring
targets on clusters consisting of slower hosts like Raspberry Pi and
many ARM boards used for labs.
Signed-off-by: Carlos de Paula <me@carlosedp.com>
* add ability to exclude rules from namespace label enforcement
* fixup! add ability to exclude rules from namespace label enforcement
* fixed TestEnforcedNamespaceLabelRule
* fixup! add ability to exclude rules from namespace label enforcement
* add tests for LabelEnforcementExcludeList
* fixup! add ability to exclude rules from namespace label enforcement
* fixup! add ability to exclude rules from namespace label enforcement
* moved enforceNamespaceLabel to shared pkg
* fixup! moved enforceNamespaceLabel to shared pkg
* fixup! moved enforceNamespaceLabel to shared pkg
* Trigger build once more
* fixup! add ability to exclude rules from namespace label enforcement
* fixup! moved enforceNamespaceLabel to shared pkg
* fixup! moved enforceNamespaceLabel to shared pkg