Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start drafting weighted and timing histograms #109277

Merged

Conversation

MikeSpreitzer
Copy link
Member

@MikeSpreitzer MikeSpreitzer commented Apr 4, 2022

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

This PR introduces two new kinds of histograms, weighted histograms and timing histograms. A timing histogram has the interface of a gauge, but keeps track of the time that the variable spent in each of the ranges defined by the bucket boundaries. A timing histogram is built on a weighted histogram, which is like a regular histogram but its Observe method takes a weight as well as a value. Due to the limitations of the current model (in OpenMetrics) for histograms, the weight has to be an unsigned integer. The timing histograms are intended to replace the existing sample-and-watermark histograms in k/apiserver/pkg/util/flowcontrol with something less complex to consume and less costly at runtime to update.

These new histograms are hoped to eventually migrate into prometheus. Until then, they resides in k/component-base/metrics/prometheusextension .

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

This is like #109094 but:

This is part of addressing #108272 .

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@kubernetes/sig-api-machinery-misc
/cc @wojtek-t
@beorn7
@dgrisonnet
@logicalhan

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. labels Apr 4, 2022
)

// WeightedObserver
type WeightedObserver interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MikeSpreitzer - I think I'm not following the usecase of having weight here.
Can you point me to the usecase for it? Why is timingHistogram not enough for us?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 4, 2022
@serathius
Copy link
Contributor

cc @logicalhan

@MikeSpreitzer
Copy link
Member Author

/retest

@cici37
Copy link
Contributor

cici37 commented Apr 5, 2022

/remove-sig api-machinery

@k8s-ci-robot k8s-ci-robot removed the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Apr 5, 2022
type GaugeOps interface {
// Set sets the Gauge to the given value.
Set(float64)
// Add(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: comments should be prefixed by the method they are commenting by convention.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the comments here are just placeholders, yes they need to follow conventions and be useful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make the relationship utterly clear, in my latest push I made the comments on the methods of GaugeOps state their equivalence to the corresponding method of Gauge.

Set(float64)
// Add(1)
Inc()
// Sub(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

Set(float64)
// Add(1)
Inc()
// Sub(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

}

func (th *timingHistogram) SetToCurrentTime() {
th.update(func(oldValue float64) float64 { return th.clock.Since(time.Unix(0, 0)).Seconds() })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really confused by this method, are you intending to set the gauge value to the unix timestamp or are you intending to set the th.lastSetTime to the current timestamp?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method will just set the current time as the new value it will not override the lastSetTime. At least that would be the behavior of the normal SetToCurrentTime method of a gauge. I find it also a bit confusing, but I guess it was a shortcut made in client_golang to simplify the use of gauge with timestamp metrics.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is strange to me too, but effectively this method is coming from the original prometheus interface:
https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#Gauge

// SetToCurrentTime sets the Gauge to the current Unix time in seconds.
SetToCurrentTime()

So it works as designed :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created GaugeOps from the existing Gauge interface, just removing the inclusion of Metric and Collector. Frankly, prometheus should have this interface (analogous to the way it has Observer = Histogram - {Metric, Collector}).

Comment on lines 24 to 25
"github.com/prometheus/client_golang/prometheus"

dto "github.com/prometheus/client_model/go"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should be grouped

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 54 to 76
v0 := value0
t1 := t0.Add(time.Nanosecond)
var v1 float64 = 0.75
clk.SetTime(t1)
th.Set(v1)
t2 := t1.Add(time.Microsecond)
var d2 float64 = 0.5
v2 := v1 + d2
clk.SetTime(t2)
th.Add(d2)
t3 := t2
for i := 0; i < 1000000; i++ {
t3 = t3.Add(time.Nanosecond)
clk.SetTime(t3)
th.Set(v2)
}
var d3 float64 = -0.6
v3 := v2 + d3
th.Add(d3)
t4 := t3.Add(time.Second)
clk.SetTime(t4)

metch := make(chan prometheus.Metric)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test setup is really hard to read and it's not obvious to me what you are doing...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my latest push I added a func comment hopefully explaining it.

if want, got := uint64(t4.Sub(t0)), wroteHist.GetSampleCount(); want != got {
t.Errorf("Wanted %v but got %v", want, got)
}
if want, got := float64(t1.Sub(t0))*v0+float64(t2.Sub(t1))*v1+float64(t3.Sub(t2))*v2+float64(t4.Sub(t3))*v3, wroteHist.GetSampleSum(); want != got {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe worth adding a private method which does the Sub and casts to a float for readability just for test readability.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

)

// WeightedObserver
type WeightedObserver interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have tests for weighted histogram too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@logicalhan
Copy link
Member

/triage accepted
/assign @logicalhan @dgrisonnet

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 7, 2022
@kubernetes kubernetes deleted a comment from logicalhan Apr 8, 2022
@kubernetes kubernetes deleted a comment from logicalhan Apr 8, 2022
@kubernetes kubernetes deleted a comment from logicalhan Apr 8, 2022
@kubernetes kubernetes deleted a comment from logicalhan Apr 8, 2022
@dgrisonnet
Copy link
Member

New changes look good to me. I am fine with moving forward with this PR once wojtek's comments are addressed.

@MikeSpreitzer
Copy link
Member Author

MikeSpreitzer commented Apr 27, 2022

@dgrisonnet, I have three concerns with making the histogram code add text about the type into the help line:

  1. A scrape gets two comments per metric, one about the type and one about the instance. The proposal here is to put words about the type into the line about the instance. That's a mental mismatch.
  2. When the reader sees the words "EXPERIMENTAL METRIC TYPE", she will look for the type that is experimental. She will not see it. The # TYPE line remains ordinary.
  3. Adding fixed words onto the description provided by the client raises the question of whether the combination will parse and make sense.

@MikeSpreitzer
Copy link
Member Author

/retest

@dgrisonnet
Copy link
Member

I understand your concerns, but sadly we can't tweak the # TYPE comment since Prometheus is using it during ingestion to identify what is the type of the metrics that it is ingesting: https://github.com/prometheus/prometheus/blob/main/model/textparse/promparse.go#L291-L305. If we were to put something other than histogram which is the type we want the new metrics to be identified as, the parser would throw an error and the scrape would fail.

To avoid any confusion for the users, maybe just pretending the actual HELP text by EXPERIMENTAL: is enough. I don't think the consumers will really care about the underlying implementation concerns that we might have. Just knowing that this metric is experimental should make them aware that it should be used with care and it might change anytime.

@MikeSpreitzer
Copy link
Member Author

The force-push to 5f8b53ef38e makes the suggested addition to the # HELP line of timing histograms, and squashes to a single commit.

The following investigation occurred during development.

Add TimingHistogram impl that shares lock with WeightedHistogram

Benchmarking and profiling shows that two layers of locking is
noticeably more expensive than one.

After adding this new alternative, I now get the following benchmark
results.

```
(base) mspreitz@mjs12 kubernetes % go test -benchmem -run=^$ -bench ^BenchmarkTimingHistogram$ k8s.io/component-base/metrics/prometheusextension
goos: darwin
goarch: amd64
pkg: k8s.io/component-base/metrics/prometheusextension
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkTimingHistogram-16    	22232037	        52.79 ns/op	       0 B/op	       0 allocs/op
PASS
ok  	k8s.io/component-base/metrics/prometheusextension	1.404s

(base) mspreitz@mjs12 kubernetes % go test -benchmem -run=^$ -bench ^BenchmarkTimingHistogram$ k8s.io/component-base/metrics/prometheusextension
goos: darwin
goarch: amd64
pkg: k8s.io/component-base/metrics/prometheusextension
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkTimingHistogram-16    	22190997	        54.50 ns/op	       0 B/op	       0 allocs/op
PASS
ok  	k8s.io/component-base/metrics/prometheusextension	1.435s
```

and

```
(base) mspreitz@mjs12 kubernetes % go test -benchmem -run=^$ -bench ^BenchmarkTimingHistogramDirect$ k8s.io/component-base/metrics/prometheusextension
goos: darwin
goarch: amd64
pkg: k8s.io/component-base/metrics/prometheusextension
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkTimingHistogramDirect-16    	28863244	        40.99 ns/op	       0 B/op	       0 allocs/op
PASS
ok  	k8s.io/component-base/metrics/prometheusextension	1.890s
(base) mspreitz@mjs12 kubernetes %
(base) mspreitz@mjs12 kubernetes %
(base) mspreitz@mjs12 kubernetes % go test -benchmem -run=^$ -bench ^BenchmarkTimingHistogramDirect$ k8s.io/component-base/metrics/prometheusextension
goos: darwin
goarch: amd64
pkg: k8s.io/component-base/metrics/prometheusextension
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkTimingHistogramDirect-16    	27994173	        40.37 ns/op	       0 B/op	       0 allocs/op
PASS
ok  	k8s.io/component-base/metrics/prometheusextension	1.384s
```

So the new implementation is roughly 20% faster than the original.

Add overlooked exception, rename timingHistogram to timingHistogramLayered

Use the direct (one mutex) style of TimingHistogram impl

This is about a 20% gain in CPU speed on my development machine, in
benchmarks without lock contention.  Following are two consecutive
trials.

(base) mspreitz@mjs12 prometheusextension % go test  -benchmem -run=^$ -bench Histogram .
goos: darwin
goarch: amd64
pkg: k8s.io/component-base/metrics/prometheusextension
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkTimingHistogramLayered-16    	21650905	        51.91 ns/op	       0 B/op	       0 allocs/op
BenchmarkTimingHistogramDirect-16     	29876860	        39.33 ns/op	       0 B/op	       0 allocs/op
BenchmarkWeightedHistogram-16         	49227044	        24.13 ns/op	       0 B/op	       0 allocs/op
BenchmarkHistogram-16                 	41063907	        28.82 ns/op	       0 B/op	       0 allocs/op
PASS
ok  	k8s.io/component-base/metrics/prometheusextension	5.432s

(base) mspreitz@mjs12 prometheusextension % go test  -benchmem -run=^$ -bench Histogram .
goos: darwin
goarch: amd64
pkg: k8s.io/component-base/metrics/prometheusextension
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkTimingHistogramLayered-16    	22483816	        51.72 ns/op	       0 B/op	       0 allocs/op
BenchmarkTimingHistogramDirect-16     	29697291	        39.39 ns/op	       0 B/op	       0 allocs/op
BenchmarkWeightedHistogram-16         	48919845	        24.03 ns/op	       0 B/op	       0 allocs/op
BenchmarkHistogram-16                 	41153044	        29.26 ns/op	       0 B/op	       0 allocs/op
PASS
ok  	k8s.io/component-base/metrics/prometheusextension	5.044s

Remove layered implementation of TimingHistogram
@MikeSpreitzer
Copy link
Member Author

And the force-push to b4a40cd adds the # HELP tweak to WeightedHistograms too.

@wojtek-t
Copy link
Member

This LGTM. Based on the comments above from @dgrisonnet I'm going to approve it, but I will leave stamping as lgtm to @dgrisonnet or @logicalhan

@dgrisonnet - would you mind taking a look?

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MikeSpreitzer, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 29, 2022
@dgrisonnet
Copy link
Member

Great job @MikeSpreitzer!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 29, 2022
@k8s-ci-robot k8s-ci-robot merged commit cbb164e into kubernetes:master May 4, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.25 milestone May 4, 2022
@MikeSpreitzer MikeSpreitzer deleted the add-weighted-histogram branch May 4, 2022 04:16
beorn7 added a commit to prometheus/client_model that referenced this pull request Jun 14, 2022
Note that this is only an extension of the proto spec. Both generators
and consumers of the protobuf still need changes to make use of these
changes.

Gauge histograms measure current distributions. For one, they are
inspired by the GaugeHistogram type introducted by OpenMetrics, see
https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#gaugehistogram

They are also handled in the same way as OpenMetrics does it, by
using a new MetricType enum field GAUGE_HISTOGRAM, but not changing
anything else, i.e. for both regular and gauge histograms, the same
Histogram message type is used.

The other reason why we need gauge histograms comes from PromQL: If
you `rate` a histogram (which is possible with the new sparse
histograms as 1st class data type), the result is a gauge histogram. A
rate'd histogram can be created by a recording rule and then stored in
the TSDB. From there, it can be exposed by federation, so we need to
be able to represent it in the exposition format.

Float histograms are histograms where all counts (count of
observations, counts in each bucket, zero bucket count) are floating
point numbers rather than integer numbers. They are rarely needed for
direct instrumentation. Use cases are weighted histograms or timing
histograms, see kubernetes/kubernetes#109277
for a real-world example.

However, float histograms happen all the time as results of PromQL
expressions. Following the same line of argument as above, those float
histograms can end up in the TSDB via recording rules, which means
they can be exposed via federation.

Note that float histograms are implicitly supported by the original
Prometheus text format, as this format simply uses floating point
numbers for all sample values. OpenMetrics has avoided this ambiguity
and has specified integers for bucket counts and the count of
observations in a histogram, which means it needs to be extended to
support float histograms, similar to how this commit extends the
original Prometheus protobuf format.

Signed-off-by: beorn7 <beorn@grafana.com>
@brandond
Copy link

@MikeSpreitzer the TBD user-facing change from the PR description has gone out in the 1.25 alpha release notes. Can this be corrected for future releases?

@MikeSpreitzer
Copy link
Member Author

I am not sure I understand the question. This particular PR has no release note. It is part of a larger campaign that does have release notes and that I hope to finish in 1.25.

@brandond
Copy link

brandond commented Jun 16, 2022

The bit of text immediately after the Does this PR introduce a user-facing change? question in the PR template is scraped verbatim into the release notes. Because it just says TBD, that's the release note entry for this PR.

https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.25.md#:~:text=tbd

@MikeSpreitzer
Copy link
Member Author

MikeSpreitzer commented Jun 17, 2022

Oh, I see. Yes, that is not what I want. How can I fix that?

@brandond
Copy link

I'm not sure if changing it after the fact will effect release notes going forward, or if it'll take someone from sig-release fix?

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Jun 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants