fn-serverless

alihan/fn-serverless

Fork 0

mirror of https://github.com/fnproject/fn.git synced 2022-10-28 21:29:17 +03:00

Commit Graph

Author	SHA1	Message	Date
jan grant	edf2fc8831	Add a finer-grained view for placer latency metrics (#1085 ) This is a small tweak to the placer latency stats. If we have a cluster of values around the 1-2s mark, then having a single relatively broad bucket that captures the (1s, 10s] range will obscure that. In particular, typical Prometheus quartile estimates may be distorted by this bucket size.	2018-06-25 10:36:46 +01:00
Tolga Ceylan	f24172aa9d	fn: introducing lb placer basic metrics (#1058 ) * fn: introducing lb placer basic metrics This change adds basic metrics to naive and consistent hash LB placers. The stats show how many times we scanned the full runner list, if runner pool failed to return a runner list or if runner pool returned an empty list. Placed and not placed status are also tracked along with if TryExec returned an error or not. Most common error code, Too-Busy is specifically tracked. If client cancels/times out, this is also tracked as a client cancel metric. For placer latency, we would like to know how much time the placer spent on searching for a runner until it successfully places a call. This includes round-trip times for NACK responses from the runners until a successful TryExec() call. By excluding last successful TryExec() latency, we try to exclude function execution & runner container startup time from this metric in an attempt to isolate Placer only latency. * fn: latency and attempt tracker Removing full scan metric. Tracking number of runners attempted is a better metric for this purpose. Also, if rp.Runners() fail, this is an unrecoverable error and we should bail out instead of retrying. * fn: typo fix, ch placer finalize err return * fn: enable LB placer metrics in WithAgentFromEnv if prometheus is enabled	2018-06-12 13:36:05 -07:00

Author

SHA1

Message

Date

jan grant

edf2fc8831

Add a finer-grained view for placer latency metrics (#1085 )

This is a small tweak to the placer latency stats. If we have a cluster of values
around the 1-2s mark, then having a single relatively broad bucket that captures
the (1s, 10s] range will obscure that. In particular, typical Prometheus quartile
estimates may be distorted by this bucket size.

2018-06-25 10:36:46 +01:00

Tolga Ceylan

f24172aa9d

fn: introducing lb placer basic metrics (#1058 )

* fn: introducing lb placer basic metrics

This change adds basic metrics to naive and consistent
hash LB placers. The stats show how many times we scanned
the full runner list, if runner pool failed to return a
runner list or if runner pool returned an empty list.

Placed and not placed status are also tracked along with
if TryExec returned an error or not. Most common error
code, Too-Busy is specifically tracked.

If client cancels/times out, this is also tracked as
a client cancel metric.

For placer latency, we would like to know how much time
the placer spent on searching for a runner until it
successfully places a call. This includes round-trip
times for NACK responses from the runners until a successful
TryExec() call. By excluding last successful TryExec() latency,
we try to exclude function execution & runner container
startup time from this metric in an attempt to isolate
Placer only latency.

* fn: latency and attempt tracker

Removing full scan metric. Tracking number of
runners attempted is a better metric for this
purpose.

Also, if rp.Runners() fail, this is an unrecoverable
error and we should bail out instead of retrying.

* fn: typo fix, ch placer finalize err return

* fn: enable LB placer metrics in WithAgentFromEnv if prometheus is enabled

2018-06-12 13:36:05 -07:00

2 Commits