We are growing our list of metrics by the day, and some of them we don't
always need. So we should provide a way to trim them.
* If metrics are not enabled we supply a no-op metrics
* If only select metric categories are enabled we supply real metrics
for the selected metrics and no-op metrics for all the rest
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
* Instantiate individual metrics where they're used
* Cache prometheus metrics to allow "duplicate" creation of metrics
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
Two changes:
* Stream chains now take up less vertical lines, only breaking on
stream operations.
* Long annotations that span multiple lines no longer have a dangling
parentesis and indent 4 spaces.
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
* Metrics Service configuration incorrect
--metrics-push-enabled is not eabling metrics push and --metrics-enabled
is enabling push instead of polling.
* formatting
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
* Don't overload `--metrics-host` and `--metrics-port` for push mode, instead add '--metrics-push-enabled' , `--metrics-push-host` and `--metrics-push-port`, rename `--metrics-prometheus-job` to `--metrics-push-prometheus-job` and remove `--metrics-mode`
* Show an error if both metrics and metrics-push are enabled.
* Allow shutdown if we cannot communicate with the Push Gateway.
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
Sometimes metrics are hard to poll (docker containers with varying ip
addresses). Because of that the push gateway exists. This extends the
metrics system to support push or pull mode for metrics (but not both
at the same time).
Three new flags
`--metrics-mode=`<`push`|`pull`> - Whether we are in pull mode (the default) where
prometheus is expected to poll or push mode where pantheon pushes to
a push gateway.
`--metrics-push-interval=`<_integer_> the frequency, in seconds, between pushes to
the push gateway. Only relevant in push mode
`--metrics-prometheus-job=`<_string_> The name of the job to report in the push gateway
Also, `--metrics-host=` and `--metrics-port=` gain new meaning in push mode. Instead of the
server they are opening up it is the host and the port of the push gateway it should push to.
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
A service like the JSON-RPC service is opened up, only serving /metrics
requests in a file format for prometheus.
New CLI flags are --metrics-enabled and --metrics-listen, just like the
--rpc and --ws variants of the same.
--host-whitelist is respected the same as the JSON-RPC endpoint.
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
* Time all tasks
This is fairly high touch consisting of 3 things:
* Moving to Prometheus's Summary for timers
* Timing at .2, .5, .8, .9, .99, and 1.0 (1.0 actually gets max I believe)
* Timing all abstract EthTasks
* The bulk of the changes: plumbing the timing context everywhere we need it
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
* Plumb in three more metrics
* add blockchain_height gauge
* add blockchain_difficulty_total gauge
* add blockchain_announcedBlock_ingest histogram
This involved some deep pluming such that the metrics system needs to be
created in the PantheonCommand, along with trickle down effects into other
consensus engines. This is likely where it should live anyway.
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
Metrics being captured initially:
Total number of peers ever connected to
Total number of peers disconnected, by disconnect reason and whether the disconnect was initiated locally or remotely.
Current number of peers
Timing for processing JSON-RPC requests, broken down by method name.
Generic JVM and process metrics (memory used, heap size, thread count, time spent in GC, file descriptors opened, CPU time etc).
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>