chore: reduce op-queue verbosity (#4092)

### Description

The gcp bill has been shooting up, and it's most likely due to the
increased number of undeliverable messages being added to and removed
from the queue, causing lots of `debug` level logs on `push` and
`pop_many` to be emitted.

This PR changes the verbosity to `trace` - we'll have lower visibility
over the lifetime of a message, but there should be sufficient logs
remaining to tell us what's going on (such as [operation
prepared](30630c4fa6/rust/agents/relayer/src/msg/op_submitter.rs (L240)),
[confirming a pre-submitted
one](30630c4fa6/rust/agents/relayer/src/msg/op_submitter.rs (L260)),
[submitting single
op](30630c4fa6/rust/agents/relayer/src/msg/op_submitter.rs (L313)),
[submitting
batch](30630c4fa6/rust/agents/relayer/src/msg/op_submitter.rs (L456)),
and [op
confirmed](30630c4fa6/rust/agents/relayer/src/msg/op_submitter.rs (L386))).

@paulbalaji suggested measuring log counts, so here they are:
- in the last 6h, we had 42M `hyperlane` relayer logs, of which 22M
where `push` logs and 1.1M were `pop_many` (no `pop`s)
- if we make push ones trace, 5.5% of the remaining logs would be
pop_many


### Drive-by changes

<!--
Are there any minor or drive-by changes also included?
-->

### Related issues

<!--
- Fixes #[issue number here]
-->

### Backward compatibility

<!--
Are these changes backward compatible? Are there any infrastructure
implications, e.g. changes that would prohibit deploying older commits
using this infra tooling?

Yes/No
-->

### Testing

<!--
What kind of testing have these changes undergone?

None/Manual/Unit Tests
-->
pull/3136/merge
Daniel Savu 5 months ago committed by GitHub
parent f8b33d1270
commit d90d8f86b4
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
  1. 4
      rust/agents/relayer/src/msg/op_queue.rs

@ -25,7 +25,7 @@ impl OpQueue {
/// - `op`: the operation to push onto the queue
/// - `new_status`: optional new status to set for the operation. When an operation is added to a queue,
/// it's very likely that its status has just changed, so this forces the caller to consider the new status
#[instrument(skip(self), ret, fields(queue_label=%self.queue_metrics_label), level = "debug")]
#[instrument(skip(self), ret, fields(queue_label=%self.queue_metrics_label), level = "trace")]
pub async fn push(&self, mut op: QueueOperation, new_status: Option<PendingOperationStatus>) {
if let Some(new_status) = new_status {
op.set_status(new_status);
@ -37,7 +37,7 @@ impl OpQueue {
}
/// Pop an element from the queue and update metrics
#[instrument(skip(self), ret, fields(queue_label=%self.queue_metrics_label), level = "debug")]
#[instrument(skip(self), ret, fields(queue_label=%self.queue_metrics_label), level = "trace")]
pub async fn pop(&mut self) -> Option<QueueOperation> {
let pop_attempt = self.pop_many(1).await;
pop_attempt.into_iter().next()

Loading…
Cancel
Save