feat: Relayer heap profiling (#4180)
### Description During the scalability audit we've identified relayer memory usage as a cause of concern. This PR runs the profiler each time e2e is run and outputs a report called `dhat-heap.json` in the current dir. That file can be visualised with a dhat viewer, such as the one hosted by the author [here](https://nnethercote.github.io/dh_view/dh_view.html). It adds a conditionally compiled heap memory profiler ([dhat-rs](https://docs.rs/dhat/latest/dhat/)), which is only included when both the `memory-profiling` feature is enabled and the `memory-profiling` cargo profile is used. The report can order the profiling results by total lifetime allocated bytes, peak allocated bytes, total allocation count, etc. Below is the result of sorting by total lifetime allocated bytes, which reveals that the biggest memory hogs are the hook indexing channel [senders](https://github.com/hyperlane-xyz/hyperlane-monorepo/blob/main/rust/hyperlane-base/src/contract_sync/mod.rs#L52) (one per origin chain), taking up 100+ MB. Its size can most likely be reduced from 1M to 1k. ![Screenshot 2024-07-23 at 14 32 21](https://github.com/user-attachments/assets/f551ebfe-3ee0-40de-b53f-e5de1bc56dfa) ### Drive-by changes Adds the remaining IDs in the channel as a field to the hook indexing log, to know if it's safe to lower the channel size. ### Related issues - Fixes https://github.com/hyperlane-xyz/issues/issues/1293 ### Backward compatibility Yes, since everything is conditionally compiled / isolated in a new cargo profile. ### Testing Manualdan/relayer-images-bump
parent
ed63e04c4a
commit
6140e6f397
@ -0,0 +1,48 @@ |
||||
use dhat::Profiler; |
||||
use eyre::Error as Report; |
||||
|
||||
#[global_allocator] |
||||
static ALLOC: dhat::Alloc = dhat::Alloc; |
||||
|
||||
fn initialize() -> Option<Profiler> { |
||||
let profiler = Profiler::new_heap(); |
||||
Some(profiler) |
||||
} |
||||
|
||||
fn termination_handler(profiler_singleton: &mut Option<Profiler>) { |
||||
// only call drop on the profiler once
|
||||
if let Some(profiler) = profiler_singleton.take() { |
||||
drop(profiler); |
||||
} |
||||
} |
||||
|
||||
pub(crate) async fn run_future<F, T>(fut: F) -> Result<T, Report> |
||||
where |
||||
F: std::future::Future<Output = Result<T, Report>>, |
||||
T: Default, |
||||
{ |
||||
let mut profiler_singleton = initialize(); |
||||
|
||||
let (shutdown_tx, shutdown_rx) = tokio::sync::oneshot::channel::<()>(); |
||||
let mut shutdown_tx_singleton = Some(shutdown_tx); |
||||
|
||||
ctrlc::set_handler(move || { |
||||
termination_handler(&mut profiler_singleton); |
||||
|
||||
// only send the shutdown signal once
|
||||
let Some(shutdown_tx) = shutdown_tx_singleton.take() else { |
||||
return; |
||||
}; |
||||
if let Err(_) = shutdown_tx.send(()) { |
||||
eprintln!("failed to send shutdown signal"); |
||||
} |
||||
}) |
||||
.expect("Error setting termination handler"); |
||||
|
||||
// this `select!` isn't cancellation-safe if `fut` owns state
|
||||
// but for profiling scenarios this is not a risk
|
||||
tokio::select! { |
||||
res = fut => { return res; }, |
||||
_ = shutdown_rx => { return Ok(T::default()) }, |
||||
}; |
||||
} |
Loading…
Reference in new issue