* Fast sync should traverse the world state depth first
1. The pending requests queue in the world state downloader is now a different data structure. We now use a list of priority queues.
2. The node requests now know about the parent request that was responsible for spawning them.
3. When a node has data available and is asked to persist before its children are persisted, the node will not do anything. Instead, it will wait for all its children to persist first and will persist together with the last child.
Storing children before parents gives us the following benefits:
* We don't need to store pending requests on disk any more.
* When restarting download from a new pivot, we do not need to walk and check the whole tree any more.
And the following drawbacks:
* We now have pending nodes in memory for which we already downloaded data, but we do not store them in the database yet.
Overall expectations on performance:
We still need to download every single state node at least once, so there is no improvement there. We will save a significant amount of time in case we change pivots. And we save lots of read/writes on filesystem because tasks are not needed to be written to disk any more.
We want to avoid having too many pending unsaved nodes in memory, not to run out of it. If we were always handling only one request to our peers at the same time, we would not need to be worried, and we would just use a simple depth first search. Because we batch our requests, we might produce too many pending unprocessed nodes in memory if we are not careful about the order of processing requests. That is where the priority on node request comes from. We want to always process nodes lower in the tree before nodes higher in the tree, and preferably we want to first process children from the same parent so that we can save the current unsaved parent as soon as possible.
At the moment, I still left in the code several artefacts that I use for debugging the behaviour. I am planning to get rid of most of these counters, feel free to point them out in the review. There is for instance a weird counter in the NodeDataRequest class that I am using to monitor the total amount of unsaved nodes. In case pending unsaved node count rises too high, there is a warning printed into the logs. At the moment of writing, I would expect the counter to stay below 10 000 generally and not rise above 20 000 nodes. If you saw the number rise to for instance 100 000 that would signify a bug.
Similarly, because of the order of processing of the nodes, we do not need to store huge number of requests on the disk any more and the whole list fits comfortably into the memory. Without batching, we would not have more than a thousand requests around waiting. Because of the batching, we can see the number of requests occasionally rises all the way up to 300 000, but usually should be under 200 000.
Note that at any time there should not be more pending unsaved blocks than pending requests. Such a situation would be a bug to be reported.
Signed-off-by: Jiri Peinlich <jiri.peinlich@gmail.com>
* Addressing review comments
Signed-off-by: Jiri Peinlich <jiri.peinlich@gmail.com>
* Fixed failing test
Signed-off-by: Jiri Peinlich <jiri.peinlich@gmail.com>
* Improving test coverage
Signed-off-by: Jiri Peinlich <jiri.peinlich@gmail.com>
* Addressing review comments
Signed-off-by: Jiri Peinlich <jiri.peinlich@gmail.com>
* lots of errorprone fixes
* some license updates
* some mockito updates
* upgrade the rocksdb version
* Prometheus left at 0.9.0 as 0.10.0+ introduces OpenMetrics
related changes that break unit tests.
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
* Upgrade to Apache Tuweni 2.0
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* Remove intermediate repository
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* Remove all occurrences of toBytes
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* Migrate to tuweni-bytes
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* add changelog
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* correct reference tests
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* Initial API changes
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* more changes
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* Change APIs for VM ops
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* Use constant UInt256.ONE
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* Optimize a bit address <> word transformation
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* spotless
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
Remove all but 4 log4j2.xml config files
* The main config for the besu CLI app
* The config for the evmTool CLI app
* The config for acceptance tests
* A config in testUtil
If any tests depend on a log4j file they should import testutil.
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
* Add tracing support for internals and JSON-RPC
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* Remove rocksdb tracing as it slows down execution too much
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* Add B3 headers extraction on JSON-RPC requests
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* Remove traces around trie tree as they slow down syncing significantly
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* Add tracing to fast sync pipeline
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* Add tracing for all pipelines
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* Address code review
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* Add acceptance tests and break out the shaded dependency
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* Fix tracer id
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* Revert changes to trie
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* Upgrade otel to latest, remove old tracing
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
* Code review comments
Signed-off-by: Antoine Toulme <antoine@lunar-ocean.com>
Adding the Log4j "jul" (java.util.logging) adapter resulted in many
messages like this at startup:
`main INFO Registered Log4j as the java.util.logging.LogManager.`
These come from the Log4j status logger. We can get rid of those by
setting the status attribute on all configurations to a higher logging
level. WARN is the next higher level.
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
This is a not-fully-functional prototype of Bonsai Tries.
Bonsai tries is a flat leaf storage, branch-by-location, and diff based reorgs
refactoring of the existing forest based trie storage mechanism aimed at
creating sustainable performance at mainnet loads.
* Since it is experimental a feature flag of --Xdata-storage-format=BONSAI
controls activation.
Some required changes have a long reach:
* To accommodate location based storage many Trie operations accept both a
location and hash value. Each data storage format is keyed off of only
one of the fields, so many tests will pass in null to the other field.
* MutableWorldStateUpdater.persist now takes an argument of a block hash.
If this is a natural progression of blocks the hash of the new block is
passed in. Otherwise null should be passed in.
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Add `memory` as an option for `--key-value-storage`. This is useful in
small network synchronization tests as memory is faster and is easier to
inspect via a debugger.
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Upgrade to ErrorProne 2.4.0
* public constructors on abstract classes are removed
* Javadoc must have meaningfull documentation
* lambdas should not be variables
* Added to the list of confusing inner class names (Entry and Type)
* no assert keyword in tests
* Obsolete JDK classes produce errors now
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
* add doomed key check (busy-waiting for now)
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* optional and logging
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* remove logging
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* sleeping and hardening
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* rename segments
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* move away from atomic references to regular vars
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* remove hardened segment parameter
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* increase sleep
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* spotless
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* remove unnecessary interface
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* rename
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* move commit waiting outside of timer
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* set default lock timeout to 1ms
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* add default lock timeout to tests
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* Revert "rename segments"
This reverts commit 184eefaaa0ccc857b0caff2b382f8338ff225d5d.
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* fix jmh compilation error
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* add documentation
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* bump up sleep to 1ms
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* (POC) Add lock to ensure that we don't prune while comitting
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* remove unnecessary persist (#569)
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* flesh out @mbaxter's idea and remove my code
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* iterator changes
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* hybridize with doomed key
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* comment
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* move doomed key unset to after node added listener
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* update instead of getting and setting doomedKeyRef in commit
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* comment
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* invert condition
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* remove locks
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* remove `removeAllKeysUnless`
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* more remove removeAllKeysUnless
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* reuse streamKeys
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* remove test
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* set default lock timeout to 1ms
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* add default lock timeout to tests
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* fix jmh compilation error
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* revert back to locks instead of doomedkey
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* change delete to not guarantee deletion
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* plugin hash
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* javadoc
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* Revert "change delete to not guarantee deletion"
This reverts commit 2289bb34cf.
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* skip key deletion on timeout
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* clear in rollback
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* Revert "fix jmh compilation error"
This reverts commit b64ecf8656.
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* Revert "add default lock timeout to tests"
This reverts commit aff6aa6065.
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* Revert "set default lock timeout to 1ms"
This reverts commit 267fe0a642.
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* use noSlowDown write option instead of global timeout
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* add back tests
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* close tryDeleteOptions
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* remove unnecessary lock
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* move increment inside try
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* use StorageException subclass instead of field
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* revert accidental deletion in javadoc
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* tryDelete javadoc
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* add trace for skipping key deletion
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* merge catch and finally try blocks
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* switch from exception to boolean return value
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* tweak
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* changelog changes
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* add api back
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* add back throws javadoc
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
Co-authored-by: Meredith Baxter <meredith.baxter@consensys.net>
Co-authored-by: MadelineMurray <43356962+MadelineMurray@users.noreply.github.com>
Removes as many Gradle 7.0 compatibility issues as possible
* `baseName` -> `archiveBaseName`
* `extension` -> `archiveExtension`
* `destinationDir` -> `destinationDirectory`
* `runtime` -> `runtimeOnly`
* Change some log4j-api and log4j-core dependencies
* Remove an unneeded and outdated plugin (`net.ltgt.apt`)
* tweak the plugin-api change detector's property annotations.
Warnings still exist with one external plugin used for license file
checking that we do not control the source code for.
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Update dependencies to most current version
- except picocli which is a major version update
Alphabetize dependencies
Signed-off-by: Danno Ferrin <danno.ferrin@gmail.com>
Adds integration tests to catch bug where the mark storage was being cleared before a mark began. Instead, the mark storage is now cleared upon preparing the mark sweep pruner
Fixes bug where the pending marks where not being checked while pruning was occurring. By removing the flush in sweepBefore, the existing tests catch the bug.
Signed-off-by: Ratan Rai Sur <ratan.r.sur@gmail.com>
* adding in spdx-license-identifier & updated check for the same; removing license check from spotless
Signed-off-by: Joshua Fernandes <joshua.fernandes@consensys.net>
* Change CheckSpdxHeader to a task.
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
* Ensure `plugin-api` module gets published at the correct maven path
* Move `plugins` to `plugin-api`
Signed-off-by: Edward Evans <edward.evans@consensys.net>
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
* Return plugin-api to the main repo
* Spotless
* Migrate all external plugin-api references to the project in this repo
* Add licence header
* Update repo reference for publish, even if commented
* Use real configuration for publishing plugin-api
This was tested with the
`:plugins:publishMavenJavaPublicationToMavenLocal` task and checking the
local Maven repo to make sure it was using the correct paths
Signed-off-by: Edward Evans <edward.evans@consensys.net>
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
* Rich Data for Events Plugin
* plugin-api -> api
* use BinaryData as a root and add getValue to UInt256Value
* bring rick data changes in line with proposed APIs.
Undo orthagonal naming changes.
* add size
* use log and transaction interfaces from rich data api
* update to released plugin api version and add new event listener.
* add tests
* fix acceptance tests
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
* sweep state roots before child nodes
* Adds long removeAccountStateTrieNode(key) method to WorldStateStorage.Updater
* remove assertj assertions from `AbstractKeyValueStorageTest`
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
Some auto-generated maven names cause collissions when used with the gradle
publish plugin, namely `core` and `util`. Rename two jars that have such
a colission so the colission doesn't occur.
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
* Database Versioning: The behavior is to load the database at the existing version if it
already exists or create the newest version if it doesn't
* multi-column by default: This makes the separated world state storage column required by mark sweep on by default
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
Windows has some slightly different Java NIO semantics, regarding size
after writing and whether files are deleted on close (they aren not).
The first issue is we shouldn't be using Integer.SIZE when we mean
Integer.BYTES.
The second issue is we cannot count on these work files showing up or
being deleted from the file system consistently across platforms. The
ordering is consistent within platforms but not across. The test was
re-written to check the read and write file numbers instead.
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
* Mark Sweep Pruner
* add `unload` method on Node interface, which is a noop everywhere but on `StoredNode`
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
* [PAN-2853] Create a metric tracking DB size
- create a Prometheus long gauge to track the database size
- `rocksdb.live-sst-files-size` : 872a261ffc/include/rocksdb/db.h (L685)
- use `rocksdb.live-sst-files-size` property rather than `rocksdb.total-sst-files-size` (performance purpose: see warning 872a261ffc/include/rocksdb/db.h (L680))
- add tests to check creation of metrics
* fix wildcard import
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
* Columnated storage to allow for iteration over world state
* change MetricsCategory to PantheonMetricsCategory
* consistency renaming of kvstores
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
Replaces the RocksDB based queue for pending world state download tasks with one that uses a simple file. Added tasks are appended to the file while the reader starts from the beginning of the file and reads forwards.
Periodically a new file is started to limit the disk space used. The reader deletes files it has completed reading.
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
Default values are currently unchanged but increasing both these values to 4 appears to improve sync performance for both ropsten and Mainnet fast syncs.
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>
Introduce a pipeline based full sync process. Currently toggled off but can be enabled via a --X option.
Signed-off-by: Adrian Sutton <adrian.sutton@consensys.net>