For view change, the view change ID is the key to determine which node
will be next leader. In our original algorithm, the view ID always increased one step
at a time. The view change period increased exponetially based on the square of
the gap between current view changing ID and previous known block view change ID.
However, it is very slow to converge the view change if multiple nodes are offline and
view change can't be reached. Especially, during the shard down event, multiple nodes
are offline and other nodes have advanced their current view changing ID.
The new time-baed synchronuous view change algorithm uses the local timestamp and
the timestamp of the block to calculate the expected view changing ID. In this case,
offline nodes can immediately catch up with the latest view changing ID as long as
the offline nodes have the latest sync'ed block and relatively acturate local
clock. This algorithm will converge the view change faster in one or two view change
duration.
Signed-off-by: Leo Chen <leo@harmony.one>
* [rawdb] add error handling to all rawdb write. Add fdlimit module. Fix the node stuck
* [core] switch back the batch write condition in InsertReceiptChain
* [rawdb] add error handling to all rawdb write. Add fdlimit module. Fix the node stuck
* [consensus] refactored and optimized tryCatchup logic
* [sync] added consensus last mile block in sync.
* [consensus] remove time wait for consensus inform sync. Make block low chan a buffered chan
* [consensus] fix rebase errors, and optimize one line code
* [consensus][sync] fix golint error and added prune logic in sync
* [consensus] move header verify after adding FBFT log in onPrepared
* [consensus] more change on block verification logic
* [consensus] fix the verified panic issue
* [consensus][sync] add block verification in consensus last mile, change it to iterator
* [consensus] fix two nil pointer references when running local node (Still cannot find the root cause for it)
* remove coverage.txt and add to gitignore
* [consensus] add leader key check. Move quorum check logic after tryCatchup and can spin state sync
* [consensus] remove the leader sender check for now. Will add later
* [consensus] refactor fbftlog to get rid of unsafe mapset module. Replace with map
* [consensus] move the isQuorumAchived logic back. We surely need to check it before add to FBFTlog
* [consensus] remove the redundant block nil check
* [test] fix the consensus test
* [consensus] rebase main and fix stuff. Removed isSendByLeader
* [consensus] added logic to spin up sync when received message is greater than consensus block number
* [consensus] more changes in consensus. Remove some spin sync logic.
* fix error in main
* [consensus] change the hash algorithm of the FBFTLog to get rid of rlp error
* [consensus] use seperate mutex in FBFT message
* [consensus] change fbft log id to a shorter form. Added unit test case
[viewchange] new view change struct
move all view change data structure to a new viewchange struct
[viewchange] wrap functions in viewchange struct
[viewchange] newView process can not be reentrant
[viewchange] additional debug
[viewchange] always add m3 message to bitmap
[viewchange] shorten the duration for view change
[viewchange] further cleanup of onNewView func
[viewchange] add validpayloadlength const
[viewchange] only add m3 message if valid m1/m2 received
[viewchange] set next leader to any key
[viewchange] print more error message in log
[viewchange] get leader pubkey from coinbase
[viewchange] rename files back for easy review
[viewchange] fix nil pointer in getNextLeader
[viewchange] fix review comments
[viewchange] squash single used function
[viewchange] code re-org
[viewchange] clean up functions
[viewchange] minor typo fix
[viewchange] code clean up and more comments
Signed-off-by: Leo Chen <leo@harmony.one>
[viewchange] encapsulate view ID in the State struct
do NOT ues consensus.current directory to set/get viewID
use the following functions
consensus.SetCurViewID, consensus.SetViewChangingID
consensus.GetCurViewID, consensus.GetViewChangingID
Signed-off-by: Leo Chen <leo@harmony.one>
Check the following issue on the detailed explanation of why this commit makes sense.
https://github.com/harmony-one/harmony/issues/3225
The onPrepare and onCommit will do signature verification of the message payload.
So, there is no need to do duplicated sanity check when handling Prepare/Commit messages.
We reserve the signature verification on other messages, as they are not duplicated.
This PR can reduce the CPU load of leader during consensus.
Signed-off-by: Leo Chen <leo@harmony.one>
This is a big PR merged many small commits together.
We add the message validation function in libp2p layer.
In the validation function, we check the following conditions
1) the p2p message is a valid consensus message
2) the p2p message sender has a valid public key
3) the sender's public key is in the current committee
4) log the number of invalid/valid messages
After the validation, the valid messages will be forward to the network,
while the invalid messages will be filtered out.
The messages intended for the validator will be handled in the consensus layer.
Signed-off-by: Leo Chen <leo@harmony.one>
* Updating all sources of block.Transactions and do the corresponding work for block staking txns
* remove usages of uncle in accessors_chain_test
* fix bug in core/blockchain.go where incorrect receipt data was generated in SetReceiptsData
* [project] Dead test
* [node] If fields always same, then just use the constant
* [project] Remove overcomplicated ctxerror
* [project] Remove more dead tests, adjust error replacing ctxerror
* [project] More dead tests, fix travis complaint on Errorf
* [node.sh] Remove is-genesis
* Clean up logs
* remove log
* Refactor block period logic; wait for 100% sig before timeout
* logic fixes
* optimize now init
* always wait for 2s for grace period
* [rpc][validator] Extend hmy blockchain validator information
* [availability] Optimize bump count
* [staking][validator][rpc] Remove validator stats rpc, fold into validator information, make existing pattern default behavior
* [slash] Reimplement SetDifference
* [reward][engine][network] Remove bad API from fall, begin setup for Per validator awards
* [header] Custom Marshal header for downstream, remove dev code
* [effective][committee] Factor out EPoS round of computation thereby unification in codebase of EPoS
* [unit-test] Fix semantically wrong validator unit tests, punt on maxBLS key wrt tx-pool test
* [reward] Use excellent singleflight package for caching lookup of subcommittees
* [apr][reward] Begin APR package itself, iterate on iterface signatures
* [reward] Handle possible error from singleflight
* [rpc][validator][reward] Adjust RPC committees, singleflight on votingPower, foldStats into Validator Information
* [apr] Stub out computation of APR
* [effective][committee] Upgrade SlotPurchase with named fields, provide marshal
* [effective] Update Tests
* [blockchain] TODO Remove the validators no longer in committee
* [validator][effective] More expressive string representation of eligibilty, ValidatorRPC explicit say if in committee now
* [rpc] Median-stake more semantic meaningful
* [validator] Iterate on semantic meaning of JSON representation
* [offchain] Make validator stats return explicit error
* [availability] Small typo
* [rpc] Quick visual hack until fix delete out kicked out validators
* [offchain] Delete validator from offchain that lost their slot
* [apr] Forgot to update interface signature
* [apr] Mul instead of Div
* [protocol][validator] Fold block reward accum per vaidator into validator-wrapper, off-chain => on-chain
* [votepower] Refactor votepower Roster, simplify aggregation of network wide rosters
* [votepower][shard] Adjust roster, optimize usage of BLSPublicKey as key, use MarshalText trick
* [shard] Granular errors
* [votepower][validator] Unify votepower data structure with off-chain usage
* [votepower][consensus][validator] Further simplify and unify votepower with off-chain, validator stats
* [votepower] Use RJs naming convention group,overall
* [votepower] Remove Println, do keep enforcing order
* [effective][reward] Expand semantics of eligibility as it was overloaded and confusing, evict old voting power computations
* [apr] Adjust json field name
* [votepower] Only aggregate on external validator
* [votepower] Mistake on aggregation, custom presentation network-wide
* [rpc][validator][availability] Remove parameter, take into account empty snapshot
* [apr] Use snapshots from two, one epochs ago. Still have question on header
* [apr] Use GetHeaderByNumber for the header needed for time stamp
* [chain] Evict > 3 epoch old voting power
* [blockchain] Leave Delete Validator snapshot as TODO
* [validator][rpc][effective] Undo changes to Protocol field, use virtual construct at RPC layer for meaning
* [project] Address PR comments
* [committee][rpc] Move +1 to computation of epos round rather than hack mutation
* [reward] Remove entire unnecessary loop, hook on AddReward. Remove unnecessary new big int
* [votepower][rpc][validator] Stick with numeric.Dec for token involved with computation, expose accumulate block-reward in RPC
* [effective][committee] Track the candidates for the EPoS auction, RPC median-stake benefits
* [node] Add hack way to get real error reason of why cannot load shardchain
* [consensus] Expand log on current issue on nil block
* [apr] Do the actual call to compute for validator's APR
* [committee] Wrap SlotOrder with validator address, manifests in median-stake RPC
* [apr] Incorrect error handle order
* [quorum] Remove incorrect compare on bls Key, (typo), remove redundant error check
* [shard] Add log if stakedSlots is 0
* [apr] More sanity check on div by zero, more lenient on error when dont have historical data yet
* [committee] Remove + 1 on seat count
* [apr] Use int64() directly
* [apr] Log when odd empty nil header
* [apr] Do not crash on empty header, figure out later