For view change, the view change ID is the key to determine which node
will be next leader. In our original algorithm, the view ID always increased one step
at a time. The view change period increased exponetially based on the square of
the gap between current view changing ID and previous known block view change ID.
However, it is very slow to converge the view change if multiple nodes are offline and
view change can't be reached. Especially, during the shard down event, multiple nodes
are offline and other nodes have advanced their current view changing ID.
The new time-baed synchronuous view change algorithm uses the local timestamp and
the timestamp of the block to calculate the expected view changing ID. In this case,
offline nodes can immediately catch up with the latest view changing ID as long as
the offline nodes have the latest sync'ed block and relatively acturate local
clock. This algorithm will converge the view change faster in one or two view change
duration.
Signed-off-by: Leo Chen <leo@harmony.one>
* [rawdb] add error handling to all rawdb write. Add fdlimit module. Fix the node stuck
* [core] switch back the batch write condition in InsertReceiptChain
* [rawdb] add error handling to all rawdb write. Add fdlimit module. Fix the node stuck
* [consensus] refactored and optimized tryCatchup logic
* [sync] added consensus last mile block in sync.
* [consensus] remove time wait for consensus inform sync. Make block low chan a buffered chan
* [consensus] fix rebase errors, and optimize one line code
* [consensus][sync] fix golint error and added prune logic in sync
* [consensus] move header verify after adding FBFT log in onPrepared
* [consensus] more change on block verification logic
* [consensus] fix the verified panic issue
* [consensus][sync] add block verification in consensus last mile, change it to iterator
* [consensus] fix two nil pointer references when running local node (Still cannot find the root cause for it)
* remove coverage.txt and add to gitignore
* [consensus] add leader key check. Move quorum check logic after tryCatchup and can spin state sync
* [consensus] remove the leader sender check for now. Will add later
* [consensus] refactor fbftlog to get rid of unsafe mapset module. Replace with map
* [consensus] move the isQuorumAchived logic back. We surely need to check it before add to FBFTlog
* [consensus] remove the redundant block nil check
* [test] fix the consensus test
* [consensus] rebase main and fix stuff. Removed isSendByLeader
* [consensus] added logic to spin up sync when received message is greater than consensus block number
* [consensus] more changes in consensus. Remove some spin sync logic.
* fix error in main
* [consensus] change the hash algorithm of the FBFTLog to get rid of rlp error
* [consensus] use seperate mutex in FBFT message
* [consensus] change fbft log id to a shorter form. Added unit test case
[viewchange] new view change struct
move all view change data structure to a new viewchange struct
[viewchange] wrap functions in viewchange struct
[viewchange] newView process can not be reentrant
[viewchange] additional debug
[viewchange] always add m3 message to bitmap
[viewchange] shorten the duration for view change
[viewchange] further cleanup of onNewView func
[viewchange] add validpayloadlength const
[viewchange] only add m3 message if valid m1/m2 received
[viewchange] set next leader to any key
[viewchange] print more error message in log
[viewchange] get leader pubkey from coinbase
[viewchange] rename files back for easy review
[viewchange] fix nil pointer in getNextLeader
[viewchange] fix review comments
[viewchange] squash single used function
[viewchange] code re-org
[viewchange] clean up functions
[viewchange] minor typo fix
[viewchange] code clean up and more comments
Signed-off-by: Leo Chen <leo@harmony.one>
[viewchange] encapsulate view ID in the State struct
do NOT ues consensus.current directory to set/get viewID
use the following functions
consensus.SetCurViewID, consensus.SetViewChangingID
consensus.GetCurViewID, consensus.GetViewChangingID
Signed-off-by: Leo Chen <leo@harmony.one>
Check the following issue on the detailed explanation of why this commit makes sense.
https://github.com/harmony-one/harmony/issues/3225
The onPrepare and onCommit will do signature verification of the message payload.
So, there is no need to do duplicated sanity check when handling Prepare/Commit messages.
We reserve the signature verification on other messages, as they are not duplicated.
This PR can reduce the CPU load of leader during consensus.
Signed-off-by: Leo Chen <leo@harmony.one>