Your Tasks

‚Äč

In this homework, you will build a distributed key-value store with two operations: GET and PUT. Data will be replicated across multiple follower servers to ensure data integrity, and a single leader server will coordinate actions across these follower servers. All nodes in this key-value store will utilize the Two-Phase Commit protocol.

Multiple clients (users) will communicate with a single leader server according to the given Key-Value gRPC API. The leader will forward all GET requests to a random follower. For PUT requests, the leader should follow the Two Phase Commit (2PC) protocol to ensure that: (1) operations are performed atomically across multiple follower servers (2) backup data is consistent across multiple follower servers

The leaders and the followers communicate over a bi-directional gRPC stream. The leader sends a Leader message, which the follower will process and respond with a Response message.

The staff solution make the following changes:

tpc/tpcfollower.go | 55 +++++++++++++++++++++++++++++++++++++++++++++++-
tpc/tpcleader.go | 55 +++++++++++++++++++++++++++++++++++++++++++++---
4 files changed, 141 insertions(+), 4 deletions(-)

Getting Familiar

You may find it helpful to read through some of the pre existing code before you start implementing. Using Important Existing Code as a guide, we suggest that you read through:

  • KVStore and Journal

  • TPC Leader and TPC Follower

  • MessageManager

  • tpc.proto (pkg/tpc/rpc) and kv.proto (api)

TODO

Here is a list of things to do for this homework in the recommended order of completion at a high level - detailed specifications follow in the next pages.

  • Start on Follower TPC Handling (don't worry about journaling for persistence yet). The follower is a state machine that receives messages from the Leader and updates the internal state based on what messages it received. After this step you should be passing TPC Follower Commit and TPC Follower Abort tests on the autograder.

  • Now work on the Leader TPC handling (again, don't worry about persistence just yet). The leader responds to API requests and issues TPC commands to the Follower. Once you implement this, you should be passing TPC Leader Commit and TPC Leader Abort along with the E2E tests.

  • Implement the logic for how the follower state machine responds to retransmitted messages. After this you should be able to pass the retransmit tests.

  • Think about where Journal entries need to be written and how the replay function should work for both the Follower and the Leader. We recommend you write to the journal upon receiving every message in the Follower and right before sending out every leader message in the Leader. At the end of this, you should be able to pass all TPC tests.