Going from three nines to four nines using Kafka
Many organizations have chosen to go with a hybrid cloud architecture to give them the best of both worlds: the scalability and ease of deployment of cloud, and the security, latency & egress benefits of local storage. Persistence of data on such an architecture can follow a write-back mode, where data is first written to local storage, and then uploaded to cloud asynchronously. However, this means that the applications cannot utilize the availability and durability guarantees of cloud, and the availability of storage is the availability SLA of on-premise hardware, which is almost always less than the availability SLA of Cloud. By switching the order, i.e. performing uploads to cloud, and then hydrating on-premise storage, applications get the benefit of availability SLAs of cloud. In our case, this allowed us to move from three 9’s of availability (99.9%) of local storage to four 9’s (99.99%). Instead of uploading in write-back mode, we duplicated the incoming stream to upload to both cloud and on-premise. For on-premise uploads that failed, we leveraged Kafka’s event processing to queue up objects that need to be egressed out of Cloud into the local storage. This architecture allowed us to hydrate the local storage with objects uploaded to Cloud. Furthermore, since local storage space is limited, we periodically purged data out of local storage and created a secondary copy of the data on cloud by leveraging Kafka event processing.
Tejas Chopra is a Senior Software Engineer, working in the Data Storage Platform team at Netflix, where he is responsible for architecting storage solutions to support Netflix Studios and Netflix Streaming Platform. Prior to Netflix, Tejas was working on designing and implementing the storage infrastructure at Box, Inc. to support a cloud content management platform that scales to petabytes of storage & millions of users. Tejas has worked on distributed file systems & backend architectures, both in on-premise and cloud environments as part of several startups in his career. Tejas is an International Keynote Speaker and periodically conducts seminars on Software Development & Cloud Computing and has a Masters Degree in Electrical & Computer Engineering from Carnegie Mellon University, with a specialization in Computer Systems.
Senior Software Engineer, Netflix