Title: A Tiered Transactional Distributed Storage System
School of Computer Science
College of Computing
Georgia Institute of Technology
Date: Monday, Dec 11, 2017
Time: 3:00pm - 5:00pm EDT
Location: KACB 2100
Dr. Umakishore Ramachandran (Advisor, School of Computer Science, Georgia Institute of Technology)
Dr. Sudipta Sengupta (Amazon.com, Inc.)
Dr. Moinuddin Qureshi (School of Electrical and Computer Engineering, Georgia Institute of Technology)
Dr. Santosh Pande (School of Computer Science, Georgia Institute of Technology)
Dr. Myoungsoo Jung (School of Integrated Technology, Yonsei University)
In recent years, researchers have proposed various in-memory distributed storage systems, which promise extremely low latency and high throughput while supporting convenient semantics and ease of programming. Although the plummeting cost of DRAM and technology advances have accelerated the trend of using memory as its primary data store, “data beyond memory capacity,” due to continuing data expansion and recent industry digitization, reintroduces the long-recurring theme: designing a cost-effective storage system.
The addition of cheaper flash storage is the solution to tackle such data capacity, but a new challenge lies in how to keep the performance and semantics already achieved by using DRAM as the primary data store rather than how to improve overall storage performance with the addition of DRAM as a cache.
To address this challenge, we designed and implemented a tiered transactional distributed storage system (T2). The dissertation is composed of two parts.
1) We explore several issues in current memory-based distributed storage systems in the modern datacenter context. Then we architecture a tiered distributed storage system such that it provides performance that is comparable to that of pure memory design when most data fits in memory while preserving transaction semantics, availability, and ease of programming.
2) We revisit several techniques to achieve performance from the flash tier and design the flash-tier system to perform efficient transactions and recovery. The resulting performance is comparable to that of the pure memory system in an optimal condition and degrades gracefully as workload deviates from the optimal condition.