Efficient read monotonic data aggregation across shards on the cloud

Narayanan Venkateswaran, Anurag Shekhar, Suvamoy Changder, Rajib Kar, Narayan C Debnath

Abstract


Client-centric consistency models define the view of the data storage expected by a client in relation to the operations done by a client within a session. Monotonic reads is a client-centric consistency model which ensures that if a process has seen a particular value for the object, any subsequent accesses will never return any previous values. Monotonic reads are used in several applications like news feeds and social networks to ensure that the user always has a forward moving view of the data.
The idea of Monotonic reads over multiple copies of the data and for lightly loaded systems is intuitive and easy to implement. For example, ensuring that a client session always fetches data from the same server automatically ensures that the user will never view old data.
However, such a simplistic setup will not work for large deployments on the cloud, where the data is sharded across multiple high availability setups and there are several million clients accessing data at the same time. In such a setup it becomes necessary to ensure that the data fetched from multiple shards are logically consistent with each other. The use of trivial implementations, like sticky sessions, causes severe performance degradation during peak loads.
This paper explores the challenges surrounding consistent monotonic reads over a sharded setup on the cloud and proposes an efficient architecture for the same. Performance of the proposed architecture is measured by implementing it on a cloud setup and measuring the response times for different shard counts. We show that the proposed solution scales with almost no change in performance as the number of shards increases.

Keywords


Horizontal Scalability, Sharding, High Availability, Cloud, Consistency, Client-centric Consistency, Monotonic Reads, Version Vectors, Vector Clocks

Full Text:

PDF

References


P. S. Almeida, C. Baquero, and V. Fonte. “Version stamps-decentralized version vectors”. In: Proceedings

nd International Conference on Distributed Computing Systems. 2002, pp. 544–551.

D. Barbara and H. Garcia-Molina. “The case for controlled inconsistency in replicated data”. In: [1990]

Proceedings. Workshop on the Management of Replicated Data. Nov. 1990, pp. 35–38.

Mike Burrows. “The Chubby Lock Service for Loosely-coupled Distributed Systems”. In: Proceedings of

the 7th Symposium on Operating Systems Design and Implementation. OSDI ’06. Seattle, Washington:

USENIX Association, 2006, pp. 335–350.

Cassandra Architecture.

https://docs.datastax.com/en/cassandra/2.0/cassandra/architecture/architectureIntro_c.html.

Accessed:2016-02-16.

Cloud At Cost. https://www.cloudatcost.com/. Accessed: 2018-07-29.

W. Golab et al. “Client-Centric Benchmarking of Eventual Consistency for Cloud Storage Systems”. In:

IEEE 34th International Conference on Distributed Computing Systems. June 2014, pp. 493–502.

Deniz Hastorun et al. “Dynamo: amazon’s highly available key-value store”. In: In Proc. SOSP. 2007, pp.

–220.

Patrick Hunt et al. “ZooKeeper: Wait-free Coordination for Internet-scale Systems”. In: Proceedings of the

USENIX Conference on USENIX Annual Technical Conference. USENIXATC’10. Boston, MA:

USENIX Association, 2010, pp. 11–11.

BeomHeyn Kim, Sukwon Oh, and David Lie. “Consistency Oracles: Towards an Interactive and Flexible

Consistency Model Specification”. In: Proceedings of the 16th Workshop on Hot Topics in Operating

Systems. HotOS ’17. Whistler, BC, Canada: ACM, 2017, pp. 82–87.

Y. Liu, Y. Wang, and Y. Jin. “Research on the improvement of MongoDB Auto-Sharding in cloud

environment”. In: 2012 7th International Conference on Computer Science Education (ICCSE). July

, pp. 851–854.

C. Mohan, B. Lindsay, and R. Obermarck. “Transaction Management in the R* Distributed Database

Management System”. In: ACM Trans. Database Syst. 11.4 (Dec. 1986), pp. 378–396.

MySQL Employee Sample Database. https://dev.mysql.com/doc/employee/en/sakila-structure.html.

Accessed: June, 2018.

MySQL Version Tokens. https://dev.mysql.com/doc/refman/5.7/en/version-tokens.html. Accessed: 2018-

-29.

D. S. Parker et al. “Detection of Mutual Inconsistency in Distributed Systems”. In: IEEE Trans. Softw.

Eng. 9.3 (May 1983), pp. 240–247.

Yoav Raz. “The Dynamic Two Phase Commitment (D2PC) protocol”. In: Database Theory — ICDT ’95.

Ed. by Georg Gottlob and Moshe Y. Vardi. Berlin, Heidelberg: Springer Berlin Heidelberg, 1995, pp.

–176.

Qun Ren, M. H. Dunham, and V. Kumar. “Semantic caching and query processing”. In: IEEE

Transactions on Knowledge and Data Engineering 15.1 (Jan. 2003), pp. 192–210.

Riak Architecture. http://docs.basho.com/riak/latest/ops/mdc/v3/architecture/. Accessed: 2016-02-16.

L. Saino, I. Psaras, and G. Pavlou. “Understanding sharded caching systems”. In: IEEE INFOCOM 2016

- The 35th Annual IEEE International Conference on Computer Communications. Apr. 2016, pp. 1–9.

Rashed Salem, Safa’a S. Saleh, and Hatem Abd elkader. “Scalable Data-Oriented Replication with

Flexible Consistency in Real-Time Data Systems”. In: 15 (Mar. 2016).

Charles Bell; Mats Kindahl; Lars Thalmann. MySQL High Availability, 2nd Edition. O’Reilly Media,

Inc., Apr. 2014.

Z. Yang, K. Ma, and J. Zhong. “Toward a Semantic Cache Supporting Version-Based Consistency”. In:

10th International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS).

July 2016, pp. 367–372.

Chi Zhang and Zheng Zhang. “Trading replication consistency for performance and availability: an

adaptive approach”. In: 23rd International Conference on Distributed Computing Systems, 2003.

Proceedings. May 2003, pp. 687–695.

Irene Zhang et al. “Building Consistent Transactions with Inconsistent Replication”. In: Proceedings of

the 25th Symposium on Operating Systems Principles. SOSP ’15. Monterey, California: ACM, 2015, pp.

–278.

Yuqing Zhu and Jianmin Wang. “Client-centric Consistency Formalization and Verification for System

with Large-scale Distributed Data Storage”. In: Future Gener. Comput. Syst. 26.8 (Oct. 2010), pp. 1180–




DOI: http://dx.doi.org/10.21533/pen.v7i1.333

Refbacks

  • There are currently no refbacks.


Copyright (c) 2019 Narayanan Venkateswaran, Anurag Shekhar, Suvamoy Changder, Rajib Kar, Narayan C Debnath

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

ISSN: 2303-4521

Digital Object Identifier DOI: 10.21533/pen

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License