Efficient read monotonic data aggregation across shards on the cloud
Abstract
The idea of Monotonic reads over multiple copies of the data and for lightly loaded systems is intuitive and easy to implement. For example, ensuring that a client session always fetches data from the same server automatically ensures that the user will never view old data.
However, such a simplistic setup will not work for large deployments on the cloud, where the data is sharded across multiple high availability setups and there are several million clients accessing data at the same time. In such a setup it becomes necessary to ensure that the data fetched from multiple shards are logically consistent with each other. The use of trivial implementations, like sticky sessions, causes severe performance degradation during peak loads.
This paper explores the challenges surrounding consistent monotonic reads over a sharded setup on the cloud and proposes an efficient architecture for the same. Performance of the proposed architecture is measured by implementing it on a cloud setup and measuring the response times for different shard counts. We show that the proposed solution scales with almost no change in performance as the number of shards increases.
Keywords
Full Text:
PDFReferences
P. S. Almeida, C. Baquero, and V. Fonte. “Version stamps-decentralized version vectors”. In: Proceedings
nd International Conference on Distributed Computing Systems. 2002, pp. 544–551.
D. Barbara and H. Garcia-Molina. “The case for controlled inconsistency in replicated data”. In: [1990]
Proceedings. Workshop on the Management of Replicated Data. Nov. 1990, pp. 35–38.
Mike Burrows. “The Chubby Lock Service for Loosely-coupled Distributed Systems”. In: Proceedings of
the 7th Symposium on Operating Systems Design and Implementation. OSDI ’06. Seattle, Washington:
USENIX Association, 2006, pp. 335–350.
Cassandra Architecture.
https://docs.datastax.com/en/cassandra/2.0/cassandra/architecture/architectureIntro_c.html.
Accessed:2016-02-16.
Cloud At Cost. https://www.cloudatcost.com/. Accessed: 2018-07-29.
W. Golab et al. “Client-Centric Benchmarking of Eventual Consistency for Cloud Storage Systems”. In:
IEEE 34th International Conference on Distributed Computing Systems. June 2014, pp. 493–502.
Deniz Hastorun et al. “Dynamo: amazon’s highly available key-value store”. In: In Proc. SOSP. 2007, pp.
–220.
Patrick Hunt et al. “ZooKeeper: Wait-free Coordination for Internet-scale Systems”. In: Proceedings of the
USENIX Conference on USENIX Annual Technical Conference. USENIXATC’10. Boston, MA:
USENIX Association, 2010, pp. 11–11.
BeomHeyn Kim, Sukwon Oh, and David Lie. “Consistency Oracles: Towards an Interactive and Flexible
Consistency Model Specification”. In: Proceedings of the 16th Workshop on Hot Topics in Operating
Systems. HotOS ’17. Whistler, BC, Canada: ACM, 2017, pp. 82–87.
Y. Liu, Y. Wang, and Y. Jin. “Research on the improvement of MongoDB Auto-Sharding in cloud
environment”. In: 2012 7th International Conference on Computer Science Education (ICCSE). July
, pp. 851–854.
C. Mohan, B. Lindsay, and R. Obermarck. “Transaction Management in the R* Distributed Database
Management System”. In: ACM Trans. Database Syst. 11.4 (Dec. 1986), pp. 378–396.
MySQL Employee Sample Database. https://dev.mysql.com/doc/employee/en/sakila-structure.html.
Accessed: June, 2018.
MySQL Version Tokens. https://dev.mysql.com/doc/refman/5.7/en/version-tokens.html. Accessed: 2018-
-29.
D. S. Parker et al. “Detection of Mutual Inconsistency in Distributed Systems”. In: IEEE Trans. Softw.
Eng. 9.3 (May 1983), pp. 240–247.
Yoav Raz. “The Dynamic Two Phase Commitment (D2PC) protocol”. In: Database Theory — ICDT ’95.
Ed. by Georg Gottlob and Moshe Y. Vardi. Berlin, Heidelberg: Springer Berlin Heidelberg, 1995, pp.
–176.
Qun Ren, M. H. Dunham, and V. Kumar. “Semantic caching and query processing”. In: IEEE
Transactions on Knowledge and Data Engineering 15.1 (Jan. 2003), pp. 192–210.
Riak Architecture. http://docs.basho.com/riak/latest/ops/mdc/v3/architecture/. Accessed: 2016-02-16.
L. Saino, I. Psaras, and G. Pavlou. “Understanding sharded caching systems”. In: IEEE INFOCOM 2016
- The 35th Annual IEEE International Conference on Computer Communications. Apr. 2016, pp. 1–9.
Rashed Salem, Safa’a S. Saleh, and Hatem Abd elkader. “Scalable Data-Oriented Replication with
Flexible Consistency in Real-Time Data Systems”. In: 15 (Mar. 2016).
Charles Bell; Mats Kindahl; Lars Thalmann. MySQL High Availability, 2nd Edition. O’Reilly Media,
Inc., Apr. 2014.
Z. Yang, K. Ma, and J. Zhong. “Toward a Semantic Cache Supporting Version-Based Consistency”. In:
10th International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS).
July 2016, pp. 367–372.
Chi Zhang and Zheng Zhang. “Trading replication consistency for performance and availability: an
adaptive approach”. In: 23rd International Conference on Distributed Computing Systems, 2003.
Proceedings. May 2003, pp. 687–695.
Irene Zhang et al. “Building Consistent Transactions with Inconsistent Replication”. In: Proceedings of
the 25th Symposium on Operating Systems Principles. SOSP ’15. Monterey, California: ACM, 2015, pp.
–278.
Yuqing Zhu and Jianmin Wang. “Client-centric Consistency Formalization and Verification for System
with Large-scale Distributed Data Storage”. In: Future Gener. Comput. Syst. 26.8 (Oct. 2010), pp. 1180–
DOI: http://dx.doi.org/10.21533/pen.v7i1.333
Refbacks
- There are currently no refbacks.
Copyright (c) 2019 Narayanan Venkateswaran, Anurag Shekhar, Suvamoy Changder, Rajib Kar, Narayan C Debnath

This work is licensed under a Creative Commons Attribution 4.0 International License.
ISSN: 2303-4521
Digital Object Identifier DOI: 10.21533/pen
This work is licensed under a Creative Commons Attribution 4.0 International License