Using machine learning for intelligent shard sizing on the cloud

Authors

  • Narayanan Venkateswaran
  • Anurag Shekhar
  • Suvamoy Changder
  • Narayan C. Debnath

DOI:

https://doi.org/10.21533/pen.v7.i1.1460

Abstract

Sharding implementations use conservative approximations for determining
the number of cloud instances required and the size of the shards to be stored
on each of them. Conservative approximations are often inaccurate and result
in overloaded deployments, which need reactive refinement. Reactive
refinement results in demand for additional resources from an already
overloaded system and is counterproductive.
This paper proposes an algorithm that eliminates the need for conservative
approximations and reduces the need for reactive refinement. A multiple
linear regression based machine learning algorithm is used to predict the
latency of requests for a given application deployed on a cloud machine. The
predicted latency helps to decide accurately and with certainty if the capacity
of the cloud machine will satisfy the service level agreement for effective
operation of the application. Application of the proposed methods on a
popular database schema on the cloud resulted in highly accurate predictions.
The results of the deployment and the tests performed to establish the
accuracy have been presented in detail and are shown to establish the
authenticity of the claims.

Downloads

Published

2019-06-01

Issue

Section

Articles

How to Cite

Using machine learning for intelligent shard sizing on the cloud. (2019). Periodicals of Engineering and Natural Sciences, 7(1). https://doi.org/10.21533/pen.v7.i1.1460