Adaptive elasticity in scalable fog applications

The PrEstoCloud project aims to provide a platform which not only allows a one-time deployment of edge and cloud components offering a business service, but also assures that an adequate number of instances of the application components is running. Not too many – so as not to waste resources – nor too few, as this would provoke a situation where the application would not function correctly.

To attain this goal in PrEstoCloud, scalability decisions are by the DevOps with the aid of the Resources Adaptation Recommender (RARecom) component. The DevOps initially creates scalability rules which map the operational boundaries of the application. Each scalability rule consists of thresholds for a variety of metrics, which when surpassed indicate that a scaling action should be implemented – a ‘situation’ in the terminology of PrEstoCloud. At the heart of the adaptation mechanism is the Situation Detection Mechanism (SDM) component, which interprets and enforces scalability rules of the DevOps at runtime. Then, the RARecom acts on situations as detected by SDM, recommending the actual adaptation which should be enacted by the platform.

Due to the latest developments in the SDM and RARecom components, adaptation decisions now benefit from additional knowledge, gathered from the raw scalability rules created by the DevOps at the initial phase of deployment. Furthermore, this knowledge is supplemented by information available at runtime. Thus, it is now possible to make different scaling decisions when the same scalability rule has been triggered – also taking into account past situations and the actual values of metrics over the thresholds.

To illustrate, let us consider that the DevOps has added the following scalability rule:

If avg (CPU_cluster) 70% and (RAM_cluster) > 70% with Timewindow = 10 min then scale_out

PrEstoCloud now differentiates between a situation where the average values of the two metrics were 71% and when the average values were 100%. The means to achieve this distinction is the notion of the severity of a situation. Severity is formally defined as

where v_i are the individual monitoring attribute (metric) values comprising the particular situation, V_violating is the vector comprised of the v_i ‘s, and n is the number of v_i ‘s. Severity is used to create a segmentation of the space of all possible situations into a number of areas (or zones) containing an equal share of the situations space, and having a similar severity value. Situations belonging the same zone will result in the same adaptation action. Also, changes to the topology become more radical when situations have a higher severity value.

The division of the situation space for our example rule – in three zones – is depicted below. The x-axis indicates the average CPU percentage used by the cluster while the y-axis represents the average RAM percentage used by the cluster.

Figure 1: The Severity zones in the case of our simple rule, and the situations (points) belonging to them. Zone 1 situations have the smaller severity values and provoke mild topology adaptations (e.g add/remove 1 instance), while the most severe Zone 3 situations necessitate aggressive adaptations (e.g add/remove 3 instances)

Information on the detected situation and its zone is sent by SDM to RARecom. Using this information, RARecom is capable of producing different adaptation actions for situations of the same type. For example, a situation in zone 1 could lead to the addition of one instance while a situation in zone 3 could add 3 instances. This capability is complemented by the feedback mechanism. The feedback mechanism can modify dynamically its recommendation in response to the detected load, by factoring in previous recommendations, and assessing the current state of the topology. In the above example, if many situations occur in zone 1, the recommendation of the number of instances can change to add two instances, instead of one instance which was initially specified.

Both RARecom and SDM will enable adaptation decisions in the use-case deployments of PreStoCloud.