Skip to main content

Azure instance memory hog

Azure instance memory hog disrupts the state of infrastructure resources.

  • This fault induces stress on the Azure Instance using the Azure Run command.
  • This command is executed using the bash scripts that are in-built in the fault.
  • It utilizes memory in excess on the Azure Instance using the bash script for a specific duration.

Azure Instance Memory Hog

Usage

View fault usage
This fault determines the resilience of an Azure instance when memory resources are utilized in excess, unexpectedly. It determines how Azure scales the memory to maintain the application when resources are consumed heavily. It simulates the situation of memory leaks in the deployment of microservices, application slowness due to memory starvation, and noisy neighbour problems due to hogging. It verifies pod priority and QoS setting for eviction purposes. It also verifies application restarts on OOM kills.

Prerequisites

  • Kubernetes >= 1.17
  • Azure Run Command agent is installed and running in the target Azure instance.
  • Use Azure file-based authentication to connect to the instance using Azure GO SDK. To generate the auth file ,run az ad sp create-for-rbac --sdk-auth > azure.auth Azure CLI command.
  • Create a Kubernetes secret that has the auth file created in the previous step in the CHAOS_NAMESPACE. Below is a sample secret file:
apiVersion: v1
kind: Secret
metadata:
name: cloud-secret
type: Opaque
stringData:
azure.auth: |-
{
"clientId": "XXXXXXXXX",
"clientSecret": "XXXXXXXXX",
"subscriptionId": "XXXXXXXXX",
"tenantId": "XXXXXXXXX",
"activeDirectoryEndpointUrl": "XXXXXXXXX",
"resourceManagerEndpointUrl": "XXXXXXXXX",
"activeDirectoryGraphResourceId": "XXXXXXXXX",
"sqlManagementEndpointUrl": "XXXXXXXXX",
"galleryEndpointUrl": "XXXXXXXXX",
"managementEndpointUrl": "XXXXXXXXX"
}
  • If you change the secret key name (from azure.auth), ensure that you update the AZURE_AUTH_LOCATION environment variable in the chaos experiment with the new name.

Default validations

Azure instance should be in a healthy state.

Fault tunables

Fault tunables

Mandatory Fields

Variables Description Notes
AZURE_INSTANCE_NAMES Names of the target Azure instances. Multiple values can be provided as a comma-separated string. For example, instance-1,instance-2.
RESOURCE_GROUP The Azure Resource Group name where the instances will be created. All the instances must be from the same resource group.

Optional Fields

Variables Description Notes
TOTAL_CHAOS_DURATION Duration that you specify, through which chaos is injected into the target resource (in seconds). Defaults to 30s.
CHAOS_INTERVAL Time interval between two successive container kills (in seconds). Defaults to 60s.
AZURE_AUTH_LOCATION Name of the Azure secret credentials files. Defaults to azure.auth.
SCALE_SET Check if the instance is a part of Scale Set. Defaults to disable. Supports enable as well.
MEMORY_CONSUMPTION Amount of memory to be consumed in the Azure instance (in megabytes). Defaults to 500 MB.
MEMORY_PERCENTAGE Amount of memory to be consumed in the Azure instance (in percentage). Defaults to 0.
NUMBER_OF_WORKERS Number of workers used to run the stress process. Defaults to 1.
SEQUENCE Sequence of chaos execution for multiple target instances. Defaults to parallel. Supports serial sequence as well.
RAMP_TIME Period to wait before and after injecting chaos (in seconds). For example, 30s.

Fault examples

Common fault tunables

Refer to the common attributes to tune the common tunables for all the faults.

Memory consumption in megabytes

It defines the memory to be utilised (in MB) on the Azure instance. You can tune it using the MEMORY_CONSUMPTION environment variable.

Use the following example to tune it:

# memory in mb to utilize
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: azure-instance-memory-hog
spec:
components:
env:
- name: MEMORY_CONSUMPTION
VALUE: '1024'
# name of the Azure instance
- name: AZURE_INSTANCE_NAMES
value: 'instance-1'
# resource group for the Azure instance
- name: RESOURCE_GROUP
value: 'rg-azure'

Memory consumption by percentage

It defines the memory to be utilised (in percentage) on the Azure instance. You can tune it using the MEMORY_PERCENTAGE environment variable.

Use the following example to tune it:

# memory percentage to utilize
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: azure-instance-memory-hog
spec:
components:
env:
- name: MEMORY_PERCENTAGE
VALUE: '50'
# name of the Azure instance
- name: AZURE_INSTANCE_NAMES
value: 'instance-1'
# resource group for the Azure instance
- name: RESOURCE_GROUP
value: 'rg-azure'

Multiple Azure instances

Multiple Azure instances can be targeted in a single chaos run. You can tune it using the AZURE_INSTANCE_NAMES environment variable.

Use the following example to tune it:

# mutilple instance targets
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: azure-instance-memory-hog
spec:
components:
env:
# names of the Azure instance
- name: AZURE_INSTANCE_NAMES
value: 'instance-1,instance-2'
# resource group for the Azure instance
- name: RESOURCE_GROUP
value: 'rg-azure'

Multiple workers

It defines the CPU threads that need to be run to spike the memory utilisation. As a consequence, this increases the growth of memory consumption. You can tune it using the NUMBER_OF_WORKERS environment variable..

Use the following example to tune this:

# multiple workers to utilize resources
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: azure-instance-memory-hog
spec:
components:
env:
- name: NUMBER_OF_WORKERS
VALUE: '3'
# name of the Azure instance
- name: AZURE_INSTANCE_NAMES
value: 'instance-1'
# resource group for the Azure instance
- name: RESOURCE_GROUP
value: 'rg-azure'