Skip to main content

Dynamo DB replication pause

DynamoDB replication pause fault pauses the data replication in DynamoDB tables over multiple locations for the chaos duration.

  • When chaos experiment is being executed, any changes to the DynamoDB table will not be replicated in different regions, thereby making the data in the DynamoDB inconsistent.
  • You can execute this fault on a DynamoDB table that is global, that is, there should be more than one replica of the table.

DynamoDB replication pause

Use cases

DynamoDB replication pause determines the resilience of the application when data (in a database) that needs to be constantly updated is disrupted.

Prerequisites

  • Kubernetes >= 1.17
  • DynamoDB should be up and running.
  • The derived table name should be a global table.
  • The Kubernetes secret should have AWS access configuration (key) in the CHAOS_NAMESPACE. A sample secret file looks like:
    apiVersion: v1
    kind: Secret
    metadata:
    name: cloud-secret
    type: Opaque
    stringData:
    cloud_config.yml: |-
    # Add the cloud AWS credentials respectively
    [default]
    aws_access_key_id = XXXXXXXXXXXXXXXXXXX
    aws_secret_access_key = XXXXXXXXXXXXXXX
tip

HCE recommends that you use the same secret name, that is, cloud-secret. Otherwise, you will need to update the AWS_SHARED_CREDENTIALS_FILE environment variable in the fault template and you won't be able to use the default health check probes.

Below is an example AWS policy to execute the fault.

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"fis:CreateExperimentTemplate",
"fis:StartExperiment",
"fis:StopExperiment",
"fis:GetExperiment",
"fis:ListExperiments"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"dynamodb:DescribeTable"
],
"Resource": [
"arn:aws:dynamodb:*:*:table/*"
]
},
{
"Effect": "Allow",
"Action": [
"cloudwatch:DescribeAlarms"
],
"Resource": "*"
}
]
}
note

Mandatory tunables

Tunable Description Notes
TABLE_NAMES Name of the table (or comma-separated name of tables) in the DynamoDB on which you apply chaos. For example, "table-name1,table-name-2,..".
COUNT Number of iterations of the chaos experiment to run. For example: 2.
STOP_CONDITION_CLOUDWATCH_ALARMS Comma-separated ARN (Amazon Resource Names) ARN of the cloudwatch alarm that is used as the stop condition by the fault. For example, "arn-1,arn-2,..".
FIS_ACCOUNT_ID Amazon FIS account used by DynamoDB.
FIS_ROLE_ARN Provide the role ARN that you want to update. For example: "arn:aws:iam:234567901244:role/CustomFISRole"
REGION Region name for the target volumes For example, us-east-1.

Optional tunables

Tunable Description Notes
TOTAL_CHAOS_DURATION Time duration for chaos insertion (sec) Default: 30 s. For more information, go to duration of the chaos.
CHAOS_INTERVAL The time duration between the attachment and detachment of the volumes (sec) Default: 30 s. For more information, go to chaos interval.
RAMP_TIME Period to wait before and after injection of chaos in sec For example, 30 s. For more information, go to ramp time.
AWS_SHARED_CREDENTIALS_FILE Path to the AWS secret credentials. Default: /tmp/cloud_config.yml.

Table names

Name of the DynamoDB global table on which you apply chaos. You can provide multiple table names as comma-separated values. Tune it using the TABLE_NAMES environment variable.

The following YAML snippet illustrates the use of this environment variable:

kind: Workflow
apiVersion: argoproj.io/v1alpha1
metadata:
name: fis-dynamodb-replication-pause
namespace: hce
spec:
- name: fis-dynamodb-replication-pause
components:
env:
- name: TOTAL_CHAOS_DURATION
value: "60"
- name: TABLE_NAMES
value: "name-1,name-2"
- name: REGION
value: "us-east"
- name: AWS_SHARED_CREDENTIALS_FILE
value: /tmp/cloud_config.yml
secrets:
- name: cloud-secret
mountPath: /tmp/

Count

Number of iterations of the chaos experiment to execute. The DynamoDB replication is paused these many times. Tune it using the COUNT environment variable.

The following YAML snippet illustrates the use of this environment variable:

kind: Workflow
apiVersion: argoproj.io/v1alpha1
metadata:
name: fis-dynamodb-replication-pause
namespace: hce
spec:
- name: fis-dynamodb-replication-pause
components:
env:
- name: TOTAL_CHAOS_DURATION
value: "60"
- name: COUNT
value: 3
- name: AWS_SHARED_CREDENTIALS_FILE
value: /tmp/cloud_config.yml
secrets:
- name: cloud-secret
mountPath: /tmp/

Stop condition cloudwatch alarms

Comma-separated ARN (Amazon Resource Names) of the cloudwatch alarm that is used as the stop condition by the fault. Tune it by using the STOP_CONDITION_CLOUDWATCH_ALARMS environment variable.

The following YAML snippet illustrates the use of this environment variable:

kind: Workflow
apiVersion: argoproj.io/v1alpha1
metadata:
name: fis-dynamodb-replication-pause
namespace: hce
spec:
- name: fis-dynamodb-replication-pause
components:
env:
- name: TOTAL_CHAOS_DURATION
value: "60"
- name: STOP_CONDITION_CLOUDWATCH_ALARMS
value: "arn-1,arn-2,.."
- name: AWS_SHARED_CREDENTIALS_FILE
value: /tmp/cloud_config.yml
secrets:
- name: cloud-secret
mountPath: /tmp/

FIS details

  • FIS account ID: Amazon FIS account used by DynamoDB. Tune it by using the FIS_ACCOUNT_ID environment variable.
  • FIS role ARN: Role ARN that you want to update. Tune it by using the FIS_ROLE_ARN environment variable.

The following YAML snippet illustrates the use of this environment variable:

kind: Workflow
apiVersion: argoproj.io/v1alpha1
metadata:
name: fis-dynamodb-replication-pause
namespace: hce
spec:
- name: fis-dynamodb-replication-pause
components:
env:
- name: TOTAL_CHAOS_DURATION
value: "60"
- name: FIS_ACCOUNT_ID
value: "jane-doe"
- name: FIS_ROLE_ARN
value: "arn:aws:iam:234567901244:role/CustomFISRole"
- name: AWS_SHARED_CREDENTIALS_FILE
value: /tmp/cloud_config.yml
secrets:
- name: cloud-secret
mountPath: /tmp/