AWS Cost Optimization

AWS Cost Optimization

Day 5: A Simple Project Demonstrating Cost Optimization in AWS

ยท

4 min read

๐Ÿš€ AWS Cost ๐Ÿ’ฐOptimization Project

One of the key responsibilities of a DevOps/Cloud Engineer is optimizing cloud infrastructure costs. This includes identifying stale resources ๐Ÿ›‘ and managing them in one of two ways:
1๏ธโƒฃ Sending notifications ๐Ÿ“ง to the concerned person.
2๏ธโƒฃ Taking action ๐Ÿ”ฅ (e.g., deleting the resources).

This article presents a fresher-level AWS cost optimization project ๐Ÿ› ๏ธ. It includes a problem statement and demonstrates its implementation in AWS using:

  • CloudWatch ๐Ÿ“Š

  • Events & EventBridge ๐Ÿ”„

  • Lambda Functions ๐Ÿ–ฅ๏ธ

  • Python Scripts ๐Ÿ

  • boto3 Library ๐Ÿ“š


Problem Statement

(You can explain this problem statement to the interviewer while explaining this project)

"Automated Cost Optimization of AWS EBS Snapshots"

In cloud environments, managing storage resources efficiently is crucial to minimizing costs. AWS Elastic Block Store (EBS) snapshots, while essential for backups and disaster recovery, can accumulate over time and result in significant unnecessary expenses if not monitored and managed properly.

Unused snapshots, or snapshots associated with deleted volumes or unused resources, contribute to this cost overhead. Additionally, identifying and removing such snapshots manually can be time-consuming and error-prone, especially in environments with numerous snapshots.

This project aims to design and implement a serverless solution for automating the management of AWS EBS snapshots. The solution will:

  1. Identify stale snapshots that are either:

    • Not attached to any volume.

    • Associated with volumes that are not attached to any running EC2 instance.

  2. Check if the snapshot was created more than 10 minutes ago (small threshold value to verify implementation) to ensure recent backups are retained.

  3. Automatically delete such snapshots or provide notifications for further action.


Hereโ€™s the complete demonstration of this project

Steps to implement the problem statement

Step 1: Prepare Your Lambda Function

  1. Log in to the AWS Management Console and go to Lambda.

  2. If you donโ€™t have a Lambda function:

    • Click Create function.

    • Choose Author from scratch.

    • Provide a Function name.

    • Select a Runtime (Python)

    • Set up an IAM Role with permissions for CloudWatch Logs.

    • Insert the delete_snapshot.py script to this lambda function

        #delete_snapshot.py
      
        import boto3
        from datetime import datetime, timezone, timedelta
      
        def lambda_handler(event, context):
            ec2 = boto3.client('ec2')
      
            # Get the current time in UTC
            current_time = datetime.now(timezone.utc)
      
            # Get all EBS snapshots
            response = ec2.describe_snapshots(OwnerIds=['self'])
      
            # Get all active EC2 instance IDs
            instances_response = ec2.describe_instances(Filters=[{'Name': 'instance-state-name', 'Values': ['running']}])
            active_instance_ids = set()
      
            for reservation in instances_response['Reservations']:
                for instance in reservation['Instances']:
                    active_instance_ids.add(instance['InstanceId'])
      
            # Iterate through each snapshot and apply the deletion conditions
            for snapshot in response['Snapshots']:
                snapshot_id = snapshot['SnapshotId']
                volume_id = snapshot.get('VolumeId')
                snapshot_time = snapshot['StartTime']  # Snapshot creation time (in UTC)
      
                # Check if the snapshot is older than 10 minutes
                if current_time - snapshot_time < timedelta(minutes=10):
                    print(f"Skipping snapshot {snapshot_id} as it was created less than 10 minutes ago.")
                    continue
      
                if not volume_id:
                    # Delete the snapshot if it's not attached to any volume
                    ec2.delete_snapshot(SnapshotId=snapshot_id)
                    print(f"Deleted EBS snapshot {snapshot_id} as it was not attached to any volume.")
                else:
                    # Check if the volume still exists
                    try:
                        volume_response = ec2.describe_volumes(VolumeIds=[volume_id])
                        if not volume_response['Volumes'][0]['Attachments']:
                            ec2.delete_snapshot(SnapshotId=snapshot_id)
                            print(f"Deleted EBS snapshot {snapshot_id} as it was taken from a volume not attached to any running instance.")
                    except ec2.exceptions.ClientError as e:
                        if e.response['Error']['Code'] == 'InvalidVolume.NotFound':
                            # The volume associated with the snapshot is not found (it might have been deleted)
                            ec2.delete_snapshot(SnapshotId=snapshot_id)
                            print(f"Deleted EBS snapshot {snapshot_id} as its associated volume was not found.")
      
  3. Test the Lambda function to ensure it works as expected.

    If there is some error related to permissions then allocate the required policies to that service role (if not present as AWS-Managed policy then create custom policy)

Step 2: Convert IST (4:00 PM) to UTC

Since IST is UTC+5:30, convert 4:00 PM IST to UTC:

  • 4:00 PM IST = 10:30 AM UTC

Step 3: Create a Rule in EventBridge

  1. Go to the AWS Management Console and navigate to EventBridge.

  2. In the Rules section, click Create rule.

Step 4: Configure the Rule

  1. Name the Rule:

    • Provide a name for the rule (e.g., TriggerLambdaAt4PMIST).

    • Add an optional description (e.g., Triggers Lambda daily at 4:00 PM IST).

  2. Select Schedule Pattern:

    • Choose Cron expression.

    • Enter the following Cron expression:

        cron(30 10 * * ? *)
      

      This means:

      • 30: 30th minute.

      • 10: 10th hour (10:30 AM UTC).

      • *: Every day of the month.

      • *: Every month.

      • ?: No specific day of the week.

Step 5: Add a Target

  1. Under Target, select Lambda function.

  2. Choose the Lambda function you want to invoke.

  3. (Optional) Add input to send custom parameters to the function.

Step 6: Review and Add Permissions

  1. EventBridge needs permission to invoke your Lambda function.

  2. AWS will prompt you to add these permissions automatically when saving the rule. Click Create rule to finalize the setup.

Step 7: Verify the Rule

  1. Go to the Rules section in EventBridge.

  2. Check that the rule status is Enabled.

Step 8: Monitor the Execution

  1. Go to CloudWatch Logs.

  2. Find the log group associated with your Lambda function (e.g., /aws/lambda/YourFunctionName).

  3. Verify that the function runs at 4:00 PM IST daily.

ย