AWS Cost Optimization
Day 5: A Simple Project Demonstrating Cost Optimization in AWS
๐ AWS Cost ๐ฐOptimization Project
One of the key responsibilities of a DevOps/Cloud Engineer is optimizing cloud infrastructure costs. This includes identifying stale resources ๐ and managing them in one of two ways:
1๏ธโฃ Sending notifications ๐ง to the concerned person.
2๏ธโฃ Taking action ๐ฅ (e.g., deleting the resources).
This article presents a fresher-level AWS cost optimization project ๐ ๏ธ. It includes a problem statement and demonstrates its implementation in AWS using:
CloudWatch ๐
Events & EventBridge ๐
Lambda Functions ๐ฅ๏ธ
Python Scripts ๐
boto3 Library ๐
Problem Statement
(You can explain this problem statement to the interviewer while explaining this project)
"Automated Cost Optimization of AWS EBS Snapshots"
In cloud environments, managing storage resources efficiently is crucial to minimizing costs. AWS Elastic Block Store (EBS) snapshots, while essential for backups and disaster recovery, can accumulate over time and result in significant unnecessary expenses if not monitored and managed properly.
Unused snapshots, or snapshots associated with deleted volumes or unused resources, contribute to this cost overhead. Additionally, identifying and removing such snapshots manually can be time-consuming and error-prone, especially in environments with numerous snapshots.
This project aims to design and implement a serverless solution for automating the management of AWS EBS snapshots. The solution will:
Identify stale snapshots that are either:
Not attached to any volume.
Associated with volumes that are not attached to any running EC2 instance.
Check if the snapshot was created more than 10 minutes ago (small threshold value to verify implementation) to ensure recent backups are retained.
Automatically delete such snapshots or provide notifications for further action.
Hereโs the complete demonstration of this project
Steps to implement the problem statement
Step 1: Prepare Your Lambda Function
Log in to the AWS Management Console and go to Lambda.
If you donโt have a Lambda function:
Click Create function.
Choose Author from scratch.
Provide a Function name.
Select a Runtime (Python)
Set up an IAM Role with permissions for CloudWatch Logs.
Insert the delete_snapshot.py script to this lambda function
#delete_snapshot.py import boto3 from datetime import datetime, timezone, timedelta def lambda_handler(event, context): ec2 = boto3.client('ec2') # Get the current time in UTC current_time = datetime.now(timezone.utc) # Get all EBS snapshots response = ec2.describe_snapshots(OwnerIds=['self']) # Get all active EC2 instance IDs instances_response = ec2.describe_instances(Filters=[{'Name': 'instance-state-name', 'Values': ['running']}]) active_instance_ids = set() for reservation in instances_response['Reservations']: for instance in reservation['Instances']: active_instance_ids.add(instance['InstanceId']) # Iterate through each snapshot and apply the deletion conditions for snapshot in response['Snapshots']: snapshot_id = snapshot['SnapshotId'] volume_id = snapshot.get('VolumeId') snapshot_time = snapshot['StartTime'] # Snapshot creation time (in UTC) # Check if the snapshot is older than 10 minutes if current_time - snapshot_time < timedelta(minutes=10): print(f"Skipping snapshot {snapshot_id} as it was created less than 10 minutes ago.") continue if not volume_id: # Delete the snapshot if it's not attached to any volume ec2.delete_snapshot(SnapshotId=snapshot_id) print(f"Deleted EBS snapshot {snapshot_id} as it was not attached to any volume.") else: # Check if the volume still exists try: volume_response = ec2.describe_volumes(VolumeIds=[volume_id]) if not volume_response['Volumes'][0]['Attachments']: ec2.delete_snapshot(SnapshotId=snapshot_id) print(f"Deleted EBS snapshot {snapshot_id} as it was taken from a volume not attached to any running instance.") except ec2.exceptions.ClientError as e: if e.response['Error']['Code'] == 'InvalidVolume.NotFound': # The volume associated with the snapshot is not found (it might have been deleted) ec2.delete_snapshot(SnapshotId=snapshot_id) print(f"Deleted EBS snapshot {snapshot_id} as its associated volume was not found.")
Test the Lambda function to ensure it works as expected.
If there is some error related to permissions then allocate the required policies to that service role (if not present as AWS-Managed policy then create custom policy)
Step 2: Convert IST (4:00 PM) to UTC
Since IST is UTC+5:30, convert 4:00 PM IST to UTC:
- 4:00 PM IST = 10:30 AM UTC
Step 3: Create a Rule in EventBridge
Go to the AWS Management Console and navigate to EventBridge.
In the Rules section, click Create rule.
Step 4: Configure the Rule
Name the Rule:
Provide a name for the rule (e.g.,
TriggerLambdaAt4PMIST
).Add an optional description (e.g.,
Triggers Lambda daily at 4:00 PM IST
).
Select Schedule Pattern:
Choose Cron expression.
Enter the following Cron expression:
cron(30 10 * * ? *)
This means:
30
: 30th minute.10
: 10th hour (10:30 AM UTC).*
: Every day of the month.*
: Every month.?
: No specific day of the week.
Step 5: Add a Target
Under Target, select Lambda function.
Choose the Lambda function you want to invoke.
(Optional) Add input to send custom parameters to the function.
Step 6: Review and Add Permissions
EventBridge needs permission to invoke your Lambda function.
AWS will prompt you to add these permissions automatically when saving the rule. Click Create rule to finalize the setup.
Step 7: Verify the Rule
Go to the Rules section in EventBridge.
Check that the rule status is Enabled.
Step 8: Monitor the Execution
Go to CloudWatch Logs.
Find the log group associated with your Lambda function (e.g.,
/aws/lambda/YourFunctionName
).Verify that the function runs at 4:00 PM IST daily.