DevOops

Office Feedback

Someone asked me for Feedback and this is what I provided (I feel I was too harsh on them :)

 What are three words that you would use to describe me? Happy, Easy-going, Talkative
 What do you see as my greatest strength? Perseverence, I have noted you stick to tasks, even when things are not working for you
 When would you most want me on your team? When we need someone to understand us better
When would you least want me on your team? When we need a strongman to drive home a point
How do I add value to you or your work? i.e : How is my leadership supporting you in your work here?) Your leadership did not support me :) However I understand that I worked with you at a time when you were yourself getting in the grooves at [...] so I cannot complain. Recommendation for your future is to assert yourself from Day-1, dont let senior team members bully around
 What do you want me to keep doing? Keep your relax persona and keep ticking the tasks
 What do you wish I would stop doing? Not being role model for the team

Note: Feedback at office is truly important so you can keep improving, unfortunately you dont usually get it.

Monitor IP addresses free in AWS VPC subnet (Python)

Creates custom metric for free IPs in Subnets (Set in Environment Variable)

CW Alarms can be set to detect IP Exhaustion, useful for VPC-deployed Lambda Functions and Capacity Monitoring

import logging
import boto3
import json
import os

logger = logging.getLogger()
logger.setLevel(logging.INFO)

cwclient = boto3.client('cloudwatch')
ec2 = boto3.client("ec2")

def lambda_handler ( event, context ):
    #Get Filter from Environment Variables:
    subnetstomonitor = os.getenv('SubnetsToMonitor', '')
    #Remove any spaces and then split by comma
    subnetlist=os.environ.get("SubnetsToMonitor")
    subnetlist=subnetlist.replace(" ", "")
    subnetlist=subnetlist.split(",")
    ec2 = boto3.client("ec2")
    response=response = ec2.describe_subnets(Filters=[{'Name': 'subnet-id','Values': subnetlist}])
    #print(response['Subnets'])
    data=response['Subnets']
    ver_subnets  (response)

# virtualInterfaces payload evaluation
def ver_subnets ( data ):
    if not 'Subnets' in data:
        logger.error("unexpected: Subnets key not found in data")
        return
    for subnet in data['Subnets']:
        put_subnetmetric( subnet['SubnetId'],subnet['AvailableIpAddressCount'] )

# Writes VirtualInterfaceState dimension data to DX custom metric
def put_subnetmetric ( subnetid, availaddresscount ):
    response = cwclient.put_metric_data(
        Namespace='AWSx/SubnetCapacity',
        MetricData=[
            {
                'MetricName': 'AvailableIPAddresses',
                'Dimensions': [
                    {
                        'Name': 'SubnetId',
                        'Value': subnetid
                    },
                ],
                'Value': availaddresscount,
                'Unit': 'None'
            },
        ],
    )

Alert on a failed EC2 instance (Status Check Failed)

This Lambda fucntion creates "Status Check Failed Instance" Metric and Alram with Restart action for any EC2 instance in the region.

# Before running the function please review the SNS Topic, Account ID (akid) and Region.
# The Lambda Role has full access to Ec2 and Cloudwatch via policy

import os
import boto3
import collections
import json

#SNS Topic Definition for EC2
ec2_sns = os.environ['ec2_sns']

#AWS Account and Region Definition for Reboot Actions
akid = os.environ['accountid']
region = os.environ['region']

#Create AWS clients
ec = boto3.client('ec2')
cw = boto3.client('cloudwatch')

def lambda_handler(event, context):

    #Enumerate EC2 instances
    reservations = ec.describe_instances().get('Reservations', [])
    instances = sum(
        [
            [i for i in r['Instances']]
            for r in reservations
        ], [])

    for instance in instances:
        try:
            for tag in instance['Tags']:
                if tag['Key'] == 'Name':
                    name_tag = tag['Value']
                    print "Found instance %s with name %s" % (instance['InstanceId'], name_tag)
                    #Create Metric "Status Check Failed (System) for 5 Minutes"
                    response = cw.put_metric_alarm(
                        AlarmName="TMS - %s %s System Check Failed" % (name_tag, instance['InstanceId']),
                        AlarmDescription='TMS - Status Check Failed (System) for 5 Minutes',
                        ActionsEnabled=True,
                        AlarmActions=[
                            ec2_sns,
                            "arn:aws:automate:%s:ec2:recover" % region,
                            ],
                            MetricName='StatusCheckFailed_System',
                            Namespace='AWS/EC2',
                            Statistic='Average',
                            Dimensions=[{'Name': 'InstanceId','Value': instance['InstanceId']},],
                            Period=60,
                            EvaluationPeriods=5,
                            Threshold=1.0,
                            ComparisonOperator='GreaterThanOrEqualToThreshold'
                            )
                    #Create Metric "Status Check Failed (Instance) for 5 Minutes"
                    response = cw.put_metric_alarm(
                        AlarmName="TMS - %s %s Instance Check Failed" % (name_tag, instance['InstanceId']),
                        AlarmDescription='TMS - Status Check Failed (Instance) for 5 Minutes',
                        ActionsEnabled=True,
                        AlarmActions=[
                        ec2_sns,
                        "arn:aws:swf:%s:%s:action/actions/AWS_EC2.InstanceId.Reboot/1.0" % (region, akid)
                        ],
                        MetricName='StatusCheckFailed_Instance',
                        Namespace='AWS/EC2',
                        Statistic='Average',
                        Dimensions=[{'Name': 'InstanceId','Value': instance['InstanceId']},],
                        Period=60,
                        EvaluationPeriods=5,
                        Threshold=1.0,
                        ComparisonOperator='GreaterThanOrEqualToThreshold'
                        )
        except Exception, e:
            print ("Error Encountered.")
            print (e)

Get rid of running/orphaned Packer instances

I had orphaned Packer instances running in my environment. Following is a solution to get rid of them:

from __future__ import print_function

import json
import urllib
import boto3
import datetime
import os

max_runtime = int(os.environ['max_runtime'])
# Available methods: stop or terminate, anything else means only notification
method = os.environ['method']
sns_topic = os.environ['sns_topic']

client = boto3.client('ec2')

def lambda_handler(event, context):
    try:
        response = client.describe_instances(
            Filters=[
                {
                    'Name': 'key-name',
                    'Values': [
                        'packer *'
                    ]
                },
                {
                    'Name': 'instance-state-name',
                    'Values': [
                        'running',
                    ]
                },
            ]
        )
        instances_to_terminate = []
        for reservation in response["Reservations"]:
            for instance in reservation["Instances"]:
                launchTime = instance["LaunchTime"]
                tz_info = launchTime.tzinfo
                now = datetime.datetime.now(tz_info)
                delta = datetime.timedelta(hours=max_runtime)
                the_past = now - delta
                # If the instance was launched more than the max_runtime ago,
                # get rid of it
                if the_past > instance["LaunchTime"]:
                    instances_to_terminate.append(instance["InstanceId"])

        if len(instances_to_terminate) > 0:
            print("These instances were running too long: ")
            print(instances_to_terminate)
            # Decide how to handle the instances
            if method == "stop":
                client.stop_instances(
                    InstanceIds=instances_to_terminate
                )
            elif method == "terminate":
                client.terminate_instances(
                    InstanceIds=instances_to_terminate
                )
            # Send an SNS message if the topic is defined
            if sns_topic != "":
                send_sns(instances_to_terminate)
    except Exception as e:
        print(e)
        raise e

def send_sns(instances):
    snsclient = boto3.client("sns")
    message = "The following instances were running too long:"
    for instance in instances:
        message += "\n* " + instance
    if method == "stop":
        message += "\n\nThey have been stopped"
    if method == "terminate":
        message += "\n\nThe have been terminated"
    snsclient.publish(TopicArn=sns_topic,
                      Message=message,
                      Subject="Packer instances running too long")

Find manually created resources (AWS)

How many times have you come across AWS environments that grew organically, with resources created manually. Usually such projects later identify the need for Infrastructure as Code (IaC) but how do you find out what was created via Script (CloudFormation) vs by hand?

# To assist in identifying resources which have been provisioned manually and not through cloudformation.
# This approach was used as opposed to pulling all resources with specific tags, as not all resource types support tagging at this stage.
#Initially load your AWS creds - replace values with your keys.
Import-Module AWSPowerShell
$AWSKEY = "Somekey"
$AWSSECRET = "SomeSecret"
Set-AWSCredential -AccessKey $AWSKEY -SecretKey $AWSSECRET -StoreAs Default
#Define output location
$outputlocation = $ENV:TEMP
#Get all the stacks
$stacksNames = Get-CFNStack | ForEach-Object -MemberName StackName
#iterate through stacks getting all resources
$totalstackresources = foreach ($stacksname in $stacksNames){
Get-CFNStackResourceList -StackName $stacksname
}
#Useful for working through the various different resource types.
$ResourceTypes = $totalstackresources | select -Property ResourceType -Unique | Sort-Object Resourcetype 

#unmanaged AutoScaleGroups
$CFASG = $totalstackresources | Where {$_.ResourceType -like "AWS::AutoScaling::AutoScalingGroup"} | select -Property PhysicalResourceId | Add-Member -MemberType AliasProperty -Name InstanceId -Value PhysicalResourceId -PassThru | select InstanceId
$ASGS = Get-ASAutoScalingGroup | Select-Object AutoScalingGroupName
$UnmanagedASG = Compare-Object $ASGS $CFASG | ForEach-Object { $_.InputObject}
$UnmanagedASG | Export-Csv $outputlocation/UnmanagedASG.csv

#unmanaged AutoScaleGroupLaunch Configuration
$CFASGLC = $totalstackresources | Where {$_.ResourceType -like "AWS::AutoScaling::LaunchConfiguration"} | select -Property PhysicalResourceId | Add-Member -MemberType AliasProperty -Name InstanceId -Value PhysicalResourceId -PassThru | select InstanceId
$ASGLC = Get-ASLaunchConfiguration | Select-Object LaunchConfigurationName
$UnmanagedASGLC = Compare-Object $ASGLC $CFASGLC | ForEach-Object { $_.InputObject}
$UnmanagedASGLC | Export-Csv $outputlocation/UnmanagedASGLC.csv

#Unmanaged EIPs
$CFEIP = $totalstackresources | Where {$_.ResourceType -like "AWS::EC2::EIP"} | select -Property PhysicalResourceId | Add-Member -MemberType AliasProperty -Name InstanceId -Value PhysicalResourceId -PassThru | select InstanceId
$EIPIds = Get-EC2Address | Select-Object NetworkInterfaceId, PublicIp, PrivateIpAddress
$UnmanagedEIPS = Compare-Object $EIPIds $CFEIP| ForEach-Object { $_.InputObject}
$UnmanagedEIPS | Export-Csv $outputlocation/UnmanagedEips.csv

#Unmanaged EC2 Instances
$CFInstance = $totalstackresources | Where {$_.ResourceType -like "AWS::EC2::Instance"} | select -Property PhysicalResourceId | Add-Member -MemberType AliasProperty -Name InstanceId -Value PhysicalResourceId -PassThru | select InstanceId
$InstanceIds = Get-EC2Instance | % { $_.RunningInstance } | Select-Object InstanceId,PublicDnsName,@{Name='TagValues'; Expression={($_.Tag | %{$_.Key + '=' + $_.Value}) -join ', '}}
$UnmanagedEC2Instances = Compare-Object $InstanceIds $CFInstance | ForEach-Object { $_.InputObject}
$UnmanagedEC2Instances| Export-Csv $outputlocation/UnmanagedEC2Instances.csv

#unmanaged SecurityGroups
$CFSG = $totalstackresources | Where {$_.ResourceType -like "AWS::EC2::SecurityGroup"} | select -Property PhysicalResourceId | Add-Member -MemberType AliasProperty -Name InstanceId -Value PhysicalResourceId -PassThru | select InstanceId
$SGIds = Get-EC2SecurityGroup | Select-Object GroupName,Description
$UnmanagedSG = Compare-Object $SGIds $CFSG| ForEach-Object { $_.InputObject}
$UnmanagedSG | Export-Csv $outputlocation/UnmanagedSG.csv

#Unmanaged ELB 
$CFELB = $totalstackresources | Where {$_.ResourceType -like "AWS::ElasticLoadBalancing::LoadBalancer"} | select -Property PhysicalResourceId | Add-Member -MemberType AliasProperty -Name InstanceId -Value PhysicalResourceId -PassThru | select InstanceId
$ELBIds = Get-ELBLoadBalancer | Select-Object DNSName,LoadBalancerName,Instances
$UnmanagedELBs = Compare-Object $ELBIDs $CFELB | ForEach-Object { $_.InputObject}
$UnmanagedELBs | Export-Csv $outputlocation/UnmanagedELBs.csv

#Unmanaged ALB (ELB v2)
$CFALB = $totalstackresources | Where {$_.ResourceType -like "AWS::ElasticLoadBalancingV2::LoadBalancer"} | select -Property PhysicalResourceId | Add-Member -MemberType AliasProperty -Name InstanceId -Value PhysicalResourceId -PassThru | select InstanceId
$ALBIds = Get-ELB2LoadBalancer | Select-Object LoadBalancerArn,DNSName,LoadBalancerName
$UnmanagedALBs = Compare-Object $ALBIDs $CFALB | ForEach-Object { $_.InputObject}
$UnmanagedALBs | Export-Csv $outputlocation/UnmanagedALBs.csv

#Unmanaged Instance Profiles
$CFProfile = $totalstackresources | Where {$_.ResourceType -like "AWS::IAM::InstanceProfile"} | select -Property PhysicalResourceId
$ProfileIds = Get-IAMInstanceProfileList | Select-Object Arn,InstanceProfileName,InstanceProfileId
$UnmanagedProfiles = Compare-Object $ProfileIds $CFProfile | ForEach-Object { $_.InputObject}
$UnmanagedProfiles | Export-Csv $outputlocation/UnmanagedProfiles.csv

#Unmanaged IAM Policies - Please note that there a bunch of AWS defined policies which have been excluded based on the filter provided below.
$CFPolicy = $totalstackresources | Where {$_.ResourceType -like "AWS::IAM::Policy"} | select -Property PhysicalResourceId
$PolicyIds = Get-IAMPolicyList | Where-Object {$_.Arn -like "arn:aws:iam::862017364710:policy/*"}| Select-Object Arn,PolicyName,PolicyId 
$UnmanagedPolicies = Compare-Object $PolicyIds $CFPolicy | ForEach-Object { $_.InputObject}
$UnmanagedPolicies | Export-Csv $outputlocation/UnmanagedPolicies.csv

#Unmanaged IAM Role 
$CFIAMROLE = $totalstackresources | Where {$_.ResourceType -like "AWS::IAM::Role"} | select -Property PhysicalResourceId
$IAMROLEIds = Get-IAMRoleList | Select-Object RoleId,RoleName 
$UnmanagedIAMRoles = Compare-Object $IAMROLEIds $CFIAMROLE | ForEach-Object { $_.InputObject}
$UnmanagedIAMRoles | Export-Csv $outputlocation/UnmanagedIAMRoles.csv

#Unmanaged Lambda Functions
$CFLambdaFN = $totalstackresources | Where {$_.ResourceType -like "AWS::Lambda::Function"} | select -Property PhysicalResourceId
$LambdaIDs = Get-LMFunctionList | Select-Object FunctionName,Runtime,RoleName 
$UnmanagedLambdaIDs = Compare-Object $LambdaIDs $CFLambdaFN | ForEach-Object { $_.InputObject}
$UnmanagedLambdaIDs | Export-Csv $outputlocation/UnmanagedLambdas.csv

#Unmanaged Route53 Resources need to convert R53 string into array for comparison
$CFR53 = $totalstackresources | Where {$_.ResourceType -like "AWS::Route53::RecordSet"} | select -Property PhysicalResourceId
$R53list = Get-R53ResourceRecordSet -HostedZoneId Z1JVZK10L1ND7P | Select-Object ResourceRecordSets
$R53s = $R53s.ResourceRecordSets | Select-Object Name
$UnmanagedR53s = Compare-Object $R53s $CFR53 | ForEach-Object { $_.InputObject}
$UnmanagedR53s | Export-Csv $outputlocation/UnmanagedR53s.csv

#Unmanaged RDS instances
$CFRDS = $totalstackresources | Where {$_.ResourceType -like "AWS::RDS::DBInstance"} | select -Property PhysicalResourceId | Add-Member -MemberType AliasProperty -Name InstanceId -Value PhysicalResourceId -PassThru | select InstanceId
$DBARNS = Get-RDSDBInstance | ForEach-Object -MemberName DBInstanceArn
$dbtotal = @()
$DBARNS | foreach {
    $DBARN = new-object PSObject -Property @{
    InstanceId= ($_.Split(":")[6])}
$dbtotal +=$DBARN}
$UnmanagedDBInstances = Compare-Object $dbtotal $CFRDS | ForEach-Object { $_.InputObject}
$UnmanagedDBInstances| Export-Csv $outputlocation/UnmanagedDBInstances.csv

#Unmanaged RDS Subnet Groups
$CFRDSSG = $totalstackresources | Where {$_.ResourceType -like "AWS::RDS::DBSubnetGroup"} | select -Property PhysicalResourceId | Add-Member -MemberType AliasProperty -Name InstanceId -Value PhysicalResourceId -PassThru | select InstanceId
$DBSGS = Get-RDSDBSubnetGroup | Select-Object DBSubnetGroupArn, DBSubnetGroupName, DBSubnetGroupDescription
$UnmanagedRDSSG = Compare-Object $DBSGS $CFRDSSG | ForEach-Object { $_.InputObject}
$UnmanagedRDSSG | Export-Csv $outputlocation/UnmanagedRDSSG.csv

#Unmanaged RDS Option Groups
$CFRDSOG = $totalstackresources | Where {$_.ResourceType -like "AWS::RDS::OptionGroup"} | select -Property PhysicalResourceId | Add-Member -MemberType AliasProperty -Name InstanceId -Value PhysicalResourceId -PassThru | select InstanceId
$DBOGS = Get-RDSOptionGroup | Select-Object OptionGroupArn
$UnmanagedRDSOG = Compare-Object $DBOGS $CFRDSOG | ForEach-Object { $_.InputObject}
$UnmanagedRDSOG | Export-Csv $outputlocation/UnmanagedRDSOG.csv

#Unmanaged SQS resources
$CFSQS = $totalstackresources | Where {$_.ResourceType -like "AWS::SQS::Queue"} | select -Property PhysicalResourceId
$SQSIds = Get-IAMRoleList | Select-Object RoleId,RoleName 
$UnmanagedSQS = Compare-Object $SQSIds $CFSQS | ForEach-Object { $_.InputObject}
$UnmanagedSQS | Export-Csv $outputlocation/UnmanagedSQS.csv

#Unmanaged Volumes - Please note that there are a bunch of volumes not in use which are NOT included.
$CFVolume = $totalstackresources | Where {$_.ResourceType -like "AWS::EC2::Volume"} | select -Property PhysicalResourceId | Add-Member -MemberType AliasProperty -Name InstanceId -Value PhysicalResourceId -PassThru | select InstanceId
$VolumeIds = Get-ec2volume | Where {$_.State -eq "in-use"} | Select-Object VolumeId, Attachments,Size
$UnmanagedVolumes = Compare-Object $VolumeIds $CFVolume | ForEach-Object { $_.InputObject}
$UnmanagedVolumes | Export-Csv $outputlocation/Unmanagedvolumes.csv