Python Script to Identify IP Ranges for EC2 Instance Connect

Today I needed to identify what IP ranges need to be added in security groups that allow EC2 Instance Connect to be able to establish a connection to a node.

I absolutely did not want to open port 22 to the entire internet for obvious security reasons.

As such, I found after reading through AWS docs that they maintain a JSON list that can be periodically parsed in the event they add/remove/change any IP ranges that are associated with internal services.

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-connect-set-up.html

As a result I wrote up a quick python script so the engineer who is tasked with this doesn’t have to parse through thousands of lines of JSON in order to find the specific criteria needed.

Just change your desired region and service at the top of the file and run it – it will output the ranges you need to add in the specified security group.

from urllib.request import urlopen
import json

desired_service = 'EC2_INSTANCE_CONNECT'
desired_region = 'us-east-2'

url = 'https://ip-ranges.amazonaws.com/ip-ranges.json'
response = urlopen(url)
json_obj = json.load(response)

for ip_range in json_obj['prefixes']:
    service = ip_range['service']
    ip_prefix = ip_range['ip_prefix']
    region = ip_range['region']

    if region.strip().upper() == desired_region.strip().upper():
        if service.strip().upper() == desired_service.strip().upper():
            print('[INFO] You should allow {ip_prefix} for {service} originating from {region}'.format(ip_prefix=ip_prefix, service=service, region=region))

Manually Truncating Docker Logs

Over the past couple weeks I’ve been redesigning my office and one thing I wanted to do once finishing is make sure my home server is still up and operational.

I came to find out that it ran out of disk space so processes weren’t starting and naturally the only thing I was able to do is SSH into a zero percent disk space environment.

It took me a while to find out exactly what the issue was but it turned out to be Docker Daemon logs for some micro-services containers I have running.

The HD for the server is relatively small – like 12GB small – so once I figured this out and truncated the logs – everything booted like a charm.

Quick wildcard command I came up with:
truncate -s 0 /var/lib/docker/containers/*/*-json.log

Quick Cronjob:
Command – crontab -e
Entry to Insert – */30 * * * truncate -s 0 /var/lib/docker/containers//-json.log >/dev/null 2>&1

Here was the solution I followed from user kaushik on StackOverflow – the only difference is that I did “truncate -s 0 filename.log” instead of “cat /dev/null”

https://stackoverflow.com/a/62554053/2520289

OneDrive Hourly Refresh Utility For Syncing Large Quantites of Files

I am nearing the last day with my current job at FINEOS Corporation and transitioning into my new role at Benjamin in the coming weeks.

With my impending departure, I am making major efforts to offload >300GB worth of client data located in my FINEOS Office 365 OneDrive.

This data is from the various obligations worked in the past three years – so it covers quite a bit of consulting work done to date.

I’m doing this in an effort to make sure my team members are able to continue day-to-day operations uninterrupted.

Nobody likes to be left with zero documentation or dead file links – it just sucks when you’re on the receiving end of this sort of thing.

Something that surprised me (and prompted this whole effort) was learning 30 days after my departure, all files in my OneDrive will be deleted and not archived – so that wasn’t good to hear.

Manual Backup Steps – With this background information lets step into the sequence of events here that prompted the blog post and subsequent utility script.

  1. Set OneDrive to download all files from the cloud (untick OnDemand, forcing all files to download locally)
  2. Create Storage Directories on Team SharePoint sites and click “Sync with OneDrive” on those directories
  3. Move the downloaded files into their associated SharePoint Sync folders on my computer
  4. OneDrive Freaks out from >300GB of files, freezing up, and periodically needs a restart of the client due to indexing of >30,000 files

Automated Utility Script Sequence – So overall, I wrote a script that will automate Step 4 and restart OneDrive every hour:

  1. Start Script
  2. Enter Infinite While Loop
    1. Script attempts to Shut Down OneDrive
      1. Enter While Loop
      2. Detects how many instances of “OneDrive.exe” are running, if greater than or equal to 1 proceed to next step, if 0 break from while loop
      3. Initially will attempt to do “OneDrive.exe /shutdown” and sleep for 30 seconds
      4. Detects how many instances of “OneDrive.exe” are running, if greater than or equal to 1 proceed to next step, if 0 break from while loop
      5. After 30 seconds, will attempt to gracefully “taskkill” without the force flag – uses the term signal instead
      6. Detects how many instances of “OneDrive.exe” are running, if greater than or equal to 1 repeat loop, if 0 break from while loop
    2. Script attempts to Start Up OneDrive
    3. Script Sleeps for 1 hour before attempting to refresh OneDrive again
    4. Repeat Loop after Sleep timer ends

I was surprised to find out that there are no formal methods to send a periodic pause/resume command to OneDrive.exe so I had to really think out of the box on this one based on the limited information I was able to obtain on scripted usage of OneDrive.exe

#!/bin/bash

# Time in Hours before attempting to Restart OneDrive
ONEDRIVE_REFRESH_INTERVAL_IN_HOURS="1"

# Global Variable for Function Usage that tracks if OneDrive is running
GLOBAL_ONEDRIVE_RUNNING="TRUE"

function onedrive_CheckIfRunning()
{
    echo "[INFO] [DetectIfRunning] Checking if OneDrive.exe is running..."

    ONEDRIVE_PROCESS_COUNT=$(tasklist | grep -i "OneDrive.exe" | wc -l)

    echo "[INFO] [DetectIfRunning] ONEDRIVE_PROCESS_COUNT is $ONEDRIVE_PROCESS_COUNT"

    if [ "$ONEDRIVE_PROCESS_COUNT" -ge "1" ];
    then
        GLOBAL_ONEDRIVE_RUNNING="TRUE"
    else
        GLOBAL_ONEDRIVE_RUNNING="FALSE"
    fi
}

function onedrive_ShutdownSleepTimer()
{
    # Create some self documenting variables to make the code
    # more descriptive on how the multiplication occurs
    SECONDS_IN_A_MINUTE="60"
    MINUTES_IN_A_HOUR="60"

    ONEDRIVE_REFRESH_INTERVAL_IN_SECONDS=$(($MINUTES_IN_A_HOUR * $SECONDS_IN_A_MINUTE * $ONEDRIVE_REFRESH_INTERVAL_IN_HOURS))

    echo "[INFO] [WAIT] Sleeping for $ONEDRIVE_REFRESH_INTERVAL_IN_HOURS hours or $ONEDRIVE_REFRESH_INTERVAL_IN_SECONDS seconds before attempting to restart OneDrive..."
    sleep $ONEDRIVE_REFRESH_INTERVAL_IN_SECONDS
}

function onedrive_Start()
{
    echo "[INFO] [START] Starting OneDrive..."
    start "" "$LOCALAPPDATA/Microsoft/OneDrive/OneDrive.exe"
}

function onedrive_Shutdown()
{
    echo "[INFO] [STOP] Refreshing OneDrive (Shutting Down) ..."

    GLOBAL_ONEDRIVE_RUNNING="TRUE"

    onedrive_CheckIfRunning

    while [ $GLOBAL_ONEDRIVE_RUNNING == "TRUE" ]
    do
        echo "[INFO] [STOP] Attempting Proper Shutdown of OneDrive.exe ..."
        start "" "$LOCALAPPDATA/Microsoft/OneDrive/OneDrive.exe" /shutdown

        echo "[INFO] [STOP] Sleeping for 30 seconds to allow OneDrive.exe time to shutdown ..."
        sleep 30

        onedrive_CheckIfRunning

        if [ "$GLOBAL_ONEDRIVE_RUNNING" == "TRUE" ];
        then
            echo "[INFO] [STOP] OneDrive.exe detected as running still, attempting a graceful task closure..."
            taskkill //im "OneDrive.exe"
            
            echo "[INFO] [STOP] Sleeping for 30 seconds while waiting for it to exit..."
            sleep 30
            
            onedrive_CheckIfRunning
        else
            GLOBAL_ONEDRIVE_RUNNING="FALSE"
        fi
    done
}

function main()
{
    while [ "TRUE" == "TRUE" ]
    do
        echo "[INFO] [HowToCloseScript] OneDrive Refresh Program is running in Endless While Loop, Press Ctrl+C to Stop Script if Desired ..."
        onedrive_Shutdown
        onedrive_Start
        onedrive_ShutdownSleepTimer
    done
}

main

Bash Automation Script for Targeted Code Analysis

This is a restructure of the script I wrote the other day called “Bash Automation Script to Clone and Bulk Modify Files for Commit”

In this version of the script I am using it for something of an impact analysis to help identify all potential filenames that match a potential Jira ID so I can write a patch for work that I’m doing that will effectively reverse the changes.

This is helpful for situations where the final artifact you may have is located and generated based off a codebase you may not have access to.

Sequence is check the code out, build it, then search for the files/directories that match the pattern you’re looking for, then sync their directory structure to the target directory, then copy the file/directory to the target.

#!/bin/bash
SCRIPT_DIRECTORY=$(pwd)

# Target SVN Path to Checkout and Modify
SVN_CHECKOUT_URL="https://svn.company.com/path/to/checkout"

# Target Checkout Directory
SVN_CHECKOUT_DIRECTORY="/c/Code/Dynamic_Checkout"

# Filenames or directories we want to find
declare -a PATTERN_TO_TARGET_LIST
PATTERN_TO_TARGET_LIST+=("JIRA-ID-1")
PATTERN_TO_TARGET_LIST+=("JIRA-ID-2")

function find_matching_files()
{
    #Spaces in a String will mess up a for loop - https://askubuntu.com/questions/344407/how-to-read-complete-line-in-for-loop-with-spaces
    IFS=$'\n' 

    for PATTERN_TO_TARGET in "${PATTERN_TO_TARGET_LIST[@]}"
    do
        echo "[INFO] Looking for $PATTERN_TO_TARGET ..."

        FILE_OR_DIRECTORY_LIST=$(find "$SVN_CHECKOUT_DIRECTORY" -iname "*$PATTERN_TO_TARGET*" ) 

        for FILE_OR_DIRECTORY in $FILE_OR_DIRECTORY_LIST
        do
            echo "[INFO] Found $FILE_OR_DIRECTORY - Copying to Script Directory..."

            # Get Preceding Directory of the File or Directory
            # Input Example - /my/directory/to/target
            # Target Output - /my/directory/to/

            DIRECTORY_OF_FILE_OR_DIRECTORY=$(performConditionalActionOnFile "GET_JUST_THE_PRECEDING_DIRECTORY" "$FILE_OR_DIRECTORY")
            
            # Get Directory without the SVN_CHECKOUT_DIRECTORY
            # so we can create the directory structure from the origin
            # Input Example - /c/Code/Dynamic_Checkout/SearchResult/FileResult
            # Target to Remove - /c/Code/Dynamic_Checkout/
            # Target Output - /SearchResult/FileResult

            DIRECTORY_OF_FILE_WITHOUT_UNDESIRED_PATH_PREFIX=$(misc_findReplace "$SVN_CHECKOUT_DIRECTORY" "" "$DIRECTORY_OF_FILE_OR_DIRECTORY")

            # Target Directory to copy the FILE_OR_DIRECTORY to
            # Ideally we are copying:
            # FROM /c/Code/Dynamic_Checkout/SearchResult/FileResult
            # TO /c/ThisScriptPathDirectory/SearchResult/FileResult

            TARGET_DIRECTORY_FOR_COPY="$SCRIPT_DIRECTORY/$DIRECTORY_OF_FILE_WITHOUT_UNDESIRED_PATH_PREFIX"
            
            mkdir -p "$TARGET_DIRECTORY_FOR_COPY"

            cp -R "$FILE_OR_DIRECTORY" "$TARGET_DIRECTORY_FOR_COPY"
        done
    done

    #Turn it off after - https://askubuntu.com/questions/344407/how-to-read-complete-line-in-for-loop-with-spaces
    unset IFS
}

function performConditionalActionOnFile()
{
    STRING_MODE="$1"

    FILE_NAME=$(basename -- "$2")
    FILE_EXTENSION="${FILE_NAME##*.}"
    FILE_NAME="${FILE_NAME%.*}"

    ARCHIVE_DIRECTORY=$(dirname "$2")
    ARCHIVE_OUTPUT_DIRECTORY="$ARCHIVE_DIRECTORY/$FILE_NAME"

    if [ "$STRING_MODE" == "GET_JUST_THE_FILENAME_WITH_EXTENSION" ]; then
        echo "$FILE_NAME.$FILE_EXTENSION"
    fi

    if [ "$STRING_MODE" == "GET_JUST_THE_FILENAME_NO_EXTENSION" ]; then
        echo "$FILE_NAME"
    fi

    if [ "$STRING_MODE" == "GET_JUST_THE_PRECEDING_DIRECTORY" ]; then
        echo "$ARCHIVE_DIRECTORY"
    fi
}

function misc_findReplace()
{
    VARIABLE_FIND="$1"
    VARIABLE_REPLACE="$2"
    VARIABLE_STRING="$3"

    echo "$VARIABLE_STRING" | sed --expression "s|${VARIABLE_FIND}|${VARIABLE_REPLACE}|g"
}

function build()
{
    cd "$SVN_CHECKOUT_DIRECTORY"
    chmod 777 ./gradlew
    ./gradlew assemble --no-daemon --console=plain 
}

function checkout()
{
    # rm -Rf "$SVN_CHECKOUT_DIRECTORY"
    mkdir -p "$SVN_CHECKOUT_DIRECTORY"
    svn checkout "$SVN_CHECKOUT_URL" "$SVN_CHECKOUT_DIRECTORY"
}

function main()
{
    checkout
    build
    find_matching_files
}

main

Bash Automation Script to Clone and Bulk Modify Files for Commit

I have this habit of writing scripts for simple but repetitive tasks.

Most of the time, I do this because I’m afraid of introducing a certain degree of human error depending on how many files I’m modifying.

In this case, the scripts purpose was to modify many environmental files at a time and insert the same properties into each file if they matched whitelist conditions – and that is what the below does.

#!/bin/bash
SCRIPT_DIRECTORY=$(pwd)

# Target Directory to Modify Properties Files
DIRECTORY_DIST="$SCRIPT_DIRECTORY/path/to/folder/with/files/to/modify/specifically"

# Target SVN Path to Checkout and Modify
SVN_CHECKOUT_URL="https://subversion.mycompany.com/svn/path/to/folder/with/files/to/modify/specifically"

# Environments we want to target
declare -a ENVIRONMENT_TO_TARGET_LIST
ENVIRONMENT_TO_TARGET_LIST+=("DEV")
ENVIRONMENT_TO_TARGET_LIST+=("TEST")
ENVIRONMENT_TO_TARGET_LIST+=("PROD")

# File we want to modify minus the environment string
DEV_FILE_PATTERN_TO_FIND_PREFIX="file-dev-prefix-"
DEV_FILE_PATTERN_TO_FIND_SUFFIX=".yaml"

TEST_FILE_PATTERN_TO_FIND_PREFIX="file-test-prefix-"
TEST_FILE_PATTERN_TO_FIND_SUFFIX=".yaml"

PROD_CONFIG_FILE_PATTERN_TO_FIND_PREFIX="file-prod-prefix-"
PROD_CONFIG_FILE_PATTERN_TO_FIND_SUFFIX=".yaml"

function make_change()
{
    FILE_TO_MODIFY_PATH="$1"

    echo '' >> "$FILE_TO_MODIFY_PATH"
    echo '' >> "$FILE_TO_MODIFY_PATH"
    echo '# -- My-Change-Tracking-ID -- BEGIN --' >> "$FILE_TO_MODIFY_PATH"
    echo 'AdditionalPropertyToAppend=Property' >> "$FILE_TO_MODIFY_PATH"
    echo "# -- My-Change-Tracking-ID -- END --" >> "$FILE_TO_MODIFY_PATH"
    echo '' >> "$FILE_TO_MODIFY_PATH"
}

function identify_files_and_make_change()
{
    PROPERTY_FILE_TO_MODIFY_LIST=$(find "$DIRECTORY_DIST" \
                                         \( -iname "$DEV_FILE_PATTERN_TO_FIND_PREFIX*$DEV_FILE_PATTERN_TO_FIND_SUFFIX" \
                                            -o \
                                            -iname "$TEST_FILE_PATTERN_TO_FIND_PREFIX*$TEST_FILE_PATTERN_TO_FIND_SUFFIX" \
                                            -o \
                                            -iname "$PROD_CONFIG_FILE_PATTERN_TO_FIND_PREFIX*$PROD_CONFIG_FILE_PATTERN_TO_FIND_SUFFIX" \)) 

    #Spaces in a String will mess up a for loop - https://askubuntu.com/questions/344407/how-to-read-complete-line-in-for-loop-with-spaces
    IFS=$'\n' 

    for PROPERTY_FILE_TO_MODIFY in $PROPERTY_FILE_TO_MODIFY_LIST
    do
        PROPERTY_FILE_TO_MODIFY_JUST_FILENAME=$(performConditionalActionOnFile "GET_JUST_THE_FILENAME_WITH_EXTENSION" "$PROPERTY_FILE_TO_MODIFY")

        IS_TARGETTED_ENVIRONMENT="FALSE"

        for ENVIRONMENT_TO_TARGET in "${ENVIRONMENT_TO_TARGET_LIST[@]}"
        do
            IS_TARGETTED_ENVIRONMENT=$(misc_contains_string_insensitive "$ENVIRONMENT_TO_TARGET" "$PROPERTY_FILE_TO_MODIFY_JUST_FILENAME")

            if [ "$IS_TARGETTED_ENVIRONMENT" == "TRUE" ]; then
                echo "[INFO] [Modifying] $PROPERTY_FILE_TO_MODIFY_JUST_FILENAME - Entry Detected in the ENVIRONMENT_TO_TARGET_LIST ($ENVIRONMENT_TO_TARGET) ..."
                
                make_change "$PROPERTY_FILE_TO_MODIFY"
                break
            fi
        done

        if [ "$IS_TARGETTED_ENVIRONMENT" == "FALSE" ]; then
            echo "[INFO] [Skipping] $PROPERTY_FILE_TO_MODIFY_JUST_FILENAME - Entry Not Detected in the ENVIRONMENT_TO_TARGET_LIST ... "
        fi
    done

    #Turn it off after - https://askubuntu.com/questions/344407/how-to-read-complete-line-in-for-loop-with-spaces
    unset IFS
}

function performConditionalActionOnFile()
{
    STRING_MODE="$1"

    FILE_NAME=$(basename -- "$2")
    FILE_EXTENSION="${FILE_NAME##*.}"
    FILE_NAME="${FILE_NAME%.*}"

    ARCHIVE_DIRECTORY=$(dirname "$2")
    ARCHIVE_OUTPUT_DIRECTORY="$ARCHIVE_DIRECTORY/$FILE_NAME"

    if [ "$STRING_MODE" == "GET_JUST_THE_FILENAME_WITH_EXTENSION" ]; then
        echo "$FILE_NAME.$FILE_EXTENSION"
    fi

    if [ "$STRING_MODE" == "GET_JUST_THE_FILENAME_NO_EXTENSION" ]; then
        echo "$FILE_NAME"
    fi

    if [ "$STRING_MODE" == "GET_JUST_THE_PRECEDING_DIRECTORY" ]; then
        echo "$ARCHIVE_DIRECTORY"
    fi
}

function misc_contains_string_insensitive()
{
    PATTERN_TO_FIND="$1"
    STRING_TO_PROCESS="$2"

    echo "$STRING_TO_PROCESS" | grep -iq "$PATTERN_TO_FIND"

    EXIT_CODE=$?

    #Check to see if the git operation failed or not
    if [ "$EXIT_CODE" == "0" ]; then
        echo "TRUE"
    else
        echo "FALSE"
    fi
}

function checkout()
{
    rm -Rf "$DIRECTORY_DIST"
    mkdir -p "$DIRECTORY_DIST"
    svn checkout "$SVN_CHECKOUT_URL" "$DIRECTORY_DIST"
}

function main()
{
    checkout
    identify_files_and_make_change
}

main

Declaring a Populated HashMap in JavaScript

I was looking for a way to define a pre-populated HashMap today in JavaScript for a custom Chrome Extension I’m writing.

Personally I’ve never liked the syntax of “define the array” and “add items to it one-at-a-time”

Took me a bit to find some examples but in the process I found some great documentation from developers across the web – below are their examples – please go check their content out!

Anyways, here is the final solution that I came up with via StackOverflow.

var inlineHashmap = {
  'cat' : 'asdf',
  'dog' : 'jkl;',
}

inlineHashmap['cat']
> "asdf"

inlineHashmap['dog']
> "jkl;"

JavaScript’s object literal syntax, which is typically used to instantiate objects (seriously, no one uses new Object or new Array), is as follows:

Christoph
https://stackoverflow.com/a/14711978/2520289
var obj = {
    'key': 'value',
    'another key': 'another value',
     anUnquotedKey: 'more value!'
};
For arrays it's:

var arr = [
    'value',
    'another value',
    'even more values'
];

If you need objects within objects, that's fine too:

var obj = {
    'subObject': {
        'key': 'value'
    },
    'another object': {
         'some key': 'some value',
         'another key': 'another value',
         'an array': [ 'this', 'is', 'ok', 'as', 'well' ]
    }
}
This convenient method of being able to instantiate static data is what led to the JSON data format.

JSON is a little more picky, keys must be enclosed in double-quotes, as well as string values:

{"foo":"bar", "keyWithIntegerValue":123}

Some time ago, I needed to use a JavaScript hashmap. A hashmap is useful for many reasons, but the main reason I needed one was to be able to find and modify an object, indexed by a unique string, without having to loop through an array of those objects every time.

In order words, I needed to search through my object collection using a unique key value. Key-Value collections are similar to dictionaries in Python, or hashmaps / hashtables in Java.

As far as I can tell, the standard JavaScript language does have a rather simple hashmap implementation, but the “keys” can only be string values. There are some good folks out there who have implemented more complex JS hashmaps. But the ol’ standby is good enough for me, so I’m using it here.

As a Titanium developer, I typically use “Ti.API.log” to print to the console. But since this topic applies to JavaScript in general, I will be using “console.log” for the print statements. For those Titanium developers out there, both function calls should work for you. 🙂

Vui Nguyen aka SunfishGurl
https://sunfishempire.wordpress.com/2014/08/19/5-ways-to-use-a-javascript-hashmap/
So here goes, 5 ways you can use a JavaScript hashmap:

5 – Create hashmap and add keys
// Create the hashmap
var animal = {};
// Add keys to the hashmap
animal[‘cat’] = { sound: ‘meow’, age:8 };
animal[‘dog’] = { sound: ‘bark’, age:10 };
animal[‘bird’] = { sound: ‘tweet’, age:2 };
animal[‘cow’] = { sound: ‘moo’, age:5 };

4 – Print all objects in the hashmap
for (var x in animal)
{
    console.log(‘Key:\n—- ‘ + x + ‘\n’);
    console.log(‘Values: ‘);
    var value = animal[x];
    for (var y in value)
    {
        console.log(‘—- ‘ + y + ‘:’ + value[y]);
    }
    console.log(‘\n’);
}

Here’s a sample of the output:
> Key:
> —- cat
> Values:
> —- sound:meow
> —- age:8
>
> Key:
> —- dog
> Values:
> —- sound:bark
> —- age:10
>
> Key:
> —- bird
> Values:
> —- sound:tweet
> —- age:2
>
> Key:
> —- cow
> Values:
> sound:moo
> —- age:5

3 – Check for the existence of a key, and modify the key
Without a hashmap, you would have to do this:
for (i = 0; i < numObjects; i++)
{
    if (animal[i].type == ‘cat’)
    {
        animal[i].sound = ‘hiss’;
    }
}

But with a hashmap, you can just do this:
// check for the existence of ‘cat’ key
if (‘cat’ in animal)
{
     // modify cat key here
    animal[cat].sound = ‘hiss’;
}
// Sweet, huh?

2 – Delete a key
// check to see if key already exists
if (‘cat’ in animal)
{
     // then, delete it
    delete animal[‘cat’];
}

1 – Count the number of keys
With JS hashmaps, you can’t just do this — animal.length — to get the number of keys, or objects in your hashmap. Instead, you’ll need a few more lines of code:

var count = 0;
for (x in animal)
{ count++; }
console.log(‘The number of animals are: ‘ + count + ‘\n’);

Here’s a sample of the output:
> The number of animals are: 4

There you have it, 5 ways to use a JavaScript hashmap. If you have examples of other uses, or if you’ve implemented a JS hashmap yourself that you’d like to share, please feel free to drop the link to your code in the comments below.

And finally, I referenced the following articles to help me with writing this one. Many thanks to the authors! :
http://stackoverflow.com/a/8877719
http://www.mojavelinux.com/articles/javascript_hashes.html
http://www.codingepiphany.com/2013/02/26/counting-associative-array-length-in-javascript/

Thanks, and hope you find this article useful.

AWS Log Insights – Replace Expression Generator Using Bash

I drafted this quick script up to support the query logic I wrote up yesterday.

This also serves as a good baseline example for doing a for-loop over a string array, string comparison using if statements, and also checking the length of a string.

declare -a VARIABLE_REPLACE_LIST=()

VARIABLE_REPLACE_LIST+=("0")
VARIABLE_REPLACE_LIST+=("1")
VARIABLE_REPLACE_LIST+=("2")
VARIABLE_REPLACE_LIST+=("3")
VARIABLE_REPLACE_LIST+=("4")
VARIABLE_REPLACE_LIST+=("5")
VARIABLE_REPLACE_LIST+=("6")
VARIABLE_REPLACE_LIST+=("7")
VARIABLE_REPLACE_LIST+=("8")
VARIABLE_REPLACE_LIST+=("9")

VARIABLE_ROOT_VALUE="@message"

function getReplaceString()
{
    VARIABLE_INPUT="$1"
    VARIABLE_VALUE_TO_FIND="$2"
    VARIABLE_VALUE_TO_REPLACE_WITH="$3"
    echo "replace($VARIABLE_INPUT, \"$VARIABLE_VALUE_TO_FIND\", \"$VARIABLE_VALUE_TO_REPLACE_WITH\")"
}

function main()
{
    VARIABLE_EXPRESSION_STRING=""

    for VARIABLE_REPLACE_ENTRY in ${VARIABLE_REPLACE_LIST[@]};
    do
        LENGTH_OF_REPLACE_STRING="${#VARIABLE_EXPRESSION_STRING}"

        if [[ "$LENGTH_OF_REPLACE_STRING" == "0" ]]; then
            echo "VARIABLE_EXPRESSION_STRING is Empty - Setting to Initial Value..."
            VARIABLE_EXPRESSION_STRING=$(getReplaceString "@message" "$VARIABLE_REPLACE_ENTRY" "")
        else
            echo "VARIABLE_EXPRESSION_STRING is Not Empty - Doing logic..."
            VARIABLE_EXPRESSION_STRING=$(getReplaceString "$VARIABLE_EXPRESSION_STRING" "$VARIABLE_REPLACE_ENTRY" "")
        fi

        echo "VARIABLE_EXPRESSION_STRING Current Value = $VARIABLE_EXPRESSION_STRING"
    done
}

main

AWS Log Insight Query – Generate Count of Unique Errors in Log Stream with Subquery to Dig Down into Exceptions

This was a cool query to write.

It does the following in AWS CloudWatch using Log Insights query engine:

  1. Parse all @messages for exceptions/errors/etc. and generates unique errors via removal of numerics
  2. Generates a count of how many of this error type is occurring
  3. Generates a sub query that can be copy pasted to dive into the results behind that count
# INSTRUCTIONS FOR USAGE

# 1. ErrorCount Column shows the count for this unique error type across all log messages

# 2. LogMessage Column shows the unique error with numerics removed to show how many 
#    times this type of error is occuring across all logs

# 3. QueryString Column is a column that generates a query that can be copy pasted into Log Insights
#    and used as a follow up query to dig into the exceptions and allow for stack trace analysis 
#    across all occurences of the errors 

# 3A. The query that is generated will work most of the time but in some instances will require 
#     that you search only part of it due to no support for wildcards in log insights.
#
#     Generated Query:
#     - fields @timestamp, @message, @logStream 
#       | filter @message like "Error with . asdf extra things but numerics have been botched"
#
#     Example to Fix from Above Filter:
#     - "Error with . asdf extra things but numerics have been botched"
#
#     Example of Better Query Syntax Revision:
#     - "asdf extra things but numerics have been botched"
#
#     Final Query for Usage:
#     - fields @timestamp, @message, @logStream 
#       | filter @message like "asdf extra things but numerics have been botched"

#Generate Count of Unique Errors - The replace below removes all numerics to generate a unique error
stats count(*) as ErrorCount by replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(@message, "0", ""), "1", ""), "2", ""), "3", ""), "4", ""), "5", ""), "6", ""), "7", ""), "8", ""), "9", "") as LogMessage, 

#Generate Query String for Diving into Results - Copy Pastable
concat(concat('fields @timestamp, @message, @logStream | filter @message like "', replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(@message, "0", ""), "1", ""), "2", ""), "3", ""), "4", ""), "5", ""), "6", ""), "7", ""), "8", ""), "9", ""), " ", " ")),'"') as QueryString_For_Log_Analysis

#Specify the Log Stream Environment if Multiple Environments Exist - (?i) makes it case insensitive
| filter @logStream like /(?i)MyCoolApplicationLogStream/ 

#Specify the Log Criteria - Example below covers exception, caused by, error
| filter @message like /(?i)exception/ or @message like /(?i)caused by/ or @message like /(?i)error/
| display ErrorCount, LogMessage, QueryString_For_Log_Analysis
| sort by ErrorCount desc

Stockholm Syndrome of Software

I’ve referenced this concept so many times in the past but have never put my thoughts down on paper – also please read the quote at the bottom of the post as it is another excellent representation of this concept.

My perspective that I always like give is becoming comfortable/complacent with how things are in your job or the application that you’re actively coding.

At the start, when you become a developer on a legacy application, you absolutely hate it and it’s lack of clear/concise documentation – not to mention it’s spaghetti/house-of-cards codebase.

Over time, you learn to appreciate and sympathize with it’s design paradigms despite your initial reaction that it was (and still is) a dumpster fire – hence the “Stockholm Syndrome of Software” idea.

This is how I feel about Google Cloud and Amazon AWS most of the time with how disorganized, non-documented, and confusing they are – among a few applications I’ve had to work on in my career.

I’ve linked people this quote below I agree with so many times that I realized that it’s potentially going to 404 one day and I wanted to go ahead and save it for future coding generations.

I’ve talked about how customers get so attached to failed code, trying to save some form of cost from a failed software project and unwilling to part with the disaster, that I’ve come up with a term for it. I refer to it as the “Stockholm Syndrome of Software.” The basic idea is that customers get so attached to failed software projects, they will try to do anything to save the investment, including trying to sprinkle a new software project with failed pieces of software.

It is understandable. On the surface, this makes sense. Surely somewhere in this pile of code, there is something that it makes sense to keep. Or, another view of it is that we, the company, can just throw out the old developers, bring some newer/better developers in to solve our problems. These new developers, all they need to do is to cut the head off of a live chicken, perform a voodoo dance around a keyboard, presto changeo, and we have a fully running system.

This is a nightmare. The code failed for a reason. If the previous set of developers didn’t know what they were doing, why do you think the architecture that they started is worth a damn? Why run on top of the old software? Why would you want to infect good code with bad?

Sorry folks, software that doesn’t work and never reached the level of being acceptable for use by being deployed is not really suitable for use. Instead of spending good money on top of bad and trying to keep software on life support that should be shot, go ahead and admit that the software is a sunk cost. Throw the non working code away. Get a set of developers that are trustworthy and can deliver. Don’t micromanage them. Don’t tell them to just put a few tweaks on the non working code. Don’t cling to the old code, trust me, you will be better off.

I find that this problem is rampant. Everyone thinks that they can save a few bucks by going the cheap route. The cheap route doesn’t tend to work. The cheap route costs more with software that doesn’t quite work. It fails in weird places. It craps out with 5 users. It does all the wrong stuff at the wrong time. Trust me, you are better off without cheap, crappy code. Let it go, and do it right.

– Wallace B. McClure, April 9, 2018
https://weblogs.asp.net/wallym/stockholm-syndrome-of-software

AWS CLI Logs Tip – Converting Local Time to UTC using Bash for Time Ranges

I’d been assigned a task to do some log analysis via AWS and didn’t want to have to use the AWS UI every time I needed to do the analysis as it would be a recurring task for a few days.

I wrote the below snippet to allow for quick variable substitution each time I needed to grab these logs – setting them to appropriate UTC time but visually appealing since all email communication around time ranges was in EST.

Hope this may help someone out in the future as I was very particular that I wanted to use built in bash functionality instead of installing commands/libraries to accomplish this task for portability.

function generateEpochString()
{
    TIME_TO_CONVERT="$1"

    #We use >&2 to prevent messing the return value of the function up
    #This allows us to print the echo statements to stderr and final return value to stdout
    #See more - https://superuser.com/a/1320694/233708

    #Begin conversion of time to UTC
    echo "Converting $TIME_TO_CONVERT to UTC..." >&2

    #Convert Local Timezone String to UTC 
    VARIABLE_TIME_UTC_STRING=$(date -u -d "$TIME_TO_CONVERT")
    echo "$TIME_TO_CONVERT converted to UTC is $VARIABLE_TIME_UTC_STRING ..." >&2

    #Convert Local Timezone String to UTC Epoch
    echo "Converting $TIME_TO_CONVERT to UTC Epoch..." >&2
    VARIABLE_TIME_UTC_EPOCH=$(date -u -d "$TIME_TO_CONVERT" +"%s")
    echo "$TIME_TO_CONVERT converted to UTC Epoch is $VARIABLE_TIME_UTC_EPOCH ..." >&2
    echo "$VARIABLE_TIME_UTC_EPOCH"
}

function generateLocalTimezoneString()
{
    VARIABLE_MONTH="$1"
    VARIABLE_DAY="$2"
    VARIABLE_YEAR="$3"
    VARIABLE_TIME="$4"
    VARIABLE_TIMEZONE="$5"

    VARIABLE_LOCAL_TIMEZONE_STRING="$VARIABLE_MONTH/$VARIABLE_DAY/$VARIABLE_YEAR $VARIABLE_TIME $VARIABLE_TIMEZONE"
    echo "$VARIABLE_LOCAL_TIMEZONE_STRING"
}

function main()
{
    #Declare variables for Start Time in Local Timezone
    VARIABLE_START_MONTH="06"
    VARIABLE_START_DAY="30"
    VARIABLE_START_YEAR="2021"
    VARIABLE_START_TIME="10:21:22"
    VARIABLE_START_TIMEZONE="EST"

    VARIABLE_START_TIME_EST_STRING=$(generateLocalTimezoneString "$VARIABLE_START_MONTH" "$VARIABLE_START_DAY" "$VARIABLE_START_YEAR" "$VARIABLE_START_TIME" "$VARIABLE_START_TIMEZONE")
    VARIABLE_START_TIME_UTC_EPOCH=$(generateEpochString "$VARIABLE_START_TIME_EST_STRING")

    #Declare variables for End Time in Local Timezone
    VARIABLE_END_MONTH="06"
    VARIABLE_END_DAY="30"
    VARIABLE_END_YEAR="2021"
    VARIABLE_END_TIME="13:21:22"
    VARIABLE_END_TIMEZONE="EST"

    VARIABLE_END_TIME_EST_STRING=$(generateLocalTimezoneString "$VARIABLE_END_MONTH" "$VARIABLE_END_DAY" "$VARIABLE_END_YEAR" "$VARIABLE_END_TIME" "$VARIABLE_END_TIMEZONE")
    VARIABLE_END_TIME_UTC_EPOCH=$(generateEpochString "$VARIABLE_END_TIME_EST_STRING")

    #Confirm the Variables before Execution
    echo "AWS Log Request - Start Time - Local Time - $VARIABLE_START_TIME_EST_STRING"
    echo "AWS Log Request - Start Time - Epoch UTC Time - $VARIABLE_START_TIME_UTC_EPOCH"
    echo "AWS Log Request - End Time - Local Time - $VARIABLE_END_TIME_EST_STRING"
    echo "AWS Log Request - End Time - Epoch UTC Time - $VARIABLE_END_TIME_UTC_EPOCH"

    #Pull the Logs - Example Usage
    aws logs start-query --log-group-name /aws/batch/job --start-time "$VARIABLE_START_TIME_UTC_EPOCH" --end-time "$VARIABLE_END_TIME_UTC_EPOCH" --query-string 'fields @message | limit 10'
}

main