Skip to main content

Manual AWS Autoscaling without Elastic Beanstalk

Gone are the days to run an over-configured hardware in anticipation that our websites would receive a huge traffic (and so are the days to run a lesser-configured hardware that will let our sites down at crucial times). Servers on the cloud with autoscaling allows us to expand / downgrade our server capability on demand.

At KnackForge we make the potential use of Amazon Web Services (AWS) to offer effective solutions to clients business. In this blog I would like to brief how we architectured our dynamically scaling server environment from the inception and evolved over a period of time from there.

AutoScaling with AWS

Amazon Web Services (AWS) is one of the pioneers in the cloud technology and they offer affordable auto-scaling to everyone, including small to medium businesses. In brief, auto-scaling is a way to step up or down the resources on demand.

Though AWS had autoscaling from it's very early stages, it was particularly hard to get things setup initially. They introduced Elastic Beanstalk to automate everything that is required for Autoscaling which essentially wraps various individual services required for autoscaling namely Elastic Compute Cloud (EC2), Elastic Load Balancer (ELB), AWS Cloudwatch, AWS Autoscaling API, Amazon S3 and Amazon IAM [for HTTPS].

Limitations of Elastic Beanstalk

With our experiences this far, the Elastic Beanstalk's front-end EC2 instances are loaded with a heavy set of Ruby scripts that eats up more CPU cycle leading to considerably lower performance of the websites hosted on a stand-alone EC2 instances at times. Moreover, we are limited to the AMIs (and the operating systems) provided by Amazon for running our front-end servers, with Beanstalk. So we made a choice to get rid of Elastic Beanstalk and instead, go by the old method of connect the various components of autoscaling manually, with an AMI of our choice.

Git integration of AutoScaling

One of the features offered by Beanstalk that we want to preserve on our manually deployed servers is the git integration. It's very handy to publish a new code to the autoscaled front-end servers with just a git command!

Autoscaling without Beanstalk - The Approach

The official documentation on autoscaling - http://aws.amazon.com/documentation/autoscaling helped us setup most of what is required and on top of that, we added our customizations for ease of setup and code updation -- inspired by the Elastic Beanstalk's architecture & functionality. The general overview as follows,

  1. The Front-end servers will be configured with a boot-time script, which will download latest available code from an S3 bucket
  2. The usual setup of Autoscaled instances will be done as explained in the official documentation (link above)
  3. We will however create two autoscaled instance set - one being active, another passive [no running instances in it]. Both autoscaled instances will be connected to their respective load balancers. But, only the active set's load-balancer will be pointed by the website ROOT Domain.
  4. When we need to push an updated code to front-end servers, we will upload new code copy to S3, then activate the passive autoscale group [by increasing the min-servers > 0 and max-servers > 0]. After allowing the passive group's servers become stable & ready to serve requests, we re-map our ROOT domain to the newly activated autoscaling group & make the currently active group inactive, by setting it's min-servers = 0 & max-servers = 0. Beanstalk had scripts monitoring from within the front-end servers to pull the latest code, but we do it externally for performance reasons. Alternatively, instead of having two autoscaling groups, we could have just one & temporarily stop all front-end servers & restart them by playing around with min-servers & max-servers configuration. But that could cause downtime. Hence we did it this way.
  5. We automated the whole process of zipping latest git code & pushing it to S3, identifying the currently active or passive autoscaling group, adjusting min-servers / max-servers, doing the DNS mapping all into a git command "kf.switch". Once we've committed or pulled the latest code to the git working directory, configured with kf.switch, all we have to do is to issue the command "kf.switch" to make it available to front-end servers, with virtually Zero downtime.

First-time Setup (creation of autoscaling groups, load balancers, etc.,)

As mentioned earlier, we feel that the official documentation on Autoscaling has enough examples already to setup a basic autoscaling system. Create one and make a replica of it again, as we will interchangeably use them as active / passive groups. You could initialize with any AMI to start with and later replace it with a different AMI quickly.

A standard autoscaling setup will work. We just assume that the code will be deployed to /var/www/html directory (similar to beanstalk), a special init.d script will be added to the AMI (see next section) and Route 53 will be used to point the ROOT domain to one of the load balancers created (remember: we create two set).

We suggest to choose CLI tools over a PHP / any language specific APIs for setting this up, as we felt it much easier this way.

Settings on the Front-End Instance (AMI):

-------------- /etc/init.d/kf-autoscale-frontend -------------
#! /bin/sh
# /etc/init.d/kf-autoscale-frontend
# give execute permissions & run sudo update-rc.d kf-autoscale-frontend defaults
# Install and Configure s3cmd on the frontend AMI  - it's assumed that the s3cmd configuration is at /root/.s3cfg

# Carry out specific functions when asked to by the system
case "$1" in
  start)
    echo "Starting kf-autoscale-frontend"
    rm -rf /var/www/html
    mkdir -p /var/www/html && cd /var/www/html
    s3cmd -c /root/.s3cfg get s3://code.yoursite.com/latest.zip /tmp/latest.zip
    unzip /tmp/latest.zip
    chmod a+w -R .
    rm /tmp/latest.zip
    /etc/init.d/apache2 restart
    ;;
  *)
    echo "Usage: /etc/init.d/kf-autoscale-frontend start"
    exit 1
    ;;
esac
exit 0

GIT config script (to configure a git checkout act as point of contact to publish latest code to frontend):

-------------- kf_autoscale_git_config_setup.sh ----------------
#!/bin/sh
# Run this script once after CD'ing to the required GIT checkout
# Once it's run on git checkout, any time cd to the git directory and run "git kf.switch" to push latest code to front-end
#
# -- prerequisite --
# Install and Configure s3cmd on the system that has GIT checkout
# The GIT checkout is expected to contain kf_autoscale_switch.php script under utils/ directory (more on this later)
# The system that has GIT checkout also should have AWS autoscaling CLI installed and it's bin/ available in the $PATH
#
# config variables
s3_filename="latest.zip"
s3_bucket="code.yourwebsite.com"

git config alias.kf.autoscale.switch "!python -c 'import os; os.system(\"git archive --format zip --output /tmp/$s3_filename HEAD\"); os.system(\"s3cmd put /tmp/$s3_filename --rr -f s3://$s3_bucket/\"); os.system(\"php ./utils/kf_autoscale_switch.php\");'"
git config alias.kf.switch '!git kf.autoscale.switch $@'

Helper Script for the GIT command to SWAP the Active/Passive Autoscaling Groups:

We went with the PHP SDK in combination with Autoscaling CLI tools, wrapped into a single PHP script which when run will identify the currently active group, bring up the currently passive group, switch the DNS to point to new active group, make the currently active group passive. You could choose any language SDK as it suits you (and even use simple CLI for everything). Make sure to update the last git command in previous section with your new swapping script. Our script below:

   -----------------  kf_autoscale_switch.php ------------------
<?php

/* This script is a stand alone and simply run as php kf_autoscale_switch.php */

require_once(dirname(__FILE__).'/phpsdk/sdk.class.php');

// config variables
$min_servers = 1;
$max_servers = 4;

$root_domain = "example.com.com";
$hosted_zone_id = "{FILL ME}"; // get this from ROUTE 53
$alias_hosted_zone_id = "{FILL ME}"; // get this from ELASTIC LOAD BALANCER interface of console.aws.amazon.com

$current_autoscale_set = 0;
$autoscale_sets = array(
  '1'=>array(
    'load_balancer_name'=>'{FILL ME}',
    'autoscaling_group_name'=>'{FILL ME}',
  ),
  '2'=>array(
    'load_balancer_name'=>'{FILL ME}',
    'autoscaling_group_name'=>'{FILL ME}',
  )
);

$cache = array();

function _find_loadbalancer_uri($set) {
  global $autoscale_sets, $cache;
  if (isset($cache[$set]['loadbalancer_uri'])) return $cache[$set]['loadbalancer_uri'];

  $elb = new AmazonELB();
  $response = $elb->describe_load_balancers(array('LoadBalancerNames'=>$autoscale_sets[$set]['load_balancer_name']));
  if ($response->status == 200 && isset($response->body->DescribeLoadBalancersResult->LoadBalancerDescriptions->member->DNSName)) {
    $uri = (string)$response->body->DescribeLoadBalancersResult->LoadBalancerDescriptions->member->DNSName;
    $cache[$set]['loadbalancer_uri'] = $uri;
    return $uri;
  }
  print "Unable to find Load Balancer URI!" . PHP_EOL;
  return '';
}

function find_active_load_balancer() {
  global $hosted_zone_id, $root_domain, $current_autoscale_set, $autoscale_sets;
  $current_loadbalancer_uri = '';
  $route53 = new AmazonRoute53();
  $response = $route53->list_rrset($hosted_zone_id, array(
    'MaxItems'=>1,
  ));
  if ($response->status == 200 && isset($response->body->ResourceRecordSets->ResourceRecordSet->AliasTarget->DNSName)) {
    $current_loadbalancer_uri = (string)$response->body->ResourceRecordSets->ResourceRecordSet->AliasTarget->DNSName;
  }
  else {
    print "Unable to find Load Balancer URI";
    print_r($response);
  }

  if ($current_loadbalancer_uri) {
    $autoscale_sets_keys = array_keys($autoscale_sets);
    if (($first_set_lb_uri = _find_loadbalancer_uri(reset($autoscale_sets_keys)))) {
      print "--debug: Current LB URI: {$current_loadbalancer_uri}" . PHP_EOL;
      print "--debug: First LB URI: {$first_set_lb_uri}" . PHP_EOL;
      if (strtolower(rtrim($first_set_lb_uri, '.')) == strtolower(rtrim($current_loadbalancer_uri, '.'))) {
        $current_autoscale_set = reset($autoscale_sets_keys);
      }
      else {
        $current_autoscale_set = end($autoscale_sets_keys);
      }
    }
    else {
      print "Unable to continue.. Cannot determine the current Autoscale SET!" . PHP_EOL;
      exit(1);
    }
  }
}

function _check_if_active_instances_exist($set) {
  global $autoscale_sets;
  $command = "as-describe-auto-scaling-groups {$autoscale_sets[$set]['autoscaling_group_name']} | grep INSTANCE | grep InService | wc -l";
  exec($command, $output, $status);
  $active_instances = reset($output);
  if (!$active_instances || $active_instances < 1) {
    print "-- debug: Active Instances: {$active_instances}" . PHP_EOL;
    return false;
  }
  return true;
}

function switch_load_balancer($from_set, $to_set) {
  global $autoscale_sets, $min_servers, $max_servers, $hosted_zone_id, $alias_hosted_zone_id, $root_domain;
  try {
    print "1) Bringing UP the currently inactive GROUP ($to_set)....." . PHP_EOL;
    $command = "as-update-auto-scaling-group {$autoscale_sets[$to_set]['autoscaling_group_name']} --min-size={$min_servers} --max-size={$max_servers}";
    exec($command, $output, $status);
    echo implode(PHP_EOL, $output) . PHP_EOL;
    if ($status) {
      throw new Exception("Unable to bring up the TO auto-scaling GROUP");
    }

    $sleepcount = 0;
    while (!($activeness=_check_if_active_instances_exist($to_set)) && $sleepcount < 25) {
      print "--Waiting for ACTIVE instances to be in-place on the newly started GROUP...." . PHP_EOL;
      sleep(20);
      $sleepcount++;
    }

    if (!$activeness) {
      throw new Exception("No Active servers found in the TO auto-scaling GROUP!");
    }

    print "                         ---------------------                       ". PHP_EOL;

    print "2) Performing DNS Switching to TO auto-scaling GROUP's Load Balancer..." . PHP_EOL;
    $route53 = new AmazonRoute53();
    
    $change_batch = array(
      'Comment' => 'Switching the Load Balancers - KF autoscaling',
      'Changes' => array(
        array(
          'Action' => 'DELETE',
          'ResourceRecordSet' => array(
            'Name' => $root_domain,
            'Type' => 'A',
            'AliasTarget' => array(
              'HostedZoneId' => $alias_hosted_zone_id,
              'DNSName' => _find_loadbalancer_uri($from_set),
            )
          )
        ),
      )
    );
    $response = $route53->change_rrset($hosted_zone_id, $change_batch);
    if ($response->status != '200') {
      throw new Exception("Unable to delete OLD Load Balancer config!" . print_r($response, true));
    }
    
    $change_batch = array(
      'Comment' => 'Switching the Load Balancers - KF autoscaling',
      'Changes' => array(
        array(
          'Action' => 'CREATE',
          'ResourceRecordSet' => array(
            'Name' => $root_domain,
            'Type' => 'A',
            'AliasTarget' => array(
              'HostedZoneId' => $alias_hosted_zone_id,
              'DNSName' => _find_loadbalancer_uri($to_set),
            ),
          ),
        ),
      ),
    );
    $response = $route53->change_rrset($hosted_zone_id, $change_batch);
    if ($response->status != '200') {
      throw new Exception("Unable to create NEW Load Balancer config!" . print_r($response, true));
    }
    print "--- Waiting for a few moments to allow DNS settings to take effect...." . PHP_EOL;
    sleep(30);

    print "                         ---------------------                       ". PHP_EOL;

    print "3) Bringing DOWN the OLD GROUP ($from_set)....." . PHP_EOL;
    $command = "as-update-auto-scaling-group {$autoscale_sets[$from_set]['autoscaling_group_name']} --min-size=0 --max-size=0";
    exec($command, $output, $status);
    echo implode(PHP_EOL, $output) . PHP_EOL;
    if ($status) {
      throw new Exception("Unable to bring DOWN the FROM auto-scaling GROUP");
    }
  }
  catch (Exception $e) {
    print "-------- ERROR -----------";
    print $e->getMessage() . PHP_EOL;
    exit(1);
  }
}

print "Finding Active Load-Balancer connected to the main domain.." . PHP_EOL;
find_active_load_balancer();
print "Found the Active Load Balancer Set as SET: " . $current_autoscale_set . PHP_EOL;
$autoscale_sets_keys = array_keys($autoscale_sets);
print "Switching Load Balancers... (Bringing up the inactive autoscale set, connecting it's LB to domain, Bringing down the active)......" . PHP_EOL;
if ($current_autoscale_set == reset($autoscale_sets_keys)) {
  switch_load_balancer($current_autoscale_set, end($autoscale_sets_keys));
}
else {
  switch_load_balancer($current_autoscale_set, reset($autoscale_sets_keys));
}

We're now running high performance Autoscaling with the flexibility & ease of use as that of Elastice Beanstalk with the above approach. We hope it helps someone who needs to implement something similar.

I would like to hear if you know a better way to achieve the same. Thanks for Reading!

Comments

brian (not verified)

Wed, 01/16/2013 - 16:17

i had drawn the same conclusions about beanstalk. its autoscaling let me down a couple times in a row during peak traffic times. i simply cannot trust a service that doesn't autoscale reliably.

i have the same requirements you do. i was bracing myself for the learning curve of how to craft something very similar to what you have here. i didn't expect to find exactly what i was looking for all laid out so nicely!

i like to maintain the active/passive production environments as well, which i also did inside beanstalk. not only does it give zero downtime for deployments, but i can direct my local host file to the passive production environment post-commit to give it one last check before i flip the dns.

there are several aspects of beanstalk's ruby scripts that seem way over-engineered to me. it makes them heavier than they need to be and more fragile at the same time. amazon could learn some things from you guys. using their amis after some of them were released in a broken state was another major blow to my confidence in the product. looking forward to using pure ubuntu again!

this is a much better solution on all fronts. thanks for sharing your work!

Add new comment

The content of this field is kept private and will not be shown publicly.

Plain text

  • No HTML tags allowed.
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.