The Server Labs Blog Rotating Header Image


Persistence Strategies for Amazon EC2

At The Server Labs, we often run into a need that comes naturally with the on-demand nature of Cloud Computing. Namely, we’d like to keep our team’s AWS bill in check by ensuring that we can safely turn off our Amazon EC2 instances when not in use. In fact, we’d like to take this practice one step further and automate the times when an instance should be operational (e.g. only during business hours). Stopping an EC2 instance is easy enough, but how do we ensure that our data and configuration persist across server restarts? Fortunately, there are a number of possible approaches to solve the persistence problem, but each one brings its own pitfalls and tradeoffs. In this article, we analyze some of the major persistence strategies and discuss their strengths and weaknesses.

A Review of EC2

Since Amazon introduced their EBS-backed AMIs in late 2009 [1], there has been a great deal of confusion around how this AMI type impacts the EC2 lifecycle operations, particularly in the area of persistence. In this section, we’ll review the often-misunderstood differences between S3 and EBS-backed EC2 instances, which will be crucial as we prepare for our discussion of persistence strategies.

Not All EC2 Instances are Created Equal

An Amazon EC2 instance can be launched from one of two types of AMIs: the traditional S3-backed AMI and the new EBS-backed AMI [2]. These two AMIs exhibit a number of differences, for example in their lifecycle [3] and data persistence characteristics [4]:

Characteristic Amazon EBS Instance (S3) Store
Lifecycle Supports stopping and restarting of instance by saving state to EBS. Instance cannot be stopped; it is either running or terminated.
Data persistence Data persists in EBS on instance failure or restart. Data can also be configured to persist when instance is terminated, although it does not do so by default. Instance storage does not persist on instance shutdown or failure. It is possible to attach non-root devices using EBS for data persistence as needed.

As explained in the table above, EBS-backed EC2 instances introduce a new stopped state, unavailable for S3-backed instances. It is important to note that while an instance is in a stopped state, it will not incur any EC2 running costs. You will, however, continue to be billed for the EBS storage associated with your instance. The other benefit over S3-backed instances is that a stopped instance can be started again while maintaining its internal state. The following diagrams summarize the lifecycles of both S3 and EBS-backed EC2 instances:


Lifecycle of an S3-backed EC2 Instance.


Lifecycle of an EBS-backed EC2 Instance.

Note that while the instance ID of a restarted EBS-backed instance will remain the same, it will be dynamically assigned a new set of public and private IP and DNS addresses. If you would like assign a static IP address to your instance, you can still do so by using Amazon’s Elastic IP service [5].

Persistence Strategies

With an understanding of the differences between S3 and EBS-backed instances, we are well equipped to discuss persistence strategies for each type of instance.

Persistence Strategy 1: EBS-backed Instances

First, we’ll start with the obvious choice: EBS-backed instances. When this type of instance is launched, Amazon automatically creates an Amazon EBS volume from the associated AMI snapshot, which then becomes the root device. Any changes to the local storage are then persisted in this EBS volume, and will survive instance failures and restarts. Note that by default, terminating an EBS-backed instance will also delete the EBS volume associated with it (and all its data), unless explicitly configured not to do so [6].

Persistence Strategy 2: S3-backed Instances

In spite of their ease of use, EBS-backed instances present a couple of drawbacks. First, not all software and architectures are supported out-of-the-box as EBS-backed AMIs, so the version of your favorite OS might not be available. Perhaps more importantly, the EBS volume is mounted as the root device, meaning that you will also be billed for storage of all static data such as operating systems files, etc., external to your application or configuration.

To circumvent these disadvantages, it is possible to use an S3-backed EC2 instance that gives you direct control over what files to persist. However, this flexibility comes at a price. Since S3-backed instances use local storage as their root device, you’ll have to manually attach and mount an EBS volume for persisting your data. Any data you write directly to your EBS mount will be automatically persisted. Other times, configuration files exist at standard locations outside of your EBS mount where you will still want to persist your changes. In such situations, you would typically create a symlink on the root device to point to your EBS mount.

For example, assuming you have mounted your EBS volume under /ebs, you would run the following shell commands to persist your apache2 configuration:

# first backup original configuration
mv /etc/apache2/apache2.conf{,.orig}
# use your persisted configuration from EBS by creating a symlink
ln –s /ebs/etc/apache2/apache2.conf /etc/apache2/apache2.conf

Once your S3-backed instance is terminated, any local instance storage (including symlinks) will be lost, but your original data and configuration will persist in your EBS volume. If you would then like to recover the state persisted in EBS upon launching a new instance, you will have to go through the process of recreating any symlinks and/or copying any pertinent configuration and data from your EBS mount to your local instance storage.

Synchronizing between the data persisted in EBS and that in the local instance storage can become complex and difficult to automate when launching new instances. In order to help with these tasks, there are a number of third-party management platforms that provide different levels of automation. These are covered in more detail in the next section.

Persistence Strategy 3: Third-party Management Platforms

In the early days of AWS, there were few and limited third-party platforms available for managing and monitoring your AWS infrastucture. Moreover, in order to manage and monitor your instances for you, these types of platforms necessarily need access to your EC2 instance keys and AWS credentials. Although a reasonable compromise for some, this requirement could pose an unacceptable security risk for others, who must guarantee the security and confidentiality of their data and internal AWS infrastructure.

Given these limitations, The Server Labs developed its own Cloud Management Framework in Ruby for managing EC2 instances, EBS volumes and internal SSH keys in a secure manner. Our framework automates routine tasks such as attaching and mounting EBS volumes when launching instances, as well providing hooks for the installation and configuration of software and services at startup based on the data persisted in EBS. It even goes one step further by mounting our EBS volumes using an encrypted file system to guarantee the confidentiality of our internal company data.

Today, companies need not necessarily develop their homegrown frameworks, and can increasingly rely on third-party platforms. An example of a powerful commercial platform for cloud management is Rightscale. For several of our projects, we rely on Rightscale to automatically attach EBS volumes when launching new EC2 instances. We also make extensive use of scripting to install and configure software onto our instances automatically at boot time using Rightscale’s Righscript technology [7]. These features make it easy to persist your application data and configuration in EBS, while automating the setup and configuration of new EC2 instances associated with one or more EBS volumes.

Automating Your Instance Uptimes

Now that we have discussed the major persistence strategies for Amazon EC2, we are in a good position to tackle our original use case. How can we schedule an instance in Amazon so that it is only operational during business hours? After all, we’d really like to avoid getting billed for instance uptime during times when it is not really needed.

To solve this problem, we’ll have to address two independent considerations. First, we’ll have to ensure that all of our instance state (including data and configuration) is stored persistently. Second, we’ll have to automate the starting and stopping of our instance, as well as restoring its state from persistent storage at boot time.

Automation Strategy 1: EBS-backed Instances

By using an EBS-backed instance, we ensure that all of its state is automatically persisted even if the instance is restarted (provided it is not terminated). Since the EBS volume is mounted as the root device, no further action is required to restore any data or configuration. Last, we’ll have to automate starting and stopping of the instance based on our operational times. For scheduling our instance uptimes, we can take advantage of the Linux cron service. For example, in order to schedule an instance to be operational during business hours (9am to 5pm, Monday-Friday), we could create the following two cron jobs:

0 9 * * 1-5 /opt/aws/bin/ i-10a64379
0 17 * * 1-/opt/aws/bin/ i-10a64379

The first cron job will schedule the EBS-backed instance identified by instance ID i-10a64379 to be started daily from Monday to Friday at 9am. Similarly, the second job schedules the same instance to be stopped at 5pm Monday through Friday. The cron service invokes the helper scripts and to facilitate the configuration of the AWS command-line tools according to your particular environment. You could run this cron job from another instance in the cloud, or you could have a machine in your office launch it.

The following snippet provides sample contents for, which does the setup necessary to invoke the AWS ec2-start-instances command. Note that this script assumes that your EBS-backed instance was previously launched manually and you know its instance ID.

# Name:
# Description: this script starts the EBS-backed instance with the specified Instance ID
# by invoking the AWS ec2-start-instances command
# Arguments: the Instance ID for the EBS-backed instance that will be started.

export JAVA_HOME=/usr/lib/jvm/java-6-sun-
export EC2_HOME=/opt/aws/ec2-api-tools-1.3
export EC2_PRIVATE_KEY=/opt/aws/keys/private-key.pem
export EC2_CERT=/opt/aws/keys/cert.pem
# uncomment the following line to use Europe as the default Zone
#export EC2_URL=

echo "Starting EBS-backed instance with ID ${INSTANCE_ID}"
ec2-start-instances ${INSTANCE_ID}

Similarly, would stop your EBS-backed instance by invoking ec2-stop-instances followed by your instance ID. Note that the instance ID of EBS-backed instances will remain the same across restarts.

Automation Strategy 2: S3-backed Instances

Amazon instances backed by S3 present the additional complexity that the local storage is not persistent and will be lost upon terminating the instance. In order to persist application data and configuration changes independently of the lifecycle of our instance, we’ll have to rely on EBS. Additionally, we’ll have to carefully restore any persisted state upon launching a new EC2 instance.

The Server Labs Cloud Manager allows us to automate these tasks. Among other features, it automatically attaches and mounts a specified EBS volume when launching a new EC2 instance. It also provides hooks to invoke one or more startup scripts directly from EBS. These scripts are specific to the application, and can be used to restore instance state from EBS, including any appropriate application data and configuration.

If you must use S3-backed instances for your solution, you’ll either have to develop your own framework along the lines of The Server Labs Cloud Manager, or rely on third-party management platforms like Rightscale. Otherwise, EBS-backed instances provide the path of least resistance to persisting your instance data and configuration.

Automation Strategy 3: Rightscale

Rightscale provides a commercial platform with support for boot time scripts (via Righscripts) and automatic attachment of EBS volumes. In addition, Rightscale allows applications to define arrays of servers that grow and shrink based on a number of parameters. By using the server array schedule feature, you can define how an alert-based array resizes over the course of a week [8], and thus ensure a single running instance of your server during business hours. In addition, leveraging boot time scripts and the EBS volume management feature enables you to automate setup and configuration of new instances in the array, while persisting changes to your application data and configuration. Using these features, it is possible to build an automated solution for a server that operates during business hours, and that can be shutdown safely when not in use.


This article describes the major approaches to persisting state in Amazon EC2. Persisting state is crucial to building robust and highly-available architectures with the capacity to scale. Not only does it promote operational efficiency by only consuming resources when a need exists; it also protects your application state so that it if your instant fails or is accidentally terminated you can automatically launch a new one and continue where you left off. In fact, these same ideas can also enable your application to scale seamlessly by automatically provisioning new EC2 instances in response to a growth in demand.


[1] New Amazon EC2 Feature: Boot from Elastic Block Store. Original announcement from Amazon explaining the new EC2 boot from EBS feature.

[2] Amazon Elastic Compute Cloud User Guide: AMI Basics. Covers basic AMI concepts for S3 and EBS AMI types.

[3] The EC2 Instance Life Cycle: excellent blog post describing major lifecycle differences between S3 and EBS-backed EC2 instances.

[4] Amazon Elastic Compute Cloud User Guide: AMIs Backed by Amazon EBS. Learn about EBS-backed AMIs and how they work.

[5] AWS Feature Guide: Amazon EC2 Elastic IP Addresses. An introduction to Elastic IP Addresses for Amazon EC2.

[6] Amazon Elastic Compute Cloud User Guide: Changing the Root Volume to Persist. Learn how to configure your EBS-backed EC2 instance so that the associated EBS volume is not deleted upon termination.

[7] RightScale User Guide: RightScripts. Learn how to write your own RightScripts.

[8] RightScale User Guide: Server Array Schedule. Learn how to create an alert-based array to resize over the course of the week.

Eating our own Dog Food! – The Server Labs moves its Lab to the Cloud!


After all these years dealing with servers, switches, routers and virtualisation technologies we think it´s time to move our lab into the next phase, the Cloud, specifically the Amazon EC2 Cloud.

We are actively working in the Cloud now for different projects, as you´ve seen in previous blog posts. We believe and feel this step is not only a natural one but also takes us in the right direction towards a more effective management of resources and higher business agility. This fits with the needs of a company like ours and we believe it will also fit for many others of different sizes and requirements.
Cloud computing is not only a niche for special projects with very specific needs. It can be used by normal companies to have a more cost effective It infrastructure, at least in certain areas.

In our lab we had a mixture of server configurations, comprising Sun and Dell servers running all kinds of OSs, using VMWare and Sun virtualisation technology. The purpose of our Lab is to provide an infrastructure for our staff, partners and customers to perform specific tests, prototypes, PoC´s, etc… Also, the Lab is our R & D resource to create new architecture solutions.

Moving our Lab to the cloud will provide an infrastructure that will be more flexible, manageable, powerful, simple and definitely more elastic to setup, use and maintain, without removing any of the features we currently have. It will also allow us to concentrate more in this new paradigm, creating advanced cloud architectures and increasing the overall know-how, that can be injected back to customers and the community.

In order to commence this small project the first thing to do was to perform a small feasibility study to identify the different technologies to use inside the cloud to maintain confidentiality and secure access primarily, but also to properly manage and monitor that infrastructure. Additionally, one of the main drivers of this activity was to reduce our monthly hosting cost, so we needed to calculate, based on the current usage, the savings of moving to the cloud.

Cost Study

Looking at the cost for moving to the cloud we performed an inventory of the required CPU power, server instances, storage (for both Amazon S3 and EBS) and the estimated data IO. Additionally, we did an estimation of the volume of data between nodes and between Amazon and the external world.

We initially thought to automatically shutdown and bring up those servers that are only needed during working hours to save more money. In the end, we will be using Amazon reserved instances, that give a even lower per-hour price similar to the one that we would get using on-demand servers.

Based on this inventory and estimations, and with the help of the Amazon Cost Calculator, we reached a final monthly cost that was aprox. 1/3 of our hosting bill!.

This cost is purely considering the physical infrastructure. We need to add on top of this the savings we have on hardware renewal, pure system administration and system installation. Even if we use virtualization technologies, sometimes we´ve had to rearrange things as our physical infrastructure was limited. All these extra costs mean savings on the cloud.

Feasibility Study

Moving to the cloud gives a feeling to most IT managers that they lose control and most importantly, they lose control of the data. While the usage of hybrid clouds can permit the control of the data, in our case we wanted to move everything to the cloud. In this case, we are certainly not different and we are quite paranoid with our data and how would be stored in Amazon EC2. Also, we still require secure network communication between or nodes in the Amazon network and the ability to give secure external access to our staff and customers.

There are a set of open-source technologies that have helped us to materialize these requirements into a solution that we feel comfortable with:

  • Filesystem encryption for securing data storage in Amazon EBS.
  • Private network and IP range for all nodes
  • Cloud-wide encrypted communication between nodes within a private IP network range via OpenVPN solution
  • IPSec VPN solution for external permanent access to the Cloud Lab, as for instance connection of private cloud/network to public EC2 Cloud
  • Use of RightScale to manage and automate the infrastructure
Overview of TSL Secure Cloud deployment

Overview of TSL Secure Cloud deployment

Implementation and Migration

The implementation of our Cloud Lab solution has gone very smoothly and it is working perfectly.
One of the beneficial side effects you get when migrating different systems into the cloud is that it forces you to be much more organised as the infrastructure is very focused on reutilisation and automatic recreation of the different servers.

We have all our Lab images standardized, taking prebuilt images available in Amazon and customising them to include the security hardening, standard services and conventions we have defined. We can in a matter of seconds deploy new images and include them into our Secure VPN-Based CloudLab network ready to be used.

Our new Cloud Lab is giving us a very stable, cost-effective, elastic and secure infrastructure, which can be rebuilt in minutes using EBS snapshots.

The Server Labs @ Cloud Computing Expo 09 – Update

The presentation we gave last month at Cloud Computing and Expo in Prague is now avaliable online below

Update 29 Jun 2009: Amazon Web Services

Amazon has just published this blog wholesale jerseys entry about The Server Labs’s Proof of Concept for ESA Scaling to the Stars

Full Weblogic Load-Balancing in EC2 with Amazon ELB

This is the latest post in the series on deploying a Weblogic cluster in Amazon EC2. Previous posts have shown how to create and configure a weblogic cluster using either standard Amazon EC2 images or RightScale ServerTemplates and RightScripts.

In the first post in the series, I explained how to deploy a load-balanced two-node Weblogic cluster in Amazon EC2. If payday loans online you want to deploy a cluster with more than two nodes in it, you need to introduce a proxy server to the mix. This will keep track of which Weblogic sessions are registered with which Weblogic nodes in the cluster (a maximum of two) and therefore will be capable of redirecting each request to a Weblogic node that has a session for the user.

Ideally, you want two proxy servers in case one goes down – we don’t want a single point of failure – and you want some kind of load balancer to direct requests to one of the two proxy server instances. In summary, you want an architecture such as that shown below, with an Amazon Elastic Load Balancer directing requests to one of two Apache instances acting as the proxy servers which, in turn, redirect requests to one of the nodes in the Weblogic cluster – I’ve shown 3 here but there could be many more nodes in the cluster.


Such an architecture is the ideal in terms of load balancing and high-availability and this post will explain how to achieve it in Amazon EC2.

Configuring the 3 Oracle Weblogic Cluster Nodes

Follow the steps outlined in the following sections of the first post in the series to create the Oracle cluster nodes. The only variation is that you should create 3 nodes instead of 2, naming them Server-0, Server-1 and Server-2:

  • Launching the instances
  • Creating the Weblogic domain
  • Starting the Weblogic Administration Server and configuring the cluster
  • Copying the managed server configuration
  • Checking that the servers work OK

Make sure you configure the cluster to work with the Weblogic proxy. To do this, in the Weblogic Administration console, click on Clusters and then cluster-0. In the General > Configuration tab, expand the “advanced” option and tick the “WebLogic Plug-In Enabled” option.

At this stage, you should have 3 EC2 machines running which we will refer to as [machine-0] [machine-1] and [machine-2]. Each machine should be running a weblogic server and a weblogic admin server should also be running on machine-0.

Configuring the Apache Servers

We will use two instances of the Apache Web Server as proxy servers for the weblogic cluster members. In EC2, start two machines based on the Alestic Ubuntu Server 9.04 32-bit (ami-bf5eb9d6) image. We will refer to these machines as [machine-3] and [machine-4].

SSH into machine-3 and run the following commands to install Apache 2.2:

apt-get update
apt-get install apache2

Download the Oracle Weblogic Apache plugin bundle to your local machine. To find it, go to this page and download the item marked “Apache Plug-ins zip”. Open the ZIP file and just extract the Apache 2.2 Linux module which you can find at /linux/i686/ Upload this file from your local machine to machine-3 using SCP. How you do this between Windows and Linux users. Below is the Linux command, in which you will need to substitute [machine-3] for the public DNS address of machine-3:

scp root@[machine-3]:/usr/lib/apache2/modules/

Now that we have uploaded the Weblogic proxy plugin, we need to configure it. On machine-3, create a new file called /etc/apache2/mods-enabled/weblogic.load and add the following content:

    LoadModule weblogic_module /usr/lib/apache2/modules/

On machine-3, create a file called /etc/apache2/mods-enabled/weblogic.conf and add the following content, substituting [machine-*] with the relevant public DNS address:

      WebLogicCluster [machine-0]:7002,[machine-1]:7002,[machine-2]:7002
      MatchExpression /clustered-webapp/*
      Debug ON
      DebugConfigInfo ON
      WLLogFile /tmp/weblogic.log

This is the core Apache configuration for the Weblogic proxy plugin. The first two lines are the most important – they tell the plugin which nodes are in the cluster and which URL patterns it should match against to proxy requests to the cluster. Note that I believe you don’t have to specify all the nodes in the cluster – the proxy should be capable of finding all nodes in the cluster by asking just one member – although here I’ve specified all 3.

Now reload Apache so it picks up the changes:

/etc/init.d/apache2 reload

Go to http://[machine-3]/clustered-webapp and check that you get a screen saying something like “Hello World! This machine’s IP Address is: ip-10-250-10-63/”. This shows that the Proxy running in the Apache server has connected to the cluster and is proxying requests to it. Given that we specified the “DebugConfigInfo” option in the configuration earlier, we can easily get debug information by accessing the following URL: http://[machine-3]/clustered-webapp/?__WebLogicBridgeConfig. The screen you should see is something like that shown below:

Weblogic proxy debug screen

Weblogic proxy debug screen

This screen is very useful for debugging problems with the proxy configuration but, as you can probably guess, it’s not a good idea to make this information publically available in a production system. The output above indicates that the proxy has correctly identified the cluster and the 3 servers available within it.

Now that one Apache server works ok, configure machine-4 in exactly the same way so that there are two apache servers (machine-3 and machine-4) which act as proxies to the 3 weblogic servers in the cluster.

Configuring the Elastic Load Balancer

To complete the organization proposed in the introduction, we have to introduce an Amazon Elastic Load Balancer which will balance requests to the two Apache web servers. If one goes down, the load balancer should be able to redirect all requests to the remaining server, signifying no drop in availability for end users.

If you have not already installed the Amazon load balancer command line tools, follow the steps below:

  1. Configure your machine to use the Amazon command line tools if you have not already done so
  2. Download and unzip the Amazon Elastic Load Balancing API Tools from
  3. set an environment variable $AWS_ELB_HOME to point to where you unzipped the tools to.
  4. add $AWS_ELB_HOME/bin to your path.

Run the following on your local machine to create the load balancer. My instances are in us-east-1b (you can find this out from the “zone” attribute when you click on a running instance in the AWS Management console). Note the DNS name that you are given when the command finishes – let’s call this [elb-dns]. This command configures the load balancer to listen on port 80 (standard HTTP port) and redirect all traffic to port 80 on each of the two Apache web servers which we will register with the load balancer.

local:-$ elb-create-lb Test --availability-zones us-east-1b --listener "protocol=http,lb-port=80,instance-port=80"

Register your instances with the Load Balancer. Use the AWS Management console to discover your instance IDs.

local:-$ elb-register-instances-with-lb Test --instances [instance-id-machine-3] [instance-id-machine-4]

Configure the health check. This specifies that the load balancer should access the URL /clustered-webapp/ on port 80 on each of the Apache Web Server instances registered with the load balancer every 5 seconds. It should wait up to 3 seconds for a response from each instance and if it gets 2 response failures, it won’t send any requests to that instance. When it gets 2 successful responses from an instance, it starts sending requests to it again.

local:-$ elb-configure-healthcheck  Test --target "HTTP:80/clustered-webapp/" --interval 5 --timeout 3 -unhealthy-threshold 2 --healthy-threshold 2

Now, when you go to http://[elb-dns]/ you should get the same page as you got when you accessed each Apache web server separately. The IP address should change, depending on which Weblogic server the load balancer uses to serve the request.

Proving High availability

This deployment architecture should enable the system to cope with the failure of an Apache proxy server and at least one Weblogic server with no apparent consequences for the end user. Let’s prove it!

Go to the URL http://[elb-dns]/clustered-webapp/sessionCounter and you should see a HTTP page that says how many times you have accessed the page. Click refresh a few times so that the number is greater than 1 and keep this page open in your web browser.

On your local machine, run the elb-describe-instance-health command to show the health of the two instances that the load balancer is balancing requests against. The output should show both Apache servers “InService”:

$ elb-describe-instance-health Test
INSTANCE-ID  i-4320782a  InService
INSTANCE-ID  i-21346c48  InService

Now, in the SSH console for machine-3, shut down Apache by executing the following:

/etc/init.d/apache2 stop

Go back to the Session Counter web page in your browser and click refresh. The number of times you have accessed the page should not be reset to 1 – it should keep rising each time you click refresh. This proves that the failure of one Apache server does not affect the end user. To prove that the apache service is down, on your local machine, run the elb-describe-instance-health command again, which should give you results similar to these:

$ elb-describe-instance-health Test
INSTANCE-ID  i-4320782a  OutOfService
INSTANCE-ID  i-21346c48  InService

Leave the Apache service shut down and terminate the Weblogic server running on machine-1 by pressing ctrl-c in the console window. Return to the Session Counter web page which is displayed in your browser. Click refresh a few times to prove that the session counter is maintained thereby proving that there is no effect on an end-user when a weblogic node fails.

How does all of this work?

There is more detailed information in the Weblogic documentation but basically when you connect to a Weblogic cluster for the first time, the node that serves your request creates a session ID cookie that contains the session ID and the IDs of two nodes in the cluster in the form [sesionID]![PrimaryNodeID]![SecondaryNodeID] e.g. 2VyWK6hdcJGjDHp3t11GKL8Pv2l6kK0V2p113Gc0Sp12Y2H0vcG8!1074248988!1767416532. The primary and secondary nodes will both contain a copy of your session.

When the Weblogic proxy running in Apache receives a request, it looks at the session ID cookie to work out which nodes in the cluster are the primary and secondary nodes that contain the user’s session. It will then try to redirect the request to the primary, only resorting to the secondary if the primary does not respond in a reasonable time. If both nodes fail at the same time – which should be an extremely rare occurrence – then the user’s session is lost. Weblogic always tries to ensure that each session is stored on two nodes in the cluser so if a node fails, Weblogic will move sessions that were stored on that machine to others.

Using RightScripts to create a Weblogic cluster in Amazon EC2

In my previous post, I described how to set up a Weblogic cluster in Amazon EC2 using the Oracle-supplied Amazon AMI image. In this post, I will describe how to create a cluster using RightScripts, an alternative technology offered by RightScale.

In Amazon EC2, you work on an AMI – installing software, configuring – until you are happy with it. Then you ‘freeze’ it, storing it in S3 so that you can create many different instances based on this AMI. Amazon give you the possibility to pass configuration data to each new instance using “user-supplied data” which allows you do differ one launched AMI from another.

RightScale offer you an alternative. Instead of doing all the installation and configuration work on an AMI and then freezing it, you capture all the installation and configuration work in scripts – RightScripts. Each time you start up an instance in RightScale, you decide which RightScripts to execute against a base operating System to construct the complete machine that you wish to deploy.

For example, if you want to deploy an Apache Web Server, you write a RightScript that downloads, installs and configures Apache. You start up a new instance (with a base operating system e.g. CentOS) and run the RightScript. Once it has finished, you have an Apache server up and running. You can even associate one or more RightScripts with a base AMI to make a RightScale Server Template.

The benefits over the use of highly personalised AMIs are:

  • It is easier to change the configuration of your machines in the future – you can just execute another RightScript ‘on the fly’
  • You are not tied to one particular cloud vendor. RightScale allow you to execute RightScripts on machines in non-Amazon clouds

This article will show you how to create a Weblogic cluster using RightScale Server Templates made up of various RightScripts. You will need some familiarity with Amazon EC2 and S3 to get the full benefits from this article and I also assume that you’ve read my previous post on this theme.

You will need to sign up for a RightScale account to follow the steps in this post.


We will create two server templates – one for the primary cluster node that runs the admin server and a managed server and one that contains just a managed server. We will create these templates from various RightScripts that completely automate the installation and configuration process and just require a few variables to be set. This should mean that we can deploy many managed server instances with just a few clicks.

Creating the RightScripts

Log into the RightScale service and click on the Design > RightScripts link in the navigation bar on the left. Click on the ‘new’ button to add a RightScript and add each of the scripts described below, naming them the same as the header:

Install Weblogic

This script installs weblogic on top of the base operating system. It relies on a tar-gzipped copy of the Weblogic installation being stored in an S3 bucket (folder) to which you have access. To create this copy (a one-off procedure), run an instance based on AMI “ami-6a917603” and execute the following commands, substituting “my-new-bucket-name” for an Amazon S3 bucket name that does not already exist e.g. [my-company-name]-[weblogic]

rpm -ivh s3cmd-0.9.9-1.el5.noarch.rpm
tar -czvf /tmp/oracle-weblogic-103.tgz /opt/oracle
s3cmd mb s3://my-new-bucket-name
s3cmd put /tmp/oracle-weblogic-103.tgz s3://my-new-bucket-name/oracle-weblogic-103.tgz

(For more information on s3cmd, see here).

You can check that the weblogic zip was uploaded correctly using a tool such as S3Fox.

Here is the RightScript. Substitute “my-new-bucket-name” for the bucket name you created in the step above.


# NOTE: relies on AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables having been set.

# get WL zip out of S3 using the s3cmd tool that's installed by default
s3cmd get my-new-bucket-name:oracle-weblogic-103.tgz /tmp/wl.tgz

# untar WL to install it
tar -zxvf /tmp/wl.tgz -C /

# report success
exit 0

Install the Domain

When I say “install”, I really mean “extract”! Download the test_domain Domain from here and use S3Fox or another equivalent tool to copy it over to the “my-new-bucket-name” folder in S3 that you created earlier. The RightScript below retrieves this domain from S3, extracts it and places it in the /mnt/domains folder. You must substitute “my-new-bucket-name” for your S3 bucket name. This domain is a basic domain that has no servers or clusters configured – these will be configured later in RightScripts.


# NOTE: relies on $AWS_ACCESS_KEY_ID and $AWS_SECRET_ACCESS_KEY environment variables having been set.

# get domain zip out of S3 using the s3cmd tool that's installed by default
s3cmd get my-new-bucket-name:test_domain.tgz /tmp/test_domain.tgz

# untar domain to install it
tar -zxvf /tmp/test_domain.tgz -C /

# report success
exit 0

Start Weblogic Admin Server

This script simply runs the admin server and redirects output to a log file. This should probably be replaced with an init.d script in a production setup.


mkdir /mnt/logs/

# start up weblogic
nohup /mnt/domains/test_domain/ > /mnt/logs/weblogicAdmin.log 2>&1 &

Create Cluster

This script contacts the Administration server on a defined hostname and port and creates a Weblogic cluster using the Weblogic Scripting Tool (WLST).

#!/bin/bash -e

PY_FILE="/tmp/create-cluster_"`date "+%s"`.py

cat < $PY_FILE



/opt/oracle/weblogic/common/bin/ $PY_FILE

rm -rf $PY_FILE

Add Server To Domain

This script uses WLST to connect to an admin server and create a new server in the domain. The server’s name will be that of the machine’s EC2 public DNS name.

#!/bin/bash -e

SERVER_NAME=$(eval "curl")
PY_FILE="/tmp/create-server_"`date "+%s"`.py

cat < $PY_FILE



/opt/oracle/weblogic/common/bin/ $PY_FILE

rm -rf $PY_FILE

Start Managed Weblogic

This script starts up a Weblogic managed server instance which connects to the admin server and downloads all the domain info required. It relies on being able to download a security file – SerializedSystemIni.dat from your “my-new-bucket-name” folder in S3. Download this file here and upload it to S3 using S3Fox or an equivalent tool. Change the “my-new-bucket-name” folder in the script below to reflect your bucket name. This should probably be replaced with an init.d script in a production setup.



SERVER_NAME=$(eval "curl")

# create the server domain directory structure
mkdir -p $DOMAIN_HOME
mkdir -p /mnt/logs

# create the file
echo "username=$SERVER_USERNAME" >> $BOOT_FILE
echo "password=$SERVER_PASSWORD" >> $BOOT_FILE

echo "using admin URL: http://$ADMIN_SERVER_DNS_NAME:$ADMIN_SERVER_PORT"

mkdir -p $START_ROOT/security
s3cmd get my-new-bucket-name:SerializedSystemIni.dat $START_ROOT/security/SerializedSystemIni.dat

# start up weblogic
nohup /opt/oracle/weblogic/common/bin/ $SERVER_NAME http://$ADMIN_SERVER_DNS_NAME:$ADMIN_SERVER_PORT > /mnt/logs/weblogicManaged-$SERVER_NAME.log 2>&1 &

Create the ServerTemplates

Now that we have the basic RightScripts to construct Weblogic instances, we can create the ServerTemplates that logically group the scripts together.

Click on the Design > ServerTemplates link in the navigation bar in RightScale. Click on ‘new’ to create a new Server Template and call the template “Weblogic Admin”. Choose “EC2 US” and “m1.small” for the cloud and instance type attributes respectively. Use the browser tool to select the image “RightImage CentOS5_2V4_1_10”. Leave the rest of the attributes as their defaults and click ‘save’.

Click on the ‘scripts’ tab in the newly-created server template and add the RightScripts that you created earlier as boot scripts, in the following order: Install Weblogic, Install Domain, Start Weblogic Admin Server, Create Cluster, Add Server To Domain, Start Managed Weblogic. As you can see, this should install a weblogic server, create the domain and start the admin server. It should also create a cluster and a server in the domain and then finally start up the server.


Create another server template called “Weblogic Managed” with the same cloud, instance type and image attributes as “Weblogic Admin”. Add the following scripts as boot scripts in this order to the template: Install Weblogic, Add Server To Domain, Start Managed Weblogic.


Start the instances

In the RightScale navigation bar, go to Manage > Deployments and click on ‘default’. Click on “Add EC2 US server”. For the Server Template, select Private > Weblogic Admin. Enter “Weblogic Admin 1” as the nickname and choose the SSH key that you want to use. Use a security group that has ports 7001 and 7002 open. Leave the rest of the attributes as defaults and click “Add”.

Repeat these steps to create “Weblogic Managed 1” which uses the Weblogic Managed template. You should now have two servers configured in the “Default” deployment and it should look something like the screenshot below (note that my servers are in the EU):

Servers for "Default" deployment

Launching the servers

Click on the launch icon next to the “Weblogic Admin – 1” server and you should see a screen prompting you for some parameters. These are all the input parameters required by the RightScripts that will run at boot time. Enter the information as shown in the screenshot below and click on the ‘Launch’ button:


Wait until RightScale shows the status of the server as “operational” – this can take a while (8mins or so), so be patient! Once it’s started, click on the server name and then on the audit entries tab. The latest audit entry should have status “operational”. Click on this and you should see a list of the scripts that ran at startup and their outcomes:

Audit entries

On the info tab of the server, you should see the public DNS name of the server e.g. “”. Go to the URL: http:://[public-dns-name]:7001/console and log in with the username/password weblogic/weblogic.

If you click on the environment > servers link, you should see an Admin server and a managed server configured in a cluster. The managed server name should correspond to the public DNS of the machine:

Weblogic console servers

Now, in the RightScale console, start up the other server – “Weblogic Managed – 1” using the following inputs (substituting “Weblogic Admin – 1” for “Weblogic EU Admin – 1”) :

Inputs for Weblogic Managed

Once the managed server has started up, hit refresh in the Weblogic console. You should see the second managed server appear and the screen should look something like my one below:

Weblogic console showing all connected servers


Using the RightScripts provided in this post, you can easily deploy a Weblogic cluster using just two server templates. Adding new nodes is simple – just create another server based on the “Weblogic Managed” template and when the node starts up, provide it with the parameters it needs to connect to the admin server. The RightScripts take care of the rest for you.