SCA Async/Conversational services Part 1: Internals of Tuscany SCA

Sometime ago I wrote about developing applications with SCA on a JBI-based infrastructure, using simple SCA services for that.

I’m coming back again with two separate SCA blog posts discussing the usage of more complex services, asynchronous and conversational services:

  • In this post, I provide an example client-server project that implements a conversational and asynchronous service in Tuscany SCA, digging into the internals of the implementation for handling these services using Web Services and JMS bindings. I found this information quite difficult to find.
  • The second post looks into the situation where the usage of a Tuscany SCA runtime is not possible in the client side, but we still need a way to make use of these type of services from a more standard web service client. For the specific case of a web service binding, and knowing the internals of Tuscany I will use a standard jax-ws client to consume a Tuscany SCA conversational and asynchronous service.

Both posts use a common example project, a Counter service that notifies to the client each time the count is updated (every second) via an asynchronous callback. The service is conversational and allows clients to stop the count and restart it through separate independent service calls. Also, multiple clients can be run in parallel, each one with its own counter service instance. Version 1.6 of Tuscany SCA is used in this sample.

The source code of the sample counter service for this post can be found here.

SCA asynchronous and conversational service definition

Let’s refresh how to define a service in Tuscany SCA that is both conversational and defines a callback interface for asynchronous calls to the client. Good references for this are the SCA specifications, the Tuscany SCA documentation and the sample projects included in the distribution. However, while I could find several callback projects on the Tuscany SCA samples there are none that exercise conversations.

Below, you find the interface definition of the SCA service in Java for the CounterService (counter-callback-ws-service maven project).

I use the @Callback and @Conversational annotations to define our service. For the @Callback annotation we need also to define the callback interface class. Therefore, for our counter service we have two interfaces, one for the service which implementation is done in the server side, and another one for the callback which implementation is done in the client side.

/**
 * The remote service that will be invoked by the client
 */
@Remotable
@Callback(CounterServiceCallback.class)
@Conversational
public interface CounterService {
 
    @OneWay
    void startCounting(int maxcount);
 
    @OneWay
    void stopCounting();
 
    @OneWay
    void restartCounting();    
 
    @OneWay
    void shutdownCounting();     
}

The Callback interface is as simple as:

/**
 * The callback interface for {@link CounterService}.
 */
@Remotable
public interface CounterServiceCallback {
 
    void receiveCount(int count, int end);
}

The server side implementation of the service looks like:

/**
 * This class implements CounterService and uses a callback.
 */
@Service(CounterService.class)
@Scope("CONVERSATION")
public class CounterServiceImpl implements CounterService {
	@ConversationID
	protected String conversationID;
 
	/**
	 * The setter used by the runtime to set the callback reference
	 * 
	 * @param counterServiceCallback
	 */
	@Callback
	public void setMyServiceCallback(CounterServiceCallback counterServiceCallback) {
		System.out.println("setCounterServiceCallback on thread "
				+ Thread.currentThread());
		this.counterServiceCallback = counterServiceCallback;
	}

On the implementation, the @Service SCA annotation is used to identify the SCA service being implemented and the @Scope which specifies a visibility and lifecycle contract an implementation has with the SCA runtime. Possible values of @Scope are STATELESS, REQUEST, CONVERSATIONAL and COMPOSITE. For this specific service we need to use CONVERSATIONAL, instructing Tuscany that conversation state must be kept in order to correlate interactions between a client and the service. For a description of the rest of scopes, refer to the SCA Java Annotations and API reference specification.

Last, we use the @Callback annotation to instruct Tuscany SCA where to inject the Callback reference to be used by the service.

The server side SCA composite is very similar than the Tuscany SCA official callback-ws-client and you can find it under the “src/main/resources/counterws.composite“. The binding used for the sample is a Web Service binding.

Tuscany SCA Counter Service Client

On the client side (counter-callback-ws-client-SCA maven project) we need to define a client interface that makes use of the counter service. To make it simple, the interface has a method that returns the reference to the Counter Service and an additional helper method to get the current count recorded in the client based on the service callbacks. As in the server side, we have to make sure to define the interface as @Conversational and the implementation with @Scope(“CONVERSATION”) so Tuscany populates the service call with conversation information.

Also, to simplify the sample code, the shared interfaces between client and server (i.e. CounterService and CounterServiceCallback) are included in both projects separately. Ideally, these interfaces would be on a separate interfaces library used by both projects.

/**
 * The client component interface.
 */
@Conversational
public interface CounterServiceClient {
 
    public CounterService getCounterService();
 
    public int getCount();
 
}

The client composite is as listed below. It defines the reference to the counter Service and the associated callback. The url defined in the callback is used by Tuscany SCA in the client side to start the listening callback web service:

<composite xmlns="http://www.osoa.org/xmlns/sca/1.0"
	targetNamespace="http://counterws"
	name="counterws-client">
 
    <component name="CounterServiceClient">
        <implementation.java class="com.tsl.sca.tuscany.demo.client.SCA.CounterServiceClientImpl" />
        <reference name="counterService">
	        <interface.java interface="com.tsl.sca.tuscany.demo.server.CounterService"
	            callbackInterface="com.tsl.sca.tuscany.demo.server.CounterServiceCallback" />
	        <binding.ws uri="http://localhost:8086/CounterServiceComponent"/>
	        <callback>
	            <binding.ws uri="http://localhost:1999/callback"/>
            </callback>
        </reference>
    </component>
</composite>

For the testing, a unit test is setup (WebServiceSCAClientTest.java), exercising the counter service, callbacks and conversations.

Running the counter service sample project

To run the counter service sample, extract the source code and:

  • Run the server side, executing the run-server.sh script in the counter-callback-ws-service directory. The server component should start and wait for client requests.
    ...
    Nov 8, 2010 4:11:23 PM org.apache.coyote.http11.Http11Protocol start
    INFO: Starting Coyote HTTP/1.1 on http-8086
    Callback server started (press enter to shutdown)
    Nov 8, 2010 4:11:23 PM org.apache.tuscany.sca.http.tomcat.TomcatServer addServletMapping
    INFO: Added Servlet mapping: http://localhost:8086/CounterServiceComponent
  • Go to the client project, counter-callback-ws-client-SCA and execute the tests with:
    mvn clean test
  • On the client side you should see how the client starts the counting service, receives the callbacks and how the counting is stopped and later restarted, continuing the count where it was left.
    ...
    Nov 8, 2010 4:11:28 PM org.apache.tuscany.sca.http.tomcat.TomcatServer addServletMapping
    INFO: Added Servlet mapping: http://localhost:1999/callback
    Starting Count and waiting 5 seconds for counts...
    CounterServiceCallback --- Received Count: 0 out of 30
    CounterServiceCallback --- Received Count: 1 out of 30
    CounterServiceCallback --- Received Count: 2 out of 30
    CounterServiceCallback --- Received Count: 3 out of 30
    CounterServiceCallback --- Received Count: 4 out of 30
    Stopping Count and waiting 5 seconds for no counts...
    Restarting Count and waiting 5 seconds for counts...
    CounterServiceCallback --- Received Count: 5 out of 30
    CounterServiceCallback --- Received Count: 6 out of 30
    CounterServiceCallback --- Received Count: 7 out of 30
    CounterServiceCallback --- Received Count: 8 out of 30
    Stopping the Client Node
    Nov 8, 2010 4:11:44 PM org.apache.tuscany.sca.node.impl.NodeImpl stop
    ...
  • On the server side, you should see the client call and the associated conversationID.
    ...
    INFO: Added Servlet mapping: http://jmatute-laptop:8086/CounterServiceComponent
    setMyServiceCallback on thread Thread[Thread-2,5,main]
    ConversationID = d8a05f47-064b-4832-9cca-7ad035bd36ee. Starting count
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    ConversationID = d8a05f47-064b-4832-9cca-7ad035bd36ee. Stopping count
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    ConversationID = d8a05f47-064b-4832-9cca-7ad035bd36ee. Restarting count
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Thread interrupted.

Having reached this point, let’s look into how Tuscany SCA is handling the conversations and callbacks and how the required information is included in the underling transport bindings (WS or JMS).

Internals of conversational and asynchronous services in Tuscany SCA

Tuscany SCA conversational services (i.e. marked with @Conversational) make use of conversation IDs in order to keep track and correctly map multiple invocations associated with a certain conversation.

SCA and Tuscany asynchronous services are similar to those defined, for instance, in BPEL where the forward and callback service calls are two complete separate service invocation calls. Therefore, Tuscany also requires additional information to know where the callback needs to be send and optionally provide a application correlation id for the callback call.

Summarising, the below table contains the information used by Tuscany SCA for conversation and callback services:

Tuscany information Description
conversationID UUID identifying the current conversation. This is used by Tuscany SCA to associate accordingly the service instance associated to that conversation.
callbackLocation The location of the callback service. This, depending on the binding used might be a URI for Web Service binding or a Queue name for JMS.
callbackID Application-specified callback ID, that is used by the application to correlate callbacks with the related application state.

The above information needs to be mapped to the specific binding and here is where no standarisation exists, making difficult to have interoperable solutions:

  • WebService Binding : for the Web Service binding, WS-Addressing headers are used in order to store the Tuscany SCA conversational/async required information. Below it is shown an example of a Tuscany Web Service conversational and asynchronous/bidrectional invocation:
    • The WS-Addressing Address contains the Callback URI to be used for the callback invocation. This is normally setup by the client to notify the server where to send the callbacks.
    • The WS-Addressing ReferenceParameters contains the other two information fields under specific Tuscany Namespace, the CallbackID and the ConversationID.
    <?xml version='1.0' encoding='UTF-8'?>
    <S:Envelope xmlns:S="http://schemas.xmlsoap.org/soap/envelope/">
        <S:Header>
            <From xmlns="http://www.w3.org/2005/08/addressing" xmlns:ns2="http://tuscany.apache.org/xmlns/sca/1.0">
                <Address>http://localhost:1999/callback</Address>
                <ReferenceParameters>
                    <ns2:CallbackID>45a963da-2074-4bb2-b9ee-f721e2ec753b</ns2:CallbackID>
                    <ns2:ConversationID>309b8322-1dc2-4c51-a4db-73d65edae391</ns2:ConversationID>
                </ReferenceParameters>
            </From>
        </S:Header>
        <S:Body>
            <ns2:startCounting xmlns:ns2="http://server.demo.tuscany.sca.tsl.com/">
                <arg0>30</arg0>
             </ns2:startCounting>
        </S:Body>
    </S:Envelope>
  • JMS Binding : For JMS, Tuscany SCA uses JMS Message properties to store the required information. A screenshot (see below) of ActiveMQ console shows an Tuscany SCA JMS Message for conversational and async/bidrectional service.
    1. CallbackID, scaConversationID and scaCallbackQueue, Tuscany SCA propietary JMS message properties to hold the information.

    Tuscany SCA JMS Message

    Tuscany SCA JMS Message

Conclusion

In this first post I have presented a Tuscany SCA example that covers both conversational and asynchronous scenarios, not currently available in the official Tuscany SCA samples, and looked into the internals of Tuscany SCA used to handle these services. This provides the basis for the next post, where I will be using this information to extend the project code with a non-SCA JAXWS-based web service client as an approach to consume these services without a Tuscany SCA runtime.

Persistence Strategies for Amazon EC2

At The Server Labs, we often run into a need that comes naturally with the on-demand nature of Cloud Computing. Namely, we’d like to keep our team’s AWS bill in check by ensuring that we can safely turn off our Amazon EC2 instances when not in use. In fact, we’d like to take this practice one step further and automate the times when an instance should be operational (e.g. only during business hours). Stopping an EC2 instance is easy enough, but how do we ensure that our data and configuration persist across server restarts? Fortunately, there are a number of possible approaches to solve the persistence problem, but each one brings its own pitfalls and tradeoffs. In this article, we analyze some of the major persistence strategies and discuss their strengths and weaknesses.

A Review of EC2

Since Amazon introduced their EBS-backed AMIs in late 2009 [1], there has been a great deal of confusion around how this AMI type impacts the EC2 lifecycle operations, particularly in the area of persistence. In this section, we’ll review the often-misunderstood differences between S3 and EBS-backed EC2 instances, which will be crucial as we prepare for our discussion of persistence strategies.

Not All EC2 Instances are Created Equal

An Amazon EC2 instance can be launched from one of two types of AMIs: the traditional S3-backed AMI and the new EBS-backed AMI [2]. These two AMIs exhibit a number of differences, for example in their lifecycle [3] and data persistence characteristics [4]:

Characteristic Amazon EBS Instance (S3) Store
Lifecycle Supports stopping and restarting of instance by saving state to EBS. Instance cannot be stopped; it is either running or terminated.
Data persistence Data persists in EBS on instance failure or restart. Data can also be configured to persist when instance is terminated, although it does not do so by default. Instance storage does not persist on instance shutdown or failure. It is possible to attach non-root devices using EBS for data persistence as needed.

As explained in the table above, EBS-backed EC2 instances introduce a new stopped state, unavailable for S3-backed instances. It is important to note that while an instance is in a stopped state, it will not incur any EC2 running costs. You will, however, continue to be billed for the EBS storage associated with your instance. The other benefit over S3-backed instances is that a stopped instance can be started again while maintaining its internal state. The following diagrams summarize the lifecycles of both S3 and EBS-backed EC2 instances:

s3-lifecycle

Lifecycle of an S3-backed EC2 Instance.

ebs-lifecycle

Lifecycle of an EBS-backed EC2 Instance.

Note that while the instance ID of a restarted EBS-backed instance will remain the same, it will be dynamically assigned a new set of public and private IP and DNS addresses. If you would like assign a static IP address to your instance, you can still do so by using Amazon’s Elastic IP service [5].

Persistence Strategies

With an understanding of the differences between S3 and EBS-backed instances, we are well equipped to discuss persistence strategies for each type of instance.

Persistence Strategy 1: EBS-backed Instances

First, we’ll start with the obvious choice: EBS-backed instances. When this type of instance is launched, Amazon automatically creates an Amazon EBS volume from the associated AMI snapshot, which then becomes the root device. Any changes to the local storage are then persisted in this EBS volume, and will survive instance failures and restarts. Note that by default, terminating an EBS-backed instance will also delete the EBS volume associated with it (and all its data), unless explicitly configured not to do so [6].

Persistence Strategy 2: S3-backed Instances

In spite of their ease of use, EBS-backed instances present a couple of drawbacks. First, not all software and architectures are supported out-of-the-box as EBS-backed AMIs, so the version of your favorite OS might not be available. Perhaps more importantly, the EBS volume is mounted as the root device, meaning that you will also be billed for storage of all static data such as operating systems files, etc., external to your application or configuration.

To circumvent these disadvantages, it is possible to use an S3-backed EC2 instance that gives you direct control over what files to persist. However, this flexibility comes at a price. Since S3-backed instances use local storage as their root device, you’ll have to manually attach and mount an EBS volume for persisting your data. Any data you write directly to your EBS mount will be automatically persisted. Other times, configuration files exist at standard locations outside of your EBS mount where you will still want to persist your changes. In such situations, you would typically create a symlink on the root device to point to your EBS mount.

For example, assuming you have mounted your EBS volume under /ebs, you would run the following shell commands to persist your apache2 configuration:

# first backup original configuration
mv /etc/apache2/apache2.conf{,.orig}
# use your persisted configuration from EBS by creating a symlink
ln –s /ebs/etc/apache2/apache2.conf /etc/apache2/apache2.conf

Once your S3-backed instance is terminated, any local instance storage (including symlinks) will be lost, but your original data and configuration will persist in your EBS volume. If you would then like to recover the state persisted in EBS upon launching a new instance, you will have to go through the process of recreating any symlinks and/or copying any pertinent configuration and data from your EBS mount to your local instance storage.

Synchronizing between the data persisted in EBS and that in the local instance storage can become complex and difficult to automate when launching new instances. In order to help with these tasks, there are a number of third-party management platforms that provide different levels of automation. These are covered in more detail in the next section.

Persistence Strategy 3: Third-party Management Platforms

In the early days of AWS, there were few and limited third-party platforms available for managing and monitoring your AWS infrastucture. Moreover, in order to manage and monitor your instances for you, these types of platforms necessarily need access to your EC2 instance keys and AWS credentials. Although a reasonable compromise for some, this requirement could pose an unacceptable security risk for others, who must guarantee the security and confidentiality of their data and internal AWS infrastructure.

Given these limitations, The Server Labs developed its own Cloud Management Framework in Ruby for managing EC2 instances, EBS volumes and internal SSH keys in a secure manner. Our framework automates routine tasks such as attaching and mounting EBS volumes when launching instances, as well providing hooks for the installation and configuration of software and services at startup based on the data persisted in EBS. It even goes one step further by mounting our EBS volumes using an encrypted file system to guarantee the confidentiality of our internal company data.

Today, companies need not necessarily develop their homegrown frameworks, and can increasingly rely on third-party platforms. An example of a powerful commercial platform for cloud management is Rightscale. For several of our projects, we rely on Rightscale to automatically attach EBS volumes when launching new EC2 instances. We also make extensive use of scripting to install and configure software onto our instances automatically at boot time using Rightscale’s Righscript technology [7]. These features make it easy to persist your application data and configuration in EBS, while automating the setup and configuration of new EC2 instances associated with one or more EBS volumes.

Automating Your Instance Uptimes

Now that we have discussed the major persistence strategies for Amazon EC2, we are in a good position to tackle our original use case. How can we schedule an instance in Amazon so that it is only operational during business hours? After all, we’d really like to avoid getting billed for instance uptime during times when it is not really needed.

To solve this problem, we’ll have to address two independent considerations. First, we’ll have to ensure that all of our instance state (including data and configuration) is stored persistently. Second, we’ll have to automate the starting and stopping of our instance, as well as restoring its state from persistent storage at boot time.

Automation Strategy 1: EBS-backed Instances

By using an EBS-backed instance, we ensure that all of its state is automatically persisted even if the instance is restarted (provided it is not terminated). Since the EBS volume is mounted as the root device, no further action is required to restore any data or configuration. Last, we’ll have to automate starting and stopping of the instance based on our operational times. For scheduling our instance uptimes, we can take advantage of the Linux cron service. For example, in order to schedule an instance to be operational during business hours (9am to 5pm, Monday-Friday), we could create the following two cron jobs:

0 9 * * 1-5 /opt/aws/bin/ec2-start.sh i-10a64379
0 17 * * 1-/opt/aws/bin/ec2-stop.sh i-10a64379

The first cron job will schedule the EBS-backed instance identified by instance ID i-10a64379 to be started daily from Monday to Friday at 9am. Similarly, the second job schedules the same instance to be stopped at 5pm Monday through Friday. The cron service invokes the helper scripts ec2-start.sh and ec2-stop.sh to facilitate the configuration of the AWS command-line tools according to your particular environment. You could run this cron job from another instance in the cloud, or you could have a machine in your office launch it.

The following snippet provides sample contents for ec2-start.sh, which does the setup necessary to invoke the AWS ec2-start-instances command. Note that this script assumes that your EBS-backed instance was previously launched manually and you know its instance ID.

#!/bin/bash
# Name: ec2-start.sh
# Description: this script starts the EBS-backed instance with the specified Instance ID
# by invoking the AWS ec2-start-instances command
# Arguments: the Instance ID for the EBS-backed instance that will be started.

export JAVA_HOME=/usr/lib/jvm/java-6-sun-1.6.0.20
export EC2_HOME=/opt/aws/ec2-api-tools-1.3
export EC2_PRIVATE_KEY=/opt/aws/keys/private-key.pem
export EC2_CERT=/opt/aws/keys/cert.pem
# uncomment the following line to use Europe as the default Zone
#export EC2_URL=https://ec2.eu-west-1.amazonaws.com
PATH=$PATH:${EC2_HOME}/bin
INSTANCE_ID=$1

echo "Starting EBS-backed instance with ID ${INSTANCE_ID}"
ec2-start-instances ${INSTANCE_ID}

Similarly, ec2-stop.sh would stop your EBS-backed instance by invoking ec2-stop-instances followed by your instance ID. Note that the instance ID of EBS-backed instances will remain the same across restarts.

Automation Strategy 2: S3-backed Instances

Amazon instances backed by S3 present the additional complexity that the local storage is not persistent and will be lost upon terminating the instance. In order to persist application data and configuration changes independently of the lifecycle of our instance, we’ll have to rely on EBS. Additionally, we’ll have to carefully restore any persisted state upon launching a new EC2 instance.

The Server Labs Cloud Manager allows us to automate these tasks. Among other features, it automatically attaches and mounts a specified EBS volume when launching a new EC2 instance. It also provides hooks to invoke one or more startup scripts directly from EBS. These scripts are specific to the application, and can be used to restore instance state from EBS, including any appropriate application data and configuration.

If you must use S3-backed instances for your solution, you’ll either have to develop your own framework along the lines of The Server Labs Cloud Manager, or rely on third-party management platforms like Rightscale. Otherwise, EBS-backed instances provide the path of least resistance to persisting your instance data and configuration.

Automation Strategy 3: Rightscale

Rightscale provides a commercial platform with support for boot time scripts (via Righscripts) and automatic attachment of EBS volumes. In addition, Rightscale allows applications to define arrays of servers that grow and shrink based on a number of parameters. By using the server array schedule feature, you can define how an alert-based array resizes over the course of a week [8], and thus ensure a single running instance of your server during business hours. In addition, leveraging boot time scripts and the EBS volume management feature enables you to automate setup and configuration of new instances in the array, while persisting changes to your application data and configuration. Using these features, it is possible to build an automated solution for a server that operates during business hours, and that can be shutdown safely when not in use.

Conclusion

This article describes the major approaches to persisting state in Amazon EC2. Persisting state is crucial to building robust and highly-available architectures with the capacity to scale. Not only does it promote operational efficiency by only consuming resources when a need exists; it also protects your application state so that it if your instant fails or is accidentally terminated you can automatically launch a new one and continue where you left off. In fact, these same ideas can also enable your application to scale seamlessly by automatically provisioning new EC2 instances in response to a growth in demand.

References

[1] New Amazon EC2 Feature: Boot from Elastic Block Store. Original announcement from Amazon explaining the new EC2 boot from EBS feature.

[2] Amazon Elastic Compute Cloud User Guide: AMI Basics. Covers basic AMI concepts for S3 and EBS AMI types.

[3] The EC2 Instance Life Cycle: excellent blog post describing major lifecycle differences between S3 and EBS-backed EC2 instances.

[4] Amazon Elastic Compute Cloud User Guide: AMIs Backed by Amazon EBS. Learn about EBS-backed AMIs and how they work.

[5] AWS Feature Guide: Amazon EC2 Elastic IP Addresses. An introduction to Elastic IP Addresses for Amazon EC2.

[6] Amazon Elastic Compute Cloud User Guide: Changing the Root Volume to Persist. Learn how to configure your EBS-backed EC2 instance so that the associated EBS volume is not deleted upon termination.

[7] RightScale User Guide: RightScripts. Learn how to write your own RightScripts.

[8] RightScale User Guide: Server Array Schedule. Learn how to create an alert-based array to resize over the course of the week.

Dynamic LDAP-based authentication and authorization in Servicemix 3

Recently, we have been working quite extensively with Apache Servicemix, a JBI-compliant ESB.
One of areas we have been looking into is securing services in general and how to perform ldap-based authentication and authorization for those services in particular.

A good starting point to understand Servicemix (SMX) security features can be found here. I will give a brief overview of the security features of JBI and SMX .

JBI does not very concisely define authentication and authorization mechanisms, but it does define the following:

  • JBI relies on JAAS definitions for security features (i.e. security Subject).
  • The Binding Components (e.g. servicemix-cxfbc) are expected to transform protocol-specific (e.g. WS-Security) credentials into a normalized format, a security Subject, and inject them into the normalized message.
  • The ESB will then apply the authentication and authorization mechanism based on Subject information.

In particular for SMX we have:

  • For authentication it relies on JAAS, being able to use custom Login modules to integrate different technologies (e.g. LDAP) and implement specific authentication logic. The SMX configuration for JAAS is located under $SERVICEMIX_HOME/conf/login.properties.
    • SMX comes by default with a file based authentication mechanism, which is not suitable for use in many environments since it requires an SMX restart after any change.
  • For authorization, SMX does not use JAAS, due to the different nature of the items to be authorized (i.e. service endpoints instead of Java code). It defines an specific ACL AuthorizationMap, used by the NormalizedRouter (SecureBroker) to allow or deny the invocation. The configuration file for this is located in $SERVICEMIX_HOME/conf/security.xml
    • Also in this case, SMX comes by default with a file based ACL definition, included in the security.xml file. Any change in the ACLs requires an SMX restart.

So the idea behind this post is to provide a dynamic authentication/authorization mechanism in SMX that does not require an ESB restart after changes whilst also making use of LDAP.

In this post I will go through a complete example that requires the following:

  1. Apache ServiceMix 3.3.2.
  2. OpenDS 2.2 as a test LDAP Server. However you can use any other LDAP server you prefer.
  3. Maven 2.0.9 or higher to build the projects.

The complete source code of this post can be found here. It contains:

  1. A LDAP ldif file with the sample directory structure used.
  2. As a test service, a secured version of the SMX’s cxf-wsdl-first, now called cxf-wsdl-first-secure.
  3. A smx-ldap maven project with the classes we need to deploy in SMX in order to enable LDAP.
  4. A smxConfig directory with the required SMX configuration.

With that in mind, let’s go through the details:

Install Servicemix and OpenDS

You can skip the OpenDS section if you have your own LDAP server, just make sure the relevant LDAP configuration parameters are used in the coming sections.

  1. Download and install Servicemix. No additional modifications are required at this stage.
  2. Download and install OpenDS. If you want to use the sample LDAP ldif file, setup the directory server with the following configuration:
    • Select as “Directory Base DN”: dc=theserverlabs,dc=com.
    • Select import data from LDIF and choose the file included in the source package.
    • Complete installation.

The users and passwords defined in the sample ldif are:

  • reader:reader
  • smx:smx
  • testuser:testuser

Create a test service that requires authentication

To show the authentication and authorization of a service deployed in SMX I took the “cxf-wsdl-first” example that is included in the SMX distribution and enabled WS-Security authentication headers. The result project is included in the source files as “cxf-wsdl-first-secure”.

For that I just added the following interceptors configuration into the xbean.xml file of the CXFBC service unit “wsdl-first-cxfbc-secure-su”:

 
<cxfbc:consumer wsdl="classpath:person.wsdl"
                        targetService="person:PersonService"
                        targetInterface="person:Person">
     <cxfbc:inInterceptors>
             <bean class="org.apache.cxf.interceptor.LoggingInInterceptor" />
             <ref bean="UserName_RequestIn" />
      </cxfbc:inInterceptors>
</cxfbc:consumer>
 
 
<bean class="org.apache.cxf.ws.security.wss4j.WSS4JInInterceptor"  id="UserName_RequestIn">
     <constructor-arg>
         <map>
		<entry key="action" value="UsernameToken" />
		<!--
			No real implementation of PasswordCallbackClass is required, thus the
	    	        dummy class. However it must be defined. This is due to the fact
			that the SMX will take care of the authentication, not WSS4J.
		-->
		<entry key="passwordCallbackClass" value="com.tsl.jbi.smx.security.DummyCallbackHandler" />
		<entry key="passwordType" value="PasswordText"/>
	</map>
    </constructor-arg>
</bean>

In this case as you see the required passwordCallbackClass just defines a DummyCallbackHandler (also included in the cxfbc su) that does nothing at all, delegating the whole authentication/authorization to SMX.
This is the simplest case and obviously you could have a class that performs additional checks or needs to deal with keystore passwords in case encryption is used. I will not get into the details here about WSS4J as this can be a subject worth a whole entry by itself.

Also, for a very good background reading of WS-Security and SMX check this blog post by Torsten Mielke.

To install the example project into SMX, just build the maven project and copy the service assembly to the hotdeploy directory of SMX:

cd Source/cxf-wsdl-first-secure/
mvn clean install
cp  wsdl-first-cxf-secure-sa/target/wsdl-first-cxf-secure-sa-3.3.2.zip $SERVICEMIX_HOME/hotdeploy

An example soap request with the WS-Security headers would be:

<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:typ="http://servicemix.apache.org/samples/wsdl-first/types">
   <soapenv:Header>
       <wsse:Security xmlns:wsse="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd">
            <wsse:UsernameToken wsu:Id="UsernameToken-9" xmlns:wsu="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd">
                   <wsse:Username>testuser</wsse:Username>
                   <wsse:Password Type="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-username-token-profile-1.0#PasswordText">testuser</wsse:Password>
            </wsse:UsernameToken>
       </wsse:Security>
   </soapenv:Header>
   <soapenv:Body>
      <typ:GetPerson>
         <typ:personId>world</typ:personId>
      </typ:GetPerson>
   </soapenv:Body>
</soapenv:Envelope>

Setup the required LDAP classes for JAAS and SMX AuthorizationMap

The required JAAS LDAP Login Module and the SMX AuthorizationMap contains many LDAP connectivity logic that are common and can be found in several open-source implementations.
I took as a reference the ActiveMQ classes and modify them to fit the SMX needs.

You can find the two classes in the smx-ldap maven project:

  • LdapLoginModule: This class implements the JAAS LoginModule interface and it is responsible to perform the authentication (login method) based on the passed credentials. It is also responsible to extract the user’s roles once it is authenticated and inject them into the Subject as additional principals. The provided class is basically the same implementation used for ActiveMQ, no changes are required.
  • LdapAuthorizationMap: This class implements the AuthorizationMap interface offered by SMX. The interface has just one method:
    Set<Principals> getAcls(endpoint, operation)

    The class simply provides a set of Principals allowed to invoke the passed endpoint and operation. Also in this case, I took as a baseline the LdapAuthorizationMap include in ActiveMQ. We need to adapt it as ActiveMQ maps work on queues and topics, not endpoints.

Having reached this point I needed to define the level of granularity and also the format of the entries in LDAP defining the ACLs. I provide one implementation possibility here and it is definitely not perfect. Depending on your needs you can take this as a baseline an implement a more complex logic, add caching of endpoints ACLs, etc…

So this is the format I defined:

  • Endpoint ACLs are located under “ou=endpoints,ou=smx” in the LDAP tree.
  • The entries defining the ACLs will be “GroupOfUniqueNames”, having an entry per allowed group.
  • The “cn” of the ACLs has the form:
    cn=<Namespace>:<ServiceName>:<Operation>

    In the defined schema you can also use wildcards on Service and Operation. Examples of ACLs definition for the cxf-wsdl-first-secure project would be:

    cn=http://servicemix.apache.org/samples/wsdl-first:*:*   
           Grant permissions to all services on that ns.    
     
    cn=http://servicemix.apache.org/samples/wsdl-first:PersonService:*   
           Grant permissions to all operations of PersonService.    
     
    cn=http://servicemix.apache.org/samples/wsdl-first:*:GetPerson   
           Grant permissions to GetPerson operations of all services in that ns.

    Another example in case you use dynamic SMX ftp endpoints would be:

    cn=urn:servicemix:ftp:* 
           Grant permissions to all dynamic endpoints of the servicemix-ftp BC.

This logic is implemented in the LdapAuthorizationMap class.

Both ActiveMQ classes are highly configurable using properties as we will see how to setup their usage in SMX.

To install the classes, build the maven project and copy the jar file to the lib directory of SMX:

cd Source/smx-ldap/
mvn clean install
cp  cp target/smx-ldap-1.0-SNAPSHOT.jar $SERVICEMIX_HOME/lib
 
<Restart ServiceMix>

Configure SMX

To make use of the LDAP authentication and authorization two SMX files need to be modified:

  • $SERVICEMIX_HOME/conf/login.properties: This is the JAAS authentication configuration and must define the LdapLoginModule. The parameters are for locally configured OpenDS LDAP, modify them with your configuration if required:
    servicemix-domain {
        com.tsl.jbi.smx.security.login.LDAPLoginModule REQUIRED
            debug=true
            initialContextFactory=com.sun.jndi.ldap.LdapCtxFactory
            connectionURL="ldap://localhost:389"
            connectionUsername="cn=reader,ou=people,dc=theserverlabs,dc=com"
            connectionPassword=reader
            connectionProtocol=s
            authentication=simple
            userBase="ou=people,dc=theserverlabs,dc=com"
            userRoleName=dummyUserRoleName
            userSearchMatching="(uid={0})"
            userSearchSubtree=false
            roleBase="ou=groups,dc=theserverlabs,dc=com"
            roleName=cn
            roleSearchMatching="(member={0})"
            roleSearchSubtree=true
            ;
    };

    You can see that there are two parameters, userSearchMatching and roleSearchMatching that allow us to define the user/roles search filter that fit our environment.

  • $SERVICEMIX_HOME/conf/security.xml. This file contains general SMX security configuration and here we will define the bean for the LdapAuthorizationMap:
      <!-- Custom, LDAP based authorisation map -->
      <!-- MessageFormat used as LDAP filter to search for Roles of an Specific Endpoint -->
      <!-- {0} will be replaced by Endpoint Namespace, {1} by Endpoint Name and -->
      <!-- {2} by Endpoint operation -->
      <bean id="roleFilterMessageFormat" class="java.text.MessageFormat">
     <constructor-arg value="(|(|(|(cn={0}:{1}:{2})(cn={0}:{1}:\*))(cn={0}:\*:{2}))(cn={0}:\*:\*))" />
      </bean>
      <bean id="LdapAuthorizationMap" class="com.tsl.jbi.smx.security.LDAPAuthorizationMap">
        <property name="initialContextFactory" value="com.sun.jndi.ldap.LdapCtxFactory"/>
        <property name="connectionURL" value="ldap://localhost:389"/>
        <property name="connectionUsername" value="cn=reader,ou=people,dc=theserverlabs,dc=com"/>
        <property name="connectionPassword" value="reader"/>
        <property name="connectionProtocol" value="s"/>
        <property name="endpointSearchBase" value="ou=endpoints,ou=smx,dc=theserverlabs,dc=com"/>
        <property name="endpointRoleFilterFormat" ref="roleFilterMessageFormat"/>
        <property name="roleAttribute" value="uniqueMember"/>
      </bean>

    Also in this case, we have an endpointRoleFilterFormat that defines the ACL’s search filter. This can be modified to match whatever other LDAP directory setup you want to define for ACLs.

  • $SERVICEMIX_HOME/conf/servicemix.xml. We need to modify the authorizationMap used by the secured broker and point to the new ldap-based map.
        <sm:broker>
          <sm:securedBroker authorizationMap="#LdapAuthorizationMap">
     
           .....
        </sm:broker>


Restart ServiceMix to apply the changes and you are ready to go!

Test the service and play around

You can test the configuration with SoapUI (example project included in sources) or make use of the client.html (thanks SMX guys for this!!!) included in the cxf-wsdl-first-secure project.

The simplest one is the client.html, just load it in your web browser and fire away.

Test Web Service Client

Test Web Service Client

The test should run successfully and return the Person information.

Now, it’s time to start playing around with the LDAP and see dynamic authentication/authorization:

  • For instance, you can change the ACL group in the endpoint LDAP to SMX-GROUP2 and see how you get a “Endpoint is not authorized for this user” exception.
  • Create a more fine grained endpoint ACL that only assigns SMX-GROUP1 to:
    cn="http://servicemix.apache.org/samples/wsdl-first:PersonService:GetPerson"
  • Change the password used for authentication of testuser in the SOAP request to get an LDAP login error.

Conclusions

Dynamic authentication/authorization configuration in an ESB with mechanisms like LDAP is mandatory in any real deployment scenario.

You have seen that adding LDAP-based authentication/authorization to SMX is not difficult but to have a robust model some implementation work is required. The implementation shown in this post can be taken as a good starting point but there are areas of improvement both in terms of performance (e.g. cache, etc…) and in the authorization model itself (e.g. precedence of ACLs, etc…).

Also, the SMX authentication/authorization mechanisms described are for version 3.x. They should be applicable also for SMX 4 with its new architecture based on OSGi, as the same type of NMR is sitting on top of the OSGi runtime.

HTML5 Websocket vs Flex Messaging

Nearly 18 months ago, I wrote an article on this blog which showed how to develop a simple flex web application using ActiveMQ as a JMS server. It has proved a popular article so there must be a fair amount of interest in this topic.

With all the hype surrounding HTML5 at the moment, I figured I’d try to update the application for the HTML5 world, at the same time seeing how easy it would be to replace Adobe Flex. I’m sure you’re all familiar with the fact that Steve Jobs won’t let Flash (and therefore Flex) onto either the iPhone or iPad, so it’s worth investigating if the alternatives (in this case HTML5 websocket) are really up to scratch.

The example shown in the article uses HTML 5 websocket, Apache ActiveMQ, the Stomp messaging protocol and (indirectly) Jetty 7. It currently only runs in Google Chrome 5 since this is the only browser with support for websocket available as of May 2010.

What is Websocket?

In short, it’s a way of opening up a firewall-friendly bi-directional communications channel with a server. The channel stays open for a long period of time, allowing the server and client web browser to interchange messages without having to do polling which consumes bandwidth and a lot of server resources.

This article describes websocket in much more depth.

Websocket forms part of the (still unapproved) HTML5 specification which all browsers will eventually have to implement.

Updating the Trader app

In order to get the best out of this section, it is probably best to have read the original Flex-JMS article.

Here is the code for the updated trader app. Instead of having a server webapp component with BlazeDS acting as a message proxy and the flex clients, we have a bunch of HTML/CSS/JS files (in the /html folder) and some code to put messages in the JMS queues published by Apache ActiveMQ (in the /server folder).

Setting up Apache ActiveMQ

Apache ActiveMQ now comes with support for Websocket (implemented behind the scenes with Jetty 7). We will use ActiveMQ both as our messaging server and web server in this example, though that is not perhaps the best production configuration.

This article explains how to configure ActiveMQ and Websocket. I will repeat the key instructions here for the sake of simplicity:

  1. Download the latest snapshot of Apache ActiveMQ 5.4 from here and unzip it somewhere on your filesystem that we will call $ACTIVEMQ_HOME
  2. Edit $ACTIVEMQ_HOME/conf/activemq.xml and change the transportConnectors section so that it looks like the example below:
            <transportConnectors>
                <transportConnector name="openwire" uri="tcp://0.0.0.0:61616"/>
                <transportConnector name="websocket" uri="ws://0.0.0.0:61614"/>
                <transportConnector name="stomp" uri="stomp://0.0.0.0:61612?transport.closeAsync=false"/> 
                <transportConnector name="stomp+nio" uri="stomp+nio://0.0.0.0:61613?transport.closeAsync=false"/> 
            </transportConnectors>
  3. Start ActiveMQ by running bin/activemq from the $ACTIVEMQ_HOME directory. Go to http://localhost:8161/admin/ and log in with the username/password admin/admin to check that everything is working ok.

Configure an install the aplication

Download the trader app code and copy the /html folder to $ACTIVEMQ_HOME/webapps/demo, renaming it to /trader-app-websocket (i.e. the full path should be $ACTIVEMQ_HOME/webapps/demo/trader-app-websocket).

Edit stock-quote.js and note the following section:

    var url = 'ws://localhost:61614/stomp';
    var login = 'guest';
    var passcode = 'guest';
    destination = '/topic/stockQuoteTopic';

The url attribute referes to the URL of the ActiveMQ server. Due to a bug in Chrome running on Ubuntu 9.10, I had to put the IP address of my machine here but if you’re running on another linux flavour, OSX or windows, I would imagine that leaving it as localhost should be OK. The username and password are guest/guest which is standard for ActiveMQ.

Download the latest version of Google Chrome (I have 5.0.375.55 which was released a few days ago) and open the URL http://localhost:8161/demo/trader-app-websocket/. You should see a UI that is similar to the Flex app developed in the original article:

chrome-initial

Open up a terminal window and go to the location to which you extracted the /server part of the code download. Run the following (assumes Maven installed);

mvn clean compile exec:java -Dexec.mainClass="com.theserverlabs.flex.trader.JSONFeed"

You should see a whole bunch of stock quote information like that shown below scrolling in the terminal:

{"symbol": "XOM",
"name": "Exxon Mobile Corp",
"low": 61.317410451219146,
"high": 61.56,
"open": 61.56,
"last": 61.317410451219146,
"change": 61.317410451219146}

This Java program is the same as that used in the original post. It generates random stock price information for a variety of stocks and publishes it in the stockQuote topic in ActiveMQ. In this case, it generates JMS text messages which contain data formatted in the JSON format.

If you go back to the Chrome browser window, you should see the stock quotes update. If they don’t update, click on refresh:

chrome-stocks

This is pretty much exactly the same as how the original Flex application worked. The UI colours etc. are slightly different and I’ve not implemented the functionality for subscribing/unsubscribing from a stock price – but that was just on time grounds, not because it is difficult.

How does it work?

When the browser opens the page, it executes the following code in stock-quote.js, which subscribes to the stock quote service:

$(document).ready(function(){
 
    var client, destination;
 
    ... 
 
    var url = 'ws://localhost:61614/stomp';
    var login = 'guest';
    var passcode = 'guest';
    destination = '/topic/stockQuoteTopic';
 
    client = Stomp.client(url);
    client.connect(login, passcode, onconnect);
 
});

Here we use the library provided by Jeff Mesnil which enables us to access ActiveMQ using the Stomp protocol instead of JMS. We use it here because it is simple and cross-platform. There is no way of directly subscribing from JavaScript to a JMS server via JMS because there is no client available.

In the same block of code, you can see the code that we execute when we receive a message:

 
    var onconnect = function(frame) {
 
      client.subscribe(destination, function(message) {
        var quote = JSON.parse(message.body);
        $('.' + quote.symbol).replaceWith("<tr class=\"" + quote.symbol + "\">" + 
            "<td>" + quote.symbol + "</td>" +  
            "<td>" + quote.open.toFixed(2) + "</td>" + 
            "<td>" + quote.last.toFixed(2) + "</td>" + 
            "<td>" + quote.change.toFixed(2) + "</td>" + 
            "<td>" + quote.high.toFixed(2) + "</td>" + 
            "<td>" + quote.low.toFixed(2) + "</td>" +  
            "</tr>");
      });
    };

When we receive a message, we parse it using the JSON parser that is part of the JQuery library and then we find the HTML element with class attribute equal to the symbol (e.g. if the symbol is IBM, we look for the HTML element with class=”IBM”) and replace it’s contents with the HTML table row code generated in the method. Simple really.

The rest of the code is just HTML and CSS and is not really that interesting for this article.

Conclusions

It is pretty easy to develop an application that uses Websocket – you only have to look at how little real code there is in this example. I’d say it is as easy as developing the original Flex app so from a development point of view, there’s little to chose between these technologies.

Unfortunately the only browser that currently supports Websocket is Google Chrome (and the implementation is a bit buggy). Other browsers (especially Firefox and Safari) should have this functionality soon though. One of the arguments used in support of Flash/Flex has always been the large installed base. Given that the iPhone and iPad are not part of this installed base, it is questionable as to whether this can still be used as a justification for using Flash/Flex. Sure there aren’t that many browsers that support Websocket but in 6 months time they probably all will and you won’t need any proprietary plugin to access the apps build using them. I’d definitely recommend that people developing internal corporate apps who can force their end users to use a browser with Websocket support take a look at this technology. People developing publically accessible webapps are probably going to have to wait till it is more widely implemented in browsers and provide graceful fallback in the case in which it isn’t.

I should point out that I have an iPhone and use Ubuntu 9.10 64-bit at work and I hate not being able to see content on my iPhone because it has been implemented in Flash and my entire firefox browser often crashes completely due to the 64-bit linux Flash plugin.

I’ve gone from being a fan of Flash/Flex to not being so sure about it. It’s hard to escape the feeling that HTML5 will offer much of the functionality currently offered by Flash/Flex in a short period of time. I think the future of these technologies is going to depend on Adobe innovating and offering stuff that is not possible in HTML5, otherwise there is little reason to use it.

Human readable JVM GC timestamps

When we are diagnosing problems in a Java (EE or otherwise) application, is often a good idea to check how garbage collection is performing. One of the basic and most unobtrusive actions is to enable garbage collection logging.

As you may know, if we add the following arguments to the java start command…

-Xloggc:<file_name> –XX:+PrintGCDetails -XX:+PrintGCDateStamps

… the JVM will start writing garbage collection messages to the file we set with the parameter -Xlogcc. The messages should be something like:


2010-04-22T18:12:27.796+0200: 22.317: [GC 59030K->52906K(97244K), 0.0019061 secs]
2010-04-22T18:12:27.828+0200: 22.348: [GC 59114K->52749K(97244K), 0.0021595 secs]
2010-04-22T18:12:27.859+0200: 22.380: [GC 58957K->53335K(97244K), 0.0022615 secs]
2010-04-22T18:12:27.890+0200: 22.409: [GC 59543K->53385K(97244K), 0.0024157 secs]

The bold part is simply the date and time when reported garbage collection event starts.

Unfortunately -XX:+PrintGCDateStamps is available only for Java 6 Update 4 and later JVMs. So, if we are unlucky and our application is running on older JVMs we are forced to use…

-Xloggc:<file> –XX:+PrintGCDetails

… and the messages will be like:


22.317: [GC 59030K->52906K(97244K), 0.0019061 secs]
22.348: [GC 59114K->52749K(97244K), 0.0021595 secs]
22.380: [GC 58957K->53335K(97244K), 0.0022615 secs]
22.409: [GC 59543K->53385K(97244K), 0.0024157 secs]

Now, the bold numbers (also present in previous format) are the seconds elapsed from JVM start time.

Mmm… way harder to correlate GC events with information from other log files in this case :/

Wouldn’t it be easier to process the gc log file and calculate date and time from seconds elapsed? It seems so, but seconds elapsed from… when? Or, putting it in other words, where do we extract the JVM startup date and time from?

In order to be as unobtrusive as possible, we should try to calculate the start date and time from the same GC log file. That brings us to the file attributes. We have different options:

Unix Windows
Access time Access time
Change time Creation time
Modify time Modify time

We discard access time (for obvious reasons) and change time and creation time as they are not available in both platforms, so we are left with modification time, which represents the time when the file was last modified.

In Windows, modification time is maintained when the file is copied elsewhere, but in Unix we should use the -p flag to preserve timestamp attributes if we want to copy the GC log file prior to our processing.

The last modification time of the GC log file should match the last timestamp recorded for a GC event in the log file. Well… for the purists, it should match exactly the last elapsed time plus the execution time (both in bold) as each log line is written piece by piece as it executes.


22.409: [GC 59543K->53385K(97244K), 0.0024157 secs]

In our approach, we discard the execution time as we don’t need accurate precision to have a rough idea of what time each garbage collection event occurred. Nevertheless, keep in mind that GC execution time could sometimes be as long as several seconds in large heaps.

When we experienced this situation in a client recently, we needed to quickly develop a simple and portable script, so we used Python for the task. You already knew we don’t do just Java, didn’t you? :P

#!/usr/bin/env python
 
import sys, os, datetime
 
# true if string is a positive float
def validSeconds(str_sec):
    try:
        return 0 < float(str_sec)
    except ValueError:
        return False
 
# show usage                
if len(sys.argv) < 2:
    print "Usage: %s <gc.log>" % (sys.argv[0])
    sys.exit(1)
 
file_str = sys.argv[1]
lastmod_date = datetime.datetime.fromtimestamp(os.path.getmtime(file_str))
 
file = open(file_str, 'r')
lines = file.readlines()
file.close()
 
# get last elapsed time
for line in reversed(lines):
    parts = line.split(':')
    if validSeconds(parts[0]):
        break
 
# calculate start time
start_date = lastmod_date - datetime.timedelta(seconds=float(parts[0]))
 
# print file prepending human readable time where appropiate  
for line in lines:
    parts = line.split(':')
    if not validSeconds(parts[0]):
        print line.rstrip()
        continue
    line_date = start_date + datetime.timedelta(seconds=float(parts[0]))
    print "%s: %s" % (line_date.isoformat(), line.rstrip())

The script output can be redirected to another file, where we’ll have

2010-04-22T18:12:27.796375: 22.317: [GC 59030K->52906K(97244K), 0.0019061 secs]
2010-04-22T18:12:27.828375: 22.348: [GC 59114K->52749K(97244K), 0.0021595 secs]
2010-04-22T18:12:27.859375: 22.380: [GC 58957K->53335K(97244K), 0.0022615 secs]
2010-04-22T18:12:27.890375: 22.409: [GC 59543K->53385K(97244K), 0.0024157 secs]

You may note the date format is not 100% the same as the one with -XX:+PrintGCDateStamps argument, but it should be enough to get an idea of when each GC event happened (Timezone management in Python is way out of the scope of this blog entry).

This has been my first blog entry for The Server Labs and I hope some of you find it useful. Of course, all comments, suggestions and feedback are very welcome.