The Server Labs Blog Rotating Header Image

Java

Smooks processing recipies

Introduction

In one of our customer projects we had a requirement to import CSV, fixed length and Excel files in different formats and store records in the database. We chose Smooks to accomplish this task.

Smooks is a Java framework to read, process and transform data from various sources (CSV, fixed length, XML, EDI, …) to various destinations (XML, Java objects, database). It convinced me because:

  • it brings out-of-the-box components to read CSV and fixed length files
  • it integrates smoothly with an ORM library (Hibernate, JPA)
  • processing is configured using an XML configuration file – you need only few lines of code to do the transformations
  • extensibility – implementing a custom Excel reader was relatively easy
  • low added filtering overhead – reading 100.000 CSV lines and storing them in the database using Hibernate took us less than 30 seconds

During the development we had to overcome some hurdles imposed by Smooks processing model. In this post I would like to share our practical experience we gained working with Smooks. First, I’m going to present a sample transformation use case with requirements similar to a real-world assignment. Then I will present solutions to these requirements in a ‘how-to’ style.

Use case

We are developing a ticketing application. The heart of your application is Issue class:

We have to write an import and conversion module for an external ticketing system. Data comes in the CSV format (for the sake of simplicity). The domain model of the external system is slightly different than ours; however, issues coming from the external issue tracker can be mapped to our Issues.

External system exchange format defines the following fields: description, priority, reporter, assignee, createdDate, createdTime, updatedDate, updatedTime. They should be mapped to our Issue in the following manner:

  • description property – description field
  • This is a simple Smooks mapping. No big issue.

  • project property – there is no project field. Project should be assigned manually
  • A constant object (from our domain model) must be passed to Smooks engine to be used in Java binding.
    See Assign constant object to a property.

  • priority property – priority field; P1 and P2 priorities should be mapped to Priority.LOW, P3 to Priority.MEDIUM, P4 and P5 to Priority.HIGH
  • This mapping could be done using an MVEL expression. However, we want to encapsulate this logic in a separate class that can be easily unit-tested. See Use external object to calculate property value

  • involvedPersons property – reporter field plus assignee field if not empty (append assignee using ‘;’ separator)
  • Set compound properties in Java binding will show how to achieve it.

  • created property – merge createdDate and createdTime fields
  • updated property – merge updatedDate and updatedTime fields
  • In Set datetime from date and time fields, two strategies will be presented.

    Before diving into details, I’m going to present the final Smooks configuration and the invocation code of the transformation (as JUnit 4 test). Later on, in each recipe, I will explain the XML configuration and Java code fragments relevant to that recipe.

    The remaining classes (Issue, Priority, Project, IssuePrioritizer) are not included in the text. You can browse online the source code in GitHub. To get your local copy, clone the Git repository:


    git clone git://github.com/mgryszko/blog-smooks-recipies.git


    smooks-config.xml

    
    
    
        
            SAX
        
    
        
    
        
            
            
            
            
        
    
        
            
        
    
        
            
            
            
                prioritizer.assignPriorityFromCode(_VALUE)
            
            
                transformedProps["reporter"]
                    + (org.apache.commons.lang.StringUtils.isNotBlank(transformedProps["assignee"]) ? ";" + transformedProps["assignee"] : "")
            
            
                yyyy-MM-dd
            
            
                HH:mm
            
            
                updated = transformedProps["updatedDate"] + " " + transformedProps["updatedTime"];
                new java.text.SimpleDateFormat("yyyy-MM-dd HH:mm").parse(updated)
            
        
    
    


    SmooksRecipiesTest

    package com.tsl.smooks;
    
    // imports hidden
    
    public class SmooksRecipiesTest {
    
        private static final SimpleDateFormat DATETIME_FORMAT = new SimpleDateFormat("yyyy-MM-dd HH:mm");
    
        private Source importedIssues = new StringSource(
            "description,priority,reporter,assignee,createdDate,createdTime,updatedDate,updatedTime\n"
                + "Added phased initialization of javabean cartridge,P1,Ataulfo,Teodorico,2010-10-01,13:10,2010-10-10,20:01\n"
                + "Processing recursive tree like structures with the Javabean Cartridge,P3,Eurico,,2010-10-02,07:15,2010-11-15,09:45"
        );
    
        private Smooks smooks;
        private ExecutionContext executionContext;
    
        private List expIssues;
        private Project expProject = new Project("Smooks");
    
        @Before
        public void initSmooks() throws Exception {
            smooks = new Smooks(getResourceFromClassLoader("smooks-config.xml"));
            executionContext = smooks.createExecutionContext();
            executionContext.getBeanContext().addBean("project", expProject);
            executionContext.getBeanContext().addBean("prioritizer", new IssuePrioritizer());
        }
    
        private InputStream getResourceFromClassLoader(String name) {
            return getClass().getClassLoader().getResourceAsStream(name);
        }
    
        @Before
        public void createExpIssues() throws Exception {
            expIssues = Arrays.asList(
                new Issue("Added phased initialization of javabean cartridge", expProject, Priority.LOW,
                    "Ataulfo;Teodorico", DATETIME_FORMAT.parse("2010-10-01 13:10"), DATETIME_FORMAT.parse("2010-10-10 20:01")
                ),
                new Issue(
                    "Processing recursive tree like structures with the Javabean Cartridge", expProject, Priority.MEDIUM,
                    "Eurico", DATETIME_FORMAT.parse("2010-10-02 07:15"), DATETIME_FORMAT.parse("2010-11-15 09:45")
                )
            );
        }
    
        @Test
        public void process() throws Exception {
            smooks.filterSource(executionContext, importedIssues);
    
            List issues = (List) executionContext.getBeanContext().getBean("issues");
            assertEquals(expIssues, issues);
        }
    }
    

    Assign a constant object (from your domain model) to a property

    According to the Smooks manual, bean context is the place where JavaBean cartridge puts newly created beans. We can add our own bean (Project):

    executionContext = smooks.createExecutionContext();
    executionContext.getBeanContext().addBean("project", new Project("Smooks"));
    

    … and reference it in the Java binding configuration:

    
        ....
        
        ...
    
    

    Use an external object to calculate property value

    Similar to the previous tip we add an additional bean (IssuePrioritizer) to the bean context:

    executionContext = smooks.createExecutionContext();
    executionContext.getBeanContext().addBean("prioritizer", new IssuePrioritizer());
    

    … and define an MVEL expression for the property. The MVEL expression uses the bean and references the value being processed (in this case coming from the CSV reader) by the implicit _VALUE variable:

    
        ....
        
            prioritizer.assignPriorityFromCode(_VALUE)
        
        ...
    
    

    Set compound properties in Java binding

    It is not possible to map directly two source fields to a Java bean property. Java bindings with and are executed on a SAX visitAfter event bound to to a single XML element/CSV field. We have to define a binding for a helper Map bean with the fields we want to merge:

    
        
        
        ...
    
    

    … and use an MVEL expression that concatenates two fields using the helper map bean (transformedProps):

    
        ...
        
            transformedProps["reporter"]
                + (org.apache.commons.lang.StringUtils.isNotBlank(transformedProps["assignee"]) ? ";" + transformedProps["assignee"] : "")
        
        ...
    
    

    Set datetime from date and time fields

    In this transformation we have to both merge and convert values of two fields.

    In the first solution, we create a separate setter for the date and time part in the target Issue class (Smooks uses setters in Java binding):

    public class Issue {
        ...
        public void setCreatedDatePart(Date createdDatetime) {
            createCreatedIfNotInitialized();
            copyDatePart(createdDatetime, created);
        }
    
        public void setCreatedTimePart(Date createdDatetime) {
            createCreatedIfNotInitialized();
            copyTimePart(createdDatetime, created);
        }
        ...
    }
    

    … and then use a standard value binding with date decoder:

    
        ...
        
            yyyy-MM-dd
        
        
            HH:mm
        
        ...
    
    

    The advantage of this approach is that you make use of Smooks decoder infrastructure. You can configure the transformation with your own decoders (e.g. custom java.util.Date allowing to specify multiple date formats). If you are using the built-in DateDecoder, you can catch and handle a standard DataDecodeException.

    The disadvantage is that you have to change your domain model code. New methods add complexity and must be unit tested, especially in cases when only one of partial setter is called.

    In the second solution, you define a binding for a helper Map bean with the date and time fields. In the right binding you use an MVEL expression concatenating date and time strings and converting them to Date (e.g. using a java.text.SimpleDateFormat instance):

    
        ...
        
            updated = transformedProps["updatedDate"] + " " + transformedProps["updatedTime"];
            new java.text.SimpleDateFormat("yyyy-MM-dd HH:mm").parse(updated)
        
        ...
    
    

    The advantages of the first solution are disadvantages of the second one. You don’t touch your Java classes. It is simple – you have to specify only the Smooks configuration. In case of handling of many date/time formats and their combinations, the MVEL expression defining the conversion can become complicated. In case of an exception, you won’t get DataDecodeException, but an ugly, generic ExpressionEvaluationException.

    Conclusions

    Smooks is a great library that will save you writing a lot of code in case of processing many different formats. With a few lines of code and the XML configuration you will be able to read a file and persist its contents in the database using your favourite ORM framework.

    However, Smooks processing model and its usage in built-in cartridges make sometimes difficult to configure the transformation for a real world requirement. The information provided in the user guide is sometimes scarce and unclear. I hope these practical cases will help you use Smooks Java bean mappings and MVEL expressions more effectively.

SCA Async/Conversational services Part 2: Non-SCA Web Service client

Following my previous post on the internals of asynchronous and conversational services in Tuscany SCA, which options are available for consuming these services when you cannot use a Tuscany SCA runtime on the client?.

Depending on the transport binding used you would expect to find a certain level of standarisation on conversational/asynchronous services implementation, allowing interoperable solutions. However, it is difficult, if not impossible, to find interoperable solutions for these types of services, even for highly standarised bindings such as SOAP Web Services. We have seen in the previous post how Tuscany SCA handles these services for WS and JMS bindings. We saw that at least for Web Services, some WS-* standards were used (i.e. WS-Addressing), but there are still solution-specific details that do not allow interoperability. This reflects how difficult it is for the standarisation community to define detailed standards and mechanisms to enable interoperable solutions for these kind of services.

In this post I present an option for this situation, using an standard JAX-WS client, attaching the appropriate WS-Addressing headers using JAXB. For that, I extend the pure Tuscany SCA client-server sample of the previous post with a non-SCA client.

You can find the extended source code of the maven projects here.

Overview

The two main pieces to setup on a JAX-WS client are:

  1. Proper WS-Addressing headers containing the Tuscany’s specific information (i.e callbackLocation, conversationID and optionally callbackID).
  2. A web service implementing the callback interface, listening in callbackLocation URI.

JAX-WS WS-Addressing headers setup

As for a normal WSDL-first client development, the first step is to generate the JAX-WS client classes from the Counter Service WSDL. This is done using the maven jaxws-maven-plugin on the wsdl file generated for the service (i.e. http://localhost:8086/CounterServiceComponent?wsdl).

To setup the SOAP WS-Addressing headers, I follow the recommendations from the JAX-WS guide and use WSBindingProvider interface, which offers a much better control on how headers can be added.

Therefore, in the constructor of CounterServiceClient.java, the WS-Addressing “TuscanyHeader” is added:

     public CounterServiceClient() {
        super();
        
        // Initialise JAX-WS Web Service Client.
        service = new CounterServiceService();
        servicePort = service.getCounterServicePort();
  
        // Set the WS-Addressing headers to use by the client.
        WSBindingProvider bp = (WSBindingProvider) servicePort;
        // Generate UUID for Tuscany headers
        conversationId = UUID.randomUUID().toString();
        callbackId = UUID.randomUUID().toString();
        ReferenceParameters rp = new ReferenceParameters(conversationId);
        rp.setCallbackId(callbackId);
        TuscanyHeader header = new TuscanyHeader(CALLBACKURI, rp);        
        bp.setOutboundHeaders(Headers.create((JAXBRIContext) jaxbContext,
                header));
        
        ...
    }        

The ReferenceParameters and TuscanyHeader classes are JAXB classes with the required information to map the TuscanyHeader Java object to the WS-Addressing XML header. For instance, the ReferenceParameters class, which includes the Tuscany SCA parameters, has the following JAXB definitions:

public class ReferenceParameters {
    private static final String TUSCANYSCA_NS = "http://tuscany.apache.org/xmlns/sca/1.0";
    /** The conversation id. */
    private String conversationId = "";
    /** The callback id. */
    private String callbackId = null;

    ...
    public void setCallbackId(String callbackId) {
        this.callbackId = callbackId;
    }

    public void setConversationId(String conversationId) {
        this.conversationId = conversationId;
    }

    @XmlElement(name = "CallbackID", namespace = TUSCANYSCA_NS, required = false)
    public String getCallbackId() {
        return callbackId;
    }

    @XmlElement(name = "ConversationID", namespace = TUSCANYSCA_NS, required = true)
    public String getConversationId() {
        return conversationId;
    }
}

JAX-WS callback service setup

The callback service needs to be created on the client side. For that we define a simple web service (CounterServiceCallbackImpl.java) with the CounterServiceCallback as the contract.

/**
 * The CounterServiceCallback implementation.
 * The Web service namespace must be the same as the one defined in the Server Side.
 */
@WebService(targetNamespace = "http://server.demo.tuscany.sca.tsl.com/")
public class CounterServiceCallbackImpl implements CounterServiceCallback {
	private int count;

	@Override
	public void receiveCount(int count, int end) {
		System.out.println("CounterServiceCallback --- Received Count: " + count + " out of " + end);
		this.count = count;
	}
        ...
}

In this sample project, the web service is also started during the instantiation of the JAX-WS Client. We use the embedded Sun httpserver to publish the web service instead of relying in other web application servers as Jetty or Tomcat:

     public CounterServiceClient() {
        super();
        ...
        
        // Setup Receiver Callback Web Service
        System.out.println("Starting Callback service in " + CALLBACKURI);
        callback = new CounterServiceCallbackImpl();
        callbackServiceEp = Endpoint.create(callback);
        callbackServiceEp.publish(CALLBACKURI);
    }        

Testing the client

As in the previous post, in order to run the client tests you need to:

  • Run the server side, executing the run-server.sh under the counter-callback-ws-service directory. The server component should start and wait for client requests.
  • Go to the client project, counter-callback-ws-client-SCA and execute the tests with:
    mvn clean test
    

    This runs both the SCA and non-SCA client tests.

  • In case you want to run two clients at the same time, you need to define two separate CallbackURIs. For that purpose the property sca.callbackuri has been defined to configure the URI to use. Therefore to run two clients in parallel execute
    mvn  test -Dsca.callbackuri="http://localhost:1999/callback" -Dtest=com.tsl.sca.tuscany.demo.client.WebServiceNonSCAClientTest 
    and another client with 
    mvn  test -Dsca.callbackuri="http://localhost:2999/callback" -Dtest=com.tsl.sca.tuscany.demo.client.WebServiceNonSCAClientTest 
    

    Once running you should see the different service calls and conversations IDs interleaved in the server side console:

    setCounterServiceCallback on thread Thread[Thread-2,5,main] 
    ConversationID = 81e257cd-af3b-4ea6-ae69-4b8e0d9db8a9. Starting count 
    Sleeping ...
    setCounterServiceCallback on thread Thread[Thread-5,5,main]                 
    ConversationID = 70e86dcf-436c-4df1-ab6b-165b3f9070a4. Starting count   
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    ConversationID = 81e257cd-af3b-4ea6-ae69-4b8e0d9db8a9. Stopping count
    Sleeping ...
    ConversationID = 70e86dcf-436c-4df1-ab6b-165b3f9070a4. Stopping count
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    ConversationID = 81e257cd-af3b-4ea6-ae69-4b8e0d9db8a9. Restarting count
    Sleeping ...
    ConversationID = 70e86dcf-436c-4df1-ab6b-165b3f9070a4. Restarting count
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Thread interrupted.
    Thread interrupted.
    

Conclusions

I hope that these posts have shed some light on how SCA and conversational/async services are implemented taking Tuscany as SCA runtime reference. I also believe that it is important to know the available options for consuming a SCA service without an SCA runtime, and how we can do it in a simple manner.

SCA Async/Conversational services Part 1: Internals of Tuscany SCA

Sometime ago I wrote about developing applications with SCA on a JBI-based infrastructure, using simple SCA services for that.

I’m coming back again with two separate SCA blog posts discussing the usage of more complex services, asynchronous and conversational services:

  • In this post, I provide an example client-server project that implements a conversational and asynchronous service in Tuscany SCA, digging into the internals of the implementation for handling these services using Web Services and JMS bindings. I found this information quite difficult to find.
  • The second post looks into the situation where the usage of a Tuscany SCA runtime is not possible in the client side, but we still need a way to make use of these type of services from a more standard web service client. For the specific case of a web service binding, and knowing the internals of Tuscany I will use a standard jax-ws client to consume a Tuscany SCA conversational and asynchronous service.

Both posts use a common example project, a Counter service that notifies to the client each time the count is updated (every second) via an asynchronous callback. The service is conversational and allows clients to stop the count and restart it through separate independent service calls. Also, multiple clients can be run in parallel, each one with its own counter service instance. Version 1.6 of Tuscany SCA is used in this sample.

The source code of the sample counter service for this post can be found here.

SCA asynchronous and conversational service definition

Let’s refresh how to define a service in Tuscany SCA that is both conversational and defines a callback interface for asynchronous calls to the client. Good references for this are the SCA specifications, the Tuscany SCA documentation and the sample projects included in the distribution. However, while I could find several callback projects on the Tuscany SCA samples there are none that exercise conversations.

Below, you find the interface definition of the SCA service in Java for the CounterService (counter-callback-ws-service maven project).

I use the @Callback and @Conversational annotations to define our service. For the @Callback annotation we need also to define the callback interface class. Therefore, for our counter service we have two interfaces, one for the service which implementation is done in the server side, and another one for the callback which implementation is done in the client side.

/**
 * The remote service that will be invoked by the client
 */
@Remotable
@Callback(CounterServiceCallback.class)
@Conversational
public interface CounterService {

    @OneWay
    void startCounting(int maxcount);
    
    @OneWay
    void stopCounting();

    @OneWay
    void restartCounting();    
    
    @OneWay
    void shutdownCounting();     
}

The Callback interface is as simple as:

/**
 * The callback interface for {@link CounterService}.
 */
@Remotable
public interface CounterServiceCallback {

    void receiveCount(int count, int end);
}

The server side implementation of the service looks like:

/**
 * This class implements CounterService and uses a callback.
 */
@Service(CounterService.class)
@Scope("CONVERSATION")
public class CounterServiceImpl implements CounterService {
	@ConversationID
	protected String conversationID;

	/**
	 * The setter used by the runtime to set the callback reference
	 * 
	 * @param counterServiceCallback
	 */
	@Callback
	public void setMyServiceCallback(CounterServiceCallback counterServiceCallback) {
		System.out.println("setCounterServiceCallback on thread "
				+ Thread.currentThread());
		this.counterServiceCallback = counterServiceCallback;
	}

On the implementation, the @Service SCA annotation is used to identify the SCA service being implemented and the @Scope which specifies a visibility and lifecycle contract an implementation has with the SCA runtime. Possible values of @Scope are STATELESS, REQUEST, CONVERSATIONAL and COMPOSITE. For this specific service we need to use CONVERSATIONAL, instructing Tuscany that conversation state must be kept in order to correlate interactions between a client and the service. For a description of the rest of scopes, refer to the SCA Java Annotations and API reference specification.

Last, we use the @Callback annotation to instruct Tuscany SCA where to inject the Callback reference to be used by the service.

The server side SCA composite is very similar than the Tuscany SCA official callback-ws-client and you can find it under the “src/main/resources/counterws.composite“. The binding used for the sample is a Web Service binding.

Tuscany SCA Counter Service Client

On the client side (counter-callback-ws-client-SCA maven project) we need to define a client interface that makes use of the counter service. To make it simple, the interface has a method that returns the reference to the Counter Service and an additional helper method to get the current count recorded in the client based on the service callbacks. As in the server side, we have to make sure to define the interface as @Conversational and the implementation with @Scope(“CONVERSATION”) so Tuscany populates the service call with conversation information.

Also, to simplify the sample code, the shared interfaces between client and server (i.e. CounterService and CounterServiceCallback) are included in both projects separately. Ideally, these interfaces would be on a separate interfaces library used by both projects.

/**
 * The client component interface.
 */
@Conversational
public interface CounterServiceClient {
    
    public CounterService getCounterService();
    
    public int getCount();
    
}

The client composite is as listed below. It defines the reference to the counter Service and the associated callback. The url defined in the callback is used by Tuscany SCA in the client side to start the listening callback web service:



    
        
        
	        
	        
	        
	            
            
        
    

For the testing, a unit test is setup (WebServiceSCAClientTest.java), exercising the counter service, callbacks and conversations.

Running the counter service sample project

To run the counter service sample, extract the source code and:

  • Run the server side, executing the run-server.sh script in the counter-callback-ws-service directory. The server component should start and wait for client requests.
    ...
    Nov 8, 2010 4:11:23 PM org.apache.coyote.http11.Http11Protocol start
    INFO: Starting Coyote HTTP/1.1 on http-8086
    Callback server started (press enter to shutdown)
    Nov 8, 2010 4:11:23 PM org.apache.tuscany.sca.http.tomcat.TomcatServer addServletMapping
    INFO: Added Servlet mapping: http://localhost:8086/CounterServiceComponent
    
  • Go to the client project, counter-callback-ws-client-SCA and execute the tests with:
    mvn clean test
    
  • On the client side you should see how the client starts the counting service, receives the callbacks and how the counting is stopped and later restarted, continuing the count where it was left.
    ...
    Nov 8, 2010 4:11:28 PM org.apache.tuscany.sca.http.tomcat.TomcatServer addServletMapping
    INFO: Added Servlet mapping: http://localhost:1999/callback
    Starting Count and waiting 5 seconds for counts...
    CounterServiceCallback --- Received Count: 0 out of 30
    CounterServiceCallback --- Received Count: 1 out of 30
    CounterServiceCallback --- Received Count: 2 out of 30
    CounterServiceCallback --- Received Count: 3 out of 30
    CounterServiceCallback --- Received Count: 4 out of 30
    Stopping Count and waiting 5 seconds for no counts...
    Restarting Count and waiting 5 seconds for counts...
    CounterServiceCallback --- Received Count: 5 out of 30
    CounterServiceCallback --- Received Count: 6 out of 30
    CounterServiceCallback --- Received Count: 7 out of 30
    CounterServiceCallback --- Received Count: 8 out of 30
    Stopping the Client Node
    Nov 8, 2010 4:11:44 PM org.apache.tuscany.sca.node.impl.NodeImpl stop
    ...
    
  • On the server side, you should see the client call and the associated conversationID.
    ...
    INFO: Added Servlet mapping: http://jmatute-laptop:8086/CounterServiceComponent
    setMyServiceCallback on thread Thread[Thread-2,5,main]
    ConversationID = d8a05f47-064b-4832-9cca-7ad035bd36ee. Starting count
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    ConversationID = d8a05f47-064b-4832-9cca-7ad035bd36ee. Stopping count
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    ConversationID = d8a05f47-064b-4832-9cca-7ad035bd36ee. Restarting count
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Sleeping ...
    Thread interrupted.
    

Having reached this point, let’s look into how Tuscany SCA is handling the conversations and callbacks and how the required information is included in the underling transport bindings (WS or JMS).

Internals of conversational and asynchronous services in Tuscany SCA

Tuscany SCA conversational services (i.e. marked with @Conversational) make use of conversation IDs in order to keep track and correctly map multiple invocations associated with a certain conversation.

SCA and Tuscany asynchronous services are similar to those defined, for instance, in BPEL where the forward and callback service calls are two complete separate service invocation calls. Therefore, Tuscany also requires additional information to know where the callback needs to be send and optionally provide a application correlation id for the callback call.

Summarising, the below table contains the information used by Tuscany SCA for conversation and callback services:

Tuscany information Description
conversationID UUID identifying the current conversation. This is used by Tuscany SCA to associate accordingly the service instance associated to that conversation.
callbackLocation The location of the callback service. This, depending on the binding used might be a URI for Web Service binding or a Queue name for JMS.
callbackID Application-specified callback ID, that is used by the application to correlate callbacks with the related application state.

The above information needs to be mapped to the specific binding and here is where no standarisation exists, making difficult to have interoperable solutions:

  • WebService Binding : for the Web Service binding, WS-Addressing headers are used in order to store the Tuscany SCA conversational/async required information. Below it is shown an example of a Tuscany Web Service conversational and asynchronous/bidrectional invocation:
    • The WS-Addressing Address contains the Callback URI to be used for the callback invocation. This is normally setup by the client to notify the server where to send the callbacks.
    • The WS-Addressing ReferenceParameters contains the other two information fields under specific Tuscany Namespace, the CallbackID and the ConversationID.
    
    
        
            
                
    http://localhost:1999/callback
    45a963da-2074-4bb2-b9ee-f721e2ec753b 309b8322-1dc2-4c51-a4db-73d65edae391
    30
  • JMS Binding : For JMS, Tuscany SCA uses JMS Message properties to store the required information. A screenshot (see below) of ActiveMQ console shows an Tuscany SCA JMS Message for conversational and async/bidrectional service.
    1. CallbackID, scaConversationID and scaCallbackQueue, Tuscany SCA propietary JMS message properties to hold the information.

    Tuscany SCA JMS Message

    Tuscany SCA JMS Message

Conclusion

In this first post I have presented a Tuscany SCA example that covers both conversational and asynchronous scenarios, not currently available in the official Tuscany SCA samples, and looked into the internals of Tuscany SCA used to handle these services. This provides the basis for the next post, where I will be using this information to extend the project code with a non-SCA JAXWS-based web service client as an approach to consume these services without a Tuscany SCA runtime.

Human readable JVM GC timestamps

When we are diagnosing problems in a Java (EE or otherwise) application, is often a good idea to check how garbage collection is performing. One of the basic and most unobtrusive actions is to enable garbage collection logging.

As you may know, if we add the following arguments to the java start command…

-Xloggc:<file_name> –XX:+PrintGCDetails -XX:+PrintGCDateStamps

… the JVM will start writing garbage collection messages to the file we set with the parameter -Xlogcc. The messages should be something like:


2010-04-22T18:12:27.796+0200: 22.317: [GC 59030K->52906K(97244K), 0.0019061 secs]
2010-04-22T18:12:27.828+0200: 22.348: [GC 59114K->52749K(97244K), 0.0021595 secs]
2010-04-22T18:12:27.859+0200: 22.380: [GC 58957K->53335K(97244K), 0.0022615 secs]
2010-04-22T18:12:27.890+0200: 22.409: [GC 59543K->53385K(97244K), 0.0024157 secs]

The bold part is simply the date and time when reported garbage collection event starts.

Unfortunately -XX:+PrintGCDateStamps is available only for Java 6 Update 4 and later JVMs. So, if we are unlucky and our application is running on older JVMs we are forced to use…

-Xloggc:<file> –XX:+PrintGCDetails

… and the messages will be like:


22.317: [GC 59030K->52906K(97244K), 0.0019061 secs]
22.348: [GC 59114K->52749K(97244K), 0.0021595 secs]
22.380: [GC 58957K->53335K(97244K), 0.0022615 secs]
22.409: [GC 59543K->53385K(97244K), 0.0024157 secs]

Now, the bold numbers (also present in previous format) are the seconds elapsed from JVM start time.

Mmm… way harder to correlate GC events with information from other log files in this case :/

Wouldn’t it be easier to process the gc log file and calculate date and time from seconds elapsed? It seems so, but seconds elapsed from… when? Or, putting it in other words, where do we extract the JVM startup date and time from?

In order to be as unobtrusive as possible, we should try to calculate the start date and time from the same GC log file. That brings us to the file attributes. We have different options:

Unix Windows
Access time Access time
Change time Creation time
Modify time Modify time

We discard access time (for obvious reasons) and change time and creation time as they are not available in both platforms, so we are left with modification time, which represents the time when the file was last modified.

In Windows, modification time is maintained when the file is copied elsewhere, but in Unix we should use the -p flag to preserve timestamp attributes if we want to copy the GC log file prior to our processing.

The last modification time of the GC log file should match the last timestamp recorded for a GC event in the log file. Well… for the purists, it should match exactly the last elapsed time plus the execution time (both in bold) as each log line is written piece by piece as it executes.


22.409: [GC 59543K->53385K(97244K), 0.0024157 secs]

In our approach, we discard the execution time as we don’t need accurate precision to have a rough idea of what time each garbage collection event occurred. Nevertheless, keep in mind that GC execution time could sometimes be as long as several seconds in large heaps.

When we experienced this situation in a client recently, we needed to quickly develop a simple and portable script, so we used Python for the task. You already knew we don’t do just Java, didn’t you? 😛

#!/usr/bin/env python

import sys, os, datetime

# true if string is a positive float
def validSeconds(str_sec):
    try:
        return 0 < float(str_sec)
    except ValueError:
        return False
                
# show usage                
if len(sys.argv) < 2:
    print "Usage: %s " % (sys.argv[0])
    sys.exit(1)
    
file_str = sys.argv[1]
lastmod_date = datetime.datetime.fromtimestamp(os.path.getmtime(file_str))

file = open(file_str, 'r')
lines = file.readlines()
file.close()

# get last elapsed time
for line in reversed(lines):
    parts = line.split(':')
    if validSeconds(parts[0]):
        break

# calculate start time
start_date = lastmod_date - datetime.timedelta(seconds=float(parts[0]))
  
# print file prepending human readable time where appropiate  
for line in lines:
    parts = line.split(':')
    if not validSeconds(parts[0]):
        print line.rstrip()
        continue
    line_date = start_date + datetime.timedelta(seconds=float(parts[0]))
    print "%s: %s" % (line_date.isoformat(), line.rstrip())

The script output can be redirected to another file, where we’ll have

2010-04-22T18:12:27.796375: 22.317: [GC 59030K->52906K(97244K), 0.0019061 secs]
2010-04-22T18:12:27.828375: 22.348: [GC 59114K->52749K(97244K), 0.0021595 secs]
2010-04-22T18:12:27.859375: 22.380: [GC 58957K->53335K(97244K), 0.0022615 secs]
2010-04-22T18:12:27.890375: 22.409: [GC 59543K->53385K(97244K), 0.0024157 secs]

You may note the date format is not 100% the same as the one with -XX:+PrintGCDateStamps argument, but it should be enough to get an idea of when each GC event happened (Timezone management in Python is way out of the scope of this blog entry).

This has been my first blog entry for The Server Labs and I hope some of you find it useful. Of course, all comments, suggestions and feedback are very welcome.

Dynamically changing log level with Weblogic, log4j, JMX and WLST

Logging is an uninteresting but nevertheless fundamental part of application development. It is useful not just when you are coding an application but in diagnosing problems of any nature once an application passes into production.

Unfortunately, it is not that clear how to do logging well when developing JEE applications that are to be deployed on a Weblogic server. With this article, I hope to clear up some of the confusion or, at the very least, explain a method that works for me!

The article is based on the use of Apache log4j since that is the most commonly used logging library. The article explains:

  • how to configure log4j properly in a JEE app to be run on Weblogic
  • how to dynamically configure the log level using JMX and Weblogic Scripting Tool

The article assumes some familiarity with Apache Maven, Eclipse and Weblogic.

The code is available here, although you may just prefer to download the EAR file which saves you having to compile any code with Maven. The code.zip file contains the code for the JEE application, the code for the JMX library developed by The Server Labs and used in this article and the scripts for WLST.

Configuring log4j in Weblogic

Weblogic 10 comes with a modified version of log4j by default so you can use log4j in your application without having to deploy the log4j library in your application. Unfortunately, my experience (in WLS 10.0) is that if you do this, you will not be able to configure log4j for your application because the classloader in which log4j is loaded is one that is separate to that of your application. Any log4j.properties that you include will be ignored.

Instead, what you have to do is the following:

1) Include the log4j JAR in your application. If your app is an EAR (like mine), then the log4j JAR should go in the .ear file (in the root directory or in /APP-INF/lib). In my project, I have used a Maven multi-project setup with 3 projects – 1 for the EAR, 1 for the EJBs and 1 webapp. In this case, I have to add the log4j JAR to the EAR project’s pom.xml:

	
		
			log4j
			log4j
			1.2.12
		
	 

2) Tell Weblogic to use your version of the log4j library and not it’s own. To do this, you specify the following in the weblogic-application.xml file which is in the EAR (/META-INF/weblogic-application.xml):

	
		org.apache.log4j.*
	

This basically says that when the app asks for a class in the package org.apache.log4j or it’s sub-packages, that it should look for it on the app’s classloader and not in the Weblogic server classloader.

3) Configure log4j using log4j.properties

Log4j requires that you configure it in order to receive log messages. I include the following contents in the /APP-INF/classes/log4j.properties file which is in the EAR file:

log4j.rootLogger=INFO, stdout, file

log4j.debug=true

log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=[%d] %-5p %c %x - %m%n

log4j.appender.file=org.apache.log4j.RollingFileAppender
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.File=/bea/logs/aplicaciones/blog-logging/logging_${weblogic.Name}.log
log4j.appender.file.Append = false
log4j.appender.file.layout = org.apache.log4j.PatternLayout
log4j.appender.file.MaxFileSize = 20000Kb
log4j.appender.file.MaxBackupIndex = 5
log4j.appender.file.layout.ConversionPattern=[%d] %-5p %c %x - %m%n


log4j.category.com.theserverlabs.blog.wls=DEBUG

This configures log4j to send output both to the console in which Weblogic was started and to a file (/bea/logs/aplicaciones/blog-logging/logging_${weblogic.Name}.log). You will probably want to change the file path to something that works on your machine.

Testing that log4j is working OK

In order to test these changes, download the EAR file and deploy it to a Weblogic server. I tested on Weblogic 10.3 but it should work on any Weblogic 10.x server or later I believe.

Go to the URL http://localhost:7001/logging/test and you should see a very basic HTML page that tells you to check the logs:

servlet

Take a look at the console in which weblogic is running (or the Server.out file if you have the output redirected) and you should see output like that shown below:

[2010-04-21 15:29:47,888] WARN  com.theserverlabs.blog.wls.logging.view.servlet.LoggingTestServlet  - warn
[2010-04-21 15:29:47,888] INFO  com.theserverlabs.blog.wls.logging.view.servlet.LoggingTestServlet  - info
[2010-04-21 15:29:47,888] DEBUG com.theserverlabs.blog.wls.logging.view.servlet.LoggingTestServlet  - debug
[2010-04-21 15:29:47,889] ERROR com.theserverlabs.blog.wls.logging.view.servlet.LoggingTestServlet  - error
[2010-04-21 15:29:47,889] FATAL com.theserverlabs.blog.wls.logging.view.servlet.LoggingTestServlet  - fatal
[2010-04-21 15:29:48,056] WARN  com.theserverlabs.blog.wls.logging.services.ejb.SessionLoggingTestService  - warn
[2010-04-21 15:29:48,056] INFO  com.theserverlabs.blog.wls.logging.services.ejb.SessionLoggingTestService  - info
[2010-04-21 15:29:48,056] DEBUG com.theserverlabs.blog.wls.logging.services.ejb.SessionLoggingTestService  - debug
[2010-04-21 15:29:48,056] ERROR com.theserverlabs.blog.wls.logging.services.ejb.SessionLoggingTestService  - error
[2010-04-21 15:29:48,056] FATAL com.theserverlabs.blog.wls.logging.services.ejb.SessionLoggingTestService  - fatal

If you examine the application, you can see that accessing the URL causes a Servlet to get executed. The servlet writes it’s messages to the log (those recorded with the class “com.theserverlabs.blog.wls.logging.view.servlet.LoggingTestServlet”) and invokes an EJB3 session bean which also writes messages to the log (those tagged with “com.theserverlabs.blog.wls.logging.services.ejb.SessionLoggingTestService”).

This shows that this log solution works equally for web as for EJB and that there is no interference with the weblogic logging system.

Making it dynamic

This is a great solution when you’re still developing the app because you can change the debug level just by modifying the log4j.properties file and redeploying the app. However, this is not something you can feasibly do in a production environment – not at the companies I’ve worked at anyway – due to the interruption in service this causes to the end users.

Why would you want to change the log level? Well, if you have to diagnose a problem in the application, it often makes sense to increase the log level of either the whole application or a subset of it such that you receive more information which helps you to diagnose the problem. Once you have diagnosed the problem, you return the logging to it’s original level so as not to affect the performance of the application unnecesarily.

What we really want is to be able to change the log level dynamically (i.e. without redeploying the application), ideally using administrator tools (since that is what the people who manage production systems tend to use) – not developer tools.

First I’ll show you how to change the log level and then I’ll explain how it works.

Using WLST and JMX to change the log level

If you have deployed the example application, go to your domain home in a terminal window and execute the setDomainEnv script. The idea is to update your shell with the domain environment variables such as JAVA_HOME, CLASSPATH etc. On my linux box, I execute the following:

$ cd $DOMAIN_HOME/bin
$ source setDomainEnv.sh

To check it has worked, run “echo $CLASSPATH” on unix or “echo %CLASSPATH%” on windows and you should see a long classpath.

Once you have set up the environment, execute the changeAppLogLevel.py script that is included in the /scripts directory of the code ZIP:

java -classpath $CLASSPATH  $JAVA_OPTIONS  weblogic.WLST changeAppLogLevel.py

You should see the following output:

$ java -classpath $CLASSPATH  $JAVA_OPTIONS  weblogic.WLST changeAppLogLevel.py

Initializing WebLogic Scripting Tool (WLST) ...

Welcome to WebLogic Server Administration Scripting Shell

Type help() for help on available commands

==>Insufficient arguments
==>Syntax: changeAppLogLevel.py [APP NAME] [LOGGER] [LEVEL]
==> where: APP NAME=context root of the app e.g. xxx for /xxx
==>        LOGGER=log4j logger name
==>        LEVEL=TRACE/DEBUG/WARN/INFO/ERROR/FATAL
==>   e.g: changeAppLogLevel.py xxx com.theserverlabs.blog.logging DEBUG

As the instructions indicate, you have to call the script with the name of the app (which is the context root of the application web), the Log4j logger that you wish to modify (normally a Java package name) and the level that you want to set (TRACE/DEBUG/WARN/INFO/ERROR/FATAL).

In the example below, we set the log level to INFO for the package com.theserverlabs.blog.wls in the application “logging”, which is what the example application is called. The output is the following:

java -classpath $CLASSPATH  $JAVA_OPTIONS  weblogic.WLST changeAppLogLevel.py logging com.theserverlabs.blog.wls ERROR 

Initializing WebLogic Scripting Tool (WLST) ...

Welcome to WebLogic Server Administration Scripting Shell

Type help() for help on available commands

Connecting to t3://localhost:7001 with userid weblogic ...
Successfully connected to Admin Server 'AdminServer' that belongs to domain 'log4j'.

Warning: An insecure protocol was used to connect to the 
server. To ensure on-the-wire security, the SSL port or 
Admin port should be used instead.

Location changed to custom tree. This is a writable tree with No root.
For more help, use help(custom)

changing log level to ERROR for app logging on server t3://localhost:7001
  ... done. New log levels are:
{com.theserverlabs.blog.wls=ERROR}


Exiting WebLogic Scripting Tool.

As you can see from the line “New log levels are:{com.theserverlabs.blog.wls=ERROR}”, the log level was changed from the default in the log4j.properties file (DEBUG) to ERROR.

If you go to the logging test page again (http://localhost:7001/logging/test), you should see that there are no longer entries for the DEBUG, INFOR or WARN levels:

[2010-04-21 17:47:09,728] ERROR com.theserverlabs.blog.wls.logging.view.servlet.LoggingTestServlet  - error
[2010-04-21 17:47:09,728] FATAL com.theserverlabs.blog.wls.logging.view.servlet.LoggingTestServlet  - fatal
[2010-04-21 17:47:09,728] ERROR com.theserverlabs.blog.wls.logging.services.ejb.SessionLoggingTestService  - error
[2010-04-21 17:47:09,728] FATAL com.theserverlabs.blog.wls.logging.services.ejb.SessionLoggingTestService  - fatal

So, we can now dynamically change the log level of a logger in our application using an administrator tool (WLST). Pretty cool, huh?

Here I have specified a single logger “com.theserverlabs.blog.wls” which is a high-level logger in my application. You can only pass 1 logger as arguments to the script but there is nothing to stop you from running the same script different times with different loggers e.g. once to set com.theserverlabs.blog.wls.logging.services.ejb to DEBUG to get detailed logging of EJB session beans and another time to set org.hibernate to INFO to get more info on what hibernate is doing.

How do I get this to work in my application?

Simple. Include our JMX-logging JAR in your WEB application and add the following lines to your web.xml file:

	
		com.theserverlabs.jmx.web.JmxServletContextListener
	

At startup time, a JMX Mbean will be registered with the Weblogic JMX subsystem. The Mbean will be registered under a name which includes your application name, meaning there will be one MBean per application deployed on the server.

Once deployed, you can use the WLST script discussed in the previous section to interact with the MBean for your application and thereby change the log level of your application.

OK, how does all this really work?

If you look at the source code for the jmx-logging JAR (included in the source code bundle, you will find a class called com.theserverlabs.jmx.ApplicationLogManager. This class implements the interface ApplicationLogManagerMBean, whose methods are JMX MBean-compatible. It provides methods that allow a caller to set the log level for a given log4j logger:

	public void setLogLevel(String logger, String level) {
		Logger l = Logger.getLogger(logger);
		l.setLevel(Level.toLevel(level));
	}

and to interrogate log4j for the currently-configured loggers and their levels, returning a map of Strings:

	@SuppressWarnings("unchecked")
	public Map getLogLevels() {
		HashMap result = new HashMap();
		Enumeration e = Logger.getRootLogger().getLoggerRepository().getCurrentLoggers();
		while (e.hasMoreElements()) {
			Logger l = e.nextElement();
			if (l.getLevel() != null) {
				result.put(l.getName(), l.getLevel().toString());
			}
		}
		return result;
	}

This MBean can be registered with any JMX system, allowing it’s methods to be invoked via JMX calls. In this case, we really want to register it with the Weblogic JMX system when the application starts up. To accomplish this, we use the class com.theserverlabs.jmx.web.JmxServletContextListener, which is registered in the web.xml of the web application:

	
		com.theserverlabs.jmx.web.JmxServletContextListener
	

A modified version of the source code for JmxServletContextListener is shown below which error-handling and logging removed for clarity. It is pretty clear that when Weblogic starts our application and calls the contextInitialized() method, we register a new instance of ApplicationLogManager with the object name “com.theserverlabs.jmx:type=ApplicationLogManager,name=[NAME OF APPLICATION]LogManager”. This name is generated in the getObjectName() method. When Weblogic stops the web application, it calls the contextDestroyed() method which un-registers the instance of ApplicationLogManager from the JMX MBean server.

public class JmxServletContextListener implements ServletContextListener {

	private InitialContext ctx;
	private MBeanServer server;

	{
	    ctx = new InitialContext();
	    server = MBeanServer.class.cast(ctx.lookup("java:comp/env/jmx/runtime"));
	}

	public void contextInitialized(ServletContextEvent sce) {
		server.registerMBean(new ApplicationLogManager(), getObjectName(sce));
	}

	public void contextDestroyed(ServletContextEvent sce) {
		if (server.isRegistered(getObjectName(sce))) {
		    server.unregisterMBean(getObjectName(sce));
		} 
	}

	private ObjectName getObjectName(ServletContextEvent sce) throws MalformedObjectNameException, NullPointerException { 
		String appName = sce.getServletContext().getContextPath().substring(1);
		return new ObjectName("com.theserverlabs.jmx:type=ApplicationLogManager,name=" + appName + "LogManager");
	}

}

The WLST script

That’s pretty much all there is to it in terms of application code. The only thing left to discuss is the Weblogic Scripting Tool (WLST) script that modifies the log level. Below is the source code (it is Python code):

def changeServerAppLogLevel(app, logger, level, serverUrl):

    ## connect to the server
    connect("weblogic", "weblogic", url=serverUrl)

    ## go to the custom MBean tree 
    custom()

    ## go to the place where our app log level mbeans are stored. 
    cd('com.theserverlabs.jmx/com.theserverlabs.jmx:type=ApplicationLogManager,name=' + app + 'LogManager')

    ## horrible code necessary to invoke a Java method that changes the log level
    args = jarray.array([java.lang.String(logger), java.lang.String(level)],java.lang.Object)
    sig = jarray.array(['java.lang.String', 'java.lang.String'],java.lang.String)
    print 'changing log level to ' + level + ' for app ' + app + ' on server ' + serverUrl
    invoke('setLogLevel',args,sig)

    print '  ... done. New log levels are:'
    print(get('LogLevels'))

# "main method" of the python script
argslength = len(sys.argv)

if argslength < 3 :

    print '==>Insufficient arguments'
    print '==>Syntax: changeAppLogLevel.py [APP NAME] [LOGGER] [LEVEL]'
    print '==> where: APP NAME=context root of the app e.g. xxx for /xxx'
    print '==>        LOGGER=log4j logger name'
    print '==>        LEVEL=TRACE/DEBUG/WARN/INFO/ERROR/FATAL'
    print '==>   e.g: changeAppLogLevel.py xxx com.theserverlabs.blog.logging DEBUG'
    exit()

else:

    app = sys.argv[1]
    logger = sys.argv[2]
    level = sys.argv[3]

    # change the log level for the server. if there was more than one server
    # in the cluster, you could add multiple calls here, each with a different 
    # server URL
    changeServerAppLogLevel(app, logger, level, 't3://localhost:7001')
    
    exit()

Hopefully the majority of this code will be clear, even if you don’t understand python. The method/function changeServerAppLogLevel() declared at the start modifies the log level for a defined logger in an application running on a specific Weblogic server. To do this it:

  • Connects to the Weblogic server at the specified URL (note that this example uses the hardcoded username and password weblogic/weblogic
  • Moves to the custom JMX MBean tree which is where Weblogic puts all non-Weblogic MBeans
  • Moves to the place where our MBean is stored in the tree
  • Executes the setLogLevel() method of the MBean with the required arguments to change the log level. This translates to an invocation of the ApplicationLogManager.setLogLevel() method.
  • Lists the new log levels

In the “main” method of the script (i.e. everything that is not in a method/function), there is some basic code to make sure the script is properly invoked and, if it is, a call to changeServerAppLogLevel() with the supplied arguments and the hard-coded URL of ‘t3://localhost:7001’. Obviously you should modify this value to reflect your Weblogic environment.

One interesting aspect of using a script to do this is that you can include a call to changeServerAppLogLevel() for each server in your cluster, if you have a multi-node setup. MBeans are published at server level so there is no concept of publishing a cluster-wide MBean (so far as I know anyway).

An interesting extension of this script would be to interrogate a Weblogic Admin server to find all the nodes in the cluster that were up and then execute the code to change the log level against each available node.

Conclusion

Hopefully this article has cleared up any confusion as to how to perform logging with log4j in a JEE app deployed on Weblogic. I think that the ability to modify dynamically the log level for a given log4j logger without having to redeploy the application is extremely useful and hopefully will be of use to people looking to do a similar thing either with Weblogic or any other JEE server that supports JMX.