ESB - The swiss army knife for cloud integration

Few applications are entirely self contained, and even less so in the Cloud.   When you have more than one piece to an application, the pieces are going to have to talk to each other.   Unfortunately it is not always so quick and easy to implement a direct connection between two pieces.   Legacy code, long development cycles, obscure communication protocols, 3rd party solutions, and political obstacles can conspire to isolate components.   To help connect up and manage communications between and within applications you can construct one or more  "Enterprise Service Buses", or ESBs.

The ESB’s job in the infrastructure is to facilitate communications between all sorts of things.  It will have a Swiss Army Knife’s worth of connectors, adapters, communication protocols, and translation and transformation capabilities.    It should be relatively easy to connect up these parts and wire practically anything to anything else.  They usually include a menu of data flow controls to route, trigger, intercept, and log the message traffic. 

You can also leverage an ESB as a tool for keeping track of where everything is.  When something in the infrastructure moves which may have a number of other things it communicates with; you could update everything, one piece at a time, and hope you got them all, or you could just update the ESB.   Simpler, centralized administration can be a big win when you are in a highly dynamic cloud environment.

You can configure a set of ESB servers behind a load balancer, so you can move them too.  You can even use an ESB as a sort of load balancer for some odd protocol or another.   There are a number of possible architecture configurations and you may want to implement more than one of them within your cloud.

Some ESB technology is embeddable in your application.  Rather than connecting to a stand-alone ESB server, you could set up an ESB service within your application for internal and external data exchanges.

The advantage of an Open Source ESB solution is that you can extend and customize the tool as needed.   The very nature of an ESB, which must be a flexible interconnector and data router, means that you will probably want to extend and customize it to glue together some odd connection here or there.  There are several Open Source ESB solutions to choose from, here are a few, all of which are backed with optional commercial support:

Mule ESB

JBoss ESB

[Glassfish] OpenESB

Apache Servicemix

Apache Synapse

 

OpenESB just put out their first release candidate for their first version of an ESB.  They have a really impressive website and feature set.  These guys are going to be major players in this space in the not too distant future.

JBoss ESB seems to be somewhat disorganized and struggling to gain traction.  It is difficult to approach without a good understanding of JBoss.  Data routing and transformations can get surprisingly complex and convoluted very quickly.  I think it is important that an ESB implementation be as clear and simple as possible so as to not make things even more confusing.  My first impression was that the JBoss solution would not lend itself well to clear and simple implementations.

Why are there two Apache ESB’s?  I have no idea.  Both seemed somewhat dated and not as comprehensive as OpenESB or Mule.  I get the sense there was a split in the community and that the rivalry did neither product justice.  They were sufficiently lacking in obvious support for a few key communication protocols that I spent little time evaluating them further.

Mule is fairly mature (the whole concept of an "ESB", as such, is still fairly new).  Some of the key brains behind Mule are some of the pioneers of ESB development.   Mule recently released version 2 (and quickly buried version 1).  Version 2 is still a little buggy but it is getting better every day and the bugs are increasingly obscure.  (Version 2.1.2 was just released this week.) The documentation varies from extremely sparse and obtuse to very detailed and clear.    Free support from the user forums is not bad.  Business Critical ESB owners should probably leverage one of the commercial support options.

Terminology varies depending on the ESB solution you are working with.  Not only are they likely to use different terms to describe the same function but they sometimes use the same terms to describe different things. (!)

All of them follow the same basic flow structure.  Data comes in, runs through some business logic, and goes out.  Flow variations can include self-triggered data (via timers), externally triggered data, data flow triggered data, multi-casting, inbound data aggregation, outbound data aggregation, sequential processing, reverse or custom reply paths, combinations of things, and fragments.

The key to a successful ESB implementation is to keep things as clear and simple and easy to understand as possible.  Lots of drawings and supporting documentation can really help, not just to communicate what is happening to others, but to get it straight in your own head.  As this blog matures I’ll post a few ESB tricks and tips, links to new and interesting developments in the ESB world, and links to articles with particularily relevant and thought provoking discussions.  Stay tuned!

 

     

 

 

Posted in | Posted on 09 Dec 2008 20:44by rotten | 7 comments

Splunk, a window into the cloud

One of the greatest challenges with implementing and supporting a cloud infrastructure is keeping track of what is happening with all of the pieces.    An excellent tool for aggregating activity and event reports from a very wide variety of applications, infrastructure components, monitors, and more is Splunk.    Most log aggregation and management tools are focused on a narrow segment of the cloud infrastructure, or, are focused on a narrow audience (such as PCI auditors).  Not so with Splunk, which can do the security audit function, and still really excel at log management for operational purposes at the same time.

Splunk bills their product as a "search tool for IT".  What they do is gather events and log files from across your Cloud and then index them in a database.  They provide a variety of pre-built views into that database, reporting, graphing, and alerting capabilities, a comprehensive search language, command line interface, and even a browser plug-in that will pop up alerts while you are surfing the web.

Log data can be collected by an impressive suite of options. Probably the most common is to read existing log files where they are and then forward the data to a Splunk database.   Sometimes you want to avoid all that disk i/o.  With Splunk you can feed data directly to it via syslog, tcp, snmp traps, and a variety of other IPC mechanisms.  You can even write your own custom data feed.  

The Splunk database can be divided over several servers, but still searched from a single command.  This lets you organize your data as best fits your cloud, straddling firewalled network zones, grouped for audit, access or other administrative purposes, or perhaps part of a global distribution with local data collection points around the world.

Search commands can be simple or sophisiticated, chained together, saved, shared, and even run on automatic intervals.

Splunk is not without competition.  Two other major players in this space include XPLG and Loglogic.  What sets Splunk apart is that it is customizable, [creative commons - free license] extensible, flexible, and (mostly) intuitive to administer.  It doesn’t require specialized appliances.  Instead it can be deployed and managed on your favorite platform by your favorite system administration team.

That power does come at a price.  It takes some pretty good hardware (but not outrageous) to store and search the log database if you start scaling into the many Gigs of data per day.    Splunk’s flexibility also means that it may take some effort to customize and deploy it within your particular cloud.  (On the other hand you can engineer a close fit to your specific requirements.)

XPLG is also a software solution.  It comes with some pretty fancy auto-discover tools, which means you may not have to know much about your infrastructure and applications (as long as they are reasonably standard) to start collecting log files.  Their licensing is a little odd, and it appears that implementing and sharing extensions, plugins, or customizations may be more challenging as a result.  Initial pricing quotes obtained from this vendor for a recent project I was working on were significantly higher than Splunk.

 

Loglogic sells a preconfigured appliance.  It is a turnkey solution with a suite of agents you can deploy to collect data.  Again the closed nature of the solution makes it hard to leverage ideas from the community, and more challenging to deploy customizations than the more open Splunk.   It is probably a great solution for an organization looking for a quick plug and play that will collect most of the interesting logs in one swoop.  Their product is not inexpensive, and it wouldn’t be practical to deploy Loglogic for small volumes of log aggregation (whereas Splunk, for small volumes, is free).  Additionally, a complex environment, which might require several appliances, could drive the cost up very quickly.

Every now and then in the IT world you come across a company that appears to be "doing it right".   Attractive website, excellent (and cool) product, sensible licensing, active and vibrant community, accomodating of the big guys AND the little guys, responsive and friendly support, clear road maps, and that je ne sais quoi magic.   Splunk is one of them.

 

 

 

Posted in , | Posted on 02 Dec 2008 21:41by rotten | no comments

Sponsored Links

Categories

Links

Archives

Copyright © CloudNavigator

Tech Blue designed by Hive Designs • Ported by Free WordPress Themes and Frédéric de Villamil Powered by Typo