Saturday, September 26, 2009

Proxying large Amounts of Data using UrlRewriteFilter

Finally, I found some time to submit a feature request with patch to the UrlRewriteFilter project.

http://code.google.com/p/urlrewritefilter/issues/detail?id=53

In one of my projects we needed to provide functionality to post (upload) and download data via a servlet container to/from another url/port - basically we needed to implement proxying.

A great library out there is UrlRewriteFilter, a Java library that provides Apache's mod_rewrite functionality. Not only can you use it to make complex urls more user-fiendly, or re-map old url to new ones but it also provides proxying capabilities.

UrlRewriteFilter uses Apache HttpClient for doing proxying. Unfortunately, I ran into memory issues when proxying large amounts of data.  The issue is that the current version of UrlRewriteFilter (3.2) is doing buffered requests while proxying. This probably works fine for 90% of all use-cases but for the project I am working on we need to basically support unlimited amounts of data to be proxied (multiple 100s of MB).

Thus I provided a patch, that worked really well in my project without increasing memory consumption.
In Apache HttpClient you can implement a custom class using the RequestEntity interface that allows you to stream the data directly.

Thursday, September 24, 2009

Camellos - Discovering Apache Camel II

As indicated in my last blog post, here is an example implementing a small Apache Camel example.

You can pick up the source code from:
Steps to get it running:
  1. Check out the source
  2. Using Maven run: mvn camel:run
  3. The Application should compile and start up correctly.
  4. You can now drop files into the camellos/inbox directory
  5. The files should get uploaded to its the FTP server running at localhost:3333
  6. The uploaded files should show up under camellos/ftp
Back to y example, for my little blog post example here I want to provide the following very simplistic functionality:
  1. pick up files from an directory
  2. make sure that you pick up no more than 3 files per 30 seconds
  3. store them into a JMS queue
  4. have a listener on that queue that picks up those files
  5. and upload files to a remote FTP site
What do you think, how many lines of Java code does it take?

With Apache Camel you can get this simple task done with ZERO lines of Java code. Well, I needed 1 Main class with a few lines of code to load the Spring context and and the embedded FTP server. Nevertheless, I think that is quite impressive. In a sense all the heavy lifting is done in the Spring Application Context file:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:camel="http://camel.apache.org/schema/spring"
xmlns:ftpserver="http://mina.apache.org/ftpserver/spring/v1"
xmlns:amq="http://activemq.apache.org/schema/core"
xsi:schemaLocation="
http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
http://camel.apache.org/schema/spring http://camel.apache.org/schema/spring/camel-spring.xsd
http://mina.apache.org/ftpserver/spring/v1 http://mina.apache.org/ftpserver/ftpserver-1.0.xsd
http://activemq.apache.org/schema/core http://activemq.apache.org/schema/core/activemq-core.xsd">


<amq:broker useJmx="false" persistent="false">
<amq:transportConnectors>
<amq:transportConnector uri="tcp://localhost:0" />
</amq:transportConnectors>
</amq:broker>

<bean id="activemq" class="org.apache.activemq.camel.component.ActiveMQComponent">
<property name="brokerURL" value="vm://localhost"/>
</bean>

<ftpserver:server id="ftpServer" max-logins="10"
anon-enabled="true" max-anon-logins="5" max-login-failures="3"
login-failure-delay="20">
<ftpserver:listeners>
<ftpserver:nio-listener name="default" port="3333" local-address="localhost"/>
</ftpserver:listeners>
<ftpserver:file-user-manager file="users.properties" encrypt-passwords="clear" />
</ftpserver:server>

<camel:camelContext shouldStartContext="true" trace="true">
<camel:package>com.hillert.camellos</camel:package>
<camel:route id="route1">
<camel:from uri="file:camellos/inbox?move=.done" />
<camel:throttle maximumRequestsPerPeriod="1" timePeriodMillis="10000" >
<camel:to uri="activemq:queue:camellos"/>
</camel:throttle>
</camel:route>
<camel:route id="route2">
<camel:from uri="activemq:queue:camellos" />
<camel:to uri="ftp://admin@localhost:3333?password=secret"/>\
</camel:route>
</camel:camelContext>
</beans>


Apache Camel provides its own Maven plugin: http://camel.apache.org/camel-maven-plugin.html

Getting started with Apache Camel is simple. I recommend using m2Eclipse which is a Maven plugin for Eclipse. Basically Camel provides it's own Maven archetype which after running creates a simple project structure and which can be immediately run using mvn camel:run after project creation.


For my implementation I followed Camel's tutorial for creating a Spring based Camel route.
As my example uses a few more components for Apache Camel but also ActiveMQ for JMS and Apache Mina FTP Server, I needed some additional Maven dependencies.

Getting all the Maven dependencies right took actually longer than implementing the actual application logic. Anyway, I hope this gives you a quick overview of some basic Apache Camel features. As time permits I will blog about more about it soon. See you then!

PS: Apache Mina FtpServer, is itself a nice little nifty package. So if you for example have the need to boot-up dynamically FTP servers from within your application...check it out.

Wednesday, September 23, 2009

Camellos - Discovering Apache Camel I

Over the next couple of weeks or months (depending how much spare-time I am able to allocate), I will dive into the world of Apache Camel (Also take a look at my second blog post)


Apache Camel is somewhat like a Swiss-army knife. As an integration framework (Message Routing API, Mediation Router), it implements all the Enterprise Integration Patterns from the book with the same name.


Next, Apache Camel provides a quite extensive component library supporting an impressive amount of communication protocols. Also, it support an wide range of data formats as well as integration points in terms of other frameworks such as Spring, Guice, ServiceMix, ActiveMQ et cetera.

Well, that's all nice and dandy—However, what does this mean for me as a software developer?

When you develop typical Java enterprise applications, you will sooner or later come across the requirement to connect to other systems or to add other asynchronous services to your applications. Something like:

"Hey, we need to pick up this file that comes in every night, parse it, process it and then stuff its data into our database and, while we're at it, also make sure the original file is archived somewhere in the file system."

Or maybe you just have some requirements where your application needs to send data to another server but you have to make sure, that the flow of data is "throttled" in order not to overburden your destination server during peak-processing times.

In all those cases Apache Camel can greatly simplify the implementation effort. It is basically a mini ESB in the form of a simple Java API. But it is extremely modular so you can just bits and pieces from it. In my next blog post I will show you an example showing you how it is really dead simple to get started with Apache Camel.

See second Camel related blog post.

Saturday, September 19, 2009

GWT - Hosted Mode Gotcha in Windows

I ran into a small gotcha while running a GWT application I am working on in hosted mode. Usually I am developing my application in hosted mode but for quick stand-alone deployments I also need to deploy the application to a dedicated servlet-container. Since the default settings cause the application to compile rather slowly, I have been trying to speed up GWT compilation times (Using GWT 1.7) by compiling against Firefox only (My browser of choice).

To achieve this I set the following property in my GWT module's *.gwt.xml file:



Well, as it turns out, this caused me quite some pain in hosted mode. I did not realize that the aforementioned setting affects hosted mode. I new that under Windows Internet Explorer (IE) is the default browser but always assumed it only affects fully compiled GWT code - not code running in hosted mode. That was a painful lesson - It caused some rather obscure errors in hosted mode and nowhere were I able to find explicit information regarding this issue.