What is Genetic Programming?

Welcome to part one of a multipart blog series discussing some background and motivation for Evolutionary Signals.

Before describing exactly what it is, let me state the goal of genetic programming, hereafter referred to as GP, as “to get a computer to do something without telling it how do it”(Koza 2013). Instead of the procedural algorithm to solve a problem, we tell the computer how to evaluate a good solution or compare competing solutions.

Any domain where you can describe how to evaluate a good solution is fertile ground for GP. I will give some concrete examples in this and other articles, but we can immediately see that in finance, an easy way to evaluate a solution is by how much money it makes. Alternatively, we might include risk in the equation, such as by considering the Sharpe ratio, or incorporating penalties for large draw downs.

The goal stated above applies to most of AI, not just GP.  The differences are in the representation used. Other forms of AI represent problems as tables, networks, trees, or other structures. Such schema only represent the true solution and must be interpreted or translated into the actual problem domain, often through an intermediary computer program. GP represents the solution as a computer program. Hence representation and solution are one. As GP’s creator, John Koza states, “the best representation for a computer program is a computer program” (Koza 2013). GP is essentially limited only by what can be represented as a computer program (which is essentially anything).

GP achieves this goal by “breeding” a population of computer programs that evolve and improve over time, producing better and more accurate solutions to the target problem. Breeding? Can you really breed computer programs? Well actually, yes. That is the crux of GP. Using techniques inspired by Darwinian evolution and population genetics, GP builds an evolving population of computer programs that improve over time, similar to how the human (or more broadly, animal) populations develop beneficial traits over generations. Based on the principle of survival of the fittest, better performing programs are similarly favored for inclusion in subsequent generations. Evolution provides further inspiration for GP’s inclusion of biological inspired genetic operations such as crossover and mating.

A key attribute of GP is its non-greedy and stochastic (random) nature.  This feature enables the development of novel and creative solutions to problems. Where a more deterministic approach might converge on obvious, and perhaps non-optimal solutions (local optima in machine learning parlance), GP aims to find novel or creative solutions (global optima).  This is accomplished by allowing less fit individuals a chance of being selected for future generations instead of always picking the absolute best. GP also incorporates massive parallelism, as does biological evolution. Instead of a single search path, each individual in a large population is itself a search. A population of one million individuals (programs) is therefore equivalent to one million simultaneous searches (though the actual underlying computer implementation may not incorporate such massive parallelism).

One of my favorite examples of such creativity is the design of a satellite boom. Satellite booms are not my domain of expertize, by I believe this NASA image is a typical example in action

 

Using a genetic algorithm (another type of evolutionary algorithm with much commonality to genetic programming) researchers were able to achieve a 20,000+% improvement in frequency averaged energy levels over a conventional design (Keane 2006). The parameters determining fitness were optimal vibration control as measured by energy transmitted through the structure.  A typical boom design and the optimized, GA designed boom, are shown below:

 

It is evident to me that only a madman would come up with such a design, or even trying it out. Evolutionary algorithms can search numerous solutions in parallel and come up with such novel solutions.

Such an approach is applicable to problems where the size and shape of the solution is unknown. Market prediction is certainly such a domain, as there is inarguably no deterministic algorithm to consistently provide market beating performance.

The next article will examine some of the mechanics of genetic programming and provide some illustrative examples.

References:

Keane, A. J. (1996). THE DESIGN OF A SATELLITE BOOM WITH ENHANCED VIBRATION PERFORMANCE USING GENETIC ALGORITHM TECHNIQUES. The Journal of the Acoustical Society of America, 99(4), 2599–2603.

Koza, J.(2013, June 21). Automated Design Using Darwinian Evolution and Genetic Programming [Video file]. Retrieved from https://www.youtube.com/watch?v=xIoytwJWJP8

Cleaning Bits

I’ve been meaning for some time to post some articles I’ve done as part of my Doctoral studies. I don’t think any of these will turn into a dissertation, but I believe all are interesting and hopefully of worth to someone.

JIMQL began as a critique of a research paper and developed into an open source project implementing some of my ideas about this domain.

I also looked into Client Side Encryption for HTML Form Fields. I believe this paper offers insight into improving web application security. In the appendix I propose additional security standards for implementation in a future version of HTML.

Finally, I’ve posted a philosophy paper I wrote a Carnegie Mellon in 1986. As it is one of the only things I’ve kept (other than my Master thesis) I must have though it wasn’t too bad. I haven’t reread it. Let me know what you think. the paper is posted on Professor V’s Teaching Cafe.

On Pittsburgh and Progressive Enhancement

First off, Pittsburgh and Progressive Enhancement have nothing in common. However, I did just take a trip to Pittsburgh this past week and did some thinking on one of my favorite topics: Progressive Enhancement.

I’ve been thinking in recent months that progressive enhancement is an obsolete concept, at least in most circumstances and where it is not specifically dictated by requirements or regulations.

Certainly, JavaScript and Ajax are all the rage. Mobile devices are now as powerful as desktops with browsers equal to their desktop counterparts. Users expect rich functionality in web applications, which means JavaScript and DHTML. It is a waste of time and effort to build a non- JavaScript version of an application if 99% of your users will not see it. Supporting non JavaScript clients will also require considerable additional work.

But none of the above statements are true.

I took the trip Pittsburgh without a laptop, only a Blackberry Pearl and an iPod Touch. The Pearl is a bit old and always ran JavaScript questionably if at all and it only runs on a slow EDGE network. I therefore did not expect much from it.  However the Touch is an incarnation of the greatest device ever invented , the iPhone (please see the sarcasm here). I expected to be able to  do simply web-based tasks on the Touch, especially over its WiFi connection.

The only task I really needed to do that week was maintain my Ning site; essentially approve photos that are posted by users  and answering  messages from users if needed.

Unfortunately, neither device handled this simple task. While the devices share some of the blame, the fault is mostly Ning’s for not providing a simple non- JavaScript implementation of the needed functionality. Approving photos on Ning is as simple as opening the photos to approve page and clicking the “approve all” button (assuming you  want to approve all ). A pop up confirmation box is then displayed where you can confirm or cancel. When navigating to the page with the Blackberry, with JavaScript turned off by default, I was  informed  that JavaScript was required to view this page. After turning on JavaScript and clicking “approve all”, I was brought back to the top of the page with no apparent action or result. I reproduced this several times without successfully approving any photos. Of course, as I said before, I didn’t expect much from the Blackberry. That is why I had the iPod. However, it did the exact same thing and I was unable to approve the photos. I am not sure why the Touch didn’t work, but the functionality here is so basic, Ning could easily provide a static confirmation page for non JavaScript enabled browsers, and through progressive enhancement, enable the pop up for JavaScript enabled browsers.

While in Pittsburgh, I also tried to check my remaining  T-Mobile  minutes. I couldn’t connect to the T-Zones application ( a native app that comes installed) so I tried to access tmobile.com. I was curtly   informed the site didn’t support blackberry. This from the company that sold me the Blackberry and on the site where I can manage my Blackberry.

On a positive note, let me give kudos to Google’s excellent mobile Blackberry Apps. Gmail and Goggle Maps are extremely usable on the Pearl. I removed my native blackberry email account after accessing it through the Gmail mobile app.  Google Maps is also excellent, complete with traffic and satellite view and directions, but the T-Mobile connection speed made the experience a bit painful.

Right now, native applications are definitely the state of the art for mobile devices. Too bad they require custom development for each platform. I certainly imagine that a model driven approach than generates runtimes for multiple platforms may be more feasible in the future. The other option is that HTML sites, specifically HTML 5, allow browser based apps to provide a comparable experience and  supplant native apps. I would put my money in that category (actually I am). Maybe I am just too lazy, or too efficient, to want to rewrite apps for every platform I want to support.

I wrote a blog a while back about stumbleupon.com and how it didn’t support Oopera at the time and did not even let me proceed with that browser. I am not sure if it currently supports Opera (I have to think it does by now) but I still see sites that don’t support all browsers. HBO.com recommended to me only last night that I ditch Opera and I use IE or Mozilla. At least it gave me the option to proceed at my own risk. I took the risk and did not have a problem.

Progressive enhancement is still in my opinion the best way to develop web applications. The preferred approach is to build the app in three stages (for the initial release/revision). First build out the content layer (XHTML).  This step allows you to develop the necessary components that make up the core of the application, the guts of the M, the V, and the C. At this point, I usually am working with a raw HTML interface, perhaps with a bit of CSS just to make it usable. There may be some pages where the effort to do a non- JavaScript page is much more difficult and cumbersome to the user. In this case, It may make sense to require JavaScript for that specific module

Secondly, Style it (with CSS).  Add colors, positioning, etc.

Finally, and only where needed, add dynamic behavior with  JavaScript. I highly recommend jQuery for this.

The versions produced in each of these steps should be fully functional to the extent possible. This approach lets you focus on the basics of the application , foundation up, and not get caught up in what you think the behavior will be needed. Some of that may not be know early on  in the project.   Please understand this is only a very brief overview of my development process. I am not advocating a waterfall approach, only progressive development through progressive enhancement.

Looking back at BidBlazer

BidBlazer was developed and marketed by AD2001 Computer Services (which later became Infoblazer LLC) from 1999 through 2001. BidBlazer was one of the first in a new breed of online auction search tools and was the first client based tool in this niche. At it’s peak, BidBlazer searched eight online auction sites, including Ebay, Yahoo, Amazon, and MSN

During this time period, Ebay greatly consolidated it’s share of the online market. Therefore, other auction sites were left with few listings and most were eventually phases out. Due to the overwhelming market share of Ebay, cross auction searches became less useful. Developers moved to building tools targeted towareds Ebay specifically, such as sniping and submission utilities. In addition, development of online searching during that time period was very prone to changes in online auction formats. Since there was generally no published API or XML communication tools, such as later popularized by Amazon and Google, keeping up to date with auction site changes was a constant process.Therefore, BidBlazer was phased out in mid 2001.

For posterity, here are some screen captures from the software and a review of the product from Rocketdownload.com

Pigs Don’t Fly

A few posts back, I discussed my new project, an online quiz application, and the initial choice of GWT and GXT as the framework. The article was entitled “Pigs are Flying”, due to my previous convictions and statements that were not always kind to so called Rich Internet Applications.

Well it turns out after all that pigs do not fly and GWT/GXT does not fly.

A few caveats here. I use GWT daily in my consulting gig, so I have some experience in the area. I just remain a firm believer in applying Ajax judiciously. GWT (and certainly GXT) is full blown Ajax, almost HTML-less Ajax, if that is possible.

I still stand by the heuristic that an RIA will take twice as long to develop as the comparably functional Ajax-enhanced HTML application. Does this type of application have a place? Most definitely. Just be certain you requirements dictate such a tool and you have the funding and or time for its implement. In my situation, I just did not have such a luxury, particularly to wait around for applications to reload changes and to search out documentation on a less than fully documented third party framework.

Nuff said on that issue.

Since I’ve always been a user of Spring, and I initially started prototyping the quiz app in Spring MVC, I returned to SpringSource technologies, but not Spring but Grails.

I may have a little to say about Grails in the future (and I don’t like to repeat what is already said out there) but Grails is truly excellent.

While pigs are not flying anymore (although they are trying to take off, since I’ve also had some less than kind things to say about dynamic languages) there is still much to get used to in a language like Groovy. I specifically mean the lack of static type checking. However, IntelliJ does a decent job with things like code completion. Plus the dynamic application reloading is great. It’s nice not to hit the restart button after every code change.

That’s it for the status update. I’ll return to discussing performance issues in future articles. I find it hard not to get sidetracked when the topic of framework selection and evaluation arises.

 

HttpSession and GWT

Most anyone who has done Java server programming has dealt with the Session object, as well as its siblings (and I use that term in essence, not in fact) objects: ServletContext, HttpRequest and HttpResponse. Of course, if you are using a newfangled “Component” framework that tries to “shield” you from the intricacies of HTTP, you may not be familiar with these babies.

GWT is somewhat of a mélange of both Component frameworks and Request/Response frameworks. It is certainly on the component side of the spectrum, in that you are programming mainly in Java objects and using Swing-like constructs and events. But that is mainly on the client side of things. On the server side, GWT is strictly Servlet based, with some excellent marshalling thrown. The marshalling is so good that sometimes can make you forget the HTTP underpinnings. A dead giveaway, other than of course reading the documentation, is that any remote service must be defined in web.xml as a Servlet. I always found this odd, since most newer request /response frameworks(that gladly expose HTTP constructs) tend to use a single “Base” Servlet that you then must extend. Typically, you would create a single Servlet Mapping for this Base Servlet, which would handle essentially all requests. The Base Servlet, based on some xml, annotated, convention based, or a combination of the above, would pass control to a custom Java class which would handle the actual implementation. The Base Servlet and related framework will also provide much of the “plumbing” and other features, such as parameter object mapping, view integration, etc.

With GWT, every remote service you make available to the client must be defined as a distinct Servlet in web.xml (I feel like I’ve already said this, but it is worth repeating). The Servlet class defined is actually the developer’s specific java class, not a base Servlet. This class must extend RemoteServiceServlet, which is a descendent of HttpServlet. This is a key point in the discussion that will follow. Beware of a framework that forces you to extend its own classes (Ok, for the few of you in the know, XX Framework makes you do this as well, but I plead ignorance since I didn’t know better at the time. And I hadn’t seen Spring MVC yet).

Begin a Servlet, our GWT remote service implementation has access to all the HTTP goodies, specifically HttpSession, which I at one time said this post was about. To access the HTTP Session, you use the method

getThreadLocalRequest().getSession()

If this was a HttpServlet Sevice() method, you’d have access to this directly via the HttpServletRequest parameter . But since that method is long gone by the time your code is called, GWT stores this information and makes it available in a ThreadLocal object. ThreadLocal is a construct that lets each JVM thread have its own version of any class level variables. GWT needs ThreadLocal since a Servlet, as in HTTPServlet, needs to serve multiple threads. TheThreadLocal makes data available to all threads that may be accessing that Servlet.

Now remember the following point:

Each Servlet maintains its own ThreadLocal storage

There is nothing unusual with this. How else could multiple threads be served?

The problem arises when you decide to treat your Remote Service Implementation as just another Java class (POJO anyone) and not what it truly is, an HTTPServlet.

So the problem arises when any server side GWT code instantiates a GWT Remote service implementation directly.

Why is this a problem? Well in most cases, it may not be. We won’t include bad architecture in this discussion, which that most certainly is. The problem occurs when you try to access our old friend, getThreadLocalRequest().

The Remote Service Serlvet’s, which our Implemenation must extend, private ThreadLocal storage is initialized when the HTTP Service (POST, GET) happens. The HttpServletRequest, which includes the HttpSession is stored in the Servlet’s Threadlocal storage for access by the current thread.

When that Remote Service Implementation, or any other subsequent code, instantiates another Remote Service Implementation, the new Service’s ThreadLocal is created (it has to be as we are creating a new object) but it is not initialized, since no HTTP Get or Post has occurred. Therefore, any access to getThreadLocalRequest() will throw a NullPointerException, as this information has not yet been stored in ThreadLocalStorage.

I recently hit this problem while working with a client’s code. Doing an internet search and looking at some GWT books didn’t really shed any light on the problem. The only mention was a simple description, useful nonetheless, of the basic method for accessing the HttpSession and related objects. No warning or discussion of what happens when you instantiate a new Remote Service Servlet implementation object.

This is another reason why the SpringSource guys are being proved right time and time again. Please check out Spring MVC for an example of a web framework that can use pure POJO’s for most functionality. These POJO’s can be subsequently instantiated and used, as they have no dependency on the framework.