In the last post, we saw the difference between a Rule Engine, Event Stream and Complex Event Processing. In this post, we would try to dig a bit deeper into what is CEP.
There are a lot of events happening in the enterprise. If these events are routed through an intelligent system that can find the correlation between the events and then take proactive action based on that then it could be of immense use to the enterprises. One of the uses of a CEP system could help in fraud detection depending on the distance between the location of the events. For example if 2 credit card transactions on the same card happen within a window of 30 minutes from 2 locations which are 500 kms apart, then you know that there is something wrong.
CEP Runtime Architecture
The following diagram from IBM, , explains the runtime architecture of a CEP system
There are 3 major parts of a CEP system. Input, which is composed of several input processors which gets events from multiple places. Processing, this part processes the input by filtering, aggregation, segregation etc on the basis of defined rules. The last part is the Output, which would either be some action which needs to be taken or hand off the processed input to an external system.
First, definitions are loaded through the Input Adapters to the Definition Manager. Definitions are parsed, and if they are consistent and complete, they are loaded into the Rule Engine through the Routing Manager. Success / Fail messages are sent to the Output listeners.
The Event Flow
Once this is done then events can start flowing through the system. Multiple events from disparate sources are routed through the system now and on the basis of their correlation to each other, irrespective of their origin, the rule engine triggers the necessary rules to take the appropriate action. This appropriate action could either inform a CEP listener to take an appropriate action. For example, these could be listeners which observe for the occurrence of an output (which is an event for them) and then trigger the necessary logic. Another category of actions are the ones which are triggered as standalone actions and not through the listeners. These could be independent events like sending emails, performing a calculation etc.
CEP tries to invert what the database does. In the RDBMS world, the data is stored and then queries are executed on the database to ascertain the trends, any rules which need to be triggered on the basis of already existing data etc. So the data already exists and the queries now work on that data. Now, if you invert that scenario. There is no data. There are some queries / rules which exist. The data flows through the CEP system instead. As the data flows different rules try to make sense of the data and take an action in the real-time.
Most of the CEP tools provide a SQL like query language to query the stream of data/ events which passes through the system.
In the next post we would try to look at Esper, which is an open-source CEP engine written entirely in Java and fully embeddable into any Java process – custom, JEE, ESB, BPM, etc. We would try to build a small application with Esper to highlight some of its features. Stay tuned.