5 pitfalls to avoid when implementing an Event-Driven Architecture
Event-Driven Architecture (EDA) is far from new, and if you search around you’re bound to find a lot of information on this architectural concept. You will find ample material on almost every technical detail and even thoughts on how to evolve towards EDA. You’ll read about the differences with Enterprise Service Busses (ESB) or queues, the importance of Domain Driven Design (DDD) and even find guidance on how to code streams, implement connectors, use schema registry and so on.
So you should be able to start immediately and set up the foundations for many years to come, right?
Not really so, as we experienced. Not a lot is written about EDA in practice, how to start implementing EDA on a full microservices landscape, how to integrate with both legacy and new components. And how to govern this landscape.
At Cymo, we have helped several customers to overcome these challenges and we have identified many pitfalls along the way. In this blog we list what we think are the 5 most important ones:
1. Align your topics with your business domains
You might be familiar with queues or Enterprise Service Busses (ESBs), and have experience with producing events on those systems. And you might even have put a lot of thought into topic naming. But this is not always the case, because you often know who will consume the data and quickly agree on a topic name with the interested party.
Using EDA, you store events using an event hub. Events are stored for a longer period of time, some even configured with infinite retention. This leads to events being consumed by new applications which didn’t exist when the event was originally produced. Organizing the events in topics with clear naming conventions therefore becomes extra important.
At the very least you should include the business domain in the topic name. Not only for clarity, also because of ordering. Business events that relate to the same domain entity must be stored in the same topic to guarantee ordering.
This is where DDD (Domain Driven Design) becomes crucial. If you know the different bounded contexts that live in your organization along with the domains that are included in those bounded contexts, you can define topic names in the correct way. Much like naming microservices or resource naming for REST services.
2. Think about versioning and backwards compatibility
Once you start producing events with a specific schema you have defined a contract. Other microservices can start consuming the events whenever and however they want. For example, a consumer can create a stream and join the events with the data in their domain.
As a producer of events you want to make sure you send out all the information that might be of use. However, applications evolve, data is added and sometimes business processes change.
In our experience most updates to the payload schema are backwards compatible. Using Apache Avro’s ‘forward transitive’ compatibility mode, a field can be added, for which you can simply define a default value when consuming older events.
But what happens when the change is not backwards compatible?
First off, you can’t produce conflicting events with the same event name on the same topic. This could only work if the consuming applications deploy their code at the same time, which goes against all principles of loose coupling and microservices. And furthermore, you would also lose the possibility of replaying events along with streaming features.
You could however create a new event for this.
But even better, you can make sure consumers get the time they need to migrate to the new event structure and agree on a date to phase out the old version. We found the following procedure works pretty well:
First you migrate all the events of the specific topic to a new topic.
Then you start producing the new event (along with the other event types) on the new topic.
Now you create a stream that reads from the new topic and produces on the old topic.
When all consumers have migrated to the new topic you can easily phase out the stream and delete the old topic.
3. Synchronous APIs can still be useful, for specific use cases
In your organization there might be some confusion about specific integrations in an Event-Driven Architecture. What if we need to send out a command for which we need a response? Why can’t we use synchronous APIs such as REST?
These are 2 guidelines you can introduce. For these use cases, there is no loss of event data which should be avoided when implementing EDA:
- You can use REST within a specific bounded context, as long as you also send out the necessary business events. Especially for frontend / backend integration REST is often used.
You can consider using REST POST for commands that don’t have a high load, for which you need a response immediately and for which clients have implemented a circuit breaker. However, when doing so, clients need to have a fallback scenario in case of failure. In some cases this still might be easier than sending out an asynchronous command.
- You also want a REST command to be followed by an event. The event will express what happened after the command was processed. This event is what you want to keep on the event hub, not the command itself.
4. Create standards for headers and payloads
Your events consist of headers and payload. If you standardize on those from the start, you can make it a lot easier to consume events for different purposes.
The headers can be useful to indicate event type, correlation ids and much more. Some of them are also set by libraries such as Spring. E.g. correlation ids and reply topics for commands.
You can even use these headers to add important metadata such as domain entity ids. Especially for generic consumers this opens up a lot of opportunities. Without knowing the payload structure, they can still index those fields. This makes tools such as the Cymo Event Explorer very powerful.
If you agree on defaults and document them properly, you will make life a lot easier later.
The same applies for payloads. You can introduce a common way of describing a payload in your events usinghttps://github.com/cloudevents/specfor example.
5. Introduce monitoring, also for business people
Expressing your events in business language is important. Not only for developers to make sure the event is interpreted correctly in the code, also for analysis and monitoring. Business events express what happened in the different processes. Everybody in the organization should be able to understand what they express.
Business event exploration is often confused with traditional tracing solutions.
Tracing is important to find out what happened for a specific event. When was it produced, when was it consumed, etc. This data is often available for about a week and added via interceptors in the code. OpenTelemetry has become a good standard for adding this tracing information. And there are many tools available to visualize such tracing.
Business event exploration goes a lot further. It is not only intended for support, it is used by technical and business users to easily find and display related business events over a timeline. And it includes security for specific data. This becomes possible by indexing the data and making it available for a long time. Just as long as the business wants to have insight in this data, which can be years. From our experience, this is a service that becomes very popular, very quickly. It helps people to understand what capturing behavior is and not only work with state. The Cymo event explorer offers all this functionality.
If you are looking to implement Event-Driven Architecture in your organization, be sure to get in touch. We have a lot of experience in this area and can help guide you through the process. While there is a lot of information available on EDA, not everything is written about how to apply it in practice. We’ve experienced many pitfalls along the way and would like to help you avoid them. Contact us today for more information!
ContactWritten byKris Van Vlaenderen