In the previous blog post, we’ve shown what were
the lessons that we’ve learned when we’ve worked in building reactive microservices. In the current article, we want to
show you what are the methods that we’ve seen that allowed for gaining proper isolation and decoupling between microservices.
Avoiding coupling
First, we will start by assuming that most of the software engineers are thinking that if two whatever software
components (eg: two functions, two classes, two microservices, two systems, etc.) do not share anything and are not
integrated in any way, then the perfect construction was achieved. Scaling the number of components from this statement
from two up to let’s say tens, hundreds, or even thousands of them will make us think that a properly designed system is
one in which the building blocks are not coupled in any way one of each other. There is nothing more false than this!
Rationally thinking, there is no system designed in this world that it’s built from more than one component not connected
one with each other. So this is why we should not consider avoiding coupling when building distributed systems, but think
about how we should couple the components in such a way that they allow our systems to have
reactive characteristics.
A little about physics …
In one of the most wonderful talks that we’ve ever seen: “Uncoupling”
delivered by Michael Nygard, it’s presented an original and rational point of view about what coupling means in systems
in general. Michael Nygard uses examples for well-coupled constructions some
mechanical linkages like
Watt’s Linkage or
Chebyshev linkage which are two theoretical models that allowed people
to build something like the steam engine. It’s obvious from here that building distributed systems should not imply in
any case avoiding coupling between their components.
Coupling-inside, coupling-outside
When talking about coupling between software components it’s almost impossible to not think about
Software package metrics, discussed by Uncle Bob in his book
“Agile software development: principles, patterns, and practices”.
The concepts from this book have been introduced to be applied at building object-oriented programs, but they were scaled
from that level up to the system-architecture view. Uncle Bob considers coupling as a measurement of dependency and
introduces some concepts from which I will consider only three:
- Efferent coupling (Ce) – which means on who a specific component depends
on; also, this can be viewed as the outgoing coupling from a component point-of-view; - Afferent coupling (Ca) – which means who depends on a
specific component; also, can be viewed as the incoming coupling from a component point-of-view; - Instability – which is the ratio of efferent coupling (Ce) to total coupling (Ce + Ca) such that I = Ce / (Ce + Ca).
In the “Advanced Distributed Systems Design” course of Udi Dahan, he presents
a table like the following one:
Looking at the above table, we can see what are the instability degrees of each of the five components regarding their
afferent and efferent coupling. We will not discuss all of them, but the one in the middle.
In the case of component C, we can see that there are not outgoing or incoming integration points, so basically we can
consider that it is perfectly isolated from the rest of the others in the system. But rationally thinking, what kind of
component is this?! One that does not depend on any other else and at the same time no other component depends on it?
This is the specific scenario that we want to show you. Considering the case of a microservices-based architecture,
striving for having microservices that are decoupled one of each other will lead us in the end to a do-nothing system.
Going through dimensions
It seems that when talking about coupling in distributed systems we are counting just two dimensions: afferent coupling
and efferent coupling. Well, there are more dimensions out there than just these two and those are:
- platform coupling, which is also known as “Interoperability” and it discusses how interoperable are the components of the system;
- spatial coupling was already discussed in the previous post and the idea here is that the components of the system should be not
physical coupled in any way one of each other so we can deploy them independently; - temporal coupling was also already introduced and this is the type of coupling that talks about the (un)availability impact of one
component of the system to the rest of others.
We will not dive more into these concepts, but we will mention them during this article when talking about various integration patterns.
What is important is that we are facing five coupling dimensions, not just with two.
Reconsidering our ideas
A system with no integrations inside is not a system at all! When designing one of it we should not tend to avoid coupling at any price
and isolate each of its parts into its cage, but instead, we must have in mind how the system’s components should interact between them.
Now, coming back to microservices, it was already discussed that a reactive system is built on top of a message-driven integration model.
This is what we want to cover next by looking together at some message-driven/event-driven integration patterns between microservices.
First of all, we don’t want to discuss REST APIs and how to build a microservices-based architecture by using this architectural style.
It’s deserving a separate article for a deep discussion about its pros and cons because the industry promoted it as being the right
choice when working with microservices.
Simple application publisher
The first type of message-driven/event-driven integration pattern and probably one of the most used in production is the event publisher.
The current scenario is one that involves a microservices-based architecture where each microservice has its responsibility and owns a database
that it’s not shared with any other microservice. Also, usually in such an architecture, after performing some business processing microservices
stores data by performing transactions in their local databases.
The primary idea here is that during business processing at certain specific points or the end of the processing, an event can be published via
a message broker like Kafka to the rest of the world. In this way, other microservices can be triggered to perform
their tasks and join in a complex business flow.
In the above diagram, it’s a high-level representation of this implementation. This kind of design will allow us to have no coupling to other
system’s components than the local datastore and the message broker which are at least infrastructure components. In this way, the efferent
coupling of our microservice is 2 and the afferent coupling can be whatever number, but if we will apply this pattern at the entire architecture,
then the afferent and efferent coupling values of one microservice will be similar to the others’ values.
Thinking from spatial and temporal coupling point of view we are decoupled because the message broker will allow us to be unavailable, buffering
the messages if needed and also it’s pretty clear that the interaction is asynchronous which eliminates any time-related issues.
The major downside of such design is the fact that we are losing atomicity between committing the transaction in the database and publish an event
to the message broker. It’s possible that in certain cases one of the two operations fails, while the other succeed.
Put your messages inside the outbox
The issue encountered in the previous pattern is solved by using the so-called outbox pattern. In this scenario, the infrastructure it’s preserved,
so we are still having a local database and a message broker.
This pattern can be implemented by leveraging some additional software components which can be viewed as infrastructure. We can use two different
solutions for implementing the outbox pattern and these are Kafka Connect from Confluent
or Debezium.
The way that these software constructs works are by simply doing polling over one or multiple tables in a specified database and publishing at a specific
time interval (not necessary just by using time intervals) of all the changes that have occurred, to Kafka.
Now coming back to our previous implementation, what we should change is the fact that instead of publishing the event directly to Kafka, we will store it
in a very specific format into a table in the same database as the one where we store business data. Usually, the table is called Outbox. A high-level diagram
showing this pattern is following.
It’s worth to be noticed that the business tables and the outbox table are living in the same transaction context, so basically, the atomicity is not lost.
Committing the data in both tables will ensure that no event will be published to the outside world only if storing data into the business table will succeed.
Now, thinking about how coupled is affected, we see that our microservice is not coupled anymore to the message broker, like in the previous case, but instead,
now we can consider that we have introduced some platform coupling into our system.
We must design with great care how records from the outbox look like and how they change over time without breaking the interoperability. For ensuring this will
work well we can use something like schema-based serialization mechanisms such as Avro or Protocol Buffers.
Capture the data changes
We can use the infrastructure presented in the case of the Outbox implementation adding some more additional components for building a
change-data capture (CDC) streaming solution.
This pattern is acceptable to be used in certain cases where the previous two patterns are not possible to be implemented, but still, it should be avoided as much as possible because it’s somehow hiding the coupling between our system’s components.
The main idea here is that using a Kafka Connect or Debezium instance we can do polling on the database transaction log to check if the data was changed.
If so, then we will publish the changes to one or more Kafka topics (preferably a table should have one topic). The topics will be ingested by a specialized
microservice which is implemented by using a streaming solution like Kafka Streams or
Akka Streams. This microservice will have the task to materialize the events into an aggregate
that will be subsequently published back to Kafka to be used by the rest of the world.
The following diagram is meant to clarify a little how this implementation looks like.
We can observe the additional “Streaming Microservice” that is presented in the above diagram. From the operational point-of-view, this is not very pleasant,
but it can be very useful in certain cases like migration from a monolith (big ball of mud) to a reactive architecture.
Looking at the Streaming Microservice we see that this is subscribing to some message broker topics and is pulling data from them. It’s worth noticing that
the messages that it receives are of different types, but still, by using a common identifier they can be joined into a single message which is our aggregate.
The aggregate is subsequently published back to Kafka.
Even by using a more complicated design like this our microservice is still decoupled by the rest of the others. Spatial and temporal couplings are still in place
and the only affected one is the platform coupling, but not with something more rather than in the case of the Outbox.
Eager aggregating data
Somehow a simpler solution than the CDC one is to aggregate the data at the database level, instead of capturing and publishing granular changes to Kafka that are subsequently aggregated into specialized streaming microservice.
How this can be done? Most of the RDBMSs have something called Views! Yes, Views! A view is nothing more than a virtual table which is backed by a SELECT statement
that query one or multiple tables each time it is queried.
We can create such a view and perform a SELECT statement that joins all the necessary tables for performing the desired aggregate. Then, by using the same Kafka Connect
or Debezium instance we can poll at specified time intervals this view to see if there were some changes produced in the meantime and if so, we will publish them to Kafka.
The above diagram shows a high-level view of this way of publishing aggregates to the outside world. The coupling degrees are still in place, but in this way,
we are going to hiding the coupling between the microservice’s database and the external world more and more, which is not a good thing. Also, is not the best
practice to publish directly business data owned by the microservice, but sometimes it can be useful for a transition stage.
We can also perform some specific transformation at the database level by using stored procedures or specialized functions to aggregate the data into a specific
format like JSON or XML.
Keep in mind that this is not the kind of pattern to be used in the long run. It’s sometimes useful in specific cases like transitioning from a big ball of mud
to a microservices-based architecture, but as soon as it’s possible we should evolve this implementation into something described earlier.
Pull the trigger!
The major downside of the above-presented solution is the fact that when performing the query behind the aggregate view each time when we query the view we can
have some performance issues. If the query behind the view takes too much time to be executed, then we should think of another solution to solve our problems.
One solution would be to use a table instead of a view, let’s call it EventStore table. We can use triggers that are inserting change events into this EventStore
whenever the data is changed in the business tables of our microservice. In this way, by polling the EventStore we can have all the changes that were produced in
a specific time interval without ever performing any costly query.
The below diagram will make things more explicit.
The picture is somehow similar to the one from the solution that used the aggregate view, but in this scenario, we are not using a view anymore, but a real table
and also, we are not querying any other table rather than the EventStore.
This solution is very efficient if we talk about performance, but like the one in the previous scenario, we should avoid using it as much as possible.
Conclusions
We have seen how we can use various ways of integrating microservices in microservices-based architecture. Also, we have seen how the foundation (asynchronous message-driven) of the Reactive Manifesto can be created, by using multiple patterns for a well-designed coupling.
All the presented patterns were used in production-ready projects and worked as expected. Also, the mentioned downsides of each of the patterns are worth to be
taken into consideration before going to implement any of them.
If you want more details about understanding these solutions or how to evolve them, I suggest you look into “Designing Data-Intensive Applications” by Martin Kleppmann.