Wednesday, 11 November 2015

Micro Services - An Introduction

Micro services isn't a brand new concept, but rather the evolution of something that's been around for years - the distributed architecture. I've worked on a number of SOAs over the past 8 years and am currently working on a cloud based product, so was keen to see what micro services have to offer. This post is a high level introduction to micro services, but some of the content is just as relevant to modern cloud based distributed architectures in general. 

Replacing the Monolith

Before we look at micro services lets first take a look at the more traditional monolithic architecture where an entire solution is built and deployed as a single artifact. The server side monolith should look familiar to most developers and typically consists of 
  • HTML, JavaScript, CSS web client
  • A server side component that exposes a set of services to the client, interacts with the database and perhaps integrates with other external systems
  • A relation database or some other external data store 
Fig 1.0 - Monolithic Architecture
While this architectural style has been around for years and served most of us well, it does have a number of downsides.
  • Tight Coupling - While we endeavor to keep services as loosely coupled as possible this can be more difficult in a monolithic application. Even if we modularize our services into distinct units, its all too easy to end up with components that are tightly coupled. A good example is the database schema that our services talk to. If multiple unrelated services talk to the same database schema then a certain level of coupling is inevitable. Model changes required by one service have a direct impact on all other services using that schema. This results in changes to a larger number of components and introduces additional regression effort for QA.
  • Change is Slow - Imagine we want to make a small change to 1 service in our application. We have to build, and deploy the entire solution even though our change was isolated to a single service. This slows our ability to get new functionality live.
  • Harder to Scale - In order to horizontally scale a monolithic application we need to deploy multiple instances of the the entire application. This is potentially inefficient as we may require extra capacity for just 1 service, but are forced to deploy multiple instances of the entire application. In an ideal world we'd scale only the parts of the application that need to scale.  
  • Less Flexibility - We may have multiple development teams, each working on different functional areas of the application. If one particular team wants to release new functionality they must co-ordinate with all other teams and can only release when all parties are in a fit state to do do. This results in less frequent release cycles and often delays new functionality going live. 

Micro Services

Micro services is an architectural style where a system is composed of a number of distinct services, each typically running on its own host and communicating with other services using light weight integration technologies. 
A micro services architecture helps address the limitations described above by splitting our system into autonomous services that can be deployed independently.  Services are typically deployed to a dedicated host and communicate with one another using light weight integration mechanisms such as REST or remote procedure calls. There is nothing to stop multiple services residing on the same host but it makes sense to deploy them separately so that in the event of a host failure only a single service is affected. 
The key benefit of splitting our application into distinct services is that it allows us to develop, deploy and mange the life cycle of each service independently. Code changes to a single service result in the deployment of just that service while the rest of the application remains unaffected.  This allows a team to make changes to its service and deploy new functionality quickly, in contrast to the large coordinated effort required for a monolithic application.
Fig 1.1 - Micro Services Architecture

Loose Coupling

In order to achieve the level of autonomy described above, loose coupling between services is essential. If services are tightly coupled a change in one service may have a direct impact on a consuming service and result in both services having to be redeployed at the same time. A micro service architecture seeks to avoid this kind of lock-step deployment by ensuring services remain as loosely coupled as possible.  Loose coupling can be difficult to achieve and requires careful consideration by both service provider and consumers. Below are a number of key considerations that are fundamental to loose coupling
  • Database
    • Services should avoid sharing a database schema with other services outside of their domain. A shared database schema means a shared data model and results in tightly coupled components. If we decide to evolve our data model we'll impact every other service that uses it. This means we can no longer evolve our service independently of other parts of the system and forces us to coordinate with the owners of other services in order to get our changes applied.
    • A dedicated database doesn't necessarily mean that each service needs its own database server. We could use a single database with each service having a dedicated schema.       
    • Each service having its own data store provides greater flexibility and allows teams to choose the technologies that are right for them. For some teams that may be a traditional relation database, while for others a NoSQL data store might make more sense. 
  • Service Interface 
    • Service interfaces should expose only the parts of the data model that are required. Exposing some form of data model to the client is essential but its important that this remains as lean as possible and does not contain any more information than is absolutely necessary.
    • Ensure that the data model you expose to clients is decoupled from your internal data model. Exposing your internal data model means exposing unnecessary implementation detail. For example if your service deals with Customer information, the entity that represents a customer inside your service should not be exposed to clients. Instead you should expose a model with just the data required by the client and no more. This way you can evolve your internal data model without breaking the client. Keeping internal implementation detail hidden is key to ensuring loose coupling. 
  • Integration Technology 
    • Choose an integration technology that lends itself to loose coupled integration. 
    • REST integration uses well defined web standards and is a popular choice for loose coupled integration. REST is platform agnostic, allowing services written in different technologies to easily exchange data. It doesn't mandate specific messaging formats, giving us the flexibility to choose a format that suits us best. From experience most restful services will expose XML or JSON. 
    • Avoid integration technologies that tightly couple client and service through a shared model. Exposing services using WSDL for example, typically require the consumer to generate a client side stub based on the exposed interface. This can make it more difficult to evolve the service without breaking clients. Changes to the service WSDL often require clients to regenerate their client side code in order to realign with the new service interface.
    • RESTful integration with plain XML or JSON over HTTP allows services to evolve their interface without necessarily breaking consumers.  
  • Clients
    • Achieving loose coupling requires discipline from service clients too. Obviously we won't always have control of the client applications calling our services but where we do, the following points are worth considering.
    • Clients should consume services in a way that is tolerant to change, implementing what is known as tolerant readers. 
    • Its preferable that clients apply minimal validation and extract only the data they need from the service response, ignoring the rest. 
    • If a service interface evolves and adds 2 new fields to an XML response, the client code can simply ignore this extra data. If required, the client code can be updated at some point in the future to read the new fields.  
    • Clients that implement tolerant readers allow the services they consume to evolve without breaking changes. 

Modeling Services on Business Concepts

Splitting our system into a set of loosely coupled services and defining the responsibilities and boundaries of those service is fundamental to successfully implementing a micro services architecture.  We should start by identifying any natural boundaries in the business domain. Most organizations are split into a number of distinct business areas, each responsible for performing some specific function. Consider an on-line retailer as an example, we could break this type of business into the following areas (obviously such a business could have many more distinct areas but for the sake of simplicity we'll go with the list below) 
  • public facing web app where customers can browse products and place orders
  • payments processing 
  • warehouse order processing
  • sales and marketing department
  • finance department 
While each of these areas is responsible for performing a specific business task they cannot exist in isolation, and rely on interactions with other parts of the business. The point at which one business area interfaces with another can be thought of as a domain boundary. By mapping out distinct business areas and their boundaries with other parts of the system, we start to get a picture of how we might model the services in our micro service architecture.         

Warehouse order processing from the list above, is an example of a business domain that could be modeled by a set of dedicated services. Such an approach would allow these services to evolve independently of services in other business domains. In theory a development team could build, test and deploy new functionality for its business domain without disrupting the wider application.

Dealing With Change

Of course its the type of change that dictates the level of disruption to the wider application. If service boundaries/interfaces don't change and the service updates are internal to the business domain, the rest of the system should be insulated from the change. For example, a service may decide to change the way it persists data, by moving from a relation data store to NoSQL. While this may involve considerable change within the service, it remains an internal implementation detail and should not impact other parts of the application. 

Service boundary change on the other hand involves altering the way a service integrates with external components and typically means the evolution of existing interfaces. This type of change has the potential to be more disruptive because changes to an exposed contract may break service clients. An example might be adding new fields to a REST endpoint. Such a change will require coordination with other parts of the business and the new interface will have to be tested to ensure it hasn't broken consumers. If an updated interface can't be handled gracefully by all clients, time will need to be set aside to allow client applications to make the required changes. Changes to a service interface that break clients are more painful because they require components from multiple business domains to be tested and deployed together.  This type of coupling is what we're trying to minimize with a micro services architecture. 

Robust Integration

An application consisting of many distinct services, poses a number of challenges when it comes to component integration. The more granular we make our services, the more integration points we have to deal with, so its important our service integrations are as robust as possible. The techniques mentioned below are applicable to any distributed architecture, but are particularly important for micro services where we're potentially dealing with a large number of remote components.
  • Retry Failed Remote Calls
    • Network outages are common even if our solution is deployed on robust cloud infrastructure. We must assume that remote calls will fail from time to time and put measures in place to deal with these failures.
    • Implementing a retry mechanism allows us to re-execute remote calls in the event of a network failure. This is especially useful for dealing with short term network glitches. A typical approach is to retry a call 3 times (configurable) with a short back-off period between each call. 
  • Circuit breakers 
    • Circuit breakers limit the number of times we attempt to call a slow or unresponsive service by monitoring previous failed attempts. 
    • After a predefined threshold has been reached the circuit breaker will trip, any further attempt to call the remote service will result in the circuit breaker skipping the remote call and immediately returning an error response. 
    • From a clients perspective this means reduced latency as we no longer have to wait for multiple attempts to call a service that will likely fail. An immediate error response from the circuit breaker allows the client to deal with the error right away.
    • It also benefits the target service by reducing the number of incoming requests and may provide a much needed opportunity for the service to recover if struggling under heavy load.  
  • Connection Pooling
    • A single connection pool can be quickly exhausted by multiple calls to one slow or unresponsive service. 
    • Exhausting the connection pool will result in other processes being unable able to make remote calls at the same time.
    • A sensible approach is to have dedicated connection pools for certain outbound service calls. This means that in the event of one set of service calls running slowly, all available connections wont be monopolized and other remote calls can proceed. 
    • Connection Timeouts
      • Sensible timeouts are important for ensuring we manage slow or unresponsive remote calls. 
      • Timeouts should be fine tuned for each remote call depending on expected performance or response times.  
      • Using timeout values that are too long, or worse still no timeouts at all, can lead to unacceptable latency for client applications. 
      • Timeout values that are too short result in failed calls that might have otherwise succeeded had the target service been given more time to process the request.  


    We know from experience that things can and will go wrong in a production environment. We should proactively monitor application health so that we can identify issues as soon as they happen and react accordingly. Once we've identified that there's an issue we need access to application metrics so that we can quickly identify the root cause and do something about it. Application metrics are important in any production environment but are of particular significance in a distributed architecture like micro services. An application consisting of many distinct services has the potential to fail at any point, so its imperative we have a fined grained view of each components health so that we can quickly identify issues and move to resolve them.   
    • Health Checks 
      • Monitoring service health on AWS can be achieved by setting up health checks using Amazons Elastic Load Balancer. We can configure the Elastic Load Balancer to send a periodic HTTP request to an endpoint exposed by our application. If the service responds successfully within a predefined period (say 3 seconds) we can assume that the server instance is healthy. 
      • We can set different tolerances for different services or environments. For example on a pre-production environment we might configure a response time of 8 seconds for health checks before deciding an instance is unresponsive.  On a production environment we may have lower tolerances and decide an instance is unresponsive after 3 seconds.  
    • Metrics
      • Real time application and infrastructure metrics are key to fault finding in a production environment and should be available for every service    
      • Below are some metrics that I've found useful trouble shooting issues in the past
        • JVM Metrics (heap and non heap memory usage)
        • Threadpool metrics
        • Database connection pool metrics
        • Remote service call times
        • DB query times
        • Cache metrics
        • Host CPU usage
        • Host memory usage

    Scaling Micro Services

    Its probably safe to say that most micro service architectures are deployed on scalable cloud infrastructure.  The product I'm currently working on runs on AWS so I've seen first hand how scalable cloud infrastructure can be used to scale an application to handle increased demand. 
    • Scaling Vertically
      • Vertical scaling is when we increase host resources such as CPU, memory or disk storage.  On cloud infrastructure this is typically the easiest way to scale a service to handle increased demand. 
      • Scaling vertically is useful but is ultimately limited by the resources available on a single host instance. To achieve real scalability we need to look beyond single instances and consider running our components across multiple hosts. 
    • Scaling Horizontally
      • Scaling horizontally is when we deploy multiple instances of a service across multiple hosts.  
      • Cloud services like AWS allow us to to route HTTP service requests through an Elastic Load Balancer which distributes those requests across the various service instances. 
      • In the event of an instance failure, the load balancer will continue to route requests to healthy instances, allowing clients to remain oblivious to the failure. The Elastic Load Balancer uses the health check mechanism we discussed earlier to decide whether or not an instance is healthy enough to accept requests.        
    • Auto Scaling
      • Auto scaling is where we use events or infrastructure metrics to trigger a change in our infrastructure profile.  
      • A failed health check is an example of an event that can trigger the provisioning of a new server instance, in this case to replace an instance that is no longer responsive.
      • CPU and memory metrics can also be used to trigger the provisioning of new instances. 
      • Alarms can be created, that if triggered will result in new server instances being started and registered with the load balancer. This is a great way to respond automatically to increased load on a service.  
      • We can also use auto scaling to scale in when load on our services decreases. We could trigger a scale in event by responding to server CPU or memory usage dropping below a predefined threshold. The ability to scale back in is important for managing cost in the cloud. 


    Micro services is an evolution of distributed architectures I've worked with in the past. It takes things a step further by encouraging a greater level of service granularity and therefore a larger number of distinct components. This introduces new complexity in terms of testing, deployment, integration and monitoring. This complexity can be offset to some degree with extensive automation especially around testing and the provisioning of infrastructure. 
    As well as greater service granularity micro services put a strong emphasis on splitting services by business domain and decoupling those domains as much as possible. This allows teams to develop test and deploy new functionality independently of other parts of the system. This level of autonomy doesn't come easily and requires adherence to strict design principles such as dedicated data stores and data models.

    A micro services architecture isn't something I'd adopt without very careful consideration. While micro services offer a number of significant benefits, they come at a cost in terms of complexity. Before deciding to go down the micro services route I'd need to be confident that the long term benefits justified the additional complexity.