Adekoyejo Akinhanmi

Posted on Mar 2, 2018

Microservice data sharing and communication

#microservices #discuss #architecture

Hello,

I really want thoughts from anyone who wants to on microservice architecture. It can be language agnostic but it's probably noteworthy that I am using SailsJS on Node Express. I would really appreciate some personal experience as well.

I would like to know:

When a service provides information to other services, how does one manage communication between them? Do you use REST, Message queuing? What's best?
When data needs to be shared across services, for example to prevent sharing of tables, is it good practice to create a relationship table?

Thank you.

Top comments (14)

rhymes • Mar 3 '18

I think I would consider synchronising them with a messaging queue. It's very tempting to use HTTP/REST (easier and faster) and I do use it.

It scales well for a while but I think it gets more difficult when you have A LOT of microservices or you want test the system and all the interactions between them.

Anyhow carefully designing, regardless of the communication system, it's paramount.

Definitely do not share the database between them, don't use the DB as the synchroniser, sends messages and keep storage local to each service.

Adekoyejo Akinhanmi • Mar 3 '18 • Edited

Hey rhymes, thank you for your answer. I am definitely concerned scaling and I am really trying to make well informed startup decisions. Also, you've mentioned something I have been pondering which is testing? If you don't mind, how would one go about testing. For example, I want to unit test each function within a microservice but then I run into challenges and I end up going the whole nine yards i.e. integration testing, because a function might depend on an another microservice for data. As such, I get confused on how to isolate it. I would appreciate if you can provide insights on this as well.

Cheers.

rhymes • Mar 3 '18 • Edited

Hi Adekoyejo, I'm honestly not an expert on this topic but I think the hardest thing is testing interaction between multiple services.

As you mentioned each service (which is a self contained app) should have its own unit and functional testing but you still need to test service A talking with B.

I would research the literature on it. Check Martin Flower's website and these resources:

Pattern: Service Integration Contract Test

Spotify's Testing of Microservices

Setting up integration testing seems also a really good way to design and discuss service boundaries. How would the client consume service A and B? Do you need A and B to then call service C? Do you have all services behind a single gateway so that it's transparent to the client?

There's no easy way to do microservices, that's for sure.

Frank Carr • Mar 3 '18

On #1, it kind of depends on the location and purpose of the service and the relationship between them. In C# Web API services I've done both REST and MSMQ. The goal was to make how it was used relatively transparent and switchable via a repository pattern. If I needed to, I could even swap in a plain old DLL to do the work.

On #2, when I was was working with an Oracle backend services had their own schemas with matching stored procedure packages. They might be accessing the same data tables underneath (usually legacy or common data ones) but the services and their callers wouldn't know that. Everything would appear separate at their level. In SQL Server, you can do a similar things with separate databases.

Some tables do need to be shared. DRY should apply to databases as well. You (or your DBA) don't want to find yourself in the position of having to maintain dozens of customer tables, zip/postal code tables, tax tables and so forth. Data will get out of sync and cause you considerable headaches. Trying to do this is like cutting and pasting code.

Patrice Gauthier • Mar 3 '18

REST is for public API but you will spend a lot of time to make it right and you have to generate the doc when it's public.

Message queuing you mean like RabbitMQ? It fills more the need of a distributed event architecture with the help of Pub/sub and other patterns but you can make direct calls with the RPC approach. I'm using that approach for 3 microservice at work also because it will come handy (pub/sub) when we want to get notified by the payment microservice.

For APIs GraphQL is coming as a game changer for publicones because the schema becomes the documentation. gRPC is something to watch for private APIs too.

Also I suggest you only send IDs between microservices. They should have their own databases. You should be able to deploy/update/change a microservice without impacting much the others. At work with this appraoch one of microservices I've created communicates with a 3rd party, keep some data in it's own database and has a very low maintenance. The initial development took me sometime though.

Adekoyejo Akinhanmi • Mar 3 '18

I am definitely consdering graphQL, however, my thoughts are: regardless of the technology being used the architecture must be well designed for scaling. I haven't implemented graphQL because the design (of my architechure) is far from complete however, I agree that it is going to be a game changer. (I almost can't wait) I haven't really heard of gRPC but I'd read it up and see if it aligns with my situation.

Speaking of sharing IDs between microservices, I quite agree with this but then I'd like to know, would you rather incremental IDs or unique alphanumeric IDs since data sharing is involved? Please provide other options you consider better.

Cheers.

Patrice Gauthier • Mar 3 '18

You scale according to what market you look for trying to plan for what will come. "Well designed for scaling" is an imprecice goal. "Well design to handle 10k concurrent user" now you have an idea of what you look for. Then you choose the technologies to reach that goal and a little above.

The design for scaling for 10k and 500k users aren't the same see this:
slideshare.net/AmazonWebServices/s...

As for IDs, in database it's faster to lookup for numeric IDs than string. Using unique incremental IDs is fine. The problem with microservice are many. Latency, input validation, error management, concurrency, transaction, caching.. What happens when: requests are taking too long? What error you return and how you handle them? How long and where do you cache the data trying to solve the first problem?

Kasey Speakman • Mar 3 '18

Microservices talking directly to one another or sharing databases should probably be an exception and not a rule. Otherwise, the microservices are coupled together and cannot realize even the basic promise of microservices -- independent deployment.

See this Software Engineering answer I posted in a similar vein.

Manoj Ahir • Mar 7 '18 • Edited

1, I have used both REST and MQ (RabbitMQ) with in project. The Architecture was like facade pattern, so calls from Gateway to individual services were REST and notifications between services were on MQ. Now would really like to use gRPC as its fast, support versioning and polyglot architecture.
2, The question i would ask myself be, will this be easy to maintain over time?
As far as Read is concern, would populate a cache and allow other services to read directly and keep ownership with one service for all other transactions. (Assuming each service has its own DB)

Mike Lowen • Mar 3 '18

In terms of inter-service communication I don't believe that it is an either/or scenario rather it is dependant on the use case you find yourself in. In my experience the rules of thumb I've used are:

If the data from the other microservice is required to complete the transaction then you should go for the synchronous communication method otherwise go for asynchronous.
Go for the method of least coupling first which is usually asynchronous messaging.

In the case of the data being required to complete the transaction some thought should be given as to whether these should be seperate services while it is not always the case it can be a smell that you may have gone too fine grained on your service separation and it is just something to weary of.

When it comes to sharing data across services I echo other commenters that you should avoid your services sharing a database. Aside from that how you share data across your services depends on your performance profile the options I've seen used are (in order of increasing complexity & performance):

Querying the other services API when needed.
Querying the other services API and caching the result as needed.
Building up a cache of the data in the services datastore populated by events emitted by the other service (in an event driven system).

Again the rule of thumb that I go by is start by going for the least complex and do some performance testing only moving onto the next option when your performance dictates that it's necessary.