Deploying an AWS PrivateLink for a Kafka Cluster
Kafka is a massively scalable way of delivering events to a multitude of systems. In th...
For further actions, you may consider blocking this person and/or reporting abuse
Hi Jonathan, I read your article and found it very helpful! Thank you for posting this method but I am a newbie in AWS and kafka and I still have a few doubt that I wish you can answer.
1 I suspect you set the listener, advertised.listener, etc in server.properties, please correct me if I am wrong. However, in AWS, where does the server.properties locates? In EC2 instance? or other places.
Thank you in advance and looking forward to your reply.
-hom
The properties file lives on the ec2 instance. It’s part of the Kafka configurations files.
VPCE stands for virtual private cloud endpoint which is synonymous with PrivateLink. Sorry, should avoid acronyms unless I explain them. AWS will use both PrivateLink and VPCE interchangeably.
Kafka.vpce.dev is the route53 hosted zone I made up. Because of how Kafka works both the provider, the VPC that is hosting Kafka, and the consumer, the vpc that’s using Kafka, must agree on a hostname pattern so the IP resolution using DNS works. The zone that’s agreed upon is the zone you put the consumer IPs in on the consumer VPC and thats the zone that Kafka must use as part of the advertised.listeners.
Thank you for the swift reply! And I have a few follow ups.
I installed kafka on my EC2 instance and there is a server.properties locating in /home/kafka_2.12-2.2.1/config. Is this property file the one to change? But how can I connect this server.properties with the kafka I want to manipulate with since this property file does not specify which kafka it corresponds to? Btw, I created the kafka through console, but not command line.
In terms of kafka.vpce.dev, can I put "DNS of endpoint:9092" to replace "kafka.vpce.dev:9092" in server.properties so that I can do less work by eliminating the need of setting up route53?
-hom
Yes that is the file you would be editing. I am not entirely sure though what you mean by "which kafka" it corresponds too. You can start multiple kafka processes on a server and they should all use that same config file by default. You'd have to tell it to use a different one at startup but not entirely sure of your setup.
You COULD put endpoint:9092 on what to respond with as an advertised listener, however your clients will receive "endpoint:9092" and will try to resolve "endpoint" in their VPCs. Unless they have "endpoint" configured in DNS to resolve to an IP of the PrivateLink they will fail with an unknown host exception.
Hi Jonathan, sorry for not making my problem clear. On my EC2, I only have one server.properties. But there are many MSK clusters in this VPC. If I change the server.properties, which cluster am I changing the properties of? Since we cannot specify a cluster in server.properties.
Also, in the line of advertised.listeners=xxxx, I have seen titles like PLAINTEXT, INTERNAL_PLAINTEXT, VPCE, CLIENT, CLIENT_SECURE. I'm confused by these mapping names, where can I set them? Or is there a list of these names for different purposes and we should stick to them?
It is an approach that not many people tried, really appreciate your help, Jonathan!