-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Description
This was reported previously by #1899 but it got auto-closed. Creating a new issue because the problem still exists.
What happened:
When a Service is annotated for both external-dns and the aws-load-balancer-controller, the DNS records are created with the wrong IPs--the FQDN resolves to the pod IPs, not the NLB's IPs.
What you expected to happen:
Ideally, external-dns would publish the right IPs the first time. Personally, I'd be happy with an "eventual consistency" behavior where the pod IPs are used while the NLB is coming online, and then updated to the NLB's IP addresses when the NLB is fully provisioned.
How to reproduce it (as minimally and precisely as possible):
Assumptions:
- route53 with registered zone. I will be using "example.com" as a placeholder.
- EKS cluster with external-dns and aws-load-balancer-controller installed.
- we experienced this issue with internal load balancers; the steps below assume you have some means (e.g. a vpn) of accessing the VPC DNS server. Standing up and connecting to an ec2 jumpbox is left as an exercise for the reader. I have not tested with a public NLB
- follow the "Deploy the echoserver resources" instructions only
- apply the yaml below (don't forget to replace
example.comwith your actual domain):
---
apiVersion: v1
kind: Service
metadata:
name: external-dns-demo
namespace: echoserver
annotations:
external-dns.alpha.kubernetes.io/hostname: demo.example.com
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
spec:
type: LoadBalancer
allocateLoadBalancerNodePorts: true
externalTrafficPolicy: Cluster
ports:
- name: http
port: 80
targetPort: 8080
selector:
app: echoserver
- immediately after applying the service yaml, run
kubectl get service -n echoserver external-dns-demoand note the NLB hostname in theEXTERNAL-IPcolumn. Attempt to resolve this via nslookup; you should get an NXDOMAIN response. - repeat step 2 every minute or so until you no longer get an NXDOMAIN response
- do a nslookup of
demo.example.com
Expected output of successful lookup: the sets of IPs from step 3 and 4 match
Actual output: demo.example.com resolves to the pod IPs (kubectl get pod -n echoserver -o wide)
Anything else we need to know?:
My experience mirrors @gavinandermerwe 's reply here
After a bit more digging it feels like a race condition for how the route53 cache gets updated internally. The IP addresses become available before the A record for the load balancer.
For me, when the NLB was provisioned by aws-load-balancer-controller, I could see the hostname for the NLB, but it was several minutes before the host resolved.
Like Gavin, removing the external-dns annotation and re-adding it caused the DNS update to work as expected.
Versions involved:
Environment:
- external-dns: 0.19.0
- kubernetes: 1.33
- DNS provider: route53
- aws-load-balancer-controller: 3.0