In a Kubernetes cluster I'm building, I was quite puzzled when setting up Ingress for one of my applications—in this case, Jenkins.
I had created a Deployment
for Jenkins (in the jenkins
namespace), and an associated Service
, which exposed port 80
on a ClusterIP
. Then I added an Ingress
resource which directed the URL jenkins.example.com
at the jenkins
Service
on port 80.
Inspecting both the Service
and Ingress
resource with kubectl get svc -n jenkins
and kubectl get ingress -n jenkins
, respectively, showed everything seemed to be configured correctly:
$ kubectl get svc -n jenkins
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
jenkins ClusterIP 172.20.3.104 <none> 80/TCP 17m
$ kubectl get ing -n jenkins
NAME HOSTS ADDRESS PORTS AGE
traefik jenkins.example.com 80 17m
But when I visited the URL, I would get a 503:
$ curl -I http://jenkins.example.com/
HTTP/1.1 503 Service Unavailable
Vary: Accept-Encoding
Date: Wed, 24 Oct 2018 18:23:42 GMT
Content-Length: 19
Content-Type: text/plain; charset=utf-8
The Traefik logs weren't all that helpful (I have Traefik running as a DaemonSet
), but did point to some sort of disconnect between the jenkins
Service
and the jenkins
Deployment
:
$ kubectl logs -l app=traefik -n ingress-controller
...
{"level":"warning","msg":"Endpoints not available for jenkins/jenkins","time":"2018-10-24T18:33:11Z"}
{"level":"warning","msg":"Endpoints not available for jenkins/jenkins","time":"2018-10-24T18:33:13Z"}
{"level":"warning","msg":"Endpoints not available for jenkins/jenkins","time":"2018-10-24T18:33:13Z"}
Eventually my Googling led me to this GitHub issue comment, which stated:
The likely culprit is that your Service's selector doesn't match any Pod's labels.
Sure enough, when I described the full jenkins
Service
, I noticed it had no associated Endpoints!
$ kubectl describe svc jenkins -n jenkins
Name: jenkins
Namespace: jenkins
Labels: app=jenkins
Annotations: <none>
Selector: app=jenkins,tier=frontend
Type: ClusterIP
IP: 172.20.3.104
Port: jenkins 80/TCP
TargetPort: 8080/TCP
Endpoints: <none>
Session Affinity: None
Events: <none>
I realized the Selector
labels I had defined did not match the jenkins
Deployment
labels I had defined. I changed the labels to match by editing the Service
definition (kubectl edit svc -n jenkins
), and then Traefik immediately started serving the traffic, and the Endpoints
value was filled in with the Jenkins pod's IP address!
Comments
SO glad I documented this when I ran into it. I ran into the exact same issue on another cluster today and was at my wit's end... googled the error from Traefik, came here, and BOOM, exact same problem :)
Maybe I should get my head checked out for making the same mistakes in multiple clusters through the years ?
Definitely helpful. Thanks for sharing this even if you thought you were the only one it helped!
Thanks for your post. I got a 503 and i was missing an app selector in the service object.
spec:
selector:
app: grafana
Thanks a bunch this pointed me to my exact issue.
What a relief!!!! I don't know how many odd things I have been thru now, and this was just it. Thank you for posting!
Thank you for this article Jeff, I finally could find out how to debug my ingress controller :)
Thanks Jeff still a useful post after all these years.
For me the issue was a little different where basically turned off my raspberry pi cluster, spun it back up everything came back online but service had no endpoints, however my selectors were correct.
Could have been the apps/pods came online before Traefik could initialize (not sure), deleting the POD(s) forced the service endpoints to be updated.