Nomad is a workload orchestrator. It uses a generic specification for tasks which can run in containers, or as a VM, or in a Java environment or as a process on a server. The Docker driver lets you use Nomad as a pure container orchestrator.
Nomad is quite different to Docker Swarm and Kubernetes, and it’s not a complete solution on its own. In a production environment it becomes the compute part of a full HashiCorp stack, with Consul for service discovery and Vault for sensitive data storage.
In this episode we’ll see a couple of options for running Nomad, and deploying containerized applications.
Here’s it is on YouTube - ECS-O4: Containers in Production with Nomad
is.gd/idobeh - Black Friday, 40% off Pluralsight subscriptions!
is.gd/icitic - Docker for .NET Apps, sign up for notifications
is.gd/wipaxi - Elton on the Semaphore podcast, talking about the challenges of learning Kubernetes
Docker’s Voting App on Swarm, Kubernetes and Nomad - comparison by Docker Captain Luc Juggery
I’m running Linux VMs for the Kubernetes cluster using Vagrant.
You can set the VMs up with:
cd episodes/ecs-o4/vagrant vagrant up
You can also try Nomad in an interactive lab.
The setup.sh script installs Docker and Nomad in the simplest way.
Connect to the dev VM and start the Nomad agent in dev mode:
vagrant ssh dev nomad agent -dev
Starts in server + client mode, suitable for lab environments.
Drivers show the available workloads -
docker for containers and
raw_exec for binaries. Others are available - e.g for Java apps, QEMU VMs, and isolated binaries.
Apps are modelled for Nomad using HCL. This simple app spec in whoami.nomad models a REST API running in a Docker container.
Restart Nomad in the background and deploy the app:
[Ctrl-C] nomad agent -dev > /dev/null & nomad node status cd /ecs-o4 nomad job run whoami.nomad
Check the status of the job and the container:
nomad status whoami docker ps curl localhost:8080
Nomad servers are the control plane, monitoring jobs and tasks. Clients run user workloads. If a task fails on the client, the server will replace it.
Remove the container and check the job status:
docker ps -lq docker rm -f $(docker ps -lq) nomad status whoami docker ps curl localhost:8080
Networking in this setup is just port publishing - like host mode in Docker Swarm. You can’t scale up to more tasks than there are client nodes because each task uses a port.
Edit the count in whoami.nomad to 2 and update:
nomad job run whoami.nomad nomad status whoami curl localhost:8080
Nomad doesn’t provide service discovery on its own - it expects to integrate with a Consul cluster. Nomad can use Consul for node discovery and for service discovery.
Try to deploy a distributed app like todo-list-dev.nomad and it will run, but the containers can’t find each other.
Run the to-do list app with db and web tasks:
nomad job run todo-list-dev.nomad nomad status todo docker ps
Check the status of the group:
nomad alloc status [ID] nomad alloc logs [ID] web
Test the app:
curl -v http://localhost:8010 nomad alloc logs [ID] web
You can run Consul in dev mode too, but we’ll skip onto running a more production-like cluster.
Consul is a service catalogue and a DNS server. It works with Nomad to register services as DNS endpoints, and resolve DNS queries to task addresses - e.g. container IP addresses.
You can run a clustered Consul setup with server and client nodes. Each node runs a local agent, and that agent gets used for lookups.
The setup-prod.sh script installs Consul and runs it as a service. It also configures the VM to use Consul for DNS lookups.
Connect to the server and verify Consul is running:
vagrant ssh server consul members
Get the IP address of the server for other nodes to join
Join the client node to Consul:
vagrant ssh client consul members consul join #[SERVER-IP]
And the second client node:
vagrant ssh client2 consul join #[SERVER-IP] consul members
Now all the servers are in a Consul cluster, for shared service discovery
A Nomad cluster runs as server and client nodes - typically 3 servers for production. We’ll use a basic setup in server.hcl which uses a single server node.
Start Nomad on the server:
vagrant ssh server cd /ecs-o4 nomad agent -config server.hcl > /dev/null & nomad node status
The server node is the control plane. Server nodes are only for management and won’t run any user workloads.
We’ll add the other VMs as clients to the cluster using the configuration in client.hcl. Nomad will use Consul to find the Nomad server.
Join the client to the Nomad cluster:
vagrant ssh client consul catalog services cd /ecs-o4 nomad agent -config client.hcl > /dev/null &
And the second client:
vagrant ssh client2 cd /ecs-o4 nomad agent -config client.hcl > /dev/null &
Check the cluster status:
vagrant ssh server nomad node status
Now we have a Nomad cluster with multiple nodes and Consul integration.
The same whoami.nomad spec will work on the cluster. With two nodes we can run two instances of the group.
Deploy the job:
cd /ecs-o4 nomad job run whoami.nomad nomad status whoami nomad alloc status [ID]
Browse to the allocation address.
Port publishing is like host mode in Swarm or NodePorts in Kubernetes - but only on the server hosting the task. You don’t get a routing mesh where any server can receive incoming traffic and send it to the container.
With Consul integration we can publish services which route to tasks, so applications can use DNS to reach other components. The app spec in todo-list.nomad uses a service for the database, and configures DNS in the web container to use Consul.
Edit todo-list.nomad to add the server IP for DNS lookup.
Deploy the app as one group:
nomad job run todo-list.nomad nomad job status todo nomad alloc status [ID]
All the tasks in the group run on the same node.
Check the database service is registered in DNS:
consul catalog services dig @localhost SRV todo-db.service.consul dig @localhost todo-db.service.dc1.consul
Browse to allocation address for the app and test
Defining a distributed application as tasks in one group doesn’t give you a scalable solution. We’ll remove this deployment and run a more production-y job.
Stop the application job:
nomad job stop todo
There’s no delete functionality - old jobs get garbage-collected by the server.
The new spec breaks the database and web components into separate groups:
Deploy the database job:
cd /ecs-o4/todo-list-v2 nomad job run todo-db.nomad nomad job status todo-db
Edit todo-web.nomad & add server IP for DNS
Deploy the web job:
nomad job run todo-web.nomad nomad job status todo-web
Browse to the allocation address and test the app.