Advanced job scheduling

By default, the Nomad scheduler uses a bin-packing algorithm to optimize the resource utilization and density of applications in your Nomad cluster. You have more fine-grained control over allocation placement. This enables use cases similar to the following:

Expressing preference for a certain class of nodes for a specific application with the job specification affinity block.
Spreading allocations across a datacenter, rack or any other node attribute or metadata with the job specification spread block.

Scheduling performance

For each allocation in a service and batch job, the Nomad scheduler iterates over nodes until it finds a small number of feasible nodes. The scheduler then scores those feasible nodes to find the best placement. The exact number of nodes scored depends on the job specification. Using the affinity or spread block can have a significant impact on scheduling performance.

No affinity or spread

When you omit the affinity or spread block, the batch job node limit is two. For service jobs, the node limit is a minimum of two or the log₂ of the total number of nodes in the datacenter and node pool.

You can reduce scheduling times by avoiding affinity and spread. Instead, rely on the default distribution of a job across multiple nodes. If this is not possible, you may consider reducing the size of the node pool or datacenter to reduce the number of nodes available for the scheduler to consider.

With affinity or spread

When you include the affinity or spread block, the scheduler scores a number of nodes in the datacenter and node pool equal to the task group count, with a maximum of 100 per allocation. This can result in order-of-magnitude increases in scheduling times.

To increase placement randomization and reduce scheduler contention when using affinity or spread, set the node-limit-for-feasibility-checks scheduler configuration option. You may specify an upper limit on the number of feasible nodes Nomad should consider when scheduling a job.

Lower numbers result in better scheduler performance and more randomization of jobs across nodes.
Higher numbers result in more deterministic application of spread or affinity.

Reducing the upper node default limit of 100 may reduce the increase in scheduling time, the tightness of binpacking, and how strongly the Nomad scheduler scores affinity or spread.

For a mathematical and graphical explanation of how node limit affects spread and affinity scheduling performance, refer to the GitHub pull request comments.

Monitoring

To monitor scheduling times potentially impacted by affinity or spread blocks, examine the nomad.nomad.worker.invoke_scheduler.* found in the Key Metrics table.

Guides

Refer to the following guides for using affinity and spread.