Resource quotas
When many teams or users are sharing a Nomad cluster, there is the concern that a single user could use more than their fair share of resources. Resource quotas provide a mechanism for cluster administrators to restrict the resources within a namespace.
Once you attach a quota specification to a namespace, the Nomad Enterprise cluster counts all resource usage by job, in that namespace, toward the quota limits. If the resource exhaustion is experienced, allocations within the namespace queue until resources become available - by other jobs finishing or expanding quota.
We recommend that you enable resource quotas for shared environments where multiple teams or applications are running on the same Nomad Enterprise cluster. Below is a list of recommendations, however if you need a tutorial for how to implement, visit the Resource Quotas tutorial page.
Quota specification
Quota specifications are first class objects in Nomad Enterprise. A quota specification has a unique name, an optional human readable description, and a set of quota limits. The quota limits define the allowed resource usage within a region.
Quota objects are shareable among namespaces. This allows an operator to define higher level quota specifications. For example, a "Team-A" quota, and multiple namespaces can apply the "Team-A" quota specification.
Use the quota block in the Nomad job specification to define resource quotas. An example is below.
quota "team-a-quota" {
  limit {
    region = "global"
    region_limit {
      cpu = 2000
      memory = 4096
    }
  }
}
Note
It is crucial to properly design your namespace structure and workload placement within those namespaces, considering resource requirements and cluster capacity.
Applying quotas to namespaces
Once you define and apply the quotas, add them to namespaces. Below is an example of how to apply a quota to a namespace.
- Add to your specification file the quota parameter:
name        = "team-a-namespace"
description = "Namespace for Team A."
quota       = "team-a-quota"
- Run nomad namespace apply ./anamespace.hcl
Version control
We recommend keeping your namespace and quota specifications within version control for audibility and troubleshooting, and incorporate into a build pipeline that manages the deployment of quotas such as Terraform. The Nomad provider provides a useful way to manage quotas and namespaces within your pipeline.
Monitoring
Monitor resource usage to ensure teams are within their quotas. There are several metric endpoints you can use to provide monitoring and alerts relating to quotas. Set the following block to false in your client configuration.
telemetry {
    disable_quota_utilization_metrics = false
}
Use the nomad.nomad.blocked_evals.total_quota_limit metric endpoint to alert you when jobs block due to reached quotas.
Use nomad.quota.utilization.cpu, nomad.quota.utilization.cores, and nomad.quota.utilization.memory_mb for resource consumption quota management.
Ensure to set your filters based on quota name, namespace, or region to provide an accurate limit report.
ACLs
By implementing strict access control measures, you can prevent users from bypassing or modifying resource quotas without proper authorization.
Accomplish this with the quota{} block within an ACL specification.
See the ACL section for more details.
Federated clusters
Nomad Enterprise replicates quota specifications in a federated cluster from the authoritative Nomad Enterprise region. This allows operators to interact with a single cluster but create quota specifications that apply to all Nomad Enterprise clusters. For example, you can create a single quota specification with multiple regions defined with their own limits.
name        = "federated-example"
description = "A single quota spec affecting multiple regions"
limit {
    region = "europe"
    region_limit {
        cpu = 20000
        memory = 10000
    }
}
limit {
    region = "asia"
    region_limit {
        cpu = 10000
        memory = 5000
    }
}
Applying this quota specification will apply it to all federated clusters.
Communication
Ensure that all teams are aware of their resource quotas and the importance of adhering to them. Regular communication or providing monitoring dashboards or alerting can help in avoiding any unexpected resource exhaustion and reduce the burden of the Nomad platform operators.