Firstly, locate the cluster definitions page from dropdown menu at the top-right of the page.
There are various types of Cluster Definitions that determine which tasks they can be used in and what options can be configured for them.
|Cluster Definition Type
|Databricks Cluster (Azure)
|Spin up Apache Spark clusters on-the-fly using Azure based Virtual Machine configurations.
|Databricks, Spark SQL Statement
|Azure Batch Pool
|Creates an Azure Batch Pool.
|Azure Batch Task
|Azure Batch Container Pool
|Creates an Azure Batch Container Pool.
|Azure Batch Task
The next page in the form will prompt you to configure the various specs and software that is used in the cluster. If you need a refresher on what makes up a Cluster Definition read Cluster Definition Concepts.
There are no validation steps required for Cluster Definitions, this is because all information is populated based on the type and is verified against the cloud provider.
There are three different cluster types.
You can create an Azure Batch Pool or an Azure Batch Container Pool.
For both cluster types you will also need to choose the Region and the Connection.
Choose from the dropdown list of regions. (Your chosen region may affect the available OS Configurations you can choose on the next page.)
Your chosen region must be the same as your Azure Batch account region.
Choose a connection in the next dropdown. These are available connections to Azure Batch.
Then choose if this cluster definition will be available to all projects or only to selected projects.
If you choose selected projects, you can then choose from a list of all projects in this tenant. This cluster definition will only be available in these projects and will not be displayed when creating tasks in other projects.
After you have chosen your projects, click Next.
Select an OS Configuration. You can choose to filter the OS Configuration dropdown by not displaying unverified OS Configurations or those that are expired or will soon expire. Your previously selected region may affect the OS Configurations that are available.
The available OS Configuration options in the dropdown list will differ depending on those selections. (The following image does not contain OS Configurations that are unverified or will soon expire.)
Choose an Azure Virtual Machine Type. The VMs available will change depending on your chosen hosting Region and capabilities.
Then choose the number of Minimum Workers for this cluster definition. This is the minimum number of processes used to run tasks.
You can also choose the number of Maximum Workers. You can leave this field blank and not specify a maximum number of workers. If specified, the cluster will automatically scale based on the workload.
Providing a number of Maximum Workers may result in higher running costs.
You can specify a container image. Provide the name or path of your chosen container image.
You can choose a connection that connects to a custom container registry, so that you can use container images that are not available from Docker Hub or other container libraries.
Select the connection to the Container Registry from the dropdown.
You can then submit this cluster definition to save, and it will be ready to use in your tasks.