Remote Task Drivers
Note: Remote Task Driver support is experimental and subject to backward incompatible changes between Nomad releases or deprecation. Please refer to the Upgrade Guide to find breaking changes.
Known Bugs: When a Nomad node running a remote task driver goes down, another node must be available and able to run the replacement allocation in order to take advantage of remote task driver's ability to avoid restarting lost tasks. If a new node is not immediately available but started later, it will start a new instance of the remote task instead of reconnecting to the old one. Follow #10592 for the fix.
Nomad 1.1.0 introduces support for Remote Task Drivers. Remote Task Drivers allow custom task driver plugins to execute tasks using nonlocal runtimes such as cloud container runtimes. Without this support, task driver plugins trying to manage remote tasks would run into the following problems:
When draining a node, Nomad stops all allocations on that node before rescheduling them.
When a node is
down
, Nomad reschedules all allocations onto other nodes.
These 2 behaviors are optimal for traditional task drivers where the task process is colocated with the Nomad agent. If the Nomad node is down or drained, the allocations should be considered down or be drained as well.
However these 2 behaviors do not apply to tasks executing on remote runtimes. If the Nomad node managing them goes down, a new Nomad node should be able to manage them without restarting the task. Likewise if the Nomad node managing the remote task is drained, a new Nomad node should manage the remote task without requiring it be stopped and restarted.
The Remote Task Driver feature in Nomad 1.1.0 improves these behaviors for
custom plugins that advertise the RemoteTasks
capability.
Caveats
Due to the exerpimental nature there are a number of standard Nomad features which Remote Task Drivers do not support by default.
Resources
See Remote Task Drivers and Resources #10549 on Github for details. Comments, ideas, and use cases welcome!
The resources
block has not been altered for remote task
drivers. Since remote tasks do not consume local resources, remote task drivers
should not use the existing resources
block and instead implement their own
resource parameters in their task.config
block.
Jobs using remote task drivers should use the minimum allowed resources in
their task.resources
block:
Nomad Client Features
Remote task drivers defer most Nomad client features to the driver plugin.
Since the allocation directory is local to the Nomad node, unless a remote task driver is able to remotely mount or copy its contents, the following features will be unavailable:
artifact
- artifacts are downloaded to the local Nomad allocation directory.dispatch_payload
- dispatch payloads are written to the local Nomad allocation directory.ephemeral_disk
- ephemeral disks are local to the Nomad node and therefore not applicable to remote task drivers.template
- templates are rendered in the local Nomad allocation directory.vault
- a secret token may be retrieved but the task driver may not place the token file in the expected location.volume
- volume and volue mounts assumed tasks have access to the Nomad node's local disk are unlikely to work with remote task drivers.
Furthermore since networking is completely handled by the remote runtime the behavior of the following features is completely driver dependent: