Troubleshooting Executors

This page compiles a list of common troubleshooting steps found during development and administration of executors.

Disabling the auto-deletion of Executor VMs

The Executor host VMs are configured to automatically tear themselves down once all jobs in the queue are completed. While this is desired behaviour under regular circumstances, it complicates debugging issues in the executor configuration or connections. To prevent the VMs from automatically stopping:

  1. ssh into the VM
  2. sudo su to become the root user
  3. Remove (or rename) the /shutdown_executor.sh file

The VM should now persist after all jobs are satisfied.

Creating a Debug Firecracker VM

To create a temporary Firecracker VM for debugging purposes:

  1. ssh into the host VM
  2. sudo su to become the root user
  3. systemctl stop executor to stop the executor service
  4. export $(cat /etc/systemd/system/executor.env | xargs) to load the executor environment into your shell
  5. Run executor test-vm to generate a test firecracker VM. The command will output a line like:
    Success! Connect to the VM using
    $ ignite attach executor-test-vm-0160f53f-e765-4481-a81e-aa3c704d07bd
    
  6. Execute the generated ignite attach <vm> command to gain a shell to the Firecracker VM

Recreating a Firecracker VM

If a server-side batch change fails unexpectedly, it's possible to recreate the generated Firecracker VM from the batch change execution.

  1. Navigate to the failed execution page of the Batch Change

  2. Select a failed Workspace on the left and click the Diagnostics link on the right pane

  3. In the modal, expand the Setup step by clicking the text or the expansion arrow on the right

  4. Copy the command from the final step of Setup starting with ignite run

  5. ssh into the host VM

  6. sudo su to become the root user

  7. systemctl stop executor to stop the executor service

  8. export $(cat /etc/systemd/system/executor.env | xargs) to load the executor environment into your shell

  9. Paste in the command copied from the batch change. You may need to remove the --copy-files and --volumes directives as those volumes and files may not exist on the VM any longer. Surround the --kernel-args arguments in quotes as well

  10. Execute the command and wait for the VM to start

  11. Run ignite ps to list all currently running VMs

  12. Run ignite attach <vm id> to get a shell to the running VM