RetryRunner plugin

RetryRunner plugin implements retry logic to improve task execution reliability.

Warning

For grouped tasks need to explicitly provide connection_name attribute such as netmiko, napalm, scrapli. Specifying connection_name attribute for standalone tasks not required. Lack of connection_name attribute will result in skipping connection retry logic and connections to all hosts initiated simultaneously up to the number of num_workers

RetryRunner Architecture

../_images/RetryRunner_v0.png

RetryRunner Sample Usage

Need to instruct Nornir to use RetryRunner on instantiation:

from nornir import InitNornir

NornirObj = InitNornir(
    runner={
        "plugin": "RetryRunner",
        "options": {
            "num_workers": 100,
            "num_connectors": 10,
            "connect_retry": 3,
            "connect_backoff": 1000,
            "connect_splay": 100,
            "task_retry": 3,
            "task_backoff": 1000,
            "task_splay": 100
        }
    }
)

Sample code to demonstrate usage of RetryRunner, DictInventory and ResultSerializer plugins:

import yaml
import pprint
from nornir import InitNornir
from nornir.core.task import Result, Task
from nornir_netmiko import netmiko_send_command, netmiko_send_config
from nornir_salt.plugins.functions import ResultSerializer

inventory_data = '''
hosts:
  R1:
    hostname: 192.168.1.151
    platform: ios
    groups: [lab]
  R2:
    hostname: 192.168.1.153
    platform: ios
    groups: [lab]
  R3:
    hostname: 192.168.1.154
    platform: ios
    groups: [lab]

groups:
  lab:
    username: cisco
    password: cisco
'''

inventory_dict = yaml.safe_load(inventory_data)

NornirObj = InitNornir(
    runner={
        "plugin": "RetryRunner",
        "options": {
            "num_workers": 100,
            "num_connectors": 10,
            "connect_retry": 3,
            "connect_backoff": 1000,
            "connect_splay": 100,
            "task_retry": 3,
            "task_backoff": 1000,
            "task_splay": 100
        }
    },
    inventory={
        "plugin": "DictInventory",
        "options": {
            "hosts": inventory_dict["hosts"],
            "groups": inventory_dict["groups"],
            "defaults": inventory_dict.get("defaults", {})
        }
    },
)

def _task_group_netmiko_send_commands(task, commands):
    # run commands
    for command in commands:
        task.run(
            task=netmiko_send_command,
            command_string=command,
            name=command
        )
    return Result(host=task.host)

# run single task
result1 = NornirObj.run(
    task=netmiko_send_command,
    command_string="show clock"
)

# run grouped tasks
result2 = NornirObj.run(
    task=_task_group_netmiko_send_commands,
    commands=["show clock", "show run | inc hostname"],
    connection_name="netmiko"
)

# run another single task
result3 = NornirObj.run(
    task=netmiko_send_command,
    command_string="show run | inc hostname"
)

NornirObj.close_connections()

# Print results
formed_result1 = ResultSerializer(result1, add_details=True)
pprint.pprint(formed_result1, width=100)

formed_result2 = ResultSerializer(result2, add_details=True)
pprint.pprint(formed_result2, width=100)

formed_result3 = ResultSerializer(result3, add_details=True)
pprint.pprint(formed_result3, width=100)

RetryRunner - Connect to hosts behind jumphost

RetryRunner implements logic to connect with hosts behind bastion/jumphosts.

To connect to devices behind jumphost, need to define jumphost parameters in host’s inventory data:

hosts:
  R1:
    hostname: 192.168.1.151
    platform: ios
    username: test
    password: test
    data:
      jumphost:
        hostname: 10.1.1.1
        port: 22
        password: jump_host_password
        username: jump_host_user

Note

Only Netmiko connection_name=”netmiko” and Ncclient connection_name=”ncclient” tasks, support connecting to hosts behind Jumphosts using above inventory data.

RetryRunner Reference

class nornir_salt.plugins.runners.RetryRunner.RetryRunner(num_workers: int = 100, num_connectors: int = 20, connect_retry: int = 3, connect_backoff: int = 5000, connect_splay: int = 100, task_retry: int = 1, task_backoff: int = 5000, task_splay: int = 100, reconnect_on_fail: bool = True, task_timeout: int = 600)

RetryRunner is a modification of QueueRunner that strives to make task execution as reliable as possible.

Parameters
  • num_workers – number of threads for tasks execution

  • num_connectors – number of threads for device connections

  • connect_retry – number of connection attempts

  • connect_backoff – exponential backoff timer in milliseconds

  • connect_splay – random interval between 0 and splay for each connection in milliseconds

  • task_retry – number of attempts to run task

  • task_backoff – exponential backoff timer in milliseconds

  • task_splay – random interval between 0 and splay before task start in milliseconds

  • reconnect_on_fail – boolean, default True, perform reconnect to host on task failure

  • task_timeout – int, seconds to wait for task to complete before closing all queues and stopping connectors and workers threads, default 600