In this lesson, you will learn what Ansible Serial and Ansible forks are and how to use them in managing playbook executions in Ansible.
For example, if six tasks are to be executed on 15 hosts, the first task must be completed first on 5 hosts (5 forks), then on another 5 hosts, and then on the remaining 5 hosts before the next task will be completed first on 5 hosts again, then on another 5 hosts, then on the remaining 5 hosts.
This is how the cycle continues in 5 batches of hosts until the tasks are completed. Well, this may be nice for a basic Ansible user, all he/she cares about is that all the tasks should and will be executed after all!
Some Ansible intermediate or advanced users, for example, I, may not even care about forking if the tasks are executed against a small number of managed hosts, especially managed hosts that are Linux servers, well, the Python scripts can easily be generated. Why should I even pay attention to it?
However, when it comes to a larger number of hosts, I should care because this may affect the performance of the Ansible controller if the tasks are executed against a large number of hosts. This is because for every task that runs, by default, python scripts are generated for those tasks on each managed host by the Ansible controller.
For example, if six tasks run against 15 hosts, six Python scripts will be generated by the controller on each managed hosts making a total of 90 python scripts.
If forks is set to be 5 which is the default value, this means that 6 python scripts will be generated first on 5 managed hosts, followed by another 6 python scripts being generated on another 5 managed hosts and the cycle continues in 5 batches of hosts for each task until it is completed.
Leaving the forks value at 5 may affect Ansible performance if the number of hosts are large, say up to 200 because each task will be executed in 5 batches against 200 hosts until completed.
This means that the python scripts will be generated in 5 batches for each task, and consecutively, thereby putting pressure or load on the control node. This will affect Ansible processing power.
In some cases, the managed hosts may not even be python related. Some may be switches, routers, load balancers or some other devices that don’t use python or SSH as their plugins, hence why the playbook execution behavior can be controlled by changing the fork value.
In this scenario we have cited, for example, the forks value may be changed to 200 which means that the first task will be executed on all 200 nodes, then the second task on all 200 nodes, followed by the third task until the tasks are completed. This increases Ansible processing power, thereby increasing Ansible performance.
In summary, the forks directive specifies the number of python scripts (as per the number of tasks) generated in a batch/at a time on the Ansible controller
NB: Setting the forks should be done judiciously and smartly.
The default forks can be seen by using the command,
[lisa@drdev1 ~]$ ansible-config dump |grep -i forks DEFAULT_FORKS(default) = 5
To change the Ansible forks value, it has to be defined in Ansible configuration file as shown below
[lisa@drdev1 ~]$ cat .ansible.cfg
[lisa@drdev1 ~]$ cat .ansible.cfg [defaults] inventory=static-ini-inventory remote_user=root ask_pass=false forks=200 [privilege_escalation] become=True become_user=root become_method=sudo beocme_ask_pass=false
One can verify after changing the forks value if they wish.
[lisa@drdev1 ~]$ ansible-config dump |grep -i forks DEFAULT_FORKS(/home/lisa/.ansible.cfg) = 200
Just as I mentioned above, a task will run on all managed hosts before the other tasks run on all managed hosts and the cycle continues until the tasks are completed.
What if there is a task in your play, for example, that needs to restart the managed hosts, and ideally in your environment, all your managed hosts must not be restarted at the same time due to some sort of high availability/cluster configuration.
What one can do in this case is to use the serial keyword, this setting will run the hosts in batches.
Using the 15 hosts scenario above as example, if the value of serial is set to 3, the first task will be completed first on three hosts, then on another batch of three hosts, then on another batch of three hosts till the number of hosts are completed, followed by the second task on the first batch of three hosts until the tasks are completed.
You should know that if there is an handler in the playbook, and there is a notifier in the task, the handler will run for the hosts in each batch as required.
Ansible serial is defined at a play level as shown below
[lisa@drdev1 ~]$ cat playbook16.yml --- - name: Managing and manipulating files using various module hosts: all serial: 3 tasks: - name: Add a line to a file ................
Change the forks value to 2 and run an Ansible task to install and start the nginx package. Change the forks value to 4 again, then install and start the mysql package. Note the difference in execution time.
If you need personal training, send an email to firstname.lastname@example.org
Your feedback is welcomed. If you love others, you will share with others