Daniel Mellado
Daniel is a Principal Software Engineer at Red Hat. He’s been involved in several networking projects, such as Kuryr-Kubernetes (a CNI plugin which enables native Neutron-based networking in Kubernetes), MetalLB and recently he’s been tackling Edge, Telco NFV and Observability use cases. He’s been a PTL (Project Team Lead) at some projects in OpenStack, a member of the Kubernetes SIG Group and part of the panel for the Leveraging Containers and OpenStack. He's also acting as the main coordinator for the Fedora eBPF-sig-group.
Sessions
With thousands of available plugins, Ansible automates and orchestrates configuration management, application deployment as well as cloud, network, security and server infrastructure.
Beyond these typical scenarios, it can be a great abstraction layer to interface or glue different tools and systems together.
Given this wide range of use cases and the many ways they can all go wrong differently dozens or thousands of times a day, it would be interesting and useful to have detailed and granular metrics about individual playbooks, hosts and tasks.
We could spot improvements, regressions, spikes and bottlenecks in Grafana to make playbooks run better and faster.
If unexpected changes or failures happen, we could notify someone or something about it with Alertmanager.
In this talk we'll explain and show "why not" using an implementation that puts Ansible metrics in Prometheus using ARA Records Ansible.
At time of writing, it kind of works and puts many pieces of the puzzle together but doesn't quite use the right approach. It turns out putting historical metrics in Prometheus is not that simple.
We might just find out how to do it together if you are interested in the use case !