docs

2. Need for StackMon

Technical Story: Public cloud operator wants to permanently observe whether regular user load (i.e. provision server) is working at all points in time and to know when problem occurs before customer will complain.

Context and Problem Statement

There are multiple existing solution to monitor certain systems, but so far there is nothing that can cover complexity of monitoring cloud. Such system involves very complex component relations and can not be normally monitored by simple metrics.

Considered Options

Decision Outcome

It is decided that introduction of a monitoring stack specialized on monitoring of OpenStack clouds should be created.

Stack should be implementing/using following components

Pros and Cons of the Options

Prometheus

RefStack/Tempest