########## Monitoring ########## Uptime Kuma ########### Uptime Kuma is an monitoring tool that periodically checks if our services are still available. Typically our services are called every 60s, an alert will be sent when a service is offline for 2 minutes. It will also warn us if SSL certificates will expire. It is hosted on `status.hyteck.de `_ which points to the server ``s5.hyteck.de``. Uptime Kuma is hosted on another server than other QZT services in order to still function on a complete failure of the main server. Alerts will be sent via matrix to the Monitoring channel ``#qzt-alerts:hyteck.de``. Other alert routes can be configured in the GUI. Access is currently limited to moanos as Uptime-Kuma does not support multi-users or SSO yet. This can be changed if needed. For convenience Uptime Kuma also provides us with a status page to which users can be pointed to. It is located here: `status.hyteck.de/status/qzt `_ .. figure:: uptime-kuma-status-page.jpeg :alt: Screenshot of the Uptime Kuma Status Page. It lists public services which is only the Wordpress site and internal services. All services are up. Internal services are: Authentic, Collabora Online, Nextcloud, GoToSocial and Vaultwarden. Healthchecks ############ Healthchecks is a tool to monitor periodic tasks like backups. A task will send (on success) a ping to the Healthchecks server. The server will record the ping. When no backup is done, there will be no ping sent and recorded. After a the normal period and a certain grace period expires Healthchecks will issue an alert. We currently monitor our backups and the shitty-book-collector with Healthchecks. Alerts will be sent via Matrix Alertbot in ``#qzt-alerts:hyteck.de``. The Healthchecks instance is located at `health.hyteck.de `_. .. figure:: healthchecks.jpeg :alt: A screenshot of a healthchecks instance and the project Queeres Zentrum Tübingen. It shows two monitored checks: "QZT-Server Backup" and "Bücherregal Skript". The last pink of the Script shows the last ping 2 hours ago, the last backup ping was never. Screenshot of our Healtchecks instance