Improve Swarm support (#333)

* Query for labeled services as well

* Try scaling down services

* Scale services back up

* Use progress tool from Docker CLI

* In test, label both services

* Clean up error and log messages

* Document scale-up/down approach in docs

* Downgrade Docker CLI to match client

* Document services stats

* Do not rely on PreviousSpec for storing desired replica count

* Log warnings from Docker when updating services

* Check whether container and service labels collide

* Document script behavior on label collision

* Add additional check if all containers have been removed

* Scale services concurrently

* Move docker interaction code into own file

* Factor out code for service updating

* Time out after five minutes of not reaching desired container count

* Inline handling of in-swarm container level restart

* Timer is more suitable for timeout race

* Timeout when scaling down services should be configurable

* Choose better filename

* Reflect changes in naming

* Rename and deprecate BACKUP_STOP_CONTAINER_LABEL

* Improve logging

* Further simplify logging
This commit is contained in:
Frederik Ring
2024-01-31 12:17:41 +01:00
committed by GitHub
parent 2065fb2815
commit c3daeacecb
18 changed files with 640 additions and 145 deletions

View File

@@ -76,7 +76,7 @@ Configuration, data about the backup run and helper functions will be passed to
Here is a list of all data passed to the template:
* `Config`: this object holds the configuration that has been passed to the script. The field names are the name of the recognized environment variables converted in PascalCase. (e.g. `BACKUP_STOP_CONTAINER_LABEL` becomes `BackupStopContainerLabel`)
* `Config`: this object holds the configuration that has been passed to the script. The field names are the name of the recognized environment variables converted in PascalCase. (e.g. `BACKUP_STOP_DURING_BACKUP_LABEL` becomes `BackupStopDuringBackupLabel`)
* `Error`: the error that made the backup fail. Only available in the `title_failure` and `body_failure` templates
* `Stats`: objects that holds stats regarding script execution. In case of an unsuccessful run, some information may not be available.
* `StartTime`: time when the script started execution
@@ -89,6 +89,11 @@ Here is a list of all data passed to the template:
* `ToStop`: number of containers matched by the stop rule
* `Stopped`: number of containers successfully stopped
* `StopErrors`: number of containers that were unable to be stopped (equal to `ToStop - Stopped`)
* `Services`: object containing stats about the docker services (only populated when Docker is running in Swarm mode)
* `All`: total number of services
* `ToScaleDown`: number of containers matched by the scale down rule
* `ScaledDwon`: number of containers successfully scaled down
* `ScaleDownErrors`: number of containers that were unable to be stopped (equal to `ToScaleDown - ScaledDowm`)
* `BackupFile`: object containing information about the backup file
* `Name`: name of the backup file (e.g. `backup-2022-02-11T01-00-00.tar.gz`)
* `FullPath`: full path of the backup file (e.g. `/archive/backup-2022-02-11T01-00-00.tar.gz`)