Maltfield Log/2024 Q3
My work log from the third quarter of the year 2024. I intentionally made this verbose to make future admin's work easier when troubleshooting. The more keywords, error messages, etc that are listed in this log, the more helpful it will be for the future OSE Sysadmin.
See Also
Sun Sep 22, 2024
- I loggged-back into Backblaze this morning and started investigating each of the keys. They each have names, buckets, and capabilities
- dev, ose-dev-server-backups, deleteFiles, listBuckets, listFiles, readFiles, shareFiles, writeFiles
- master2, -, deleteBuckets, deleteFiles, deleteKeys, listBuckets, listFiles, listKeys, readFiles, shareFiles, writeBuckets, writeFiles, writeKeys
- prod-append-only, ose-server-backups, listBuckets, writeFiles
- prod-append-only-2022-10, ose-server-backups, readFiles, writeFiles
- prod-list-and-append-only-2022-10, ose-server-backups, listFiles, readFiles, writeFiles
- it's a bit concerning that the dev one clearly isn't append-only, but that data isn't so important. And I don't think it's a good use of my time to fight with the dev server (which is probably broken atm and would be a time sink just to investigate it), so I'll just leave that one as-is. It can only access the dev bucket, anyway
- I don't recognize the "master2" key, and it definitely shouldn't exist on any server, so I deleted it
- I confirmed from my logs yesterday that the hetzner2 prod server uses the last two keys (
prod-append-only-2022-10
for b2 user andprod-list-and-append-only-2022-10
for the root user), so I deleted the old keyprod-append-only
- I created a new (temporary) master key named
master-2024-09
in the Backblaze B2 WUI - I installed the backblaze-b2 package on my local debian 12 laptop
sudo apt-get install backblaze-b2
- I configured my local
backblaze-b2
CLI tool with the new master key that I created in the Backblaze B2 WUI (above)
user@disp3202:~$ backblaze-b2 authorize-account Using https://api.backblazeb2.com Backblaze account ID: REDACTED Backblaze application key: user@disp3202:~$
- I created two new keys for the hetzner3 server
user@disp3202:~$ backblaze-b2 create-key --bucket 'ose-server-backups' 'hetzner3-append-only-2024-09' 'readFiles, writeFiles' OBFUSCATED OBFUSCATED user@disp3202:~$ user@disp3202:~$ backblaze-b2 create-key --bucket 'ose-server-backups' 'hetzner3-list-and-append-only-2024-09' 'listFiles, readFiles, writeFiles' OBFUSCATED OBFUSCATED user@disp3202:~$
- I configured rclone on hetzne3 for the root user with the key that includes the
listFiles
capability
root@mail ~/backups # rclone config 2024/09/22 20:31:36 NOTICE: Config file "/root/.config/rclone/rclone.conf" not found - using defaults No remotes found, make a new one? n) New remote s) Set configuration password q) Quit config n/s/q> n Enter name for new remote. name> b2 Option Storage. Type of storage to configure. Choose a number from below, or type in your own value. 1 / 1Fichier \ (fichier) 2 / Akamai NetStorage \ (netstorage) 3 / Alias for an existing remote \ (alias) 4 / Amazon Drive \ (amazon cloud drive) 5 / Amazon S3 Compliant Storage Providers including AWS, Alibaba, Ceph, China Mobile, Cloudflare, ArvanCloud, Digital Ocean, Dreamhost, Huawei OBS, IBM COS, IDrive e2, IONOS Cloud, Lyve Cloud, Minio, Netease, RackCorp, Scaleway, SeaweedFS, StackPath, Storj, Tencent COS, Qiniu and Wasabi \ (s3) 6 / Backblaze B2 \ (b2) 7 / Better checksums for other remotes \ (hasher) 8 / Box \ (box) 9 / Cache a remote \ (cache) 10 / Citrix Sharefile \ (sharefile) 11 / Combine several remotes into one \ (combine) 12 / Compress a remote \ (compress) 13 / Dropbox \ (dropbox) 14 / Encrypt/Decrypt a remote \ (crypt) 15 / Enterprise File Fabric \ (filefabric) 16 / FTP \ (ftp) 17 / Google Cloud Storage (this is not Google Drive) \ (google cloud storage) 18 / Google Drive \ (drive) 19 / Google Photos \ (google photos) 20 / HTTP \ (http) 21 / Hadoop distributed file system \ (hdfs) 22 / HiDrive \ (hidrive) 23 / In memory object storage system. \ (memory) 24 / Internet Archive \ (internetarchive) 25 / Jottacloud \ (jottacloud) 26 / Koofr, Digi Storage and other Koofr-compatible storage providers \ (koofr) 27 / Local Disk \ (local) 28 / Mail.ru Cloud \ (mailru) 29 / Microsoft Azure Blob Storage \ (azureblob) 30 / Microsoft OneDrive \ (onedrive) 31 / OpenDrive \ (opendrive) 32 / OpenStack Swift (Rackspace Cloud Files, Memset Memstore, OVH) \ (swift) 33 / Pcloud \ (pcloud) 34 / Put.io \ (putio) 35 / SMB / CIFS \ (smb) 36 / SSH/SFTP \ (sftp) 37 / Sia Decentralized Cloud \ (sia) 38 / Sugarsync \ (sugarsync) 39 / Transparently chunk/split large files \ (chunker) 40 / Union merges the contents of several upstream fs \ (union) 41 / Uptobox \ (uptobox) 42 / WebDAV \ (webdav) 43 / Yandex Disk \ (yandex) 44 / Zoho \ (zoho) 45 / premiumize.me \ (premiumizeme) 46 / seafile \ (seafile) Storage> b2 Option account. Account ID or Application Key ID. Enter a value. account> OBFUSCATED Option key. Application Key. Enter a value. key> OBFUSCATED Option hard_delete. Permanently delete files on remote removal, otherwise hide files. Enter a boolean value (true or false). Press Enter for the default (false). hard_delete> true Edit advanced config? y) Yes n) No (default) y/n> n Configuration complete. Options: - type: b2 - account: OBFUSCATED - key: OBFUSCATED - hard_delete: true Keep this "b2" remote? y) Yes this is OK (default) e) Edit this remote d) Delete this remote y/e/d> y Current remotes: Name Type = b2 b2 e) Edit existing remote n) New remote d) Delete remote r) Rename remote c) Copy remote s) Set configuration password q) Quit config e/n/d/r/c/s/q> q root@mail ~/backups #
- I confirmed that it's now working
root@mail ~/backups # rclone --b2-versions ls b2:ose-server-backups 21812118477 daily_hetzner2_20240921_072001.tar.gpg 21757200532 daily_hetzner2_20240922_072001.tar.gpg 21349753244 monthly_hetzner2_20231001_072001.tar.gpg 21360808568 monthly_hetzner2_20231101_072001.tar.gpg 21360301269 monthly_hetzner2_20231201_072001.tar.gpg 21820017340 monthly_hetzner2_20240201_072001.tar.gpg 21683700909 monthly_hetzner2_20240301_072001.tar.gpg 21660296728 monthly_hetzner2_20240401_072001.tar.gpg 21790035424 monthly_hetzner2_20240501_072001.tar.gpg 21603737883 monthly_hetzner2_20240601_072001.tar.gpg 21663769333 monthly_hetzner2_20240701_072001.tar.gpg 21991147307 monthly_hetzner2_20240801_072001.tar.gpg 21896377523 monthly_hetzner2_20240901_072001.tar.gpg 21942660432 weekly_hetzner2_20240826_072001.tar.gpg 21902006508 weekly_hetzner2_20240902_072001.tar.gpg 21873908566 weekly_hetzner2_20240909_072001.tar.gpg 21830987241 weekly_hetzner2_20240916_072001.tar.gpg 17516124812 yearly_hetzner2_20190101_111520.tar.gpg 18872422001 yearly_hetzner2_20200101_072001.tar.gpg 19827971632 yearly_hetzner2_20210101_072001.tar.gpg 21079942509 yearly_hetzner2_20230101_072001.tar.gpg 21541199047 yearly_hetzner2_20240101_072001.tar.gpg root@mail ~/backups #
- I switched to the
b2user
user, and I did the same process for the other key
sudo su - b2user rclone config
- I spent some time documenting this whole key creation and rclone config process on Backblaze
- I also spent some time updating the Hetzner3 article with a summary of the steps to configure the server until now
- I checked the backups log. I see two problems
- as expected, rclone failed to upload because it didn't have the b2 endpoint setup yet
- but it's also failing to copy the backups to the b2 user
+ /bin/mv /root/backups/sync/daily_hetzner3_20240922_061931.tar.gpg /home/b2user/backups/sync/daily_hetzner3_20240922_061931.tar.gpg /bin/mv: cannot stat '/root/backups/sync/daily_hetzner3_20240922_061931.tar.gpg': No such file or directory
- oh, the issue is higher-up; it fails to create the gpg file because gpg is not installed!
+ echo -e '\tINFO: Encrypting the single-file tarball' INFO: Encrypting the single-file tarball + /bin/nice /bin/gpg2 --output /root/backups/sync/daily_hetzner3_20240922_061931.tar.gpg --batch --symmetric --cipher-algo aes256 --compress-algo none --passphrase-file /root/backups/ose-backups-cron.2.key /root/backups/sync/daily_hetzner3_20240922_061931.tar /bin/nice: ‘/bin/gpg2’: No such file or directory real 0m0.002s user 0m0.002s sys 0m0.000s
- huh, gpg *is* installed. I guess it just doesn't use `gpg2` anymore
root@mail ~/backups # which gpg2 root@mail ~/backups # which gpg /usr/bin/gpg root@mail ~/backups #
- I updated the
backup.settings
file to use/usr/bin/gpg
, kicked-off the backup script, and took lunch
...
- the update failed after almost exactly 10 minutes, due to a timeout
root@mail ~/backups # time /root/backups/backup.sh ... + /bin/sudo -u b2user /bin/rclone -v --bwlimit 3M copy /home/b2user/backups/sync/daily_hetzner3_20240922_190109.tar.gpg b2:ose-server-backups 2024/09/22 21:01:10 INFO : Starting bandwidth limiter at 3Mi Byte/s 2024/09/22 21:11:10 Failed to create file system for "b2:ose-server-backups": failed to authorize account: failed to authenticate: Get "https://api.backblazeb2.com/b2api/v1/b2_authorize_account": dial tcp 104.153.233.180:443: i/o timeout real 10m0.067s user 0m0.001s sys 0m0.006s + echo ================================================================================ ================================================================================ ++ date -u +%Y%m%d_%H%M%S + echo 'INFO: Finished Backup Run at 20240922_191110' INFO: Finished Backup Run at 20240922_191110 + echo ================================================================================ ================================================================================ + exit 0 real 10m1.345s user 0m0.242s sys 0m0.043s root@mail ~/backups #
- this smells like I didn't update the firewall rules for this new
b2user
user (our firewall rules block traffic for users by default, unless they're explicitly allowed - I updated the ansible playbook to poke a hole in the firewall for the
b2user
, ran ansible, and kicked it off again. - ok, that worked. The backup is stupid small, currently only 11 KB
root@mail ~/backups # time /root/backups/backup.sh ... + /bin/sudo -u b2user /bin/rclone -v --bwlimit 3M copy /home/b2user/backups/sync/weekly_hetzner3_20240922_220636.tar.gpg b2:ose-server-backups 2024/09/23 00:06:36 INFO : Starting bandwidth limiter at 3Mi Byte/s 2024/09/23 00:06:39 INFO : weekly_hetzner3_20240922_220636.tar.gpg: Copied (new) 2024/09/23 00:06:39 INFO : Transferred: 10.104 KiB / 10.104 KiB, 100%, 5.046 KiB/s, ETA 0s Transferred: 1 / 1, 100% Elapsed time: 3.2s real 0m3.231s user 0m0.003s sys 0m0.003s + echo ================================================================================ ================================================================================ ++ date -u +%Y%m%d_%H%M%S + echo 'INFO: Finished Backup Run at 20240922_220639' INFO: Finished Backup Run at 20240922_220639 + echo ================================================================================ ================================================================================ + exit 0 real 0m3.506s user 0m0.232s sys 0m0.042s root@mail ~/backups # ls -lah /home/b2user/backups total 16K drwx------ 4 b2user b2user 4.0K Sep 23 00:06 . drwxr-xr-x 4 b2user b2user 4.0K Sep 17 07:29 .. drwxr-xr-x 2 root root 4.0K Sep 23 00:06 sync drwx------ 2 b2user b2user 4.0K Sep 23 00:04 sync.old root@mail ~/backups # ls -lah /home/b2user/backups/sync total 20K drwxr-xr-x 2 root root 4.0K Sep 23 00:06 . drwx------ 4 b2user b2user 4.0K Sep 23 00:06 .. -rw-r--r-- 1 b2user root 11K Sep 23 00:06 weekly_hetzner3_20240922_220636.tar.gpg root@mail ~/backups #
- now, the final test: can we download the backup from the Backblaze B2 WUI, decrypt it, and view the files?
- I logged into the BackBlaze B2 WUI, and clicked "Browse Files" on the left-hand navigation
- I clicked on the
ose-server-backups
bucket - I saw one file with "hetzner3" in its name =
weekly_hetzner3_20240922_220636.tar.gpg
, and I clicked on it - In the popup modal that appeared, I clicked the
Download
button
- I was successfully able to decrypt and extract its contents, but everything was missing except for mysql, which is funny because that's one that I expected to be missing because we don't even have mysql installed!
user@ose:~/tmp/hetzner3/backup-restore-test$ gpg --batch --passphrase-file ose-backups-cron.2.key --output weekly_hetzner3_20240922_220636.tar weekly_hetzner3_20240922_220636.tar.gpg gpg: WARNING: no command supplied. Trying to guess what you mean ... gpg: AES256.CFB encrypted data gpg: encrypted with 1 passphrase user@ose:~/tmp/hetzner3/backup-restore-test$ user@ose:~/tmp/hetzner3/backup-restore-test$ ls ose-backups-cron.2.key weekly_hetzner3_20240922_220636.tar.gpg weekly_hetzner3_20240922_220636.tar user@ose:~/tmp/hetzner3/backup-restore-test$ user@ose:~/tmp/hetzner3/backup-restore-test$ tar -xf weekly_hetzner3_20240922_220636.tar user@ose:~/tmp/hetzner3/backup-restore-test$ user@ose:~/tmp/hetzner3/backup-restore-test$ du -sh * 4.0K ose-backups-cron.2.key 40K root 12K weekly_hetzner3_20240922_220636.tar 12K weekly_hetzner3_20240922_220636.tar.gpg user@ose:~/tmp/hetzner3/backup-restore-test$ user@ose:~/tmp/hetzner3/backup-restore-test$ ls root/backups/sync/weekly_hetzner3_20240922_220636/ etc home log mysqldump root www user@ose:~/tmp/hetzner3/backup-restore-test$ user@ose:~/tmp/hetzner3/backup-restore-test$ cd root/backups/sync/weekly_hetzner3_20240922_220636/ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/weekly_hetzner3_20240922_220636/$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/weekly_hetzner3_20240922_220636$ find . ./log ./www ./root ./home ./etc ./mysqldump ./mysqldump/mysqldump.20240922_220636.sql.gz user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/weekly_hetzner3_20240922_220636$ cd mysqldump/ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/weekly_hetzner3_20240922_220636/mysqldump$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/weekly_hetzner3_20240922_220636/mysqldump$ du -sh mysqldump.20240922_220636.sql.gz 0 mysqldump.20240922_220636.sql.gz user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/weekly_hetzner3_20240922_220636/mysqldump$
- ok, so the tarball has literally just one file that's 0 bytes large. Fail. This is why we test restores ;)
- hmm, I also realized that today is Sunday, but the backup is marked as a 'weekly'. Weeklys are supposed to happen on the first of the week, which is supposed to be Monday
- ah, hmm, it looks like the locale never got setup properly
root@mail ~/backups # date +%u 1 root@mail ~/backups # date Mon Sep 23 12:36:58 AM CEST 2024 root@mail ~/backups # date -u Sun Sep 22 10:38:33 PM UTC 2024 root@mail ~/backups # <pre> # I uncommented the 'maltfield.locale' role and gave ansible another run # after that, the timezone was UTC (as desired) <pre> root@mail ~/backups # date Sun Sep 22 10:40:21 PM UTC 2024 root@mail ~/backups # root@mail ~/backups # date +%u 7 root@mail ~/backups #
- I reviewed the output of the
backup.sh
run. Looks like all thetar
commands are failing because of the change I made to use pgiz
+ /bin/nice /bin/tar --use-compress-program=pigz --exclude '/home/b2user/backups/sync*' -czf /root/backups/sync/weekly_hetzner3_20240922_220636/home/home.20240922_220636.tar.gz /home/b2user /home/maltfield /bin/tar: Conflicting compression options Try '/bin/tar --help' or '/bin/tar --usage' for more information. real 0m0.003s user 0m0.000s sys 0m0.003s
- I triggered another backup run. This time it was 40 MB, and the whole backup & upload process finished in less than 20 seconds
root@mail ~/backups # time /root/backups/backup.sh ... INFO: Beginning upload to backblaze b2 + /bin/sudo -u b2user /bin/rclone -v --bwlimit 3M copy /home/b2user/backups/sync/daily_hetzner3_20240922_224521.tar.gpg b2:ose-server-backups 2024/09/22 22:45:23 INFO : Starting bandwidth limiter at 3Mi Byte/s 2024/09/22 22:45:38 INFO : daily_hetzner3_20240922_224521.tar.gpg: Copied (new) 2024/09/22 22:45:38 INFO : Transferred: 39.678 MiB / 39.678 MiB, 100%, 2.834 MiB/s, ETA 0s Transferred: 1 / 1, 100% Elapsed time: 15.4s real 0m15.442s user 0m0.000s sys 0m0.006s + echo ================================================================================ ================================================================================ ++ date -u +%Y%m%d_%H%M%S + echo 'INFO: Finished Backup Run at 20240922_224538' INFO: Finished Backup Run at 20240922_224538 + echo ================================================================================ ================================================================================ + exit 0 real 0m16.788s user 0m6.406s sys 0m0.371s root@mail ~/backups #
- back in the web browser, I downloaded this latest backup file from the Backblaze B2 WUI
- and on my computer, I decrypted it, extracted it, and confirmed that I can restore files from log, www, root, home, and etc. Backups are working :)
user@ose:~/tmp/hetzner3/backup-restore-test$ gpg --batch --passphrase-file ose-backups-cron.2.key --output daily_hetzner3_20240922_224521.tar daily_hetzner3_20240922_224521.tar.gpg gpg: WARNING: no command supplied. Trying to guess what you mean ... gpg: AES256.CFB encrypted data gpg: encrypted with 1 passphrase user@ose:~/tmp/hetzner3/backup-restore-test$ user@ose:~/tmp/hetzner3/backup-restore-test$ tar -xf daily_hetzner3_20240922_224521.tar user@ose:~/tmp/hetzner3/backup-restore-test$ user@ose:~/tmp/hetzner3/backup-restore-test$ du -sh * 40M daily_hetzner3_20240922_224521.tar 40M daily_hetzner3_20240922_224521.tar.gpg 4.0K ose-backups-cron.2.key 40M root user@ose:~/tmp/hetzner3/backup-restore-test$ user@ose:~/tmp/hetzner3/backup-restore-test$ cd root/backups/sync/daily_hetzner3_20240922_224521/ user@ose:~/tmp/hetzner3/backup-restore-test$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521$ find . ./log ./log/log.20240922_224521.tar.gz ./www ./www/www.20240922_224521.tar.gz ./root ./root/root.20240922_224521.tar.gz ./home ./home/home.20240922_224521.tar.gz ./etc ./etc/etc.20240922_224521.tar.gz ./mysqldump ./mysqldump/mysqldump.20240922_224521.sql.gz user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521$ cd log user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/log$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/log$ tar -xf log.20240922_224521.tar.gz user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/log$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/log$ ls log.20240922_224521.tar.gz var/ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/log$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/log$ ls var/log/ alternatives.log audit btmp.1 faillog private unattended-upgrades alternatives.log.1 backups dpkg.log journal README wtmp apt btmp dpkg.log.1 lastlog runit user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/log$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/log$ tail var/log/apt/history.log Commandline: /usr/bin/apt-get -y -o Dpkg::Options::=--force-confdef -o Dpkg::Options::=--force-confold install stubby unbound resolvconf Requested-By: maltfield (1000) Install: stubby:amd64 (1.6.0-3+b1), resolvconf:amd64 (1.91+nmu1), libyaml-0-2:amd64 (0.2.5-1, automatic), libevent-2.1-7:amd64 (2.1.12-stable-8, automatic), libunbound8:amd64 (1.17.1-2+deb12u2, automatic), libgetdns10:amd64 (1.6.0-3+b1, automatic), libevent-core-2.1-7:amd64 (2.1.12-stable-8, automatic), libev4:amd64 (1:4.33-1, automatic), unbound:amd64 (1.17.1-2+deb12u2), dns-root-data:amd64 (2024041801~deb12u1, automatic) End-Date: 2024-09-15 21:50:47 Start-Date: 2024-09-16 18:46:10 Commandline: /usr/bin/apt-get -y -o Dpkg::Options::=--force-confdef -o Dpkg::Options::=--force-confold install pigz Requested-By: maltfield (1000) Install: pigz:amd64 (2.6-1) End-Date: 2024-09-16 18:46:10 user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/log$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/log$ cd ../www user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/www$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/www$ tar -xf www.20240922_224521.tar.gz user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/www$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/www$ ls www.20240922_224521.tar.gz user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/www$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/www$ du www.20240922_224521.tar.gz 4 www.20240922_224521.tar.gz user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/www$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/www$ tar -tf www.20240922_224521.tar.gz user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/www$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/www$ cd ../root user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/root$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/root$ tar -xf root.20240922_224521.tar.gz user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/root$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/root$ ls root backups dead.letter Mail user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/root$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/root$ ls root/backups/ backupReport.sh backup.settings backup.settings.20240916 backup.sh backup.sh.74511.2024-09-22@22:45:04~ ose-backups-cron.2.key ose-backups-cron.key README.txt user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/root$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/root$ tail root/backups/README.txt gpg --batch --passphrase-file /root/backups/ose-backups-cron.key --decrypt daily_hetzner2_20240726_160837.tar.gpg > daily_hetzner2_20240726_160837.tar Then you can untar the wrapper tarball and the compressed tarball inside of that. For example: tar -xf daily_hetzner2_20240726_160837.tar cd root/backups/sync/daily_hetzner2_20240726_160837/www/ tar -xf www.20240726_160837.tar.gz head var/www/html/www.opensourceecology.org/htdocs/index.php --Michael Altfield <https://michaelaltfield.net.> user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/root$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/root$ cd ../home user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/home$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/home$ tar -xf home.20240922_224521.tar.gz user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/home$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/home$ ls home b2user maltfield user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/home$ ls home/maltfield/ bin ossec.conf.31557.2024-09-15@04:36:48~ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/home$ tail home/maltfield/ossec.conf.31557.2024-09-15@04\:36\:48~ <log_format>syslog</log_format> <location>/var/ossec/logs/active-responses.log</location> </localfile> <localfile> <log_format>syslog</log_format> <location>/var/log/dpkg.log</location> </localfile> </ossec_config> user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/home$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/home$ cd ../etc user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/etc$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/etc$ tar -xf etc.20240922_224521.tar.gz user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/etc$ user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/etc$ ls etc etc.20240922_224521.tar.gz user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/etc$ tail etc/hostname hetzner3 user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_202user@ose:~/tmp/hetzner3/backup-restore-test/root/backups/sync/daily_hetzner3_20240922_224521/etc$
- note that the contents of www was emtpy, because that dir doesn't exist on the server (yet)
root@mail ~/backups # ls -lah /var/www ls: cannot access '/var/www': No such file or directory root@mail ~/backups #
- I spent some time updating Wazuh with the changes from hetzner2 to hetzner3
Sat Sep 21, 2024
- basically, from my research yesterday, I discovered two ways to be able to login to our backblaze gmail account again:
- enable 2FA for this (shared) user account
- temporarily turn-off "two step authentication" (which is a distinct Google concept from "two factor authentication")
- I did some reading about how one "should" setup a shared mailbox in Google Groups. There's a few ways, the it appears the most recommended is to create a Google Group https://old.reddit.com/r/gsuite/comments/rwx0wb/set_up_an_info_general_email_address_for_all_users/
- I checked OSE's Google Groups, and I see that, apparently, I created a group TODO
- I went ahead and created a new group with the following description:
This is a Google Group. People in this Google Group will receive emails from services related to OSE's internet presence and Internet infrastructure. Please note that "reset password" functionality usually works by sending a link to a user's email address, so we should assume that anyone on this list can login to these services, even if they don't have the account password. So please only ever put trusted users on this list.
- I made it so that anyone can't join this group if they'd like, but anyone with an @opensourceecology.org email account can ask to join
- I made it so that only group members can read & reply to emails in this list
- I made it so anyone with an @opensourceecology.org email account can see who is a member of this list
- I invited Marcin to join this group, with the following message
Hey Marcin, I'm creating this "Google Group" to forward messages about OSE's online presence and Internet infrastructure (eg OSE Server) to anyone who is a member. The reason I'm setting this up is so that an event like what just happened with backblaze (we failed to pay ~$10 for a couple months, and they deleted all our backups!) won't happen again, because we can setup service-specific accounts (eg for backblaze) to forward all of its mail to this email list, which in-turn will make these alerts appear in multiple people's inboxes. Please note that "reset password" functionality usually works by sending a link to a user's email address, so we should assume that anyone on this list can login to these services, even if they don't have the account password. So please only ever put trusted users on this list. Cheers, Michael Altfield
- I set the "subscription" to "Each Email" -- as opposed to the others, which I guess group-together all the emails into a summary email
- To re-gain access to our backblaze email account, I logged into the admin.google.com panel
- Clicked Directory -> Users
- Clicked on the backblaze username
- Clicked on the "Security" tab
- Scrolled-down to "Login challenge" and clicked the "TURN OFF FOR 10 MINS" button
- finally, this time when I attempted to login to the backblaze gmail account, I was let-in!
- I saw some security alerts and marked them all as "yeah it as me"
- I setup email forwarding of all incoming mail to this new address
- Additionally, from the backblaze gmail UI, I went to Settings -> Accounts -> "Grant access to your account" and I added marcin & myself to have "access" to the backblaze account. This is not a replacement for the Google Groups list, but I'm hoping its another way for our users to be able to login as the backblaze user (using our own creds) to be able to read and send mail on behalf of this user
- Surprisingly, I don't think we had *any* documentation on the wiki about Google Workspace, so I created Google Workspace
- I documented what I did today, and how we should use Google Groups for service-specific accounts
- I logged-into our backblaze account, and I see that the $6.12 account balance is marked as "Paid"
- I sent an email reply to Backblaze support asking to confirm that our account is now in good-standing, and that we don't need to do anything further to prevent data loss
- I checked our "buckets" page, and they're both there (thank god)
- ose-server-backups @ 446.4 GB
- ose-dev-server-backups @ 303.1 MB
- I also went ahead and set the following service-specific accounts to forward to our new Google Groups list
- Let's Encrypt
- Cloudflare
- Status Cake
- I sent Marcin an email asking him to click the "accept" link in the emails that he received
- I replied to the Backblaze support ticket again, this time asking for them to forward some feature requests to their product team which, if implemented, would greatly reduce the risk of us loosing our data due to an event like this in the future
Hi Joshua, > I'd be happy to forward over your interest in these alternate options to our teams for consideration. Yes, we do have some feature requests that we'd like you to forward to your product team. 1. We would like the ability to decrease the amount of days in the Grace Period from 30 days to 0 days, such that it increases the No Service period from 14 days to 44 days. 2. We would like the ability to add additional users to our account, such that individual users can login (with their own personal username/credentials), and so that each user will receive "Billing Failed" email alerts separately in their own inboxes. 3. We would like to be able to transfer a sum of funds into our Backblaze account in excess of a given bill, such that we can maintain a positive balance from which Backblaze will deduct monthly fees-from, in the event that credit card payments failed 4. We would like the ability to pay with cryptocurrency, as a backup payment method in the event that we have issues with our bank blocking our credit card payments. Can you please forward these requests to your product team, and let us know if any are currently planned?
- within an hour, the backblaze support rep confirmed that our account is now in good-standing and that none of the features we requested are currently planned
I can confirm that the $6.12 charge has been successfully paid for and your account is now in good standing. No further action will be needed on that end. As for those feature requests, thank you for providing your input! I'll be sure to pass these over to our product team. At the moment, none of these requests are currently planned but I'll pass it forward for consideration. Let me know if you have any questions or any other feature requests.
..
- ok -- now -- with all that in-place, it's time to actually setup the new backblaze api keys, so we can get hetzner3 uploading its backups to backblaze
- currently we have 5x application keys (besides the master keys, which we shouldn't be using anywhere)
- this is odd; I would expect us to only have 1 or 2 keys in-use?
- I re-visited my notes from Maltfield_Log/2022#Fri_October_28.2C_2022, when I last setup rclone. It was when some python version on the OS broke b2 and our backups, so I switched from b2 (which was not available in yum on CentOS) to rclone
- Per my notes, however, it looks like I can't actually create an append-only (protected from ransomware) key with the Backblaze B2 WUI; I *have* to use the b2 CLI app directly
- The good news is: the b2 cli is directly available in the official Debian repos. So now that we've switched to Debian, we can simply install the backblaze b2 CLI
- ah, yeah, my notes from 2022-10-28 say that I created two keys:
- One for the b2 user with only
readFiles, writeFiles
(namedprod-append-only-2022-10
) - And one for the root user with
listFiles, readFiles, writeFiles
(namedprod-list-and-append-only-2022-10
)
- One for the b2 user with only
- I spent some time starting to document this at Backblaze
Fri Sep 20, 2024
- Marcin paid the backblaze bill last night
- he said that after payment, all our buckets (and therefore all of our backup data) was gone
- I submitted a support request asking [a] if we can restore the data, [b] how we can build better reporting tools to notify us of failed payments and [c] how we can setup better payment methods to avoid this issue in the future
Hi, We just discovered that our bank was blocking payments to backblaze "for our protection" As soon as we discovered the issue, we settled the balance. Unfortunately, now when we login, we don't see any buckets. Is it possible to restore the data? Also important: how can we prevent this from happening again? I see two major areas for improvement: 1. Better Alerting 2. Better Payment Methods === Better Alerting === We have a script that runs once per month (on the second of every month) that checks (with rclone) to see if our monthly, weekly, and yearly backups can be found in our B2 bucket. This works great when we have a technical issue. If a backup is missing, our server generates a high-priority email alert that gets sent out to everyone in our small non-profit organization. Unfortunately, we just learned that it, apparently, *doesn't work for payment issues*. Here's the problem: On 2024-09-02, our Backblaze Report indicated no issues – even though payment wasn't made for 17 days at that point. On 2024-10-02, we would have received our first report indicating an error. We only discovered the payment issue early because I happened to login to the Backblaze B2 WUI to create a new API key. We only do this every few years, so it was incredible that I noticed there was an issue with the account, at all. How can we update our reporting bash script (which uses rclone, not the backblaze CLI tool) to throw an error when the account is not in good-standing? How long is the grace period? Is it possible for us to make all our API calls fail during the grace period, yet still retain our data during the grace period? This would make it so that our reporting tools alert us about the lack-of-payment, without the risk of data loss of our very, very important backups. === Better Payment Methods === We have had numerous issues (with a couple different banks at this point) where our CNP transactions fail due to false-positives. We have to call the bank and tell them that, yes, that recurring transaction for Backblaze B2 is still valid. We've also had this issue with other service providers. Usually, to prevent the risk of service shutdown due to bank-blocked auto-payments, we hedge this risk by transferring a large sum of money to our service provider's accounts yearly -- such that we maintain a balance with them. We keep the balance on the account sufficient to cover the costs of 1 year of service, so if we miss 12 monthly payments in-a-row, they can be deducted from the balance without service disruptions. You get paid. Our service doesn't shutdown. Everyone wins. Some of these service providers accept cryptocurrency, which – while not a good option for recurring payments – it works great to top-up the account balance, because there's no faulty "fraud detection" system that can block the payment; payments always goes through, unlike with less-secure credit card systems utilizing faulty "fraud detection" systems. Is it possible for us to transfer a positive balance to our Backblaze account of, say, $100 – so that you can deduct payments from this account balance in the event that there's a failed payment due to issues with the bank? And what alternative payment methods exist? Is cryptocurrency an option? Thank you, Michael Altfield Senior Technology Advisor PGP Fingerprint: 8A4B 0AF8 162F 3B6A 79B7 70D2 AA3E DF71 60E2 D97B Open Source Ecology www.opensourceecology.org -- *Full Disclosure:* OSE works openly. All conversations in this email are intended to be transparent and subject to sharing, with due respect. OSE does not sign NDAs in order to promote collaboration. All of our work is libre or open source. If you are discussing potential hardware development collaboration, your work must also be open source pursuant to the Open Source Hardware Association definition <http://www.oshwa.org/definition/>.
- I tried to login to our backblaze-specific gapps account, but I again got an error demanding a phone number to "recover" my account (false-positives, even though I entered the correct password on the first attempt!)
- The error said to contact the gapps admin. I am the admin!
- I tried to login to my gapps admin account. Last time I tried this (a couple days ago), I got the same error, but today it let me in
- I immediately setup 2FA via TOTP. There's no guarantee, but I hope that this additional protection will prevent Google from locking me out of my own account, just because they can't see any history in their panoptic datasets from tracking my browsers fingerprint across the Internet..
- anyway, I went to the admin console for the OSE gapps https://admin.google.com/
- I don't see any alerts about failed logins for the backblaze user
- if I open the backblaze user, I still see nothing. If I click on the "Investigate" tab and click "View logs" next to "Failed sign-ins", then I do see an entry for a failed login attempt 2 days ago
- curiously that entry says "Is suspicious" = "False"
- In another ephemeral/sandboxed browser, I tried to login ask the backblaze user; same issue.
- After I correctly enter the username and password on the first try, I'm told to "Enter a phone number to get a text message with a verification code."
- For security (and practicality; this is a shared account so it should definitely not be tied to anyone's phone number), I ignore the request and click "Try another way" -- but really I just want to click "skip"; I already entered the correct password on the first try -- just let me in!
- It then brings me to a page that says "Choose how you want to sign in". I have two options: "Get a verification code sent to your phone" or "Get help"
- of course, this is absurd; I've already signed-in! Why can't I just skip this?
- The only option is "Get Help", which then just displays "Contact your domain admin for help. Learn more"
- If I click "Learn more", I'm brought to this page https://support.google.com/accounts/answer/181627?hl=en
Sign in to your work, school, or other group account You can't reset or recover the password to a work, school, or other group account on this page because your organization has restricted password changes. You'll need to contact your administrator. For students who use Classroom, go to Troubleshooting for students.
- But ^ that page is ridiculous and not relevant. I'm not trying to reset or change my password; I just want to login! I already auth'd! Let me in!!
- Meanwhile, with my admin hat on in the other browser -- wtf am I supposed to do? Where's the button to say: "Google, stop being an asshole, just let that user in if they entered the correct password. Why are you being so stupid, Google?"
- If I refresh the "Investigate" tab, I don't even see the failed login from just now
- I did find the docs on the "failed login" entry, at least https://developers.google.com/admin-sdk/reports/v1/appendix/activity/login#login_failure
- If I click on the entry for my failed login attempt to my own admin account on 2024-09-18 (two days ago), I don't even see 'login_failure_type' -- like it's saying I failed to login, but there is no reason it failed. Yeah, that tracks.
- One of the options for 'login_failure_type' is 'login_failure_invalid_password' -- so, yeah, I guess this is saying that I had the right password, but it failed anyway. Why tho!?!
- Unfortunately, if I can't I still can't login to the account login to this account, then I won't be able to see the response from backblaze support
- in the meantime, I opened this question on SE asking why the fuck Google is blocking me from logging-in https://webapps.stackexchange.com/questions/177032/why-is-google-blocking-a-user-login-with-failed-login-when-they-entered-the-co
- I found this, but there is no answer saying how to disable this faulty bug https://webapps.stackexchange.com/questions/115156/how-to-stop-googles-google-prevented-suspicious-attempt
- I found this thread on reddit from another equally frustrated gapps admin https://old.reddit.com/r/gsuite/comments/14psu8s/how_do_i_disable_login_challenges_permanently/
- A common suggestion is to enable 2FA. That's not a great solution for a shared account. Anyway, a 100-char random passphrase should be good enough. And storing a 2FA secret key in the same keepass DB won't add any additional security.
- Ok, google may internally refer to this as a "login challenge security method" -- not very clear because a password is a "login challenge". But, anyway, here's a KB article by Google titled "How to disable login challenge security method permanently" https://knowledge.workspace.google.com/kb/how-to-disable-login-challenge-security-method-permanently-000007696
- the answer is totally unsatisfying. It's not a toggle button. It says "This is working to product specification. Login challenges cannot be disabled permanently. They say to enable 2FA.
- damn, it's a straight-up "we don't care about your business needs or that our tech is faulty and locking you out of your own accounts; fuck you and wontfix".
Tue Sep 18, 2024
- I need to create a new set of backblaze b2 api keys for hetzner3 (and then configure rclone to use it for backups)
- unfortunatly, after I logged into the backblaze wui, I got an error
- I clicked on "application keys" in the left-hand navigation menu, and I got a similar error
We were unable to retrieve your keys. Error: failed to advance iterator: B2 Storage API(s) not Enabled Please contact support if this continues.
- oh man, that was replaced with this
Your access to B2 has been suspended because your account has not been in good standing and your grace period has now ended. Please review your account and update your payment method at payment history, or contact tech support for assistance.
- shit, I checked the payments page and I see a lot of red
- last successful payment was July 2024
- there's 4x failed payments since then
- I manually ran the backup script; it alerted us immediately that our key was bad
[root@opensourceecology backups]# ./backupReport.sh INFO: email body below ATTENTION: BACKUPS MISSING! 2024/09/19 01:40:43 Failed to create file system for "b2:ose-server-backups": failed to authorize account: failed to authenticate: B2 Storage API(s) not Enabled (403 access_denied) WARNING: First of this month's backup (20240901) is missing! WARNING: First of last month's backup (20240801) is missing! WARNING: Yesterday's backup (20240918) is missing! WARNING: The day before yesterday's backup (20240917) is missing! See below for the contents of the backblaze b2 bucket = ose-server-backups 2024/09/19 01:40:43 Failed to create file system for "b2:ose-server-backups": failed to authorize account: failed to authenticate: B2 Storage API(s) not Enabled (403 access_denied) --- Note: This report was generated on 20240919_014042 UTC by script '/root/backups/backupReport.sh' This script was triggered by '/etc/cron.d/backup_to_backblaze' For more information about OSE backups, please see the relevant documentation pages on the wiki: * https://wiki.opensourceecology.org/wiki/Backblaze * https://wiki.opensourceecology.org/wiki/OSE_Server#Backups [root@opensourceecology backups]#
- Our last backups report came on Sep 02, and it was fine. So I guess our "grace period" was active then, and it ended sometime in the past ~2 weeks
No missing backups detected See below for the contents of the backblaze b2 bucket = ose-server-backups 21349753244 monthly_hetzner2_20231001_072001.tar.gpg 21360808568 monthly_hetzner2_20231101_072001.tar.gpg 21360301269 monthly_hetzner2_20231201_072001.tar.gpg 21820017340 monthly_hetzner2_20240201_072001.tar.gpg 21683700909 monthly_hetzner2_20240301_072001.tar.gpg 21660296728 monthly_hetzner2_20240401_072001.tar.gpg 21790035424 monthly_hetzner2_20240501_072001.tar.gpg 21603737883 monthly_hetzner2_20240601_072001.tar.gpg 21663769333 monthly_hetzner2_20240701_072001.tar.gpg 21991147307 monthly_hetzner2_20240801_072001.tar.gpg 21896377523 monthly_hetzner2_20240901_072001.tar.gpg 22031705783 weekly_hetzner2_20240812_072001.tar.gpg 21980940370 weekly_hetzner2_20240819_072001.tar.gpg 21942660432 weekly_hetzner2_20240826_072001.tar.gpg 21902006508 weekly_hetzner2_20240902_072001.tar.gpg 17516124812 yearly_hetzner2_20190101_111520.tar.gpg 18872422001 yearly_hetzner2_20200101_072001.tar.gpg 19827971632 yearly_hetzner2_20210101_072001.tar.gpg 21079942509 yearly_hetzner2_20230101_072001.tar.gpg 21541199047 yearly_hetzner2_20240101_072001.tar.gpg --- Note: This report was generated on 20240903_042001 UTC by script '/root/backups/backupReport.sh' This script was triggered by '/etc/cron.d/backup_to_backblaze' For more information about OSE backups, please see the relevant documentation pages on the wiki: * https://wiki.opensourceecology.org/wiki/Backblaze * https://wiki.opensourceecology.org/wiki/OSE_Server#Backups
- I tried to login to our OSE-specific backblaze gmail account for this (which I would hope was setup to forward emails to Marcin & myself), but I got an error on-login demanding a phone number. When I said I didn't have a phone number (it's definitely not a good idea to link phones numbers to google accounts), it said to contact the gapps admin. That's me!!
- I tried to login as my gapps admin account, and it had the same issue: enter phone number. Say no, and it says "enter the last password that you remember" — as if I didn't enter the creds correctly & perfectly the first time. Then it said to ask another admin to recover my account
- I did *not* forget my password. This happens to me ~50% of the time I use Google, and it's why I never recommend orgs use Google. They lock you out of your own account, even if you enter the correct credentials on the very first time. Google's faulty "fraud protection" isn't FOSS, but it's most likely using some super-shitty machine-learning anonmoly detection that false-positives on me because they can't track my activity on the internet. I'm "fresh" to them, so they raise a flag and ban me from my own account. There's numerous discussions on the Internet of people loosing access to their google accounts because of this.
- anyway, if I can't login go google, then I can't investigate email alerts
Mon Sep 17, 2024
- Marcin emailed me today saying that he can't email to gmail (but apparently OSE-to-OSE emails work)
- I replied-all, and I got an error back from Catarina's email
Message blocked Your message to OBFUSCATED@gmail.com has been blocked. See technical details below for more information. Learn more here: https://support.google.com/mail/answer/81126#authentication The response was: 550 5.7.26 Your email has been blocked because the sender is unauthenticated. Gmail requires all senders to authenticate with either SPF or DKIM. Authentication results: DKIM = did not pass SPF [opensourceecology.org] with ip: [209.85.220.41] = did not pass For instructions on setting up authentication, go to https://support.google.com/mail/answer/81126#authentication d75a77b69052e-459aad0da83sor46218481cf.5 - gsmtp
- I checked our DNS, and I got this
user@disp3433:~$ while true; do date; dig -t TXT opensourceecology.org; sleep 300; echo; done Tue Sep 17 11:58:03 PM -05 2024 ; <<>> DiG 9.18.28-1~deb12u2-Debian <<>> -t TXT opensourceecology.org ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21111 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;opensourceecology.org. IN TXT ;; ANSWER SECTION: opensourceecology.org. 120 IN TXT "spf1=v a mx include:_spf.google.com ip4:78.46.3.178 ip4:138.201.84.223 ip4:144.76.164.201 ip6:2a01:4f8:200:40d7::2 ~all" ;; Query time: 113 msec ;; SERVER: 10.139.1.1#53(10.139.1.1) (UDP) ;; WHEN: Tue Sep 17 23:58:05 -05 2024 ;; MSG SIZE rcvd: 182 ^C user@disp3433:~$
- mxtoolbox.com also complains about the SPF record https://mxtoolbox.com/emailhealth/opensourceecology.org/
- oh, yikes, looks like it says "spf1=v" and it should be "v=spf1"
- I quickly changed it in cloudflare, and now it looks good
user@disp3433:~$ while true; do date; dig -t TXT opensourceecology.org; sleep 300; echo; done Wed Sep 18 12:02:36 AM -05 2024 ; <<>> DiG 9.18.28-1~deb12u2-Debian <<>> -t TXT opensourceecology.org ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10396 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;opensourceecology.org. IN TXT ;; ANSWER SECTION: opensourceecology.org. 120 IN TXT "v=spf1 a mx include:_spf.google.com ip4:78.46.3.178 ip4:138.201.84.223 ip4:144.76.164.201 ip6:2a01:4f8:200:40d7::2 ~all" ;; Query time: 106 msec ;; SERVER: 10.139.1.1#53(10.139.1.1) (UDP) ;; WHEN: Wed Sep 18 00:02:36 -05 2024 ;; MSG SIZE rcvd: 182 ^C user@disp3433:~$
- I also refreshed mxtoolbox.com, and it no longer complains about the SPF record
- I sent marcin another reply, and it worked.
Mon Sep 16, 2024
- since last night, I have 4 PGP-encrypted emails in my inbox from wazuh@hetzner3.opensourceecology.org; it's working!
- it's warning me about a trojaned version of /bin/diff, but I've seen that error on other systems; seems to be a false-positive on Debian
- there's already a bunch of open/duplicate reports of this; wazuh team seems sloppy in managing their tickets https://github.com/wazuh/wazuh/issues/13278#issuecomment-2333707523
- but a PR was submitted last week https://github.com/wazuh/wazuh/issues/13278#issuecomment-2335327334
- anyway, sad truth is that OSE has 0 full-time sysadmins, so nobody is probably going to be reading wazuh alert emails, anyway. It's useful because;
- it creates an off-system audit log that can be reviewed post-incident
- it's a solid foundation for the future, if OSE scales and actually gets a full-time sysadmin team
- the other contents of the wazuh alerts overnight were netstat diff reports
- looks like some sshd connection disappeared, probably my laptop going to sleep
- oh, and I got another alert where the sshd connection popped-up again at 09:30-ish, so -- yeah -- my laptop reconnecting again
tcp 127.0.0.1:47362 0.0.0.0:* 1612/sshd
...
- I also went ahead and updated the hostname, so my shell clearly says which system I'm working-on
root@mail ~ # hostnamectl Static hostname: mail Icon name: computer-desktop Chassis: desktop 🖥️ Machine ID: 6f64c84dc1094f43a1acbb37ba58b697 Boot ID: 4fb17ee3a5884a4f8f3a219ffd329cbc Operating System: Debian GNU/Linux 12 (bookworm) Kernel: Linux 6.1.0-21-amd64 Architecture: x86-64 Hardware Vendor: FUJITSU Hardware Model: D3401-H1 Firmware Version: V5.0.0.11 R1.29.0 for D3401-H1x root@mail ~ # root@mail ~ # hostnamectl set-hostname hetzner3 root@mail ~ # root@mail ~ # hostnamectl Static hostname: hetzner3 Icon name: computer-desktop Chassis: desktop 🖥️ Machine ID: 6f64c84dc1094f43a1acbb37ba58b697 Boot ID: 4fb17ee3a5884a4f8f3a219ffd329cbc Operating System: Debian GNU/Linux 12 (bookworm) Kernel: Linux 6.1.0-21-amd64 Architecture: x86-64 Hardware Vendor: FUJITSU Hardware Model: D3401-H1 Firmware Version: V5.0.0.11 R1.29.0 for D3401-H1x root@mail ~ # root@mail ~ # cat /etc/hosts ### Hetzner Online GmbH installimage 127.0.0.1 localhost.localdomain localhost 144.76.164.201 mail.opensourceecology.org mail ::1 ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters ff02::3 ip6-allhosts 2a01:4f8:200:40d7::2 mail.opensourceecology.org mail root@mail ~ # root@mail ~ # vim /etc/hosts ... root@mail ~ # root@mail ~ # cat /etc/hosts ### Hetzner Online GmbH installimage 127.0.0.1 localhost.localdomain localhost 144.76.164.201 hetzner3.opensourceecology.org hetzner3 ::1 ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters ff02::3 ip6-allhosts 2a01:4f8:200:40d7::2 hetzner3.opensourceecology.org hetzner3 root@mail ~ #
...
- now that we've used ansible to harden and configure our security-sensitive base packages, I want to setup backups before provisioning anything else (like databases, web servers, etc)
- so I commented-out all the roles except 'maltfield.backups' in ansible, and gave it a run; looks successful
user@ose:~/sandbox_local/ansible/hetzner3$ ansible-playbook provision.yml PLAY [hetzner3] ************************************************************************ TASK [Gathering Facts] ***************************************************************** ok: [hetzner3] TASK [maltfield.backups : install backups prereqs] ************************************* changed: [hetzner3] TASK [maltfield.backups : Make root's hardened backups dir] **************************** changed: [hetzner3] TASK [maltfield.backups : Add b2user user] ********************************************* changed: [hetzner3] TASK [maltfield.backups : Make b2user's hardened backups] ****************************** changed: [hetzner3] TASK [maltfield.backups : Make b2user's sync dir] ************************************** changed: [hetzner3] TASK [maltfield.backups : Backup logs dir] ********************************************* changed: [hetzner3] TASK [maltfield.backups : Backup script] *********************************************** changed: [hetzner3] TASK [maltfield.backups : Backup Report script] **************************************** changed: [hetzner3] TASK [maltfield.backups : Backup README.txt] ******************************************* changed: [hetzner3] TASK [maltfield.backups : Backup cron] ************************************************* changed: [hetzner3] TASK [install basic essential packages] ************************************************ ok: [hetzner3] PLAY RECAP ***************************************************************************** hetzner3 : ok=12 changed=10 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 user@ose:~/sandbox_local/ansible/hetzner3$
- here's what it put on the server
root@mail ~ # ls -lah /root/backups/ total 24K drwx------ 2 root root 4.0K Sep 16 18:46 . drwx------ 8 root root 4.0K Sep 16 18:46 .. -rwx------ 1 root root 3.2K Sep 16 18:45 backupReport.sh -rwx------ 1 root root 6.4K Sep 16 18:45 backup.sh -rwx------ 1 root root 1.1K Sep 16 18:45 README.txt root@mail ~ # root@mail ~ # ls -lah /etc/cron.d total 24K drwxr-xr-x 2 root root 4.0K Sep 16 18:46 . drwxr-xr-x 84 root root 4.0K Sep 16 18:46 .. -rw-r--r-- 1 root root 248 Sep 16 18:45 backup_to_backblaze -rw-r--r-- 1 root root 201 Mar 5 2023 e2scrub_all -rw-r--r-- 1 root root 563 Jul 31 23:44 mdadm -rw-r--r-- 1 root root 102 Mar 2 2023 .placeholder root@mail ~ # root@mail ~ # cat /etc/cron.d/backup_to_backblaze # Ansible managed SHELL=/bin/bash 20 07 * * * root sleep $(( RANDOM \% 3600 )) && time /bin/nice /root/backups/backup.sh &>> /var/log/backups/backup.log 20 04 03 * * root sleep $(( RANDOM \% 3600 )) && time /bin/nice /root/backups/backupReport.sh root@mail ~ # root@mail ~ # dpkg -l | grep -i rclone ii rclone 1.60.1+dfsg-2+b5 amd64 rsync for commercial cloud storage root@mail ~ #
- oh, I also got an email from wazuh informing me that pigz was installed (this is new), nice :)
- it also says that /etc/passwd (and some other important system files) were updated
- unfortunately, the message was cut-off
- there's some things (intentionally) missing:
- the rclone config (with secret keys not to be stored in ansible)
- /root/backups/backups.settings (with passwords not to be stored in ansible)
- /root/backups/*.key (the private key used to encrypt/decrypt backups, not to be stored in ansible)
- I figured that I would generate a new key (it's a good idea to rotate keys at least once a year, and the last key was generated 2018-04-02 (over 6 years ago))
- I actually did generate a new key, but when I went to add it to our OSE shared keepass, I realized that I had already pre-generated an additional 2 unused keys
- I left a note in the OSE Keepass entry saying that I wanted to go ahead and pregenerate keys for future use, so that I was sure they'd be in Marcin's offline copy that we made when I was at FeF. Smart thinking, Michael!
2018-05-21: At the time of writing, we're currently using 'ose-backups-cron.key' for encrypting our backups. The following two key files are unused. They've just been pre-generated & stored here in keepass to ensure that our backups (namely, Marcin's Veracrypt'd usb drive) includes the current backup encryption key + future ones if we need to rotate keys.
- So I deleted my newly-generated key, and instead uploaded `ose-backups-cron.2.key` to hetzner3
- I also uploaded `ose-backups-cron.key` to hetzner3, of course, so that it can still decrypt backups made by hetzner2
root@mail ~/backups # ls -lah total 32K drwx------ 2 root root 4.0K Sep 16 19:36 . drwx------ 8 root root 4.0K Sep 16 19:12 .. -rwx------ 1 root root 3.2K Sep 16 18:45 backupReport.sh -rwx------ 1 root root 6.4K Sep 16 18:45 backup.sh -r-------- 1 root root 4.0K Sep 16 19:31 ose-backups-cron.2.key -r-------- 1 root root 4.0K Sep 16 19:34 ose-backups-cron.key -rwx------ 1 root root 1.1K Sep 16 18:45 README.txt root@mail ~/backups #
- next, I rsync'd over the old backup.settings file and made a backup of it
root@mail ~/backups # ls -lah total 36K drwx------ 2 root root 4.0K Sep 16 20:05 . drwx------ 8 root root 4.0K Sep 16 20:00 .. -rwx------ 1 root root 3.2K Sep 16 18:45 backupReport.sh -rw------- 1 root root 966 Oct 29 2022 backup.settings -rwx------ 1 root root 6.4K Sep 16 18:45 backup.sh -r-------- 1 root root 4.0K Sep 16 19:31 ose-backups-cron.2.key -r-------- 1 root root 4.0K Sep 16 19:34 ose-backups-cron.key -rwx------ 1 root root 1.1K Sep 16 18:45 README.txt root@mail ~/backups # cp backup.settings backup.settings.20240916 root@mail ~/backups # root@mail ~/backups # ls -lah total 40K drwx------ 2 root root 4.0K Sep 16 20:06 . drwx------ 8 root root 4.0K Sep 16 20:00 .. -rwx------ 1 root root 3.2K Sep 16 18:45 backupReport.sh -rw------- 1 root root 966 Oct 29 2022 backup.settings -rw------- 1 root root 966 Sep 16 20:06 backup.settings.20240916 -rwx------ 1 root root 6.4K Sep 16 18:45 backup.sh -r-------- 1 root root 4.0K Sep 16 19:31 ose-backups-cron.2.key -r-------- 1 root root 4.0K Sep 16 19:34 ose-backups-cron.key -rwx------ 1 root root 1.1K Sep 16 18:45 README.txt root@mail ~/backups #
- I editing the settings file, making a few changes
- I added the 'EMAIL_LIST=' variable, which was previously in the backupReports.sh file (but now that this is provisioned by ansible and we're uploading ansible roles to GitHub, I removed it because I didn't want our email addresses exposed on GitHub)
- I added a 'PIGZ=' variable, with the path to the `pigz` binary -- this will let us compress files across multiple cores, which was previously pegged to 1 CPU using just `gzip`
- I removed the amazon glacier creds; we don't use this anymore because min retention reqs for glacier made it insanely expensive
- I removed b2 binary stuff; this broke on hetzner2 due to python changes, and we ended-up switching to rclone, which is actually in the official debian repos
- I changed the path to the 'encryptionKeyFilePath=' from '/root/backups/ose-backups-cron.key' to '/root/backups/ose-backups-cron.2.key'
- I changed the path to the 'b2StagingDir' from '/home/b2user/sync' to '/home/b2user/backups/sync'
root@mail ~/backups # vim backup.settings ...
Sun Sep 15, 2024
- it looks like our ssh server keys aren't ideal (eg 3072-bit RSA key should be 4096-bit)
root@mail /etc/ssh # ssh-keygen -lf ssh_host_rsa_key.pub 3072 SHA256:sS0Nbe0cHagMevsRQxZ7BkRw2UlrnwLxPSOIl1JUWF0 root@mail.opensourceecology.org (RSA) root@mail /etc/ssh # root@mail /etc/ssh # ssh-keygen -lf ssh_host_ecdsa_key.pub 256 SHA256:eTYBZQgqpbmUpMD1XKPFEkUiHHMwAfY4rB60osHI5jA root@mail.opensourceecology.org (ECDSA) root@mail /etc/ssh # root@mail /etc/ssh # ssh-keygen -lf ssh_host_ed25519_key.pub 256 SHA256:xmAIk3DB/dXxxm2PunsITI8fRDwBXjH2W5ULXKsysBQ root@mail.opensourceecology.org (ED25519) root@mail /etc/ssh #
- so I rotated the current ssh host keys, and I regenerated new ones
revoked_keys root@mail /etc/ssh # mkdir original_host_keys.20240915 root@mail /etc/ssh # root@mail /etc/ssh # mv ssh_host* original_host_keys.20240915/ root@mail /etc/ssh # root@mail /etc/ssh # du -sh original_host_keys.20240915 28K original_host_keys.20240915 root@mail /etc/ssh # root@mail /etc/ssh # ssh-keygen -f /etc/ssh/ssh_host_rsa_key -t rsa -b 4096 -o -a 100 Generating public/private rsa key pair. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /etc/ssh/ssh_host_rsa_key Your public key has been saved in /etc/ssh/ssh_host_rsa_key.pub The key fingerprint is: SHA256:1VDOqdL3X0cNt/2qDWHyqxLMd8Tz8hb2t89+kkEejPE root@mail The key's randomart image is: +---[RSA 4096]----+ | ... | | =.. | | ..==. .| | o .= Eo+| | oS +.+= o+| | +..=oo*..| | o .o+.=+| | . +=+*| | ...ooo+X| +----[SHA256]-----+ root@mail /etc/ssh # root@mail /etc/ssh # ssh-keygen -f /etc/ssh/ssh_host_ecdsa_key -t ecdsa -b 521 -o -a 100 Generating public/private ecdsa key pair. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /etc/ssh/ssh_host_ecdsa_key Your public key has been saved in /etc/ssh/ssh_host_ecdsa_key.pub The key fingerprint is: SHA256:hxzCaH018Wn8tEn7sZ22P/WrNMmZ29w5tK/w0KHq8eE root@mail The key's randomart image is: +---[ECDSA 521]---+ | +. | | + . + . | | o + o = o | | . + o . + + | | S . *. | | . .o++*| | . =Bo+*| | =.=*+=| | .o Eo+BX| +----[SHA256]-----+ root@mail /etc/ssh # root@mail /etc/ssh # ssh-keygen -f /etc/ssh/ssh_host_ed25519_key -t ed25519 -a 100 Generating public/private ed25519 key pair. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /etc/ssh/ssh_host_ed25519_key Your public key has been saved in /etc/ssh/ssh_host_ed25519_key.pub The key fingerprint is: SHA256:5SGqAP2v//HV5aJ0VyhI7AcG3p5MKTzgXyfvzpuIP0Q root@mail The key's randomart image is: +--[ED25519 256]--+ | . . | | . . + + . | |. . . * & . | | . . o #EO . | | . . . S.B + . o| | . o .o o o.| | . . .. + + o| | . +.* + o | | ....o.+.*. | +----[SHA256]-----+ root@mail /etc/ssh # root@mail /etc/ssh # ssh-keygen -lf ssh_host_rsa_key.pub 4096 SHA256:1VDOqdL3X0cNt/2qDWHyqxLMd8Tz8hb2t89+kkEejPE root@mail (RSA) root@mail /etc/ssh # root@mail /etc/ssh # ssh-keygen -lf ssh_host_ecdsa_key.pub 521 SHA256:hxzCaH018Wn8tEn7sZ22P/WrNMmZ29w5tK/w0KHq8eE root@mail (ECDSA) root@mail /etc/ssh # root@mail /etc/ssh # ssh-keygen -lf ssh_host_ed25519_key.pub 256 SHA256:5SGqAP2v//HV5aJ0VyhI7AcG3p5MKTzgXyfvzpuIP0Q root@mail (ED25519) root@mail /etc/ssh #
- I was able to fix wazuh
- first I just took the default ossec.conf file that was put in-place at /var/ossec/etc/ossec.conf after wazuh 4.9.0-1 was installed fresh on Debian 12
- then I manually merged-in the important options from our old ossec.conf file (which was used on CentOS7 with, I think, ossec v2.9.0)
- one obvious change appears to be that <rules> no longer exists. Now it's <ruleset> -- where (instead of defining individual .xml rules files) we define a directory where the .xml rules files live
- anyway, after re-provisioning wazuh with this merged file, it started
- most importantly, wazuh should be managing active responses
- I tested this, by throwing a failed ssh "attack" in a loop
- $ while true; do date; ssh -p 32415 144.76.164.201; sleep 1; echo; done
Sun 15 Sep 2024 12:53:02 PM -05 user@144.76.164.201: Permission denied (publickey).
Sun 15 Sep 2024 12:53:06 PM -05 user@144.76.164.201: Permission denied (publickey).
Sun 15 Sep 2024 12:53:11 PM -05 user@144.76.164.201: Permission denied (publickey).
Sun 15 Sep 2024 12:53:14 PM -05 user@144.76.164.201: Permission denied (publickey).
Sun 15 Sep 2024 12:53:18 PM -05 user@144.76.164.201: Permission denied (publickey).
Sun 15 Sep 2024 12:53:23 PM -05 user@144.76.164.201: Permission denied (publickey).
Sun 15 Sep 2024 12:53:26 PM -05 user@144.76.164.201: Permission denied (publickey).
- after this try, my existing ssh session got stuck
- I changed IPs, reconnected, and saw this in the logs triggering both 'hosts-dey' and 'firewall-drop
2024/09/15 19:53:47 active-response/bin/host-deny: Starting 2024/09/15 19:53:47 active-response/bin/host-deny: {"version":1,"origin":{"name":"node01","module":"wazuh-execd"},"command":"add","parameters":{"extra_args":[],"alert":{"timestamp":"2024-09-15T19:53:47.144+0200","rule":{"level":10,"description":"sshd: brute force trying to get access to the system. Non existent user.","id":"5712","mitre":{"id":["T1110"],"tactic":["Credential Access"],"technique":["Brute Force"]},"frequency":8,"firedtimes":1,"mail":true,"groups":["syslog","sshd","authentication_failures"],"gdpr":["IV_35.7.d","IV_32.2"],"hipaa":["164.312.b"],"nist_800_53":["SI.4","AU.14","AC.7"],"pci_dss":["11.4","10.2.4","10.2.5"],"tsc":["CC6.1","CC6.8","CC7.2","CC7.3"]},"agent":{"id":"000","name":"mail"},"manager":{"name":"mail"},"id":"1726422827.1936013","previous_output":"Sep 15 17:53:42 mail sshd[100734]: Invalid user user from 146.70.188.50 port 38336\nSep 15 17:53:38 mail sshd[100722]: Invalid user user from 146.70.188.50 port 47542\nSep 15 17:53:34 mail sshd[100720]: Invalid user user from 146.70.188.50 port 47532\nSep 15 17:53:30 mail sshd[100718]: Invalid user user from 146.70.188.50 port 38758\nSep 15 17:53:26 mail sshd[100716]: Invalid user user from 146.70.188.50 port 38746\nSep 15 17:53:22 mail sshd[100714]: Invalid user user from 146.70.188.50 port 38732\nSep 15 17:52:52 mail sshd[100711]: Invalid user user from 146.70.188.50 port 51720","full_log":"Sep 15 17:53:46 mail sshd[100736]: Invalid user user from 146.70.188.50 port 38352","predecoder":{"program_name":"sshd","timestamp":"Sep 15 17:53:46","hostname":"mail"},"decoder":{"parent":"sshd","name":"sshd"},"data":{"srcip":"146.70.188.50","srcport":"38352","srcuser":"user"},"location":"journald"},"program":"active-response/bin/host-deny"}} 2024/09/15 19:53:47 active-response/bin/host-deny: {"version":1,"origin":{"name":"host-deny","module":"active-response"},"command":"check_keys","parameters":{"keys":["146.70.188.50"]}} 2024/09/15 19:53:47 active-response/bin/host-deny: {"version":1,"origin":{"name":"node01","module":"wazuh-execd"},"command":"continue","parameters":{"extra_args":[],"alert":{"timestamp":"2024-09-15T19:53:47.144+0200","rule":{"level":10,"description":"sshd: brute force trying to get access to the system. Non existent user.","id":"5712","mitre":{"id":["T1110"],"tactic":["Credential Access"],"technique":["Brute Force"]},"frequency":8,"firedtimes":1,"mail":true,"groups":["syslog","sshd","authentication_failures"],"gdpr":["IV_35.7.d","IV_32.2"],"hipaa":["164.312.b"],"nist_800_53":["SI.4","AU.14","AC.7"],"pci_dss":["11.4","10.2.4","10.2.5"],"tsc":["CC6.1","CC6.8","CC7.2","CC7.3"]},"agent":{"id":"000","name":"mail"},"manager":{"name":"mail"},"id":"1726422827.1936013","previous_output":"Sep 15 17:53:42 mail sshd[100734]: Invalid user user from 146.70.188.50 port 38336\nSep 15 17:53:38 mail sshd[100722]: Invalid user user from 146.70.188.50 port 47542\nSep 15 17:53:34 mail sshd[100720]: Invalid user user from 146.70.188.50 port 47532\nSep 15 17:53:30 mail sshd[100718]: Invalid user user from 146.70.188.50 port 38758\nSep 15 17:53:26 mail sshd[100716]: Invalid user user from 146.70.188.50 port 38746\nSep 15 17:53:22 mail sshd[100714]: Invalid user user from 146.70.188.50 port 38732\nSep 15 17:52:52 mail sshd[100711]: Invalid user user from 146.70.188.50 port 51720","full_log":"Sep 15 17:53:46 mail sshd[100736]: Invalid user user from 146.70.188.50 port 38352","predecoder":{"program_name":"sshd","timestamp":"Sep 15 17:53:46","hostname":"mail"},"decoder":{"parent":"sshd","name":"sshd"},"data":{"srcip":"146.70.188.50","srcport":"38352","srcuser":"user"},"location":"journald"},"program":"active-response/bin/host-deny"}} 2024/09/15 19:53:47 active-response/bin/host-deny: Ended 2024/09/15 19:53:47 active-response/bin/firewall-drop: Starting 2024/09/15 19:53:47 active-response/bin/firewall-drop: {"version":1,"origin":{"name":"node01","module":"wazuh-execd"},"command":"add","parameters":{"extra_args":[],"alert":{"timestamp":"2024-09-15T19:53:47.144+0200","rule":{"level":10,"description":"sshd: brute force trying to get access to the system. Non existent user.","id":"5712","mitre":{"id":["T1110"],"tactic":["Credential Access"],"technique":["Brute Force"]},"frequency":8,"firedtimes":1,"mail":true,"groups":["syslog","sshd","authentication_failures"],"gdpr":["IV_35.7.d","IV_32.2"],"hipaa":["164.312.b"],"nist_800_53":["SI.4","AU.14","AC.7"],"pci_dss":["11.4","10.2.4","10.2.5"],"tsc":["CC6.1","CC6.8","CC7.2","CC7.3"]},"agent":{"id":"000","name":"mail"},"manager":{"name":"mail"},"id":"1726422827.1936013","previous_output":"Sep 15 17:53:42 mail sshd[100734]: Invalid user user from 146.70.188.50 port 38336\nSep 15 17:53:38 mail sshd[100722]: Invalid user user from 146.70.188.50 port 47542\nSep 15 17:53:34 mail sshd[100720]: Invalid user user from 146.70.188.50 port 47532\nSep 15 17:53:30 mail sshd[100718]: Invalid user user from 146.70.188.50 port 38758\nSep 15 17:53:26 mail sshd[100716]: Invalid user user from 146.70.188.50 port 38746\nSep 15 17:53:22 mail sshd[100714]: Invalid user user from 146.70.188.50 port 38732\nSep 15 17:52:52 mail sshd[100711]: Invalid user user from 146.70.188.50 port 51720","full_log":"Sep 15 17:53:46 mail sshd[100736]: Invalid user user from 146.70.188.50 port 38352","predecoder":{"program_name":"sshd","timestamp":"Sep 15 17:53:46","hostname":"mail"},"decoder":{"parent":"sshd","name":"sshd"},"data":{"srcip":"146.70.188.50","srcport":"38352","srcuser":"user"},"location":"journald"},"program":"active-response/bin/firewall-drop"}} 2024/09/15 19:53:47 active-response/bin/firewall-drop: {"version":1,"origin":{"name":"firewall-drop","module":"active-response"},"command":"check_keys","parameters":{"keys":["146.70.188.50"]}} 2024/09/15 19:53:47 active-response/bin/firewall-drop: {"version":1,"origin":{"name":"node01","module":"wazuh-execd"},"command":"continue","parameters":{"extra_args":[],"alert":{"timestamp":"2024-09-15T19:53:47.144+0200","rule":{"level":10,"description":"sshd: brute force trying to get access to the system. Non existent user.","id":"5712","mitre":{"id":["T1110"],"tactic":["Credential Access"],"technique":["Brute Force"]},"frequency":8,"firedtimes":1,"mail":true,"groups":["syslog","sshd","authentication_failures"],"gdpr":["IV_35.7.d","IV_32.2"],"hipaa":["164.312.b"],"nist_800_53":["SI.4","AU.14","AC.7"],"pci_dss":["11.4","10.2.4","10.2.5"],"tsc":["CC6.1","CC6.8","CC7.2","CC7.3"]},"agent":{"id":"000","name":"mail"},"manager":{"name":"mail"},"id":"1726422827.1936013","previous_output":"Sep 15 17:53:42 mail sshd[100734]: Invalid user user from 146.70.188.50 port 38336\nSep 15 17:53:38 mail sshd[100722]: Invalid user user from 146.70.188.50 port 47542\nSep 15 17:53:34 mail sshd[100720]: Invalid user user from 146.70.188.50 port 47532\nSep 15 17:53:30 mail sshd[100718]: Invalid user user from 146.70.188.50 port 38758\nSep 15 17:53:26 mail sshd[100716]: Invalid user user from 146.70.188.50 port 38746\nSep 15 17:53:22 mail sshd[100714]: Invalid user user from 146.70.188.50 port 38732\nSep 15 17:52:52 mail sshd[100711]: Invalid user user from 146.70.188.50 port 51720","full_log":"Sep 15 17:53:46 mail sshd[100736]: Invalid user user from 146.70.188.50 port 38352","predecoder":{"program_name":"sshd","timestamp":"Sep 15 17:53:46","hostname":"mail"},"decoder":{"parent":"sshd","name":"sshd"},"data":{"srcip":"146.70.188.50","srcport":"38352","srcuser":"user"},"location":"journald"},"program":"active-response/bin/firewall-drop"}} 2024/09/15 19:53:47 active-response/bin/firewall-drop: Ended
- I can confirm that my IP address was added to the hosts.deny file and to my live iptables rules too
root@mail /var/ossec/logs # cat /etc/hosts.deny # /etc/hosts.deny: list of hosts that are _not_ allowed to access the system. # See the manual pages hosts_access(5) and hosts_options(5). # # Example: ALL: some.host.name, .some.domain # ALL EXCEPT in.fingerd: other.host.name, .other.domain # # If you're going to protect the portmapper use the name "rpcbind" for the # daemon name. See rpcbind(8) and rpc.mountd(8) for further information. # # The PARANOID wildcard matches any host whose name does not match its # address. # # You may wish to enable this to ensure any programs that don't # validate looked up hostnames still leave understandable logs. In past # versions of Debian this has been the default. # ALL: PARANOID ALL:146.70.188.50 root@mail /var/ossec/logs # root@mail /var/ossec/logs # iptables-save | grep -i DROP :INPUT DROP [0:0] :FORWARD DROP [0:0] :OUTPUT DROP [0:0] -A INPUT -s 146.70.188.50/32 -j DROP -A INPUT -s 127.0.0.0/8 -d 127.0.0.0/8 -j DROP -A INPUT -j DROP -A FORWARD -s 146.70.188.50/32 -j DROP -A OUTPUT -j DROP root@mail /var/ossec/logs #
- wazuh is super-powerful, but I really only want it for [a] active-response temp-bans (shown above) and [b] email alerts
- email alerts aren't going to work because postfix isn't installed yet; let's use ansible to install & configure postfix now
- pre-test: mailutils is installed, but postfix is not
root@mail /var/ossec/logs # echo "this is just a test" | /usr/bin/mail -r noreply@opensourceecology.org -s "test from hetzner3" "michael@michaelaltfield.net" mail: Cannot open mailer: No such file or directory mail: cannot send message: No such file or directory root@mail /var/ossec/logs #
- now let's try to install postfix with ansible
user@ose:~/sandbox_local/ansible/hetzner3$ ansible-playbook provision.yml [WARNING]: While constructing a mapping from /home/user/sandbox_local/ansible/hetzner3/roles/maltfield.postfix/tasks/main.yml, line 8, column 3, found a duplicate dict key (command). Using last defined value only. PLAY [hetzner3] ************************************************************************ TASK [Gathering Facts] ***************************************************************** ok: [hetzner3] TASK [maltfield.postfix : install postfix] ********************************************* changed: [hetzner3] TASK [maltfield.postfix : generate dh parameters file] ********************************* [WARNING]: Consider using the file module with mode rather than running 'chmod'. If you need to use command because file is insufficient you can add 'warn: false' to this command task or set 'command_warnings=False' in ansible.cfg to get rid of this message. fatal: [hetzner3]: FAILED! => {"ansible_facts": {"discovered_interpreter_python": "/usr/bin/python3"}, "changed": true, "cmd": ["chmod", "0400", "/etc/ssl/certs/dhparam.pem"], "delta": "0:00:00.003531", "end": "2024-09-15 20:03:32.776570", "msg": "non-zero return code", "rc": 1, "start": "2024-09-15 20:03:32.773039", "stderr": "chmod: cannot access '/etc/ssl/certs/dhparam.pem': No such file or directory", "stderr_lines": ["chmod: cannot access '/etc/ssl/certs/dhparam.pem': No such file or directory"], "stdout": "", "stdout_lines": []} PLAY RECAP ***************************************************************************** hetzner3 : ok=2 changed=1 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0 user@ose:~/sandbox_local/ansible/hetzner3$
- hmm, it does appear to have been started
root@mail ~ # systemctl status postfix ● postfix.service - Postfix Mail Transport Agent Loaded: loaded (/lib/systemd/system/postfix.service; enabled; preset: enabled) Active: active (exited) since Sun 2024-09-15 20:03:31 CEST; 2min 31s ago Docs: man:postfix(1) Process: 103774 ExecStart=/bin/true (code=exited, status=0/SUCCESS) Main PID: 103774 (code=exited, status=0/SUCCESS) CPU: 1ms Sep 15 20:03:31 mail systemd[1]: Starting postfix.service - Postfix Mail Transport Agent... Sep 15 20:03:31 mail systemd[1]: Finished postfix.service - Postfix Mail Transport Agent. root@mail ~ #
- but, yeah, it looks like it didn't create the dhparams file, which would explain why chmod says "no such file or directory"
root@mail ~ # ls -lah /etc/ssl/certs/ | grep -i dh root@mail ~ #
- here's the role tasks
user@ose:~/sandbox_local/ansible/hetzner3$ head -n15 roles/maltfield.postfix/tasks/main.yml --- - name: install postfix apt: update_cache: yes pkg: - postfix - name: generate dh parameters file command: openssl dhparam -out /etc/ssl/certs/dhparam.pem 4096 command: chown root:root /etc/ssl/certs/dhparam.pem command: chmod 0400 /etc/ssl/certs/dhparam.pem args: creates: /etc/ssl/certs/dhparam.pem - name: Whitelisted Domains user@ose:~/sandbox_local/ansible/hetzner3$
- I executed these manually; well, it worked fine. not sure why ansible failed
root@mail ~ # time openssl dhparam -out /etc/ssl/certs/dhparam.pem 4096 Generating DH parameters, 4096 bit long safe prime .............................................................+........................................................................................................................................................+....+...............................................................................................................................................................................................................................................+........................................................................................................................................................................................................................................................................................+..................................................................................................................................................................................................+...........................................................................................................................................................................+.................................................................................................................................................................+...................................................................................+...................................................+....................................................................................................................................................................................................................................................++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++*++* real 0m24.935s user 0m24.919s sys 0m0.016s root@mail ~ # root@mail ~ # chown root:root /etc/ssl/certs/dhparam.pem root@mail ~ # root@mail ~ # chmod 0400 /etc/ssl/certs/dhparam.pem root@mail ~ # root@mail ~ # ls -lah /etc/ssl/certs/ | grep -i dh -r-------- 1 root root 769 Sep 15 20:12 dhparam.pem root@mail ~ #
- i did a double-tap with ansible; it failed again when restarting postfix
user@ose:~/sandbox_local/ansible/hetzner3$ ansible-playbook provision.yml [WARNING]: While constructing a mapping from /home/user/sandbox_local/ansible/hetzner3/roles/maltfield.postfix/tasks/main.yml, line 8, column 3, found a duplicate dict key (command). Using last defined value only. PLAY [hetzner3] ************************************************************************ TASK [Gathering Facts] ***************************************************************** ok: [hetzner3] TASK [maltfield.postfix : install postfix] ********************************************* ok: [hetzner3] TASK [maltfield.postfix : generate dh parameters file] ********************************* [WARNING]: Consider using the file module with mode rather than running 'chmod'. If you need to use command because file is insufficient you can add 'warn: false' to this command task or set 'command_warnings=False' in ansible.cfg to get rid of this message. ok: [hetzner3] TASK [maltfield.postfix : Whitelisted Domains] ***************************************** changed: [hetzner3] TASK [maltfield.postfix : Postfox main.cf] ********************************************* changed: [hetzner3] TASK [maltfield.postfix : Postfox master.cf] ******************************************* changed: [hetzner3] TASK [install basic essential packages] ************************************************ ok: [hetzner3] RUNNING HANDLER [maltfield.postfix : restart postfix] ********************************** ERROR! [mux 57866] 13:13:28.475680 E mitogen.[ssh.144.76.164.201:32415.sudo.root]: raw pickle was: b'\x80\x02(X&\x00\x00\x00ose-57937-746bb1080740-215639e850q\x00X\x16\x00\x00\x00ansible_mitogen.targetq\x01NX\n\x00\x00\x00run_moduleq\x02)cmitogen.core\nKwargs\nq\x03}q\x04X\x06\x00\x00\x00kwargsq\x05}q\x06(X\x0b\x00\x00\x00runner_nameq\x07X\x0e\x00\x00\x00NewStyleRunnerq\x08X\x06\x00\x00\x00moduleq\tcansible.utils.unsafe_proxy\nAnsibleUnsafeText\nq\nX\x16\x00\x00\x00ansible.legacy.systemdq\x0b\x85q\x0c\x81q\rX\x04\x00\x00\x00pathq\x0eX9\x00\x00\x00/usr/lib/python3/dist-packages/ansible/modules/systemd.pyq\x0fX\t\x00\x00\x00json_argsq\x10XJ\x02\x00\x00{"name": "postfix", "state": "restarted", "_ansible_check_mode": false, "_ansible_no_log": false, "_ansible_debug": false, "_ansible_diff": false, "_ansible_verbosity": 0, "_ansible_version": "2.10.17", "_ansible_module_name": "ansible.legacy.systemd", "_ansible_syslog_facility": "LOG_USER", "_ansible_selinux_special_fs": ["fuse", "nfs", "vboxsf", "ramfs", "9p", "vfat"], "_ansible_string_conversion_action": "warn", "_ansible_socket": null, "_ansible_shell_executable": "/bin/sh", "_ansible_keep_remote_files": false, "_ansible_tmpdir": null, "_ansible_remote_tmp": "~/.ansible/tmp"}q\x11X\x03\x00\x00\x00envq\x12}q\x13X\x14\x00\x00\x00interpreter_fragmentq\x14NX\t\x00\x00\x00is_pythonq\x15NX\n\x00\x00\x00module_mapq\x16}q\x17(X\x07\x00\x00\x00builtinq\x18]q\x19(X\x1a\x00\x00\x00ansible.module_utils._textq\x1aX\x1a\x00\x00\x00ansible.module_utils.basicq\x1bX\x1b\x00\x00\x00ansible.module_utils.commonq\x1cX/\x00\x00\x00ansible.module_utils.common._collections_compatq\x1dX(\x00\x00\x00ansible.module_utils.common._json_compatq\x1eX"\x00\x00\x00ansible.module_utils.common._utilsq\x1fX\'\x00\x00\x00ansible.module_utils.common.collectionsq X \x00\x00\x00ansible.module_utils.common.fileq!X&\x00\x00\x00ansible.module_utils.common.parametersq"X#\x00\x00\x00ansible.module_utils.common.processq#X$\x00\x00\x00ansible.module_utils.common.sys_infoq$X \x00\x00\x00ansible.module_utils.common.textq%X+\x00\x00\x00ansible.module_utils.common.text.convertersq&X+\x00\x00\x00ansible.module_utils.common.text.formattersq\'X&\x00\x00\x00ansible.module_utils.common.validationq(X$\x00\x00\x00ansible.module_utils.common.warningsq)X\x1b\x00\x00\x00ansible.module_utils.compatq*X\'\x00\x00\x00ansible.module_utils.compat._selectors2q+X%\x00\x00\x00ansible.module_utils.compat.selectorsq,X\x1b\x00\x00\x00ansible.module_utils.distroq-X#\x00\x00\x00ansible.module_utils.distro._distroq.X\x1a\x00\x00\x00ansible.module_utils.factsq/X,\x00\x00\x00ansible.module_utils.facts.ansible_collectorq0X$\x00\x00\x00ansible.module_utils.facts.collectorq1X!\x00\x00\x00ansible.module_utils.facts.compatq2X-\x00\x00\x00ansible.module_utils.facts.default_collectorsq3X#\x00\x00\x00ansible.module_utils.facts.hardwareq4X\'\x00\x00\x00ansible.module_utils.facts.hardware.aixq5X(\x00\x00\x00ansible.module_utils.facts.hardware.baseq6X*\x00\x00\x00ansible.module_utils.facts.hardware.darwinq7X-\x00\x00\x00ansible.module_utils.facts.hardware.dragonflyq8X+\x00\x00\x00ansible.module_utils.facts.hardware.freebsdq9X(\x00\x00\x00ansible.module_utils.facts.hardware.hpuxq:X(\x00\x00\x00ansible.module_utils.facts.hardware.hurdq;X)\x00\x00\x00ansible.module_utils.facts.hardware.linuxq<X*\x00\x00\x00ansible.module_utils.facts.hardware.netbsdq=X+\x00\x00\x00ansible.module_utils.facts.hardware.openbsdq>X)\x00\x00\x00ansible.module_utils.facts.hardware.sunosq?X$\x00\x00\x00ansible.module_utils.facts.namespaceq@X"\x00\x00\x00ansible.module_utils.facts.networkqAX&\x00\x00\x00ansible.module_utils.facts.network.aixqBX\'\x00\x00\x00ansible.module_utils.facts.network.baseqCX)\x00\x00\x00ansible.module_utils.facts.network.darwinqDX,\x00\x00\x00ansible.module_utils.facts.network.dragonflyqEX)\x00\x00\x00ansible.module_utils.facts.network.fc_wwnqFX*\x00\x00\x00ansible.module_utils.facts.network.freebsdqGX.\x00\x00\x00ansible.module_utils.facts.network.generic_bsdqHX\'\x00\x00\x00ansible.module_utils.facts.network.hpuxqIX\'\x00\x00\x00ansible.module_utils.facts.network.hurdqJX(\x00\x00\x00ansible.module_utils.facts.network.iscsiqKX(\x00\x00\x00ansible.module_utils.facts.network.linuxqLX)\x00\x00\x00ansible.module_utils.facts.network.netbsdqMX\'\x00\x00\x00ansible.module_utils.facts.network.nvmeqNX*\x00\x00\x00ansible.module_utils.facts.network.openbsdqOX(\x00\x00\x00ansible.module_utils.facts.network.sunosqPX \x00\x00\x00ansible.module_utils.facts.otherqQX\'\x00\x00\x00ansible.module_utils.facts.other.facterqRX%\x00\x00\x00ansible.module_utils.facts.other.ohaiqSX!\x00\x00\x00ansible.module_utils.facts.sysctlqTX!\x00\x00\x00ansible.module_utils.facts.systemqUX*\x00\x00\x00ansible.module_utils.facts.system.apparmorqVX&\x00\x00\x00ansible.module_utils.facts.system.capsqWX(\x00\x00\x00ansible.module_utils.facts.system.chrootqXX)\x00\x00\x00ansible.module_utils.facts.system.cmdlineqYX+\x00\x00\x00ansible.module_utils.facts.system.date_timeqZX.\x00\x00\x00ansible.module_utils.facts.system.distributionq[X%\x00\x00\x00ansible.module_utils.facts.system.dnsq\\X%\x00\x00\x00ansible.module_utils.facts.system.envq]X&\x00\x00\x00ansible.module_utils.facts.system.fipsq^X\'\x00\x00\x00ansible.module_utils.facts.system.localq_X%\x00\x00\x00ansible.module_utils.facts.system.lsbq`X)\x00\x00\x00ansible.module_utils.facts.system.pkg_mgrqaX*\x00\x00\x00ansible.module_utils.facts.system.platformqbX(\x00\x00\x00ansible.module_utils.facts.system.pythonqcX)\x00\x00\x00ansible.module_utils.facts.system.selinuxqdX-\x00\x00\x00ansible.module_utils.facts.system.service_mgrqeX.\x00\x00\x00ansible.module_utils.facts.system.ssh_pub_keysqfX&\x00\x00\x00ansible.module_utils.facts.system.userqgX"\x00\x00\x00ansible.module_utils.facts.timeoutqhX \x00\x00\x00ansible.module_utils.facts.utilsqiX"\x00\x00\x00ansible.module_utils.facts.virtualqjX\'\x00\x00\x00ansible.module_utils.facts.virtual.baseqkX,\x00\x00\x00ansible.module_utils.facts.virtual.dragonflyqlX*\x00\x00\x00ansible.module_utils.facts.virtual.freebsdqmX\'\x00\x00\x00ansible.module_utils.facts.virtual.hpuxqnX(\x00\x00\x00ansible.module_utils.facts.virtual.linuxqoX)\x00\x00\x00ansible.module_utils.facts.virtual.netbsdqpX*\x00\x00\x00ansible.module_utils.facts.virtual.openbsdqqX(\x00\x00\x00ansible.module_utils.facts.virtual.sunosqrX)\x00\x00\x00ansible.module_utils.facts.virtual.sysctlqsX\x1c\x00\x00\x00ansible.module_utils.parsingqtX)\x00\x00\x00ansible.module_utils.parsing.convert_boolquX\x1f\x00\x00\x00ansible.module_utils.pycompat24qvX\x1c\x00\x00\x00ansible.module_utils.serviceqwX\x18\x00\x00\x00ansible.module_utils.sixqxeX\x06\x00\x00\x00customqy]qzuX\x0e\x00\x00\x00py_module_nameq{X\x17\x00\x00\x00ansible.modules.systemdq|X\r\x00\x00\x00good_temp_dirq}X\x12\x00\x00\x00/root/.ansible/tmpq~X\x03\x00\x00\x00cwdq\x7fNX\t\x00\x00\x00extra_envq\x80NX\x0b\x00\x00\x00emulate_ttyq\x81\x88X\x0f\x00\x00\x00service_contextq\x82cmitogen.core\n_unpickle_context\nq\x83K\x00N\x86q\x84Rq\x85us\x85q\x86Rq\x87tq\x88.' An exception occurred during task execution. To see the full traceback, use -vvv. The error was: File "<stdin>", line 853, in _find_global fatal: [hetzner3]: FAILED! => {"msg": "Unexpected failure during module execution.", "stdout": ""} RUNNING HANDLER [maltfield.postfix : rebuild whitelisted domains] ********************** NO MORE HOSTS LEFT ********************************************************************* PLAY RECAP ***************************************************************************** hetzner3 : ok=7 changed=3 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0 user@ose:~/sandbox_local/ansible/hetzner3$
- I tried manual restart & status; it looks fine to me on the server-side..
root@mail ~ # systemctl restart postfix root@mail ~ # root@mail ~ # systemctl status postfix ● postfix.service - Postfix Mail Transport Agent Loaded: loaded (/lib/systemd/system/postfix.service; enabled; preset: enabled) Active: active (exited) since Sun 2024-09-15 20:15:08 CEST; 5s ago Docs: man:postfix(1) Process: 105274 ExecStart=/bin/true (code=exited, status=0/SUCCESS) Main PID: 105274 (code=exited, status=0/SUCCESS) CPU: 1ms Sep 15 20:15:08 mail systemd[1]: Starting postfix.service - Postfix Mail Transport Agent... Sep 15 20:15:08 mail systemd[1]: Finished postfix.service - Postfix Mail Transport Agent. root@mail ~ # root@mail ~ # journalctl -u postfix Sep 15 20:03:31 mail systemd[1]: Starting postfix.service - Postfix Mail Transport Agent... Sep 15 20:03:31 mail systemd[1]: Finished postfix.service - Postfix Mail Transport Agent. Sep 15 20:15:06 mail systemd[1]: postfix.service: Deactivated successfully. Sep 15 20:15:06 mail systemd[1]: Stopped postfix.service - Postfix Mail Transport Agent. Sep 15 20:15:06 mail systemd[1]: Stopping postfix.service - Postfix Mail Transport Agent... Sep 15 20:15:08 mail systemd[1]: Starting postfix.service - Postfix Mail Transport Agent... Sep 15 20:15:08 mail systemd[1]: Finished postfix.service - Postfix Mail Transport Agent. root@mail ~ #
- uhh, third times a charm? on running again, it didn't throw an error
user@ose:~/sandbox_local/ansible/hetzner3$ ansible-playbook provision.yml [WARNING]: While constructing a mapping from /home/user/sandbox_local/ansible/hetzner3/roles/maltfield.postfix/tasks/main.yml, line 8, column 3, found a duplicate dict key (command). Using last defined value only. PLAY [hetzner3] ************************************************************************ TASK [Gathering Facts] ***************************************************************** ok: [hetzner3] TASK [maltfield.postfix : install postfix] ********************************************* ok: [hetzner3] TASK [maltfield.postfix : generate dh parameters file] ********************************* [WARNING]: Consider using the file module with mode rather than running 'chmod'. If you need to use command because file is insufficient you can add 'warn: false' to this command task or set 'command_warnings=False' in ansible.cfg to get rid of this message. ok: [hetzner3] TASK [maltfield.postfix : Whitelisted Domains] ***************************************** ok: [hetzner3] TASK [maltfield.postfix : Postfox main.cf] ********************************************* ok: [hetzner3] TASK [maltfield.postfix : Postfox master.cf] ******************************************* ok: [hetzner3] TASK [install basic essential packages] ************************************************ ok: [hetzner3] PLAY RECAP ***************************************************************************** hetzner3 : ok=7 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 user@ose:~/sandbox_local/ansible/hetzner3$
- unfortunately, I'm still not able to email myself
root@mail ~ # echo "this is just a test" | /usr/bin/mail -r noreply@opensourceecology.org -s "test from hetzner3" "michael@michaelaltfield.net" mail: cannot send message: Process exited with a non-zero status root@mail ~ #
- curiously, there's no mail.log, and I don't see any errors from postfix
root@mail /var/log # ls alternatives.log alternatives.log.1 apt audit btmp btmp.1 dpkg.log dpkg.log.1 faillog journal lastlog private README runit unattended-upgrades wtmp root@mail /var/log # root@mail /var/log # journalctl -u postfix -f Sep 15 20:03:31 mail systemd[1]: Starting postfix.service - Postfix Mail Transport Agent... Sep 15 20:03:31 mail systemd[1]: Finished postfix.service - Postfix Mail Transport Agent. Sep 15 20:15:06 mail systemd[1]: postfix.service: Deactivated successfully. Sep 15 20:15:06 mail systemd[1]: Stopped postfix.service - Postfix Mail Transport Agent. Sep 15 20:15:06 mail systemd[1]: Stopping postfix.service - Postfix Mail Transport Agent... Sep 15 20:15:08 mail systemd[1]: Starting postfix.service - Postfix Mail Transport Agent... Sep 15 20:15:08 mail systemd[1]: Finished postfix.service - Postfix Mail Transport Agent.
- ah, if I just don't specify the unit in `journalctl`, I see the error
root@mail /var/log # journalctl -f Sep 15 20:23:13 mail kernel: iptablesIN= OUT=enp0s31f6 SRC=144.76.164.201 DST=185.12.64.2 LEN=61 TOS=0x00 PREC=0x00 TTL=64 ID=16909 DF PROTO=UDP SPT=47889 DPT=53 LEN=41 Sep 15 20:23:13 mail kernel: iptablesIN= OUT=enp0s31f6 SRC=2a01:04f8:0200:40d7:0000:0000:0000:0002 DST=2a01:04ff:ff00:0000:0000:0000:0add:0001 LEN=81 TC=0 HOPLIMIT=64 FLOWLBL=102047 PROTO=UDP SPT=37100 DPT=53 LEN=41 Sep 15 20:23:13 mail kernel: iptablesIN= OUT=enp0s31f6 SRC=144.76.164.201 DST=185.12.64.1 LEN=61 TOS=0x00 PREC=0x00 TTL=64 ID=49769 DF PROTO=UDP SPT=56946 DPT=53 LEN=41 Sep 15 20:23:13 mail kernel: iptablesIN= OUT=enp0s31f6 SRC=2a01:04f8:0200:40d7:0000:0000:0000:0002 DST=2a01:04ff:ff00:0000:0000:0000:0add:0001 LEN=81 TC=0 HOPLIMIT=64 FLOWLBL=838135 PROTO=UDP SPT=56214 DPT=53 LEN=41 Sep 15 20:23:13 mail kernel: iptablesIN= OUT=enp0s31f6 SRC=2a01:04f8:0200:40d7:0000:0000:0000:0002 DST=2a01:04ff:ff00:0000:0000:0000:0add:0001 LEN=81 TC=0 HOPLIMIT=64 FLOWLBL=606878 PROTO=UDP SPT=48253 DPT=53 LEN=41 Sep 15 20:23:43 mail kernel: iptablesIN= OUT=enp0s31f6 SRC=144.76.164.201 DST=185.12.64.2 LEN=62 TOS=0x00 PREC=0x00 TTL=64 ID=56850 DF PROTO=UDP SPT=40619 DPT=53 LEN=42 Sep 15 20:23:43 mail kernel: iptablesIN= OUT=enp0s31f6 SRC=2a01:04f8:0200:40d7:0000:0000:0000:0002 DST=2a01:04ff:ff00:0000:0000:0000:0add:0001 LEN=82 TC=0 HOPLIMIT=64 FLOWLBL=397680 PROTO=UDP SPT=36931 DPT=53 LEN=42 Sep 15 20:23:43 mail kernel: iptablesIN= OUT=enp0s31f6 SRC=144.76.164.201 DST=185.12.64.1 LEN=62 TOS=0x00 PREC=0x00 TTL=64 ID=17522 DF PROTO=UDP SPT=57806 DPT=53 LEN=42 Sep 15 20:23:43 mail kernel: iptablesIN= OUT=enp0s31f6 SRC=144.76.164.201 DST=185.12.64.2 LEN=62 TOS=0x00 PREC=0x00 TTL=64 ID=31763 DF PROTO=UDP SPT=57488 DPT=53 LEN=42 Sep 15 20:23:43 mail kernel: iptablesIN= OUT=enp0s31f6 SRC=2a01:04f8:0200:40d7:0000:0000:0000:0002 DST=2a01:04ff:ff00:0000:0000:0000:0add:0001 LEN=82 TC=0 HOPLIMIT=64 FLOWLBL=178850 PROTO=UDP SPT=53628 DPT=53 LEN=42 Sep 15 20:24:09 mail postfix/sendmail[106640]: fatal: parameter inet_interfaces: no local interface found for OBFUSCATED Sep 15 20:24:14 mail kernel: iptablesIN= OUT=enp0s31f6 SRC=144.76.164.201 DST=185.12.64.2 LEN=61 TOS=0x00 PREC=0x00 TTL=64 ID=24249 DF PROTO=UDP SPT=37918 DPT=53 LEN=41 Sep 15 20:24:14 mail kernel: iptablesIN= OUT=enp0s31f6 SRC=2a01:04f8:0200:40d7:0000:0000:0000:0002 DST=2a01:04ff:ff00:0000:0000:0000:0add:0001 LEN=81 TC=0 HOPLIMIT=64 FLOWLBL=477425 PROTO=UDP SPT=47559 DPT=53 LEN=41 Sep 15 20:24:14 mail kernel: iptablesIN= OUT=enp0s31f6 SRC=144.76.164.201 DST=185.12.64.1 LEN=61 TOS=0x00 PREC=0x00 TTL=64 ID=26314 DF PROTO=UDP SPT=42086 DPT=53 LEN=41 Sep 15 20:24:14 mail kernel: iptablesIN= OUT=enp0s31f6 SRC=2a01:04f8:0200:40d7:0000:0000:0000:0002 DST=2a01:04ff:ff00:0000:0000:0000:0add:0001 LEN=81 TC=0 HOPLIMIT=64 FLOWLBL=604691 PROTO=UDP SPT=43822 DPT=53 LEN=41 Sep 15 20:24:14 mail kernel: iptablesIN= OUT=enp0s31f6 SRC=2a01:04f8:0200:40d7:0000:0000:0000:0002 DST=2a01:04ff:ff00:0000:0000:0000:0add:0001 LEN=81 TC=0 HOPLIMIT=64 FLOWLBL=67841 PROTO=UDP SPT=43091 DPT=53 LEN=41 ^C root@mail /var/log #
- Oh, looks like the issue was I had hard-coded an IP address into the config
- I converted these three postfix ansible role's "files" to "templates" and update the bind IP to Template:Ansible default ipv4.address and Template:Ansible default ipv6.address
- I pushed-out the configs with ansible (note I have to do a double-tap to bypass the errors on the first-go for some reason still)
- and now I don't get an error from this
root@mail ~ # echo "this is just a test" | /usr/bin/mail -r noreply@opensourceecology.org -s "test from hetzner3" "michael@michaelaltfield.net" root@mail ~ #
- unfortunately the mail doesn't come, but I'm thinking it's because I haven't updated the firewall rules to allow the postfix user to send messages
- I updated the firewall rules in 'provision.yml' and re-ran ansible
- ...but I'm having the same issue
- actually, I don't even see that postfix is running
root@mail ~ # ps -ef | grep -iE 'postfix' maltfie+ 103848 1 0 20:05 ? 00:00:00 SCREEN -S postfix maltfie+ 105249 105240 0 20:14 pts/0 00:00:00 screen -dr postfix root 113685 103864 0 20:58 pts/16 00:00:00 grep -iE postfix root@mail ~ #
- ok, this says "active" but "exited"
root@mail ~ # systemctl status postfix ● postfix.service - Postfix Mail Transport Agent Loaded: loaded (/lib/systemd/system/postfix.service; enabled; preset: enabled) Active: active (exited) since Sun 2024-09-15 20:15:08 CEST; 41min ago Docs: man:postfix(1) Process: 105274 ExecStart=/bin/true (code=exited, status=0/SUCCESS) Main PID: 105274 (code=exited, status=0/SUCCESS) CPU: 1ms Sep 15 20:15:08 mail systemd[1]: Starting postfix.service - Postfix Mail Transport Agent... Sep 15 20:15:08 mail systemd[1]: Finished postfix.service - Postfix Mail Transport Agent. root@mail ~ #
- I tried to start it, and I got some errors to appear in journal
root@mail ~ # systemctl start postfix root@mail ~ #
- here's the other terminal with the errors
Sep 15 20:56:08 mail systemd[1]: Starting postfix@-.service - Postfix Mail Transport Agent (instance -)... Sep 15 20:56:08 mail postfix[113442]: Postfix is using backwards-compatible default settings Sep 15 20:56:08 mail postfix[113442]: See http://www.postfix.org/COMPATIBILITY_README.html for details Sep 15 20:56:08 mail postfix[113442]: To disable backwards compatibility use "postconf compatibility_level=3.6" and "postfix reload" Sep 15 20:56:08 mail postfix/postfix-script[113672]: starting the Postfix mail system Sep 15 20:56:08 mail postfix/master[113674]: warning: duplicate master.cf entry for service "trace" (private/trace) -- using the last entry Sep 15 20:56:08 mail postfix/master[113674]: daemon started -- version 3.7.11, configuration /etc/postfix Sep 15 20:56:08 mail systemd[1]: Started postfix@-.service - Postfix Mail Transport Agent (instance -). Sep 15 20:56:08 mail audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=unconfined msg='unit=postfix@- comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Sep 15 20:56:08 mail postfix/cleanup[113677]: error: open database /etc/postfix/virtual.db: No such file or directory Sep 15 20:56:08 mail postfix/pickup[113675]: 7CEA9B87FAF: uid=0 from=<noreply@opensourceecology.org> Sep 15 20:56:08 mail postfix/cleanup[113677]: 7CEA9B87FAF: message-id=<20240915185608.7CEA9B87FAF@mail.opensourceecology.org> Sep 15 20:56:08 mail postfix/cleanup[113677]: warning: hash:/etc/postfix/virtual is unavailable. open database /etc/postfix/virtual.db: No such file or directory Sep 15 20:56:08 mail postfix/cleanup[113677]: warning: hash:/etc/postfix/virtual lookup error for "michael@michaelaltfield.net" Sep 15 20:56:08 mail postfix/cleanup[113677]: warning: 7CEA9B87FAF: virtual_alias_maps map lookup problem for michael@michaelaltfield.net -- message not accepted, try again later Sep 15 20:56:08 mail postfix/pickup[113675]: 7E07BB87FAF: uid=0 from=<noreply@opensourceecology.org> ...
- so it seems pretty unhappy that we don't actually have our virtual.db defined yet.
- I actually decided that this is something that I don't want to have managed by ansible, because postfix lookup tables are actually stateful databases; they should be migrated like mariadb databases
- we do have a virtual file on hetzner2, but actually it's just all comments
[maltfield@opensourceecology ~]$ wc -l /etc/postfix/virtual 299 /etc/postfix/virtual [maltfield@opensourceecology ~]$ [maltfield@opensourceecology ~]$ head /etc/postfix/virtual # VIRTUAL(5) VIRTUAL(5) # # NAME # virtual - Postfix virtual alias table format # # SYNOPSIS # postmap /etc/postfix/virtual # # postmap -q "string" /etc/postfix/virtual # [maltfield@opensourceecology ~]$ [maltfield@opensourceecology ~]$ tail /etc/postfix/virtual # The Secure Mailer license must be distributed with this # software. # # AUTHOR(S) # Wietse Venema # IBM T.J. Watson Research # P.O. Box 704 # Yorktown Heights, NY 10598, USA # # VIRTUAL(5) [maltfield@opensourceecology ~]$ [maltfield@opensourceecology ~]$ grep -iE "^[^#]" /etc/postfix/virtual [maltfield@opensourceecology ~]$
- anyway, just to keep the comments, I copied this over to the new machine
- note that I'm using `-e 'ssh -p 31415'` to change the default port; otherwise it "just works"
[maltfield@opensourceecology ~]$ rsync -e 'ssh -p 32415' -av --progress /etc/postfix/virtual 144.76.164.201: The authenticity of host '[144.76.164.201]:32415 ([144.76.164.201]:32415)' can't be established. ECDSA key fingerprint is SHA256:hxzCaH018Wn8tEn7sZ22P/WrNMmZ29w5tK/w0KHq8eE. ECDSA key fingerprint is MD5:dd:27:98:46:b8:be:af:e1:b5:7d:54:d6:15:72:38:53. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '[144.76.164.201]:32415' (ECDSA) to the list of known hosts. sending incremental file list virtual 12,696 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=0/1) sent 12,783 bytes received 35 bytes 5,127.20 bytes/sec total size is 12,696 speedup is 0.99 [maltfield@opensourceecology ~]$
- back on hetzner3, I moved the file into place; note I had to change the owner
root@mail /etc/postfix # mv /home/maltfield/virtual /etc/postfix/ root@mail /etc/postfix # root@mail /etc/postfix # postmap /etc/postfix/virtual postmap: fatal: open database /etc/postfix/virtual.db: Permission denied root@mail /etc/postfix # root@mail /etc/postfix # ls -lah /etc/postfix/virtual -rw-r--r-- 1 maltfield maltfield 13K Apr 1 2020 /etc/postfix/virtual root@mail /etc/postfix # root@mail /etc/postfix # chown root:root /etc/postfix/virtual root@mail /etc/postfix # root@mail /etc/postfix # postmap /etc/postfix/virtual root@mail /etc/postfix # root@mail /etc/postfix # ls virtual* virtual virtual.db root@mail /etc/postfix #
- I tried to mail again; this time I got info messages suggesting it worked. And on my server, I see connections from the hetzner3 server. yay!
root@mail /etc/postfix # echo "this is just a test" | /usr/bin/mail --debug-level=remote -r noreply@opensourceecology.org -s "test from hetzner3" "michael@michaelaltfield.net" root@mail /etc/postfix #
- lol but I guess I've been spamming my personal server, so the hetzner3 logs are depressed that I'm not talking back to it
Sep 15 21:12:57 mail postfix/smtp[113810]: 3E06BB87E2F: host mail.michaelaltfield.net[2a01:4f8:c0c:95be::1] refused to talk to me: 421 4.7.0 mail.michaelaltfield.net Error: too many connections from 2a01:4f8:200:40d7::2
- but after some time, I do get a 250
Sep 15 21:12:57 mail postfix/smtp[113809]: 3B40CB87FAF: to=<michael@michaelaltfield.net>, relay=mail.michaelaltfield.net[2a01:4f8:c0c:95be::1]:25, delay=1691, delays=1689/0.66/1/0.03, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as BC4D8108)
- it looks like my server sent the auto-reply (michael@michaelaltfield.net is just a dumb fake email that auto-replies telling the user to go to email.michaelaltfield.net and solve a captcha to get my real/current email address), but my server is sending it to noreply@opensourceecology.org, which then goes to google's servers, per our MX records
2024-09-15T19:12:44.087871+00:00 mail postfix/smtp[1076271]: DEB7C131: to=<noreply@opensourceecology.org>, relay=aspmx.l.google.com[2a00:1450:400c:c07::1b]:25, delay=0.18, delays=0.02/0/0.13/0.04, dsn=4.3.0, status=deferred (bounce or trace service failure)
- I tried sending to my OSE gmail email address, but I got a bounce. Unfortunately, I can't see the exact error on Google's servers, but this is expected since we never auth'd this new server to send emails on behalf of opensourceecology.org
- it looks like we have ghandi for oswarehouse.org & opensourcewarehouse.org
- our other domains' nameservers are hosted on cloudflare
- I wasn't able to get past the cloudflare captcha in my normal web browser, but I was able to login on Tor Browser. Glad to see cloudflare isn't totally evil.
- Here's our current DNS SPF record
Type = TXT name = opensourceecology.org TTL = 2 min Content = v=spf1 a mx include:_spf.google.com ip4:78.46.3.178 ip4:138.201.84.223 ~all
- And a query for good measure
user@disp3433:~$ user@disp3433:~$ dig -t TXT opensourceecology.org ; <<>> DiG 9.18.28-1~deb12u2-Debian <<>> -t TXT opensourceecology.org ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 18846 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;opensourceecology.org. IN TXT ;; ANSWER SECTION: opensourceecology.org. 120 IN TXT "v=spf1 a mx include:_spf.google.com ip4:78.46.3.178 ip4:138.201.84.223 ~all" ;; Query time: 188 msec ;; SERVER: 10.139.1.1#53(10.139.1.1) (UDP) ;; WHEN: Sun Sep 15 14:39:06 -05 2024 ;; MSG SIZE rcvd: 138 user@disp3433:~$
- curiously, the IPv6 addresses are not listed. And I don't see any entries in cloudflare for openbuildinginstiture.org or other domains; I'll have to investigate that later..
- oh shit, I just realized -- we actually *don't* use our server for receiving emails; we only use it for sending emails (phplist).
- all OSE infrastructure uses GSuite (we're grandfathered-in for free) for "gmail" UI on @opensourceecology.org email addresses
- so, yeah, I updated the postfix config to only listen on localhost; it shouldn't bind to our ipv4 or ipv6 addresses
- and I updated the firewall to not allow incoming traffic on port 25
- And I updated the SPF record in CloudFlare
spf1=v a mx include:_spf.google.com ip4:78.46.3.178 ip4:138.201.84.223 ip4:144.76.164.201 ip6:2a01:4f8:200:40d7::2 ~all
- I was expecting it to take a few minutes, but it propagated immediately!
user@disp3433:~$ while true; do date; dig -t TXT opensourceecology.org; sleep 300; echo; done Sun Sep 15 02:53:58 PM -05 2024 ; <<>> DiG 9.18.28-1~deb12u2-Debian <<>> -t TXT opensourceecology.org ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 8306 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;opensourceecology.org. IN TXT ;; ANSWER SECTION: opensourceecology.org. 120 IN TXT "spf1=v a mx include:_spf.google.com ip4:78.46.3.178 ip4:138.201.84.223 ip4:144.76.164.201 ip6:2a01:4f8:200:40d7::2 ~all" ;; Query time: 191 msec ;; SERVER: 10.139.1.1#53(10.139.1.1) (UDP) ;; WHEN: Sun Sep 15 14:53:58 -05 2024 ;; MSG SIZE rcvd: 182
- I tried emailing myself again, but google still bounces it; maybe it'll take some time for them
- I went to update the RDNS in hetzner WUI
- I was surprised to see that the RDNS entry differs from our hostname
- old server hostname is just 'opensourceecology.org' (I thought it was 'mail.opensourceecology.org')
[root@opensourceecology ~]# hostname opensourceecology.org [root@opensourceecology ~]#
- and RDNS is 'mailer.opensourceecology.org' (I thought it was 'mail.opensourceecology.org')
user@disp3433:~$ dig -x 138.201.84.243 ; <<>> DiG 9.18.28-1~deb12u2-Debian <<>> -x 138.201.84.243 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49686 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;243.84.201.138.in-addr.arpa. IN PTR ;; ANSWER SECTION: 243.84.201.138.in-addr.arpa. 86389 IN PTR mailer.opensourceecology.org. ;; Query time: 186 msec ;; SERVER: 10.139.1.1#53(10.139.1.1) (UDP) ;; WHEN: Sun Sep 15 14:58:47 -05 2024 ;; MSG SIZE rcvd: 98 user@disp3433:~$
- ah, ok, apparently 'mail.opensourceecology.org' is already google's servers and so I made 'mailer.opensourceecology.org' hetzner2 https://wiki.opensourceecology.org/wiki/Maltfield_Log/2019_Q1#Tue_Mar_19.2C_2019
- I guess if it doesn't really matter, I'm going to name everything 'hetzner3.opensourceecology.org' -- we'll see if it works
- I created two new DNS entries for 'hetzner3.opensourceecology.org'
A - 144.76.164.201 AAAA - 2a01:4f8:200:40d7::2
- that also was immediate; I remember now why we use cloudflare for DNS!
user@disp3433:~$ dig hetzner3.opensourceecology.org ; <<>> DiG 9.18.28-1~deb12u2-Debian <<>> hetzner3.opensourceecology.org ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55968 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;hetzner3.opensourceecology.org. IN A ;; ANSWER SECTION: hetzner3.opensourceecology.org. 120 IN A 144.76.164.201 ;; Query time: 189 msec ;; SERVER: 10.139.1.1#53(10.139.1.1) (UDP) ;; WHEN: Sun Sep 15 15:05:43 -05 2024 ;; MSG SIZE rcvd: 75 user@disp3433:~$ dig -t AAAA hetzner3.opensourceecology.org ; <<>> DiG 9.18.28-1~deb12u2-Debian <<>> -t AAAA hetzner3.opensourceecology.org ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52394 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;hetzner3.opensourceecology.org. IN AAAA ;; ANSWER SECTION: hetzner3.opensourceecology.org. 120 IN AAAA 2a01:4f8:200:40d7::2 ;; Query time: 189 msec ;; SERVER: 10.139.1.1#53(10.139.1.1) (UDP) ;; WHEN: Sun Sep 15 15:05:48 -05 2024 ;; MSG SIZE rcvd: 87 user@disp3433:~$
- ok, I used the hetzner wui to update RDNS too; it only took a few seconds to update
user@disp3433:~$ dig -x 144.76.164.201 ; <<>> DiG 9.18.28-1~deb12u2-Debian <<>> -x 144.76.164.201 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17479 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;201.164.76.144.in-addr.arpa. IN PTR ;; ANSWER SECTION: 201.164.76.144.in-addr.arpa. 86400 IN PTR hetzner3.opensourceecology.org. ;; Query time: 723 msec ;; SERVER: 10.139.1.1#53(10.139.1.1) (UDP) ;; WHEN: Sun Sep 15 15:07:35 -05 2024 ;; MSG SIZE rcvd: 100 user@disp3433:~$
- emails still getting bounced by Google Mail servers
- I also setup RDNS on IPv6
user@disp3433:~$ dig -x 2a01:4f8:200:40d7::2 ;; communications error to 10.139.1.1#53: timed out ; <<>> DiG 9.18.28-1~deb12u2-Debian <<>> -x 2a01:4f8:200:40d7::2 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29353 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.7.d.0.4.0.0.2.0.8.f.4.0.1.0.a.2.ip6.arpa. IN PTR ;; ANSWER SECTION: 2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.7.d.0.4.0.0.2.0.8.f.4.0.1.0.a.2.ip6.arpa. 86397 IN PTR hetzner3.opensourceecology.org. ;; Query time: 185 msec ;; SERVER: 10.139.1.1#53(10.139.1.1) (UDP) ;; WHEN: Sun Sep 15 15:10:38 -05 2024 ;; MSG SIZE rcvd: 145 user@disp3433:~$
- I see that I did, in fact, update the hostname of the postfix config to be 'mailer.opensourceecology.org'
[root@opensourceecology ~]# grep -ir 'mailer.opensourceecology.org' /etc Binary file /etc/aliases.db matches /etc/postfix/main.cf:myhostname = mailer.opensourceecology.org [root@opensourceecology ~]#
- so I updated the hostname of main.cf on hetzner3 to be 'hetzner3.opensourceecology.org'
- I'm going to take lunch and see if it works after..
...
- after lunch I tried emailing michael@michaelaltfield.net again, and I got an error on hetzner3 complaining about my MX records?
Sep 16 00:06:19 mail postfix/smtp[117168]: 1EAA7B8809F: to=<michael@michaelaltfield.net>, relay=none, delay=10, delays=0.04/0.02/10/0, dsn=4.4.3, status=deferred (Host or domain name not found. Name service error for name=michaelaltfield.net type=MX: Host not found, try again)
- my server's DNS is fine
user@disp3433:~$ dig -t MX michaelaltfield.net ; <<>> DiG 9.18.28-1~deb12u2-Debian <<>> -t MX michaelaltfield.net ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 50590 ;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;michaelaltfield.net. IN MX ;; ANSWER SECTION: michaelaltfield.net. 47 IN MX 10 mail.michaelaltfield.net. ;; Query time: 185 msec ;; SERVER: 10.139.1.1#53(10.139.1.1) (UDP) ;; WHEN: Sun Sep 15 17:08:13 -05 2024 ;; MSG SIZE rcvd: 69 user@disp3433:~$ dig mail.michaelaltfield.net. ; <<>> DiG 9.18.28-1~deb12u2-Debian <<>> mail.michaelaltfield.net. ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11898 ;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;mail.michaelaltfield.net. IN A ;; ANSWER SECTION: mail.michaelaltfield.net. 1774 IN A 94.130.74.14 ;; Query time: 197 msec ;; SERVER: 10.139.1.1#53(10.139.1.1) (UDP) ;; WHEN: Sun Sep 15 17:08:25 -05 2024 ;; MSG SIZE rcvd: 69 user@disp3433:~$
- oh shit, actually, looks like the server can't find it tho
root@mail /etc/postfix # sudo -u postfix dig -t MX michaelaltfield.net ; <<>> DiG 9.18.28-1~deb12u2-Debian <<>> -t MX michaelaltfield.net ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 20574 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;michaelaltfield.net. IN MX ;; Query time: 4 msec ;; SERVER: ::1#53(::1) (UDP) ;; WHEN: Mon Sep 16 00:09:37 CEST 2024 ;; MSG SIZE rcvd: 48 root@mail /etc/postfix #
- ah, the above output says it's querying port 53 on localhost (ipv6). yeah, looks like ansible setup our DoT solution (stubby + unbound), but I guess it's broken. It's always DNS ;P
root@mail /etc/postfix # dpkg -l | grep -i stubby ii stubby 1.6.0-3+b1 amd64 modern asynchronous DNS API (stub resolver) root@mail /etc/postfix # dpkg -l | grep -i unbound ii libunbound8:amd64 1.17.1-2+deb12u2 amd64 library implementing DNS resolution and validation ii unbound 1.17.1-2+deb12u2 amd64 validating, recursive, caching DNS resolver root@mail /etc/postfix #
- unfortunately, re-running ansible to configure the 'maltifeld.dns' role gets stuck
- SE says I should figure out what process it's waiting-on and execute it myself; probably it's apt waiting for user input https://serverfault.com/questions/1020302/ansible-apt-module-hangs-process-sleeping
- I think mitogen (a performance "strategy" we use for optimizing ansible speed) is obfuscating what we're running
root@mail /etc/postfix # ps -ef | grep -i python root 92168 1 0 Sep15 ? 00:00:16 /usr/bin/python3 /usr/share/unattended-upgrades/unattended-upgrade-shutdown --wait-for-signal wazuh 97237 1 0 Sep15 ? 00:00:34 /var/ossec/framework/python/bin/python3 /var/ossec/api/scripts/wazuh_apid.py wazuh 97338 97237 0 Sep15 ? 00:00:00 /var/ossec/framework/python/bin/python3 /var/ossec/api/scripts/wazuh_apid.py wazuh 97341 97237 0 Sep15 ? 00:00:00 /var/ossec/framework/python/bin/python3 /var/ossec/api/scripts/wazuh_apid.py wazuh 97344 97237 0 Sep15 ? 00:00:00 /var/ossec/framework/python/bin/python3 /var/ossec/api/scripts/wazuh_apid.py maltfie+ 117202 117201 0 00:12 ? 00:00:00 /usr/bin/python3(mitogen:user@ose:63374) maltfie+ 117205 117202 0 00:12 ? 00:00:00 /usr/bin/python3(mitogen:user@ose:63374) root 117207 117202 0 00:12 pts/18 00:00:00 sudo -u root -H -- /usr/bin/python3 -c import codecs,os,sys;_=codecs.decode;exec(_(_("eNqFkcFPwyAUxs/jr+AGZGSDGtPY2ESzg/FgTBrjDtti6EqVSIHQbnX+9bJWXTsP3vjxPt73PV5Gl6mtZ045iQnwtB2QKmGA0vp3TBJwPBc7F2FGOWPkxBkdkg9V3vNW21ribAh+CMshtAQAAINlfQgBtGiCbwXTFKJC+FYZBIUpuqL8kNtdI3Itu/J8V/t5rszcHZo32+vWYDKByjT4p9HMSy1FsCGrJNoQeA35VRjpvN807W720tfKmlVysekSSrNXPjC6ze6eGdqk42e9JqDG4wId4xThSjX2VZqkEropldTFTSWUTjiPIxYRREDo1HrVSMwperh/emSMrQ0KCba2COkJWKQv+LiVwjppwi6QzxEJ44kC85jFl4SiT+VCp9KlJ92SojZHx0WV7ttg0Z37zz9Tt/+p/6bk45S/a43IF77vv0g=".encode(),"base64"),"zip")) root 117208 117207 0 00:12 pts/19 00:00:00 sudo -u root -H -- /usr/bin/python3 -c import codecs,os,sys;_=codecs.decode;exec(_(_("eNqFkcFPwyAUxs/jr+AGZGSDGtPY2ESzg/FgTBrjDtti6EqVSIHQbnX+9bJWXTsP3vjxPt73PV5Gl6mtZ045iQnwtB2QKmGA0vp3TBJwPBc7F2FGOWPkxBkdkg9V3vNW21ribAh+CMshtAQAAINlfQgBtGiCbwXTFKJC+FYZBIUpuqL8kNtdI3Itu/J8V/t5rszcHZo32+vWYDKByjT4p9HMSy1FsCGrJNoQeA35VRjpvN807W720tfKmlVysekSSrNXPjC6ze6eGdqk42e9JqDG4wId4xThSjX2VZqkEropldTFTSWUTjiPIxYRREDo1HrVSMwperh/emSMrQ0KCba2COkJWKQv+LiVwjppwi6QzxEJ44kC85jFl4SiT+VCp9KlJ92SojZHx0WV7ttg0Z37zz9Tt/+p/6bk45S/a43IF77vv0g=".encode(),"base64"),"zip")) root 117209 117208 0 00:12 pts/19 00:00:00 /usr/bin/python3(mitogen:maltfield@mail:117202) root 117212 117209 0 00:12 pts/19 00:00:00 /usr/bin/python3(mitogen:maltfield@mail:117202) root 117307 103864 0 00:17 pts/16 00:00:00 grep -i python root@mail /etc/postfix #
- ok, I can override the mitogen strategy with the $ANSIBLE_STRATEGY env var
ANSIBLE_STRATEGY=linear ansible-playbook -vvv provision.yml
- it gets stuck here
... TASK [maltfield.dns : install packages for encrypted DNS] ****************************** task path: /home/user/sandbox_local/ansible/hetzner3/roles/maltfield.dns/tasks/main.yml:1 <144.76.164.201> ESTABLISH SSH CONNECTION FOR USER: maltfield <144.76.164.201> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o Port=32415 -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="maltfield"' -o ConnectTimeout=10 -o ControlPath=/home/user/.ansible/cp/5244de437f 144.76.164.201 '/bin/sh -c '"'"'echo ~maltfield && sleep 0'"'"'' <144.76.164.201> (0, b'/home/maltfield\n', b'') <144.76.164.201> ESTABLISH SSH CONNECTION FOR USER: maltfield <144.76.164.201> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o Port=32415 -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="maltfield"' -o ConnectTimeout=10 -o ControlPath=/home/user/.ansible/cp/5244de437f 144.76.164.201 '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo /home/maltfield/.ansible/tmp `"&& mkdir "` echo /home/maltfield/.ansible/tmp/ansible-tmp-1726438986.9852722-63704-187090998946541 `" && echo ansible-tmp-1726438986.9852722-63704-187090998946541="` echo /home/maltfield/.ansible/tmp/ansible-tmp-1726438986.9852722-63704-187090998946541 `" ) && sleep 0'"'"'' <144.76.164.201> (0, b'ansible-tmp-1726438986.9852722-63704-187090998946541=/home/maltfield/.ansible/tmp/ansible-tmp-1726438986.9852722-63704-187090998946541\n', b'') Using module file /usr/lib/python3/dist-packages/ansible/modules/apt.py <144.76.164.201> PUT /home/user/.ansible/tmp/ansible-local-63689muzkrcup/tmph3i2h59r TO /home/maltfield/.ansible/tmp/ansible-tmp-1726438986.9852722-63704-187090998946541/AnsiballZ_apt.py <144.76.164.201> SSH: EXEC sftp -b - -C -o ControlMaster=auto -o ControlPersist=60s -o Port=32415 -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="maltfield"' -o ConnectTimeout=10 -o ControlPath=/home/user/.ansible/cp/5244de437f '[144.76.164.201]' <144.76.164.201> (0, b'sftp> put /home/user/.ansible/tmp/ansible-local-63689muzkrcup/tmph3i2h59r /home/maltfield/.ansible/tmp/ansible-tmp-1726438986.9852722-63704-187090998946541/AnsiballZ_apt.py\n', b'') <144.76.164.201> ESTABLISH SSH CONNECTION FOR USER: maltfield <144.76.164.201> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o Port=32415 -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="maltfield"' -o ConnectTimeout=10 -o ControlPath=/home/user/.ansible/cp/5244de437f 144.76.164.201 '/bin/sh -c '"'"'chmod u+x /home/maltfield/.ansible/tmp/ansible-tmp-1726438986.9852722-63704-187090998946541/ /home/maltfield/.ansible/tmp/ansible-tmp-1726438986.9852722-63704-187090998946541/AnsiballZ_apt.py && sleep 0'"'"'' <144.76.164.201> (0, b'', b'') <144.76.164.201> ESTABLISH SSH CONNECTION FOR USER: maltfield <144.76.164.201> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o Port=32415 -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="maltfield"' -o ConnectTimeout=10 -o ControlPath=/home/user/.ansible/cp/5244de437f -tt 144.76.164.201 '/bin/sh -c '"'"'sudo -H -S -n -u root /bin/sh -c '"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-kvoohsjnseqwnzgqldvyzoiisloygcac ; /usr/bin/python3 /home/maltfield/.ansible/tmp/ansible-tmp-1726438986.9852722-63704-187090998946541/AnsiballZ_apt.py'"'"'"'"'"'"'"'"' && sleep 0'"'"'' Escalation succeeded
- and then I check the server
root@mail /etc/postfix # ps -ef | grep -i python root 92168 1 0 Sep15 ? 00:00:16 /usr/bin/python3 /usr/share/unattended-upgrades/unattended-upgrade-shutdown --wait-for-signal wazuh 97237 1 0 Sep15 ? 00:00:35 /var/ossec/framework/python/bin/python3 /var/ossec/api/scripts/wazuh_apid.py wazuh 97338 97237 0 Sep15 ? 00:00:00 /var/ossec/framework/python/bin/python3 /var/ossec/api/scripts/wazuh_apid.py wazuh 97341 97237 0 Sep15 ? 00:00:00 /var/ossec/framework/python/bin/python3 /var/ossec/api/scripts/wazuh_apid.py wazuh 97344 97237 0 Sep15 ? 00:00:00 /var/ossec/framework/python/bin/python3 /var/ossec/api/scripts/wazuh_apid.py maltfie+ 117448 117343 0 00:23 pts/18 00:00:00 /bin/sh -c sudo -H -S -n -u root /bin/sh -c 'echo BECOME-SUCCESS-kvoohsjnseqwnzgqldvyzoiisloygcac ; /usr/bin/python3 /home/maltfield/.ansible/tmp/ansible-tmp-1726438986.9852722-63704-187090998946541/AnsiballZ_apt.py' && sleep 0 root 117449 117448 0 00:23 pts/18 00:00:00 sudo -H -S -n -u root /bin/sh -c echo BECOME-SUCCESS-kvoohsjnseqwnzgq
Sat Sep 14, 2024
- I continued looking through the ansible roles to see if there was anything remaining that needed to be removed before publishing them publicly on GitHub
- I found EMAIL_LIST in backupReport.sh
- I decided to simply delete this and move it to backup.settings — which is a file with loads of passwords which I'll provision manually
- similarly, I found a $recipients list of email addresses in 'roles/maltfield.wazuh/templates/sent_encrypted_alarm.sh.j2
- similarly, I decided to source this from a new file that I'll call '/var/ossec/sent_encrypted_alarm.settings
...
- I decided to update the openbuildinginstitute nginx files to listen on the same address as opensourceecology
- we originally decided to have two distinct IP addresses for the best backwards compatibility of HTTPS webserver, but now (some years later)
- client support for SNI should be very widespread https://en.wikipedia.org/wiki/Server_Name_Indication
- IPv4 congestion has lead to increased fees for >1 IP address, especially at hetzner (our hosting provider)'
- we originally decided to have two distinct IP addresses for the best backwards compatibility of HTTPS webserver, but now (some years later)
...
- I provided an answer to my SE question with how I setup purge keys for our varnish config files in ansible https://stackoverflow.com/questions/78980038/how-to-replace-jinja2-variable-only-when-first-provisioning-file/78986073#78986073
- I spent some time updating OSE_Server
...
- I did two more passes through all of the content that I've setup in ansible in the past ~month to make sure there's no passwords or anything that we can't share publicly in it
- I went through the docs listing all of the services, and made sure we're addressing everything that we can with ansible https://wiki.opensourceecology.org/wiki/OSE_Server#Provisioning
- here's some of the commands that I used
grep -Eir '.htpasswd' * | less # any long strings; look for passwords grep -iEr '[A-Za-z0-9]{10}+' * | less # check that all template files end in '.j2' filename find | grep -Ei '.*(templates|files)/.*conf$' # check that all template files have {{ ansible managed }} the top files=$(find -ipath *templates* -type f -or -ipath *file* -type f) for f in $files; do grep -L ansible $f; done
- after my third pass, I copied all of the files into my local sandbox of our public OSE 'ansible' repo and committed it https://github.com/OpenSourceEcology/ansible
- I'm going to do some testing before I push to github
- I gave the 'provision' ansible playbook a run against hetzner3, for now only including the following roles:
- dev-sec.ssh-hardening
- mikegleasonjr.firewall
- maltfield.wazuh
- maltfield.unattended-upgrades
- unfortunately, it failed on the wazuh install
user@ose:~/sandbox_local/ansible/hetzner3$ ansible-playbook provision.yml PLAY [hetzner3] ************************************************************************ TASK [Gathering Facts] ***************************************************************** ok: [hetzner3] TASK [dev-sec.ssh-hardening : include_tasks] ******************************************* included: /home/user/sandbox_local/ansible/hetzner3/roles/dev-sec.ssh-hardening/tasks/hardening.yml for hetzner3 TASK [dev-sec.ssh-hardening : Set OS dependent variables] ****************************** ok: [hetzner3] => (item=/home/user/sandbox_local/ansible/hetzner3/roles/dev-sec.ssh-hardening/vars/Debian.yml) TASK [dev-sec.ssh-hardening : get openssh-version] ************************************* ok: [hetzner3] TASK [dev-sec.ssh-hardening : include tasks to create crypo-vars] ********************** included: /home/user/sandbox_local/ansible/hetzner3/roles/dev-sec.ssh-hardening/tasks/crypto.yml for hetzner3 TASK [dev-sec.ssh-hardening : set hostkeys according to openssh-version] *************** ok: [hetzner3] TASK [dev-sec.ssh-hardening : set hostkeys according to openssh-version] *************** skipping: [hetzner3] TASK [dev-sec.ssh-hardening : set hostkeys according to openssh-version] *************** skipping: [hetzner3] TASK [dev-sec.ssh-hardening : set macs according to openssh-version if openssh >= 7.6] *** ok: [hetzner3] TASK [dev-sec.ssh-hardening : set macs according to openssh-version if openssh >= 6.6] *** skipping: [hetzner3] TASK [dev-sec.ssh-hardening : set macs according to openssh-version] ******************* skipping: [hetzner3] TASK [dev-sec.ssh-hardening : set macs according to openssh-version] ******************* skipping: [hetzner3] TASK [dev-sec.ssh-hardening : set ciphers according to openssh-version if openssh >= 6.6] *** skipping: [hetzner3] TASK [dev-sec.ssh-hardening : set ciphers according to openssh-version] **************** skipping: [hetzner3] TASK [dev-sec.ssh-hardening : set kex according to openssh-version if openssh >= 6.6] *** ok: [hetzner3] TASK [dev-sec.ssh-hardening : set kex according to openssh-version] ******************** skipping: [hetzner3] TASK [dev-sec.ssh-hardening : create revoked_keys and set permissions to root/600] ***** changed: [hetzner3] TASK [dev-sec.ssh-hardening : create sshd_config and set permissions to root/600] ****** changed: [hetzner3] TASK [dev-sec.ssh-hardening : create ssh_config and set permissions to root/644] ******* changed: [hetzner3] TASK [dev-sec.ssh-hardening : Check if /etc/ssh/moduli contains weak DH parameters] **** ok: [hetzner3] TASK [dev-sec.ssh-hardening : remove all small primes] ********************************* changed: [hetzner3] TASK [dev-sec.ssh-hardening : include tasks to setup ca keys and principals] *********** skipping: [hetzner3] TASK [dev-sec.ssh-hardening : include tasks to setup 2FA] ****************************** skipping: [hetzner3] TASK [dev-sec.ssh-hardening : include selinux specific tasks] ************************** skipping: [hetzner3] TASK [mikegleasonjr.firewall : include_tasks] ****************************************** included: /home/user/sandbox_local/ansible/hetzner3/roles/mikegleasonjr.firewall/tasks/rules.yml for hetzner3 TASK [mikegleasonjr.firewall : Generate v4 rules] ************************************** changed: [hetzner3] TASK [mikegleasonjr.firewall : Load v4 rules] ****************************************** changed: [hetzner3] TASK [mikegleasonjr.firewall : Generate v6 rules] ************************************** changed: [hetzner3] TASK [mikegleasonjr.firewall : Load v6 rules] ****************************************** changed: [hetzner3] TASK [mikegleasonjr.firewall : include_tasks] ****************************************** included: /home/user/sandbox_local/ansible/hetzner3/roles/mikegleasonjr.firewall/tasks/persist-debian.yml for hetzner3 TASK [mikegleasonjr.firewall : Remove any obsolete scripts used by an old version of the role] *** ok: [hetzner3] => (item=/etc/network/if-post-down.d/iptables-v4) ok: [hetzner3] => (item=/etc/network/if-pre-up.d/iptables-v4) ok: [hetzner3] => (item=/etc/iptables.v4.saved) TASK [mikegleasonjr.firewall : Install iptables-persistent] **************************** ok: [hetzner3] TASK [mikegleasonjr.firewall : Install ipset-persistent] ******************************* changed: [hetzner3] TASK [mikegleasonjr.firewall : Check if netfilter-persistent is present] *************** ok: [hetzner3] TASK [mikegleasonjr.firewall : Save rules (netfilter-persistent)] ********************** changed: [hetzner3] TASK [mikegleasonjr.firewall : Save rules (iptables-persistent)] *********************** skipping: [hetzner3] TASK [mikegleasonjr.firewall : include_tasks] ****************************************** skipping: [hetzner3] TASK [maltfield.wazuh : install wazuh prereqs] ***************************************** changed: [hetzner3] TASK [maltfield.wazuh : wazuh gpg key] ************************************************* changed: [hetzner3] TASK [maltfield.wazuh : wazuh repo] **************************************************** changed: [hetzner3] TASK [maltfield.wazuh : install wazuh manager] ***************************************** changed: [hetzner3] TASK [maltfield.wazuh : ossec.conf] **************************************************** fatal: [hetzner3]: FAILED! => {"changed": false, "checksum": "9e715fb840b9f88e6af9ffbe523613e34975d123", "gid": 109, "group": "wazuh", "mode": "0660", "msg": "chgrp failed: failed to look up group ossec", "owner": "root", "path": "/var/ossec/etc/ossec.conf", "size": 9856, "state": "file", "uid": 0} RUNNING HANDLER [dev-sec.ssh-hardening : restart sshd] ********************************* PLAY RECAP ***************************************************************************** hetzner3 : ok=28 changed=14 unreachable=0 failed=1 skipped=13 rescued=0 ignored=0 user@ose:~/sandbox_local/ansible/hetzner3$
- ugh, unfortunately I failed to do a dry run and then take a backup of the files it was going to change before the run.
- For many files ansible should have already created a backup, such as the sshd_config file that it changed; but this won't be the case for all files
root@mail ~ # ls -lah /etc/ssh total 600K drwxr-xr-x 4 root root 4.0K Sep 15 04:35 . drwxr-xr-x 72 root root 4.0K Sep 15 04:35 .. -rw-r--r-- 1 root root 529K Sep 15 04:35 moduli -rw------- 1 root root 24 Sep 15 04:34 revoked_keys -rw-r--r-- 1 root root 3.2K Sep 15 04:34 ssh_config drwxr-xr-x 2 root root 4.0K Dec 19 2023 ssh_config.d -rw------- 1 root root 4.9K Sep 15 04:34 sshd_config drwxr-xr-x 2 root root 4.0K Dec 19 2023 sshd_config.d -rw-r--r-- 1 root root 3.2K Aug 1 00:00 sshd_config.orig.20240801_000012 -rw-r--r-- 1 root root 3.2K Aug 1 00:07 sshd_config.ucf-dist -rw------- 1 root root 525 Jul 31 23:44 ssh_host_ecdsa_key -rw-r--r-- 1 root root 193 Jul 31 23:44 ssh_host_ecdsa_key.pub -rw------- 1 root root 432 Jul 31 23:44 ssh_host_ed25519_key -rw-r--r-- 1 root root 113 Jul 31 23:44 ssh_host_ed25519_key.pub -rw------- 1 root root 2.6K Jul 31 23:44 ssh_host_rsa_key -rw-r--r-- 1 root root 585 Jul 31 23:44 ssh_host_rsa_key.pub -rw-r--r-- 1 root root 342 Feb 28 2020 ssh_import_id root@mail ~ #
- oh, wait, no, that backup wsa one I created last month :/
- better late than never; I created a backup of /etc/ now
root@mail ~ # tar -czvf /etc/etc.20240914.tar.gz /etc/* tar: Removing leading `/' from member names /etc/acpi/ /etc/acpi/events/ ... -rw-r--r-- 1 root root 681 Jan 17 2023 xattr.conf drwxr-xr-x 3 root root 4.0K Jun 23 06:18 xdg root@mail ~ # root@mail ~ # du -sh /etc/etc.20240914.tar.gz 424K /etc/etc.20240914.tar.gz root@mail ~ #
- anyway, the error above was on the install of ossec/wazuh. It looks like it installed ok, but the user 'ossec' was missing
- note that on hetzner2 we had wazuh 3.x, and now we're installing wazuh 4.x. It's possible the usernames/group names changed
root@mail ~ # dpkg -l | grep -i wazuh ii wazuh-manager 4.9.0-1 amd64 Wazuh helps you to gain security visibility into your infrastructure by monitoring hosts at an operating system and application level. It provides the following capabilities: log analysis, file integrity monitoring, intrusions detection and policy and compliance monitoring root@mail ~ #
- it looks like the username & group should be 'wazuh'
root@mail ~ # grep -iE 'wazuh|ossec' /etc/passwd wazuh:x:102:109::/var/ossec:/sbin/nologin root@mail ~ # grep -iE 'wazuh|ossec' /etc/group wazuh:x:109: root@mail ~ #
- but they didn't change the name of the directory :shrug:
root@mail ~ # ls -lah /var/ossec total 80K drwxr-x--- 20 root wazuh 4.0K Sep 15 04:36 . drwxr-xr-x 12 root root 4.0K Sep 15 04:35 .. drwxr-x--- 3 root wazuh 4.0K Sep 15 04:35 active-response drwxr-x--- 2 root wazuh 4.0K Sep 15 04:35 agentless drwxr-x--- 4 root wazuh 4.0K Sep 15 04:35 api drwxr-x--- 5 root wazuh 4.0K Sep 15 04:35 backup drwxr-x--- 2 root wazuh 4.0K Sep 15 04:35 bin drwxrwx--- 7 wazuh wazuh 4.0K Sep 15 04:36 etc drwxr-x--- 5 root wazuh 4.0K Sep 15 04:35 framework drwxr-x--- 2 root wazuh 4.0K Sep 15 04:35 integrations drwxr-x--- 2 root wazuh 4.0K Sep 15 04:35 lib drwxrwx--- 8 wazuh wazuh 4.0K Sep 15 04:35 logs drwxr-x--- 18 root wazuh 4.0K Sep 15 04:35 queue drwxr-x--- 5 root wazuh 4.0K Sep 15 04:35 ruleset drwxrwx--- 2 root wazuh 4.0K Aug 30 12:07 .ssh drwxr-x--- 2 wazuh wazuh 4.0K Aug 30 12:07 stats dr--r----- 2 root wazuh 4.0K Sep 15 04:35 templates drwxrwx--T 2 root wazuh 4.0K Sep 15 04:35 tmp drwxr-x--- 9 root wazuh 4.0K Sep 15 04:35 var drwxr-x--- 6 root wazuh 4.0K Sep 15 04:35 wodles root@mail ~ #
- I'm also going to go ahead and make a backup of /var/ossec now
root@mail ~ # tar -cjvf /var/ossec/ossec.20240914.tar.bz2 /var/ossec/* ... /var/ossec/wodles/docker/DockerListener.py root@mail ~ # root@mail ~ # du -sh /var/ossec/ossec.20240914.tar.bz2 354M /var/ossec/ossec.20240914.tar.bz2 root@mail ~ #
- I tried to do the ansible --no-op, but it gave me some vague python pickle error (?)
user@ose:~/sandbox_local/ansible/hetzner3$ ansible-playbook --check provision.yml PLAY [hetzner3] ************************************************************************ .. TASK [mikegleasonjr.firewall : include_tasks] ****************************************** skipping: [hetzner3] TASK [maltfield.wazuh : install wazuh prereqs] ***************************************** ok: [hetzner3] TASK [maltfield.wazuh : wazuh gpg key] ************************************************* ok: [hetzner3] TASK [maltfield.wazuh : wazuh repo] **************************************************** ok: [hetzner3] TASK [maltfield.wazuh : install wazuh manager] ***************************************** ok: [hetzner3] TASK [maltfield.wazuh : ossec.conf] **************************************************** ok: [hetzner3] TASK [maltfield.wazuh : local_rules.xml] *********************************************** changed: [hetzner3] TASK [maltfield.wazuh : email encryption .forward file] ******************************** changed: [hetzner3] TASK [maltfield.wazuh : email encryption script] *************************************** changed: [hetzner3] TASK [maltfield.unattended-upgrades : install unattended-upgrades] ********************* changed: [hetzner3] TASK [maltfield.unattended-upgrades : 20auto-upgrades] ********************************* changed: [hetzner3] TASK [maltfield.unattended-upgrades : 50unattended-upgrades] *************************** changed: [hetzner3] TASK [install basic essential packages] ************************************************ changed: [hetzner3] RUNNING HANDLER [maltfield.wazuh : restart wazuh-manager] ****************************** ERROR! [mux 51930] 22:05:31.908054 E mitogen.[ssh.144.76.164.201:32415.sudo.root]: raw pickle was: b'\x80\x02(X&\x00\x00\x00ose-52117-75857062c740-1df609da65q\x00X\x16\x00\x00\x00ansible_mitogen.targetq\x01NX\n\x00\x00\x00run_moduleq\x02)cmitogen.core\nKwargs\nq\x03}q\x04X\x06\x00\x00\x00kwargsq\x05}q\x06(X\x0b\x00\x00\x00runner_nameq\x07X\x0e\x00\x00\x00NewStyleRunnerq\x08X\x06\x00\x00\x00moduleq\tcansible.utils.unsafe_proxy\nAnsibleUnsafeText\nq\nX\x16\x00\x00\x00ansible.legacy.systemdq\x0b\x85q\x0c\x81q\rX\x04\x00\x00\x00pathq\x0eX9\x00\x00\x00/usr/lib/python3/dist-packages/ansible/modules/systemd.pyq\x0fX\t\x00\x00\x00json_argsq\x10XO\x02\x00\x00{"name": "wazuh-manager", "state": "restarted", "_ansible_check_mode": true, "_ansible_no_log": false, "_ansible_debug": false, "_ansible_diff": false, "_ansible_verbosity": 0, "_ansible_version": "2.10.17", "_ansible_module_name": "ansible.legacy.systemd", "_ansible_syslog_facility": "LOG_USER", "_ansible_selinux_special_fs": ["fuse", "nfs", "vboxsf", "ramfs", "9p", "vfat"], "_ansible_string_conversion_action": "warn", "_ansible_socket": null, "_ansible_shell_executable": "/bin/sh", "_ansible_keep_remote_files": false, "_ansible_tmpdir": null, "_ansible_remote_tmp": "~/.ansible/tmp"}q\x11X\x03\x00\x00\x00envq\x12}q\x13X\x14\x00\x00\x00interpreter_fragmentq\x14NX\t\x00\x00\x00is_pythonq\x15NX\n\x00\x00\x00module_mapq\x16}q\x17(X\x07\x00\x00\x00builtinq\x18]q\x19(X\x1a\x00\x00\x00ansible.module_utils._textq\x1aX\x1a\x00\x00\x00ansible.module_utils.basicq\x1bX\x1b\x00\x00\x00ansible.module_utils.commonq\x1cX/\x00\x00\x00ansible.module_utils.common._collections_compatq\x1dX(\x00\x00\x00ansible.module_utils.common._json_compatq\x1eX"\x00\x00\x00ansible.module_utils.common._utilsq\x1fX\'\x00\x00\x00ansible.module_utils.common.collectionsq X \x00\x00\x00ansible.module_utils.common.fileq!X&\x00\x00\x00ansible.module_utils.common.parametersq"X#\x00\x00\x00ansible.module_utils.common.processq#X$\x00\x00\x00ansible.module_utils.common.sys_infoq$X \x00\x00\x00ansible.module_utils.common.textq%X+\x00\x00\x00ansible.module_utils.common.text.convertersq&X+\x00\x00\x00ansible.module_utils.common.text.formattersq\'X&\x00\x00\x00ansible.module_utils.common.validationq(X$\x00\x00\x00ansible.module_utils.common.warningsq)X\x1b\x00\x00\x00ansible.module_utils.compatq*X\'\x00\x00\x00ansible.module_utils.compat._selectors2q+X%\x00\x00\x00ansible.module_utils.compat.selectorsq,X\x1b\x00\x00\x00ansible.module_utils.distroq-X#\x00\x00\x00ansible.module_utils.distro._distroq.X\x1a\x00\x00\x00ansible.module_utils.factsq/X,\x00\x00\x00ansible.module_utils.facts.ansible_collectorq0X$\x00\x00\x00ansible.module_utils.facts.collectorq1X!\x00\x00\x00ansible.module_utils.facts.compatq2X-\x00\x00\x00ansible.module_utils.facts.default_collectorsq3X#\x00\x00\x00ansible.module_utils.facts.hardwareq4X\'\x00\x00\x00ansible.module_utils.facts.hardware.aixq5X(\x00\x00\x00ansible.module_utils.facts.hardware.baseq6X*\x00\x00\x00ansible.module_utils.facts.hardware.darwinq7X-\x00\x00\x00ansible.module_utils.facts.hardware.dragonflyq8X+\x00\x00\x00ansible.module_utils.facts.hardware.freebsdq9X(\x00\x00\x00ansible.module_utils.facts.hardware.hpuxq:X(\x00\x00\x00ansible.module_utils.facts.hardware.hurdq;X)\x00\x00\x00ansible.module_utils.facts.hardware.linuxq<X*\x00\x00\x00ansible.module_utils.facts.hardware.netbsdq=X+\x00\x00\x00ansible.module_utils.facts.hardware.openbsdq>X)\x00\x00\x00ansible.module_utils.facts.hardware.sunosq?X$\x00\x00\x00ansible.module_utils.facts.namespaceq@X"\x00\x00\x00ansible.module_utils.facts.networkqAX&\x00\x00\x00ansible.module_utils.facts.network.aixqBX\'\x00\x00\x00ansible.module_utils.facts.network.baseqCX)\x00\x00\x00ansible.module_utils.facts.network.darwinqDX,\x00\x00\x00ansible.module_utils.facts.network.dragonflyqEX)\x00\x00\x00ansible.module_utils.facts.network.fc_wwnqFX*\x00\x00\x00ansible.module_utils.facts.network.freebsdqGX.\x00\x00\x00ansible.module_utils.facts.network.generic_bsdqHX\'\x00\x00\x00ansible.module_utils.facts.network.hpuxqIX\'\x00\x00\x00ansible.module_utils.facts.network.hurdqJX(\x00\x00\x00ansible.module_utils.facts.network.iscsiqKX(\x00\x00\x00ansible.module_utils.facts.network.linuxqLX)\x00\x00\x00ansible.module_utils.facts.network.netbsdqMX\'\x00\x00\x00ansible.module_utils.facts.network.nvmeqNX*\x00\x00\x00ansible.module_utils.facts.network.openbsdqOX(\x00\x00\x00ansible.module_utils.facts.network.sunosqPX \x00\x00\x00ansible.module_utils.facts.otherqQX\'\x00\x00\x00ansible.module_utils.facts.other.facterqRX%\x00\x00\x00ansible.module_utils.facts.other.ohaiqSX!\x00\x00\x00ansible.module_utils.facts.sysctlqTX!\x00\x00\x00ansible.module_utils.facts.systemqUX*\x00\x00\x00ansible.module_utils.facts.system.apparmorqVX&\x00\x00\x00ansible.module_utils.facts.system.capsqWX(\x00\x00\x00ansible.module_utils.facts.system.chrootqXX)\x00\x00\x00ansible.module_utils.facts.system.cmdlineqYX+\x00\x00\x00ansible.module_utils.facts.system.date_timeqZX.\x00\x00\x00ansible.module_utils.facts.system.distributionq[X%\x00\x00\x00ansible.module_utils.facts.system.dnsq\\X%\x00\x00\x00ansible.module_utils.facts.system.envq]X&\x00\x00\x00ansible.module_utils.facts.system.fipsq^X\'\x00\x00\x00ansible.module_utils.facts.system.localq_X%\x00\x00\x00ansible.module_utils.facts.system.lsbq`X)\x00\x00\x00ansible.module_utils.facts.system.pkg_mgrqaX*\x00\x00\x00ansible.module_utils.facts.system.platformqbX(\x00\x00\x00ansible.module_utils.facts.system.pythonqcX)\x00\x00\x00ansible.module_utils.facts.system.selinuxqdX-\x00\x00\x00ansible.module_utils.facts.system.service_mgrqeX.\x00\x00\x00ansible.module_utils.facts.system.ssh_pub_keysqfX&\x00\x00\x00ansible.module_utils.facts.system.userqgX"\x00\x00\x00ansible.module_utils.facts.timeoutqhX \x00\x00\x00ansible.module_utils.facts.utilsqiX"\x00\x00\x00ansible.module_utils.facts.virtualqjX\'\x00\x00\x00ansible.module_utils.facts.virtual.baseqkX,\x00\x00\x00ansible.module_utils.facts.virtual.dragonflyqlX*\x00\x00\x00ansible.module_utils.facts.virtual.freebsdqmX\'\x00\x00\x00ansible.module_utils.facts.virtual.hpuxqnX(\x00\x00\x00ansible.module_utils.facts.virtual.linuxqoX)\x00\x00\x00ansible.module_utils.facts.virtual.netbsdqpX*\x00\x00\x00ansible.module_utils.facts.virtual.openbsdqqX(\x00\x00\x00ansible.module_utils.facts.virtual.sunosqrX)\x00\x00\x00ansible.module_utils.facts.virtual.sysctlqsX\x1c\x00\x00\x00ansible.module_utils.parsingqtX)\x00\x00\x00ansible.module_utils.parsing.convert_boolquX\x1f\x00\x00\x00ansible.module_utils.pycompat24qvX\x1c\x00\x00\x00ansible.module_utils.serviceqwX\x18\x00\x00\x00ansible.module_utils.sixqxeX\x06\x00\x00\x00customqy]qzuX\x0e\x00\x00\x00py_module_nameq{X\x17\x00\x00\x00ansible.modules.systemdq|X\r\x00\x00\x00good_temp_dirq}X\x12\x00\x00\x00/root/.ansible/tmpq~X\x03\x00\x00\x00cwdq\x7fNX\t\x00\x00\x00extra_envq\x80NX\x0b\x00\x00\x00emulate_ttyq\x81\x88X\x0f\x00\x00\x00service_contextq\x82cmitogen.core\n_unpickle_context\nq\x83K\x00N\x86q\x84Rq\x85us\x85q\x86Rq\x87tq\x88.' An exception occurred during task execution. To see the full traceback, use -vvv. The error was: File "<stdin>", line 853, in _find_global fatal: [hetzner3]: FAILED! => {"msg": "Unexpected failure during module execution.", "stdout": ""} NO MORE HOSTS LEFT ********************************************************************* PLAY RECAP ***************************************************************************** hetzner3 : ok=31 changed=7 unreachable=0 failed=1 skipped=18 rescued=0 ignored=0 user@ose:~/sandbox_local/ansible/hetzner3$
- I ran it without `--check`, and it hit the same error on the wazuh restart; but the other changes applied
user@ose:~/sandbox_local/ansible/hetzner3$ ansible-playbook provision.yml PLAY [hetzner3] ************************************************************************ TASK [Gathering Facts] ***************************************************************** ok: [hetzner3] TASK [dev-sec.ssh-hardening : include_tasks] ******************************************* included: /home/user/sandbox_local/ansible/hetzner3/roles/dev-sec.ssh-hardening/tasks/hardening.yml for hetzner3 TASK [dev-sec.ssh-hardening : Set OS dependent variables] ****************************** ok: [hetzner3] => (item=/home/user/sandbox_local/ansible/hetzner3/roles/dev-sec.ssh-hardening/vars/Debian.yml) TASK [dev-sec.ssh-hardening : get openssh-version] ************************************* ok: [hetzner3] TASK [dev-sec.ssh-hardening : include tasks to create crypo-vars] ********************** included: /home/user/sandbox_local/ansible/hetzner3/roles/dev-sec.ssh-hardening/tasks/crypto.yml for hetzner3 TASK [dev-sec.ssh-hardening : set hostkeys according to openssh-version] *************** ok: [hetzner3] TASK [dev-sec.ssh-hardening : set hostkeys according to openssh-version] *************** skipping: [hetzner3] TASK [dev-sec.ssh-hardening : set hostkeys according to openssh-version] *************** skipping: [hetzner3] TASK [dev-sec.ssh-hardening : set macs according to openssh-version if openssh >= 7.6] *** ok: [hetzner3] TASK [dev-sec.ssh-hardening : set macs according to openssh-version if openssh >= 6.6] *** skipping: [hetzner3] TASK [dev-sec.ssh-hardening : set macs according to openssh-version] ******************* skipping: [hetzner3] TASK [dev-sec.ssh-hardening : set macs according to openssh-version] ******************* skipping: [hetzner3] TASK [dev-sec.ssh-hardening : set ciphers according to openssh-version if openssh >= 6.6] *** skipping: [hetzner3] TASK [dev-sec.ssh-hardening : set ciphers according to openssh-version] **************** skipping: [hetzner3] TASK [dev-sec.ssh-hardening : set kex according to openssh-version if openssh >= 6.6] *** ok: [hetzner3] TASK [dev-sec.ssh-hardening : set kex according to openssh-version] ******************** skipping: [hetzner3] TASK [dev-sec.ssh-hardening : create revoked_keys and set permissions to root/600] ***** ok: [hetzner3] TASK [dev-sec.ssh-hardening : create sshd_config and set permissions to root/600] ****** ok: [hetzner3] TASK [dev-sec.ssh-hardening : create ssh_config and set permissions to root/644] ******* ok: [hetzner3] TASK [dev-sec.ssh-hardening : Check if /etc/ssh/moduli contains weak DH parameters] **** ok: [hetzner3] TASK [dev-sec.ssh-hardening : remove all small primes] ********************************* skipping: [hetzner3] TASK [dev-sec.ssh-hardening : include tasks to setup ca keys and principals] *********** skipping: [hetzner3] TASK [dev-sec.ssh-hardening : include tasks to setup 2FA] ****************************** skipping: [hetzner3] TASK [dev-sec.ssh-hardening : include selinux specific tasks] ************************** skipping: [hetzner3] TASK [mikegleasonjr.firewall : include_tasks] ****************************************** included: /home/user/sandbox_local/ansible/hetzner3/roles/mikegleasonjr.firewall/tasks/rules.yml for hetzner3 TASK [mikegleasonjr.firewall : Generate v4 rules] ************************************** ok: [hetzner3] TASK [mikegleasonjr.firewall : Load v4 rules] ****************************************** skipping: [hetzner3] TASK [mikegleasonjr.firewall : Generate v6 rules] ************************************** ok: [hetzner3] TASK [mikegleasonjr.firewall : Load v6 rules] ****************************************** skipping: [hetzner3] TASK [mikegleasonjr.firewall : include_tasks] ****************************************** included: /home/user/sandbox_local/ansible/hetzner3/roles/mikegleasonjr.firewall/tasks/persist-debian.yml for hetzner3 TASK [mikegleasonjr.firewall : Remove any obsolete scripts used by an old version of the role] *** ok: [hetzner3] => (item=/etc/network/if-post-down.d/iptables-v4) ok: [hetzner3] => (item=/etc/network/if-pre-up.d/iptables-v4) ok: [hetzner3] => (item=/etc/iptables.v4.saved) TASK [mikegleasonjr.firewall : Install iptables-persistent] **************************** ok: [hetzner3] TASK [mikegleasonjr.firewall : Install ipset-persistent] ******************************* ok: [hetzner3] TASK [mikegleasonjr.firewall : Check if netfilter-persistent is present] *************** skipping: [hetzner3] TASK [mikegleasonjr.firewall : Save rules (netfilter-persistent)] ********************** skipping: [hetzner3] TASK [mikegleasonjr.firewall : Save rules (iptables-persistent)] *********************** skipping: [hetzner3] TASK [mikegleasonjr.firewall : include_tasks] ****************************************** skipping: [hetzner3] TASK [maltfield.wazuh : install wazuh prereqs] ***************************************** ok: [hetzner3] TASK [maltfield.wazuh : wazuh gpg key] ************************************************* ok: [hetzner3] TASK [maltfield.wazuh : wazuh repo] **************************************************** ok: [hetzner3] TASK [maltfield.wazuh : install wazuh manager] ***************************************** ok: [hetzner3] TASK [maltfield.wazuh : ossec.conf] **************************************************** ok: [hetzner3] TASK [maltfield.wazuh : local_rules.xml] *********************************************** changed: [hetzner3] TASK [maltfield.wazuh : email encryption .forward file] ******************************** changed: [hetzner3] TASK [maltfield.wazuh : email encryption script] *************************************** changed: [hetzner3] TASK [maltfield.unattended-upgrades : install unattended-upgrades] ********************* changed: [hetzner3] TASK [maltfield.unattended-upgrades : 20auto-upgrades] ********************************* changed: [hetzner3] TASK [maltfield.unattended-upgrades : 50unattended-upgrades] *************************** changed: [hetzner3] TASK [install basic essential packages] ************************************************ changed: [hetzner3] RUNNING HANDLER [maltfield.wazuh : restart wazuh-manager] ****************************** ERROR! [mux 52332] 22:08:36.062548 E mitogen.[ssh.144.76.164.201:32415.sudo.root]: raw pickle was: b'\x80\x02(X&\x00\x00\x00ose-52519-797716ec4740-1e0102d106q\x00X\x16\x00\x00\x00ansible_mitogen.targetq\x01NX\n\x00\x00\x00run_moduleq\x02)cmitogen.core\nKwargs\nq\x03}q\x04X\x06\x00\x00\x00kwargsq\x05}q\x06(X\x0b\x00\x00\x00runner_nameq\x07X\x0e\x00\x00\x00NewStyleRunnerq\x08X\x06\x00\x00\x00moduleq\tcansible.utils.unsafe_proxy\nAnsibleUnsafeText\nq\nX\x16\x00\x00\x00ansible.legacy.systemdq\x0b\x85q\x0c\x81q\rX\x04\x00\x00\x00pathq\x0eX9\x00\x00\x00/usr/lib/python3/dist-packages/ansible/modules/systemd.pyq\x0fX\t\x00\x00\x00json_argsq\x10XP\x02\x00\x00{"name": "wazuh-manager", "state": "restarted", "_ansible_check_mode": false, "_ansible_no_log": false, "_ansible_debug": false, "_ansible_diff": false, "_ansible_verbosity": 0, "_ansible_version": "2.10.17", "_ansible_module_name": "ansible.legacy.systemd", "_ansible_syslog_facility": "LOG_USER", "_ansible_selinux_special_fs": ["fuse", "nfs", "vboxsf", "ramfs", "9p", "vfat"], "_ansible_string_conversion_action": "warn", "_ansible_socket": null, "_ansible_shell_executable": "/bin/sh", "_ansible_keep_remote_files": false, "_ansible_tmpdir": null, "_ansible_remote_tmp": "~/.ansible/tmp"}q\x11X\x03\x00\x00\x00envq\x12}q\x13X\x14\x00\x00\x00interpreter_fragmentq\x14NX\t\x00\x00\x00is_pythonq\x15NX\n\x00\x00\x00module_mapq\x16}q\x17(X\x07\x00\x00\x00builtinq\x18]q\x19(X\x1a\x00\x00\x00ansible.module_utils._textq\x1aX\x1a\x00\x00\x00ansible.module_utils.basicq\x1bX\x1b\x00\x00\x00ansible.module_utils.commonq\x1cX/\x00\x00\x00ansible.module_utils.common._collections_compatq\x1dX(\x00\x00\x00ansible.module_utils.common._json_compatq\x1eX"\x00\x00\x00ansible.module_utils.common._utilsq\x1fX\'\x00\x00\x00ansible.module_utils.common.collectionsq X \x00\x00\x00ansible.module_utils.common.fileq!X&\x00\x00\x00ansible.module_utils.common.parametersq"X#\x00\x00\x00ansible.module_utils.common.processq#X$\x00\x00\x00ansible.module_utils.common.sys_infoq$X \x00\x00\x00ansible.module_utils.common.textq%X+\x00\x00\x00ansible.module_utils.common.text.convertersq&X+\x00\x00\x00ansible.module_utils.common.text.formattersq\'X&\x00\x00\x00ansible.module_utils.common.validationq(X$\x00\x00\x00ansible.module_utils.common.warningsq)X\x1b\x00\x00\x00ansible.module_utils.compatq*X\'\x00\x00\x00ansible.module_utils.compat._selectors2q+X%\x00\x00\x00ansible.module_utils.compat.selectorsq,X\x1b\x00\x00\x00ansible.module_utils.distroq-X#\x00\x00\x00ansible.module_utils.distro._distroq.X\x1a\x00\x00\x00ansible.module_utils.factsq/X,\x00\x00\x00ansible.module_utils.facts.ansible_collectorq0X$\x00\x00\x00ansible.module_utils.facts.collectorq1X!\x00\x00\x00ansible.module_utils.facts.compatq2X-\x00\x00\x00ansible.module_utils.facts.default_collectorsq3X#\x00\x00\x00ansible.module_utils.facts.hardwareq4X\'\x00\x00\x00ansible.module_utils.facts.hardware.aixq5X(\x00\x00\x00ansible.module_utils.facts.hardware.baseq6X*\x00\x00\x00ansible.module_utils.facts.hardware.darwinq7X-\x00\x00\x00ansible.module_utils.facts.hardware.dragonflyq8X+\x00\x00\x00ansible.module_utils.facts.hardware.freebsdq9X(\x00\x00\x00ansible.module_utils.facts.hardware.hpuxq:X(\x00\x00\x00ansible.module_utils.facts.hardware.hurdq;X)\x00\x00\x00ansible.module_utils.facts.hardware.linuxq<X*\x00\x00\x00ansible.module_utils.facts.hardware.netbsdq=X+\x00\x00\x00ansible.module_utils.facts.hardware.openbsdq>X)\x00\x00\x00ansible.module_utils.facts.hardware.sunosq?X$\x00\x00\x00ansible.module_utils.facts.namespaceq@X"\x00\x00\x00ansible.module_utils.facts.networkqAX&\x00\x00\x00ansible.module_utils.facts.network.aixqBX\'\x00\x00\x00ansible.module_utils.facts.network.baseqCX)\x00\x00\x00ansible.module_utils.facts.network.darwinqDX,\x00\x00\x00ansible.module_utils.facts.network.dragonflyqEX)\x00\x00\x00ansible.module_utils.facts.network.fc_wwnqFX*\x00\x00\x00ansible.module_utils.facts.network.freebsdqGX.\x00\x00\x00ansible.module_utils.facts.network.generic_bsdqHX\'\x00\x00\x00ansible.module_utils.facts.network.hpuxqIX\'\x00\x00\x00ansible.module_utils.facts.network.hurdqJX(\x00\x00\x00ansible.module_utils.facts.network.iscsiqKX(\x00\x00\x00ansible.module_utils.facts.network.linuxqLX)\x00\x00\x00ansible.module_utils.facts.network.netbsdqMX\'\x00\x00\x00ansible.module_utils.facts.network.nvmeqNX*\x00\x00\x00ansible.module_utils.facts.network.openbsdqOX(\x00\x00\x00ansible.module_utils.facts.network.sunosqPX \x00\x00\x00ansible.module_utils.facts.otherqQX\'\x00\x00\x00ansible.module_utils.facts.other.facterqRX%\x00\x00\x00ansible.module_utils.facts.other.ohaiqSX!\x00\x00\x00ansible.module_utils.facts.sysctlqTX!\x00\x00\x00ansible.module_utils.facts.systemqUX*\x00\x00\x00ansible.module_utils.facts.system.apparmorqVX&\x00\x00\x00ansible.module_utils.facts.system.capsqWX(\x00\x00\x00ansible.module_utils.facts.system.chrootqXX)\x00\x00\x00ansible.module_utils.facts.system.cmdlineqYX+\x00\x00\x00ansible.module_utils.facts.system.date_timeqZX.\x00\x00\x00ansible.module_utils.facts.system.distributionq[X%\x00\x00\x00ansible.module_utils.facts.system.dnsq\\X%\x00\x00\x00ansible.module_utils.facts.system.envq]X&\x00\x00\x00ansible.module_utils.facts.system.fipsq^X\'\x00\x00\x00ansible.module_utils.facts.system.localq_X%\x00\x00\x00ansible.module_utils.facts.system.lsbq`X)\x00\x00\x00ansible.module_utils.facts.system.pkg_mgrqaX*\x00\x00\x00ansible.module_utils.facts.system.platformqbX(\x00\x00\x00ansible.module_utils.facts.system.pythonqcX)\x00\x00\x00ansible.module_utils.facts.system.selinuxqdX-\x00\x00\x00ansible.module_utils.facts.system.service_mgrqeX.\x00\x00\x00ansible.module_utils.facts.system.ssh_pub_keysqfX&\x00\x00\x00ansible.module_utils.facts.system.userqgX"\x00\x00\x00ansible.module_utils.facts.timeoutqhX \x00\x00\x00ansible.module_utils.facts.utilsqiX"\x00\x00\x00ansible.module_utils.facts.virtualqjX\'\x00\x00\x00ansible.module_utils.facts.virtual.baseqkX,\x00\x00\x00ansible.module_utils.facts.virtual.dragonflyqlX*\x00\x00\x00ansible.module_utils.facts.virtual.freebsdqmX\'\x00\x00\x00ansible.module_utils.facts.virtual.hpuxqnX(\x00\x00\x00ansible.module_utils.facts.virtual.linuxqoX)\x00\x00\x00ansible.module_utils.facts.virtual.netbsdqpX*\x00\x00\x00ansible.module_utils.facts.virtual.openbsdqqX(\x00\x00\x00ansible.module_utils.facts.virtual.sunosqrX)\x00\x00\x00ansible.module_utils.facts.virtual.sysctlqsX\x1c\x00\x00\x00ansible.module_utils.parsingqtX)\x00\x00\x00ansible.module_utils.parsing.convert_boolquX\x1f\x00\x00\x00ansible.module_utils.pycompat24qvX\x1c\x00\x00\x00ansible.module_utils.serviceqwX\x18\x00\x00\x00ansible.module_utils.sixqxeX\x06\x00\x00\x00customqy]qzuX\x0e\x00\x00\x00py_module_nameq{X\x17\x00\x00\x00ansible.modules.systemdq|X\r\x00\x00\x00good_temp_dirq}X\x12\x00\x00\x00/root/.ansible/tmpq~X\x03\x00\x00\x00cwdq\x7fNX\t\x00\x00\x00extra_envq\x80NX\x0b\x00\x00\x00emulate_ttyq\x81\x88X\x0f\x00\x00\x00service_contextq\x82cmitogen.core\n_unpickle_context\nq\x83K\x00N\x86q\x84Rq\x85us\x85q\x86Rq\x87tq\x88.' An exception occurred during task execution. To see the full traceback, use -vvv. The error was: File "<stdin>", line 853, in _find_global fatal: [hetzner3]: FAILED! => {"msg": "Unexpected failure during module execution.", "stdout": ""} NO MORE HOSTS LEFT ********************************************************************* PLAY RECAP ***************************************************************************** hetzner3 : ok=31 changed=7 unreachable=0 failed=1 skipped=18 rescued=0 ignored=0 user@ose:~/sandbox_local/ansible/hetzner3$
- well it looks like there is no 'wazuh' or 'ossec' service of any kind, so that would be an issue
root@mail ~ # systemctl list-units | grep -iE 'ossec|wazuh' root@mail ~ #
- ok, apparently I have to enable and start it before it'll show-up https://documentation.wazuh.com/current/installation-guide/wazuh-server/step-by-step.html#starting-the-wazuh-manager
root@mail ~ # systemctl enable wazuh-manager Created symlink /etc/systemd/system/multi-user.target.wants/wazuh-manager.service → /lib/systemd/system/wazuh-manager.service. root@mail ~ # systemctl list-units | grep -iE 'ossec|wazuh' root@mail ~ # systemctl start wazuh-manager Job for wazuh-manager.service failed because the control process exited with error code. See "systemctl status wazuh-manager.service" and "journalctl -xeu wazuh-manager.service" for details. root@mail ~ # systemctl list-units | grep -iE 'ossec|wazuh' ● wazuh-manager.service loaded failed failed Wazuh manager root@mail ~ #
- but, in any case, it had an error trying to start
root@mail ~ # journalctl -u wazuh-manager --no-pager Sep 15 05:16:18 mail systemd[1]: Starting wazuh-manager.service - Wazuh manager... Sep 15 05:16:20 mail env[93637]: 2024/09/15 05:16:20 wazuh-csyslogd: ERROR: (1230): Invalid element in the configuration: 'rules'. Sep 15 05:16:20 mail env[93637]: 2024/09/15 05:16:20 wazuh-csyslogd: ERROR: (1202): Configuration error at 'etc/ossec.conf'. Sep 15 05:16:20 mail env[93637]: 2024/09/15 05:16:20 wazuh-csyslogd: CRITICAL: (1202): Configuration error at 'etc/ossec.conf'. Sep 15 05:16:20 mail env[93617]: wazuh-csyslogd: Configuration error. Exiting Sep 15 05:16:20 mail systemd[1]: wazuh-manager.service: Control process exited, code=exited, status=1/FAILURE Sep 15 05:16:20 mail systemd[1]: wazuh-manager.service: Failed with result 'exit-code'. Sep 15 05:16:20 mail systemd[1]: Failed to start wazuh-manager.service - Wazuh manager. Sep 15 05:16:20 mail systemd[1]: wazuh-manager.service: Consumed 1.706s CPU time. root@mail ~ #
- so it doesn't like our old config files; guess I have to read the wazuh documentation about upgrading https://documentation.wazuh.com/current/upgrade-guide/upgrading-central-components.html
- well this clearly says that the 'ossec' user was replaced by a 'wazuh' user in Wazuh 4.2.x https://documentation.wazuh.com/current/upgrade-guide/upgrading-central-components.html#upgrading-the-wazuh-server
- otherwise the docs weren't especially helpful
- it would be nice if the error message said on what line the file is on
- the internet says to check ossec.log directly, but it's not any more helpful
root@mail ~ # cat /var/ossec/logs/ossec.log 2024/09/15 05:16:20 wazuh-csyslogd: ERROR: (1230): Invalid element in the configuration: 'rules'. 2024/09/15 05:16:20 wazuh-csyslogd: ERROR: (1202): Configuration error at 'etc/ossec.conf'. 2024/09/15 05:16:20 wazuh-csyslogd: CRITICAL: (1202): Configuration error at 'etc/ossec.conf'. root@mail ~ #
- A couple other people reported this error on GitHub, but they stopped responding so the devs just closed it :(
- fortunately ansible *did* make a backup of the ossec.conf file before it replaced it
root@mail ~ # ls -lah /var/ossec/etc | grep -i ossec.conf -rw-rw---- 1 root wazuh 9.7K Sep 15 04:36 ossec.conf -rw-rw---- 1 root wazuh 9.2K Sep 15 04:35 ossec.conf.31557.2024-09-15@04:36:48~ root@mail ~ #
Fri Sep 13, 2024
- I found some bugs in wordpress that weren't an issue in php 5.6, but will be in php 8+ (debian 12 runs php 8.2 currently)
- I opened this bug report https://core.trac.wordpress.org/ticket/62047
- ...but it was closed as a duplicate of this existing issue, opened in 2019 https://core.trac.wordpress.org/ticket/48693
- but ^ that ticket was created to address just an issue where calling `ini_set()` would add logs
- the wordpress dev said that, apparently since php 8, calling ini_set() when it's disabled (as OSE has always done, per hardening best-practices) triggers a PHP Fatal Error; this is a catostrophic error that prevents us from even being able to login
- I'm going to have to manually edit all our wordpress installs where this bug exists, but I'm hoping this gets fixed upstream in the not too distant future
- I submitted a PR to wordpress https://github.com/WordPress/wordpress-develop/pull/7352
- one of the wordpress maintainers recommended that we create a dummy function named "ini_set" in wordpress-config.php
...
- I decided that the best way to proceed with ansible-ifying our varnish config is to move the purge keys out of the site vcl file template into a new file with just the one line that instantiates the variable containing the purge key. Then we just
import()
that file into the main file - this has the advantage of being able to use ansible to generate a random purge key when it first creates the file, without it changing if we change the big site-specific vcl file
...
- that finishes the first-pass through all the templates; tomorrow I need to grep again through all the files and see if there's any more secrets to remove
Thr Sep 12, 2024
- spent more time working on ansible roles
- I mostly finished varnish
- one varnish issue is that all of these varnish vhost config files have a plaintext purge key in them; we shouldn't add that to github (we want to publish our ansible roles)
- Ideally, I'd like to have ansible assign this purge key to be random when it first creates the files. I already did this with /etc/ansible/secret (which is a global non-vhost-specific purge key), but that method won't work for a file with more than just the password in it — else ansible will change the key every time it makes an update to the file
- I asked about this here https://stackoverflow.com/questions/78980038/how-to-replace-jinja2-variable-only-when-first-provisioning-file
- one potential solution is to use 'register' and '.changed. to execute a subsequent task (that's limited by 'creates') conditionally, and have that task do a substitute. But my fear is that this is less robust, and--if it breaks--we'll end-up with the unsub'd string (eg "CHANGEME") actually being defined as the purge key. If that happens, it makes us a bit more vulnerable to some DOS.
- I also created vhosts for xhprof (and a cron job to clean up old reports from '
- I finished the ansible role for php
- we're doing a major jump from php 5.6 on hetzner2 (cent 7.9) to php 8.2 on hetzner3 (debian 12)
Wed Sep 11, 2024
- wow, the ansible-role-firewall ansible role by mikegleasonjr merged my PR from yesterday https://github.com/mikegleasonjr/
- that was fast! and it was the first commit to that repo in 6 years; cool 8)
- I spent more time working on the ansible roles. I finished (untested) review of apache role and began working on the varnish role
Tue Sep 10, 2024
- Today I billed Marcin for 11 hours in August
- I sent an email to Marcin letting him know that my September commitment was postponed until October, and that I'm hoping to get a lot of work done this month. I don't expect to be able to finish everything, but I do hope to at least migrate some of the websites before October 15th
Hey Marcin, I just sent you an invoice for my 11 hours working on hetzner3 in August. I have good news: my September commitment was pushed-back to October, so I have availability in September to make a lot of progress on provisioning hetzner3. I don't think I'll be able to fully finish this project before my next commitment in mid-October, but I do think I'll be able migrate some subset of your sites in this stretch. Cheers, Michael Altfield Senior Technology Advisor PGP Fingerprint: 8A4B 0AF8 162F 3B6A 79B7 70D2 AA3E DF71 60E2 D97B Open Source Ecology www.opensourceecology.org
...
- here's TOFU 3/3 (ISP, exit in Ecuador)
Ecuador 2024-09-10 --2024-09-10 REDACTED-- https://www.mediawiki.org/keys/keys.txt Resolving www.mediawiki.org (www.mediawiki.org)... 208.80.154.224, 2620:0:861:ed1a::1 Connecting to www.mediawiki.org (www.mediawiki.org)|208.80.154.224|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/plain] Saving to: ‘keys.txt’ keys.txt [ <=> ] 54.79K 300KB/s in 0.2s 2024-09-10 REDACTED (300 KB/s) - ‘keys.txt’ saved [56107] --2024-09-10 REDACTED https://releases.wikimedia.org/mediawiki/1.39/mediawiki-1.39.8.tar.gz.sig Resolving releases.wikimedia.org (releases.wikimedia.org)... 208.80.154.224, 2620:0:861:ed1a::1 Connecting to releases.wikimedia.org (releases.wikimedia.org)|208.80.154.224|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 95 [application/pgp-signature] Saving to: ‘mediawiki-1.39.8.tar.gz.sig’ mediawiki-1.39.8.ta 100%[===================>] 95 --.-KB/s in 0s 2024-09-10 REDACTED (113 MB/s) - ‘mediawiki-1.39.8.tar.gz.sig’ saved [95/95] --2024-09-10 REDACTED-- https://wordpress.org/wordpress-6.6.1.zip Resolving wordpress.org (wordpress.org)... 198.143.164.252 Connecting to wordpress.org (wordpress.org)|198.143.164.252|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 26138467 (25M) [application/zip] Saving to: ‘wordpress-6.6.1.zip’ wordpress-6.6.1.zip 100%[===================>] 24.93M 7.04MB/s in 3.5s 2024-09-10 REDACTED (7.04 MB/s) - ‘wordpress-6.6.1.zip’ saved [26138467/26138467] 2024-09-10 2e943991a469cb28f4906148b2c3517ab6d5a9285e5342e2312c9f70e643955c keys.txt 25376a68595b872b5efdda1fc21a905df1afa57717a01e9e71d344067b216b4e mediawiki-1.39.8.tar.gz.sig 3757aa0f30e5e6f9952bcd08ca02c82f15b5fd25fb0cc6f9a8bc437af5a8f09f wordpress-6.6.1.zip gpg: WARNING: no command supplied. Trying to guess what you mean ... pub rsa4096/0x73F146FECF9D333C 2014-11-20 [SC] [expired: 2021-06-05] Key fingerprint = F64E BF5F 2099 6AB5 14F1 98A8 73F1 46FE CF9D 333C uid Tim Starling <tstarling@wikimedia.org> sub rsa4096/0x1075249FCCC9CAAF 2014-11-20 [E] [expired: 2021-06-05] pub dsa1024/0xC119E1A64D70938E 2003-11-15 [SCA] Key fingerprint = 4412 76E9 CCD1 5F44 F6D9 7D18 C119 E1A6 4D70 938E uid Brion Vibber <brion@pobox.com> sub elg1024/0x6596FAD2965B3548 2003-11-15 [E] pub dsa1024/0x9B69B3109D3BB7B0 2011-10-24 [SC] Key fingerprint = 1D98 867E 8298 2C8F E0AB C25F 9B69 B310 9D3B B7B0 uid Sam Reed <reedy@wikimedia.org> sub elg2048/0x3BBB95CE2B08BFD2 2011-10-24 [E] pub rsa2048/0x72BC1C5D23107F8A 2014-04-29 [SC] [expires: 2026-04-29] Key fingerprint = 41B2 ABE8 17AD D3E5 2BDA 946F 72BC 1C5D 2310 7F8A uid Chad Horohoe <chad@wikimedia.org> uid keybase.io/demon <demon@keybase.io> sub rsa2048/0x08CF4E7951361C13 2014-04-29 [E] [expires: 2026-04-29] pub rsa4096/0xF6DAD285018FAC02 2014-02-19 [SC] [expired: 2018-10-04] Key fingerprint = 6237 D8D3 ECC1 AE91 8729 296F F6DA D285 018F AC02 uid Tyler Cipriani <tcipriani@wikimedia.org> uid Tyler Cipriani <tyler@tylercipriani.com> uid [jpeg image of size 5098] sub rsa4096/0xB002E1FDEE737D83 2014-02-19 [E] [expired: 2018-10-04] pub rsa3072/0x26752EBB0D9E6218 2021-11-11 [SC] Key fingerprint = 72D2 86F6 F8F0 3C78 F2C5 9C73 2675 2EBB 0D9E 6218 uid Amir Sarabadani <asarabadani@wikimedia.org> sub rsa3072/0x4F889038CE86B378 2021-11-11 [E] pub rsa4096/0x361F943B15C08DD4 2015-05-22 [SC] [expired: 2020-05-20] Key fingerprint = 80D1 13B7 67E3 D519 3672 5679 361F 943B 15C0 8DD4 uid Brian Wolff <bwolff@wikimedia.org> uid Brian Wolff (Bawolff) <bawolff@gmail.com> sub rsa4096/0xBF1629CD074D3DD8 2015-05-22 [E] [expired: 2020-05-20] pub rsa4096/0x131910E01605D9AA 2016-01-08 [SC] [expired: 2020-07-31] Key fingerprint = C83A 8E4D 3C8F EB7C 8A3A 1998 1319 10E0 1605 D9AA uid Mukunda Modell <twentyafterfour@gmail.com> uid Mukunda Modell (WMF) <mmodell@wikimedia.org> uid [jpeg image of size 2928] sub rsa4096/0x5411F23A0C4E5EC1 2018-12-25 [A] [expired: 2020-12-24] sub rsa4096/0x02C99BB8AB1C6DD5 2018-12-25 [E] [expired: 2020-12-24] sub rsa4096/0x60AE06D4875BE862 2018-12-26 [S] [expired: 2019-12-26] user@disp4545:/tmp/tmp.5jCgsgJXsY$
- excellent; the sha256sums and full gpg output are identical on all three TOFUs; I have a very high confidence in their authenticity now
...
- I returned to work on reviewing and preparing the ansible roles
- I've chosen to use the ansible-role-firewall ansible role by mikegleasonjr because it's simple and powerful https://github.com/mikegleasonjr/
- unfortunately, I've discovered that, as soon as you add a firewall rule that uses the ipset module, iptables fails on-boot and you're left with a unfirewall'd server if you reboot!
- the fix for this appears to be installing the `ipset-persistent` package in apt, though the module doesn't do this for you
- I filed a bug report with the maintainer (even though it hasn't been updated in 6 years) https://github.com/mikegleasonjr/ansible-role-firewall/issues/43
- I also submitted a PR https://github.com/mikegleasonjr/ansible-role-firewall/pull/44
- in the meantime, I'm just going to edit the ansible role on the OSE config to install the `ipset-persistent` package
Wed Aug 07, 2024
- I realized that the 3TOFU on wordpress plugins and themes is going to be very difficult, especially for any themes or plugins whoose releases are locked-up behind paywalls
- anyway, here's TOFU 2/3 (VPN, exit in Latvia) for the main software
Latvia 2024-08-07 --2024-08-07 15:53:31-- https://www.mediawiki.org/keys/keys.txt Resolving www.mediawiki.org (www.mediawiki.org)... 185.15.59.224, 2a02:ec80:300:ed1a::1 Connecting to www.mediawiki.org (www.mediawiki.org)|185.15.59.224|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/plain] Saving to: ‘keys.txt’ keys.txt [ <=> ] 54.79K 64.6KB/s in 0.8s 2024-08-07 15:53:35 (64.6 KB/s) - ‘keys.txt’ saved [56107] --2024-08-07 15:53:35-- https://releases.wikimedia.org/mediawiki/1.39/mediawiki-1.39.8.tar.gz.sig Resolving releases.wikimedia.org (releases.wikimedia.org)... 185.15.59.224, 2a02:ec80:300:ed1a::1 Connecting to releases.wikimedia.org (releases.wikimedia.org)|185.15.59.224|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 95 [application/pgp-signature] Saving to: ‘mediawiki-1.39.8.tar.gz.sig’ mediawiki-1.39.8.ta 100%[===================>] 95 --.-KB/s in 0s 2024-08-07 15:53:37 (104 MB/s) - ‘mediawiki-1.39.8.tar.gz.sig’ saved [95/95] --2024-08-07 15:53:37-- https://wordpress.org/wordpress-6.6.1.zip Resolving wordpress.org (wordpress.org)... 198.143.164.252 Connecting to wordpress.org (wordpress.org)|198.143.164.252|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 26138467 (25M) [application/zip] Saving to: ‘wordpress-6.6.1.zip’ wordpress-6.6.1.zip 100%[===================>] 24.93M 494KB/s in 77s 2024-08-07 15:54:58 (330 KB/s) - ‘wordpress-6.6.1.zip’ saved [26138467/26138467] 2024-08-07 2e943991a469cb28f4906148b2c3517ab6d5a9285e5342e2312c9f70e643955c keys.txt 25376a68595b872b5efdda1fc21a905df1afa57717a01e9e71d344067b216b4e mediawiki-1.39.8.tar.gz.sig 3757aa0f30e5e6f9952bcd08ca02c82f15b5fd25fb0cc6f9a8bc437af5a8f09f wordpress-6.6.1.zip gpg: WARNING: no command supplied. Trying to guess what you mean ... pub rsa4096/0x73F146FECF9D333C 2014-11-20 [SC] [expired: 2021-06-05] Key fingerprint = F64E BF5F 2099 6AB5 14F1 98A8 73F1 46FE CF9D 333C uid Tim Starling <tstarling@wikimedia.org> sub rsa4096/0x1075249FCCC9CAAF 2014-11-20 [E] [expired: 2021-06-05] pub dsa1024/0xC119E1A64D70938E 2003-11-15 [SCA] Key fingerprint = 4412 76E9 CCD1 5F44 F6D9 7D18 C119 E1A6 4D70 938E uid Brion Vibber <brion@pobox.com> sub elg1024/0x6596FAD2965B3548 2003-11-15 [E] pub dsa1024/0x9B69B3109D3BB7B0 2011-10-24 [SC] Key fingerprint = 1D98 867E 8298 2C8F E0AB C25F 9B69 B310 9D3B B7B0 uid Sam Reed <reedy@wikimedia.org> sub elg2048/0x3BBB95CE2B08BFD2 2011-10-24 [E] pub rsa2048/0x72BC1C5D23107F8A 2014-04-29 [SC] [expires: 2026-04-29] Key fingerprint = 41B2 ABE8 17AD D3E5 2BDA 946F 72BC 1C5D 2310 7F8A uid Chad Horohoe <chad@wikimedia.org> uid keybase.io/demon <demon@keybase.io> sub rsa2048/0x08CF4E7951361C13 2014-04-29 [E] [expires: 2026-04-29] pub rsa4096/0xF6DAD285018FAC02 2014-02-19 [SC] [expired: 2018-10-04] Key fingerprint = 6237 D8D3 ECC1 AE91 8729 296F F6DA D285 018F AC02 uid Tyler Cipriani <tcipriani@wikimedia.org> uid Tyler Cipriani <tyler@tylercipriani.com> uid [jpeg image of size 5098] sub rsa4096/0xB002E1FDEE737D83 2014-02-19 [E] [expired: 2018-10-04] pub rsa3072/0x26752EBB0D9E6218 2021-11-11 [SC] Key fingerprint = 72D2 86F6 F8F0 3C78 F2C5 9C73 2675 2EBB 0D9E 6218 uid Amir Sarabadani <asarabadani@wikimedia.org> sub rsa3072/0x4F889038CE86B378 2021-11-11 [E] pub rsa4096/0x361F943B15C08DD4 2015-05-22 [SC] [expired: 2020-05-20] Key fingerprint = 80D1 13B7 67E3 D519 3672 5679 361F 943B 15C0 8DD4 uid Brian Wolff <bwolff@wikimedia.org> uid Brian Wolff (Bawolff) <bawolff@gmail.com> sub rsa4096/0xBF1629CD074D3DD8 2015-05-22 [E] [expired: 2020-05-20] pub rsa4096/0x131910E01605D9AA 2016-01-08 [SC] [expired: 2020-07-31] Key fingerprint = C83A 8E4D 3C8F EB7C 8A3A 1998 1319 10E0 1605 D9AA uid Mukunda Modell <twentyafterfour@gmail.com> uid Mukunda Modell (WMF) <mmodell@wikimedia.org> uid [jpeg image of size 2928] sub rsa4096/0x5411F23A0C4E5EC1 2018-12-25 [A] [expired: 2020-12-24] sub rsa4096/0x02C99BB8AB1C6DD5 2018-12-25 [E] [expired: 2020-12-24] sub rsa4096/0x60AE06D4875BE862 2018-12-26 [S] [expired: 2019-12-26] user@disp6295:/tmp/tmp.njx09QpvJQ$
- I sent Marcin & Catarina an email asking if they have a list of all the paid wordpress themes/plugins that they use
Hey Marcin, Hey Caraina, Do you have a list of all the wordpress plugins & themes that you use that are paid? I might need to get credentials from you to be able to download the latest version of these plugins & themes. Having a complete list will help with the migration. Please let me know if you have a list of the wordpress themes & plugins that you've bought and use. Thank you, Michael Altfield Senior Technology Advisor PGP Fingerprint: 8A4B 0AF8 162F 3B6A 79B7 70D2 AA3E DF71 60E2 D97B Open Source Ecology www.opensourceecology.org
- Marcin also responded to my last email, agreeing on this migration order
- he also said that the only candidate to retire was the old forums, but since I converted that to a dumb static HTML site, I said not to. It's trivial to migrate, unlike a DB-backed web app.
1. forum.opensourceecology.org 2. store.opensourceecology.org 3. microfactory.opensourceecology.org 4. fef.opensourceecology.org 5. oswh.opensourceecology.org 6. seedhome.openbuildinginstitute.org 7. www.openbuildinginstitute.org 8. www.opensourceecology.org 9. phplist.opensourceecology.org 10. wiki.opensourceecology.org
- I spent some time on the ansible roles, most of today was merging the old nginx config files to the new debian format
- it looks like I used to have two awstats sites for OSE and OBI. I've decided to combine those to just one
- Catarina responded with info about paid wordpress themes. She says:
- We have 3 licenses for Oshine (www.openbuildinginstitute.org, microfactory.opensourceecology.org, and store.opensourceecology.org) +
- 1 license for Enigmatic (www.opensourceecology.org)
- the licenses are split over two different accounts
- we did a confidential credential sharing, but I'm unable to login because — in addition to the username/email and password credential, the website sends an OTP to the email account registered for 2FA. This is giving me déjà vu
- fortunately I left some notes on this wiki before; apparently you don't even need to login to download the oshine release; it's publicly available here http://brandexponents.com/oshin-plugins/oshine.zip
- looks like the latest version of oshine is v7.2.2, released a few months ago on 2024-05-30 https://brandexponents.com/oshin-changelog/
-
- looks like the latest version of enigmatic is Version 3.6, released many years ago on August 2017-08-30
- we currently have v3.5 installed, so we do have an update to do
- I can't login to any of these accounts on my own, but Catarina forwarded me an OTP and I was able to get into her account
- Her Account -> Downloads section shows 3 rows
- Oshine - Regular License
- Oshine - Regular License
- Enigmatic - Regular License
- I clicked "Download" -> "Installable Wordpress file only" for each of the two separate rows for Oshine. I also downloaded a third copy from http://brandexponents.com/oshin-plugins/oshine.zip
- All three had different files
- The first file was named
themeforest-3JjZqZRr-oshine-creative-multipurpose-wordpress-theme-wordpress-theme.zip
and came from https://marketplace-downloads.customer.envatousercontent.com/files/496530523/oshin.zip?response-content-disposition=attachment%3B+filename%3Dthemeforest-3JjZqZRr-oshine-creative-multipurpose-wordpress-theme-wordpress-theme.zip&Expires=1723076076&Signature=eEnIYvbyQgjbbk~CCZMJq-gLERi-I2pSSZhkfL7hN~L3~UcMdV8Fuuehh5ArJV~xAANzUMobrS539ByHlyahnmp8rBhoBw4yItlhxKzzayOyf7Y9k0JMKeIj0RauYNkUSzyzLBgqB52eilQXINrmwpxxKuE5xs5n4FDDgJIlwxjsK0993lfWEcBW0CIrKjaPWehHh6MGDqlRNMaGJoGp~CFL6zTmmq~rnwEahlg~AWa5cULrupmm5ZQvLRAqh9~7BADq67nHIUW37Ya9ys6v-afqk5WNzLuciDaREcTV93Zm3bU1fu1Dpczt01wPFrRxyQqim85W40VvAHMD~AyU03iNNP51kfD~v12OxTWNPnyA7W7i~8zvxQ8m3jwNj-kMa~yKTllaV4nSZkShxYOc69~dgsPzKAwDJ0ukaIgW1Hs-07YZoQV0lvWs9DNbKPBTNbUBK9KQ1Rc4UEuu1ediKyuF9GZ-NVuhj7eI9AzraxrL4p~6-RmCu5Fk0GiYM20JZdOZMLQJiMhMsiP4PhJaZCmz0xYSGn9lfgurKJOOdISsTAXWgZp~o4nrtIj0B4u7SdaTIKHCMQ0kKT0yh0ZZxYQKonBk-FK0D7uiO4eddDUwTR~tiK8hLDrqu2wk1oHGGBosbrF-5IyWeup8J0EvX2ZtD0FW37bsYBOUR9JUjV0_&Key-Pair-Id=APKAJRP2AVKNFZOM4BLQ - The second file was named
themeforest-4EaAhtH1-oshine-creative-multipurpose-wordpress-theme-wordpress-theme.zip
and came from https://marketplace-downloads.customer.envatousercontent.com/files/496530523/oshin.zip?response-content-disposition=attachment%3B+filename%3Dthemeforest-4EaAhtH1-oshine-creative-multipurpose-wordpress-theme-wordpress-theme.zip&Expires=1723076144&Signature=h6cYIRf3dYjKM35sOS6Bfk5HPrcYj9Gi1Qfcz8Qj38SSDLW8IpQA-3q1VuyxKG1ODLcMnGinrgzhroVyZKanfQBD0nqlQc1TYTibeoBvGpTqGdETE9beUErw6FxJDJYFwxltGiBmUbMyV-pK0wigxMUKRls6I9Pzz2w7Vf-xR5Wp1m9SeFiNehAZ1E3UevERqLh0OIYU-Lcb1cirvs5WVrUaHILA9vnOV8AxQSgJeC6ldNrMkeGwEJCGX5UaO1VmvtN7APNAH43eDVPb8gwIfuXe3ZrRnvoz~Ezx2sgrOAPB15phu9pYgpNYQPCeeoQUyJ6kKTJh~GtjhC2MIdKEtiXOzYDUdybPt2~nr84ruFbuxMeXTQEhr-0fI~FGgCv~d3swfzlON3BwZHZziipZAlBGHPG49N5BrcCqbLRpDeM6ldSNiArAkrd~FAkV~uCAHrsWJ1OQCEz6-DxbcXgoumhxIoafECv8sSMuIF5MjzTQlLN6qCh2Rph7NMLEJbGjYn1e6dFO8K~b-zCShYhZg8qs1ret6n5R~US0r6Jq2Q9lykYMv5hWhcMLpM74dOT1~UFq4YHgFuwuIMzs7X3663GTMzzcCbHbldEgJW8oAFsD4p515UQOI06ltDxjAXfgyoXrlXNy0QW~ODuLLbAsgzTeyQMHo-7e0xX7h9wsPSQ_&Key-Pair-Id=APKAJRP2AVKNFZOM4BLQ - The third file was just named
oshine.zip
and came from http://brandexponents.com/oshin-plugins/oshine.zip - The first two files had an identical hash, but the third that I got without logging-in was different
- Her Account -> Downloads section shows 3 rows
user@disp6631:~/Downloads$ sha256sum i-download-after-login-1/themeforest-3JjZqZRr-oshine-creative-multipurpose-wordpress-theme-wordpress-theme.zip 7506d6759ff1ee3f66d6135176537f12067ce86f2d5ba045c125f20df6240789 i-download-after-login-1/themeforest-3JjZqZRr-oshine-creative-multipurpose-wordpress-theme-wordpress-theme.zip user@disp6631:~/Downloads$ user@disp6631:~/Downloads$ sha256sum i-download-after-login-2/themeforest-4EaAhtH1-oshine-creative-multipurpose-wordpress-theme-wordpress-theme.zip 7506d6759ff1ee3f66d6135176537f12067ce86f2d5ba045c125f20df6240789 i-download-after-login-2/themeforest-4EaAhtH1-oshine-creative-multipurpose-wordpress-theme-wordpress-theme.zip user@disp6631:~/Downloads$ user@disp6631:~/Downloads$ sha256sum oshine.zip 8a3ec6b288bbb3c0d08693d14f315b25957f582023705e8f224c323f02e297ed oshine.zip user@disp6631:~/Downloads$
- I did a diff of the oshine.zip file that I downloaded from the unauth'd public URL and the two identical files that I downloaded from themeforest, and I found that the unauth'd one is v7.0.5 (from mar 2022) and the other ones are v7.2.1 (from apr 2024)
- I also went ahead and downloaded Enegmatic, which saved to a file named
themeforest-2XwUOcbo-enigmatic-responsive-multipurpose-wp-theme-wordpress-theme.zip
from
user@disp6631:~/Downloads$ sha256sum themeforest-2XwUOcbo-enigmatic-responsive-multipurpose-wp-theme-wordpress-theme.zip ed0628d0e57bb4e44b1af24eb235c6c384433c9ca94806c11b881e16f7f2b74a themeforest-2XwUOcbo-enigmatic-responsive-multipurpose-wp-theme-wordpress-theme.zip user@disp6631:~/Downloads$
- the GET variable "Expires" is set to "
1723076850
", which is 8 minutes from now. So I think these URLs are valid for 10 minutes only.
- the GET variable "Expires" is set to "
- anyway, I've saved these versions on my laptop's OSE VM for now. I hope I can do at least a 2TOFU before I copy them to our new server.
- I also downloaded the license info for the three paid products on catarina's themeforest
user@disp6631:~/Downloads$ ls *.txt REDACTED-enigmatic-responsive-multipurpose-wp-theme-license.txt REDACTED-oshine-creative-multipurpose-wordpress-theme-license.txt REDACTED-oshine-creative-multipurpose-wordpress-theme-license.txt user@disp6631:~/Downloads$ user@disp6631:~/Downloads$ for f in $(ls *.txt); do echo $f; cat $f; echo; done REDACTED-enigmatic-responsive-multipurpose-wp-theme-license.txt LICENSE CERTIFICATE : Envato Market Item ============================================== This document certifies the purchase of: ONE REGULAR LICENSE as defined in the standard terms and conditions on Envato Market. Licensor's Author Username: LiveMesh Licensee: Catarina Mota Item Title: Enigmatic - Responsive Multi-Purpose WP Theme Item URL: https://themeforest.net/item/enigmatic-responsive-multipurpose-wp-theme/REDACTED Item ID: REDACTED Item Purchase Code: REDACTED Purchase Date: 2014-02-06 REDACTED UTC For any queries related to this document or license please contact Help Team via https://help.market.envato.com Envato Pty Ltd (11 119 159 741) PO Box 16122, Melbourne, VIC 8007, Australia ### THIS IS NOT A TAX RECEIPT OR INVOICE REDACTED-oshine-creative-multipurpose-wordpress-theme-license.txt LICENSE CERTIFICATE : Envato Market Item ============================================== This document certifies the purchase of: ONE REGULAR LICENSE as defined in the standard terms and conditions on Envato Market. Licensor's Author Username: brandexponents Licensee: Catarina Mota Item Title: Oshine - Multipurpose Creative WordPress Theme Item URL: https://themeforest.net/item/oshine-creative-multipurpose-wordpress-theme/REDACTED Item ID: REDACTED Item Purchase Code: REDACTED Purchase Date: 2016-02-08 REDACTED UTC For any queries related to this document or license please contact Help Team via https://help.market.envato.com Envato Pty Ltd (11 119 159 741) PO Box 16122, Melbourne, VIC 8007, Australia ### THIS IS NOT A TAX RECEIPT OR INVOICE REDACTED-oshine-creative-multipurpose-wordpress-theme-license.txt LICENSE CERTIFICATE : Envato Market Item ============================================== This document certifies the purchase of: ONE REGULAR LICENSE as defined in the standard terms and conditions on Envato Market. Licensor's Author Username: brandexponents Licensee: Catarina Mota Item Title: Oshine - Multipurpose Creative WordPress Theme Item URL: https://themeforest.net/item/oshine-creative-multipurpose-wordpress-theme/REDACTED Item ID: REDACTED Item Purchase Code: REDACTED Purchase Date: 2019-03-27 REDACTED UTC For any queries related to this document or license please contact Help Team via https://help.market.envato.com Envato Pty Ltd (11 119 159 741) PO Box 16122, Melbourne, VIC 8007, Australia ### THIS IS NOT A TAX RECEIPT OR INVOICE user@disp6631:~/Downloads$
- after spending a few hours manually merging the nginx configs, I checked the '/var/www/html/' dir on the prod server, and I noticed a few I didn't recognize
[root@opensourceecology ~]# ls -lah /var/www/html total 100K drwxr-xr-x 25 root root 4.0K May 30 2023 . drwxr-xr-x 5 root root 4.0K May 30 2023 .. d---r-x--- 3 not-apache apache 4.0K Aug 8 2018 3dp.opensourceecology.org drwxr-xr-x 4 root root 4.0K Dec 31 2019 awstats.openbuildinginstitute.org drwxr-xr-x 4 root root 4.0K Dec 31 2019 awstats.opensourceecology.org drwxr-xr-x 2 root root 4.0K Mar 2 2018 cacti.opensourceecology.org.old drwxr-xr-x 3 apache apache 4.0K Feb 9 2018 certbot d---r-x--- 3 not-apache apache 4.0K Aug 7 2018 d3d.opensourceecology.org d---r-x--- 3 not-apache apache 4.0K Apr 9 2019 fef.opensourceecology.org dr-xr-x--- 5 apache apache 4.0K Jul 11 2018 forum.opensourceecology.org d---r-x--- 3 not-apache apache 4.0K Oct 4 2018 microfactory.opensourceecology.org drwxr-xr-x 5 munin munin 4.0K Nov 6 2023 munin drwxr-xr-x 2 root root 4.0K Mar 3 2018 munin.opensourceecology.org drwxr-xr-x 3 root root 4.0K Nov 24 2017 openbuildinginstitute.org drwxr-x--- 3 apache apache 4.0K Jan 10 2018 oswh.opensourceecology.org d---r-x--- 7 not-apache apache 4.0K Mar 16 2019 phplist.opensourceecology.org d---r-x--- 4 not-apache apache 4.0K Dec 18 2017 seedhome.openbuildinginstitute.org drwxr-xr-x 3 root root 4.0K Nov 23 2017 SITE_DOWN drwxr-x--- 3 apache apache 4.0K Nov 13 2017 staging.openbuildinginstitute.org d---r-x--- 3 not-apache apache 4.0K Mar 5 2018 staging.opensourceecology.org d---r-x--- 4 not-apache apache 4.0K Apr 9 2019 store.opensourceecology.org drwxr-xr-x 3 apache apache 4.0K Sep 18 2017 varnishTest d---r-x--- 4 not-apache apache 4.0K May 18 2020 wiki.opensourceecology.org drwxr-x--- 3 apache apache 4.0K Dec 9 2017 www.openbuildinginstitute.org d---r-x--- 3 not-apache apache 4.0K Sep 5 2019 www.opensourceecology.org [root@opensourceecology ~]#
- d3d.opensourceecology.org
- openbuildinginstitute.org
- this one is basically empty (not to be confused with the actual vhost dir '/var/www/html/www.openbuildinginstitute.org/')
[root@opensourceecology ~]# find /var/www/html/openbuildinginstitute.org/ /var/www/html/openbuildinginstitute.org/ /var/www/html/openbuildinginstitute.org/htdocs /var/www/html/openbuildinginstitute.org/htdocs/.well-known [root@opensourceecology ~]#
- my notes say that d3d was an alternate name for what became microfactory. So I'm going to not migrate it.
Mon Aug 05, 2024
- I spent some more time working on the ansible roles
- I realized that wazuh 4.x is out, but our old install had 3.0 https://documentation.wazuh.com/current/installation-guide/wazuh-agent/wazuh-agent-package-linux.html
- I'm going to try to install wazuh 4.x on this new server
Sun Aug 04, 2024
- at the time of writing, here's the latest versions of mediawiki https://www.mediawiki.org/wiki/Version_lifecycle
- MediaWiki 1.42.1, current stable. Released 2024-06-27. EOL 2025-06 https://www.mediawiki.org/wiki/Special:MyLanguage/MediaWiki_1.42
- MediaWiki 1.39.8, current LTS stable. Released 2022-11-30. EOL 2025-11 https://www.mediawiki.org/wiki/Special:MyLanguage/MediaWiki_1.39
- There's also going to be a new LTS release coming-out in 2024-12, but I expect that we'll finish the migration by then
- good news: looks like Mediawiki signs their releases with PGP ☺
- the keys file doesn't appear to be on more than one domain; it has a lot of different people's keys https://www.mediawiki.org/keys/keys.txt
- the MediaWiki sourcecode is also on mediawiki.org, so that doesn't offer any additional out-of-bound auth verification of their public keys https://gerrit.wikimedia.org/
- the current LTS release is signed with Sam Reed's key (1D98 867E 8298 2C8F E0AB C25F 9B69 B310 9D3B B7B0), but I think we should just 3TOFU the keys.txt file above https://releases.wikimedia.org/mediawiki/1.39/mediawiki-1.39.8.tar.gz.sig
- at the time of writing, the latest version of wordpress is 6.6.1, released 2024-07-23
- wordpress doesn't have an LTS version :(
- and wordpress doesn't sign their releases :'(
- there's been a ticket open for adding release signing to the in-app upgrades since 2016 (8 years ago!), but that's not really what we want. We want release signing for downloading the installer https://core.trac.wordpress.org/ticket/39309
- I opened a bug report to just sign the damn releases as part of the release process with gpg like all other major foss projects, but I was given the runaround and they said they'll wait until core finishes the 8-year-old stalled ticket. perfect example of perfect being the enemy of the good https://meta.trac.wordpress.org/ticket/7574
- so I guess we'll just have to 3TOFU the latest wordpress release https://wordpress.org/wordpress-6.6.1.zip
- here's our 3TOFU script
REMOTE_FILES="https://www.mediawiki.org/keys/keys.txt https://releases.wikimedia.org/mediawiki/1.39/mediawiki-1.39.8.tar.gz.sig https://wordpress.org/wordpress-6.6.1.zip" CURL="/usr/bin/curl" WGET="/usr/bin/wget --retry-on-host-error --retry-connrefused" PYTHON="/usr/bin/python3" # in tails, we must torify if [[ "`whoami`" == "amnesia" ]] ; then CURL="/usr/bin/torify ${CURL}" WGET="/usr/bin/torify ${WGET}" PYTHON="/usr/bin/torify ${PYTHON}" fi tmpDir=`mktemp -d` pushd "${tmpDir}" # first get some info about our internet connection ${CURL} -s https://ifconfig.co/country | head -n1 ${CURL} -s https://check.torproject.org | grep Congratulations | head -n1 # and today's date date -u +"%Y-%m-%d" # get the file for file in ${REMOTE_FILES}; do wget ${file} done # checksum date -u +"%Y-%m-%d" sha256sum * # gpg fingerprint gpg --with-fingerprint --with-subkey-fingerprint --keyid-format 0xlong keys.txt
- And here's TOFU 1/3 (Tor, exit in Germany)
Congratulations. This browser is configured to use Tor. 2024-08-04 --2024-08-04 22:09:12-- https://www.mediawiki.org/keys/keys.txt Resolving www.mediawiki.org (www.mediawiki.org)... 185.15.59.224 Connecting to www.mediawiki.org (www.mediawiki.org)|185.15.59.224|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/plain] Saving to: ‘keys.txt’ keys.txt [ <=> ] 54.79K 97.2KB/s in 0.6s 2024-08-04 22:09:18 (97.2 KB/s) - ‘keys.txt’ saved [56107] --2024-08-04 22:09:18-- https://releases.wikimedia.org/mediawiki/1.39/mediawiki-1.39.8.tar.gz.sig Resolving releases.wikimedia.org (releases.wikimedia.org)... 185.15.59.224 Connecting to releases.wikimedia.org (releases.wikimedia.org)|185.15.59.224|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 95 [application/pgp-signature] Saving to: ‘mediawiki-1.39.8.tar.gz.sig’ mediawiki-1.39.8.ta 100%[===================>] 95 --.-KB/s in 0s 2024-08-04 22:09:21 (214 MB/s) - ‘mediawiki-1.39.8.tar.gz.sig’ saved [95/95] --2024-08-04 22:09:21-- https://wordpress.org/wordpress-6.6.1.zip Resolving wordpress.org (wordpress.org)... 198.143.164.252 Connecting to wordpress.org (wordpress.org)|198.143.164.252|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 26138467 (25M) [application/zip] Saving to: ‘wordpress-6.6.1.zip’ wordpress-6.6.1.zip 100%[===================>] 24.93M 704KB/s in 67s 2024-08-04 22:10:30 (383 KB/s) - ‘wordpress-6.6.1.zip’ saved [26138467/26138467] 2024-08-04 2e943991a469cb28f4906148b2c3517ab6d5a9285e5342e2312c9f70e643955c keys.txt 25376a68595b872b5efdda1fc21a905df1afa57717a01e9e71d344067b216b4e mediawiki-1.39.8.tar.gz.sig 3757aa0f30e5e6f9952bcd08ca02c82f15b5fd25fb0cc6f9a8bc437af5a8f09f wordpress-6.6.1.zip gpg: WARNING: no command supplied. Trying to guess what you mean ... pub rsa4096/0x73F146FECF9D333C 2014-11-20 [SC] [expired: 2021-06-05] Key fingerprint = F64E BF5F 2099 6AB5 14F1 98A8 73F1 46FE CF9D 333C uid Tim Starling <tstarling@wikimedia.org> sub rsa4096/0x1075249FCCC9CAAF 2014-11-20 [E] [expired: 2021-06-05] pub dsa1024/0xC119E1A64D70938E 2003-11-15 [SCA] Key fingerprint = 4412 76E9 CCD1 5F44 F6D9 7D18 C119 E1A6 4D70 938E uid Brion Vibber <brion@pobox.com> sub elg1024/0x6596FAD2965B3548 2003-11-15 [E] pub dsa1024/0x9B69B3109D3BB7B0 2011-10-24 [SC] Key fingerprint = 1D98 867E 8298 2C8F E0AB C25F 9B69 B310 9D3B B7B0 uid Sam Reed <reedy@wikimedia.org> sub elg2048/0x3BBB95CE2B08BFD2 2011-10-24 [E] pub rsa2048/0x72BC1C5D23107F8A 2014-04-29 [SC] [expires: 2026-04-29] Key fingerprint = 41B2 ABE8 17AD D3E5 2BDA 946F 72BC 1C5D 2310 7F8A uid Chad Horohoe <chad@wikimedia.org> uid keybase.io/demon <demon@keybase.io> sub rsa2048/0x08CF4E7951361C13 2014-04-29 [E] [expires: 2026-04-29] pub rsa4096/0xF6DAD285018FAC02 2014-02-19 [SC] [expired: 2018-10-04] Key fingerprint = 6237 D8D3 ECC1 AE91 8729 296F F6DA D285 018F AC02 uid Tyler Cipriani <tcipriani@wikimedia.org> uid Tyler Cipriani <tyler@tylercipriani.com> uid [jpeg image of size 5098] sub rsa4096/0xB002E1FDEE737D83 2014-02-19 [E] [expired: 2018-10-04] pub rsa3072/0x26752EBB0D9E6218 2021-11-11 [SC] Key fingerprint = 72D2 86F6 F8F0 3C78 F2C5 9C73 2675 2EBB 0D9E 6218 uid Amir Sarabadani <asarabadani@wikimedia.org> sub rsa3072/0x4F889038CE86B378 2021-11-11 [E] pub rsa4096/0x361F943B15C08DD4 2015-05-22 [SC] [expired: 2020-05-20] Key fingerprint = 80D1 13B7 67E3 D519 3672 5679 361F 943B 15C0 8DD4 uid Brian Wolff <bwolff@wikimedia.org> uid Brian Wolff (Bawolff) <bawolff@gmail.com> sub rsa4096/0xBF1629CD074D3DD8 2015-05-22 [E] [expired: 2020-05-20] pub rsa4096/0x131910E01605D9AA 2016-01-08 [SC] [expired: 2020-07-31] Key fingerprint = C83A 8E4D 3C8F EB7C 8A3A 1998 1319 10E0 1605 D9AA uid Mukunda Modell <twentyafterfour@gmail.com> uid Mukunda Modell (WMF) <mmodell@wikimedia.org> uid [jpeg image of size 2928] sub rsa4096/0x5411F23A0C4E5EC1 2018-12-25 [A] [expired: 2020-12-24] sub rsa4096/0x02C99BB8AB1C6DD5 2018-12-25 [E] [expired: 2020-12-24] sub rsa4096/0x60AE06D4875BE862 2018-12-26 [S] [expired: 2019-12-26] user@host:/tmp/user/1000/tmp.WBsp37SdVb$
- I need to decide the order of the websites to roll-out
- we should do the less important ones first and save the hardest for last
- here's the list of sites that apache is currently serving
- we should do the less important ones first and save the hardest for last
[maltfield@opensourceecology ~]$ sudo httpd -S [sudo] password for maltfield: VirtualHost configuration: 127.0.0.1:8010 localhost.localdomain (/etc/httpd/conf.d/certbot.conf:13) 127.0.0.1:8000 is a NameVirtualHost default server fef.opensourceecology.org (/etc/httpd/conf.d/00-fef.opensourceecology.org.conf:10) port 8000 namevhost fef.opensourceecology.org (/etc/httpd/conf.d/00-fef.opensourceecology.org.conf:10) port 8000 namevhost forum.opensourceecology.org (/etc/httpd/conf.d/00-forum.opensourceecology.org.conf:10) port 8000 namevhost microfactory.opensourceecology.org (/etc/httpd/conf.d/00-microfactory.opensourceecology.org.conf:10) port 8000 namevhost oswh.opensourceecology.org (/etc/httpd/conf.d/00-oswh.opensourceecology.org.conf:10) port 8000 namevhost phplist.opensourceecology.org (/etc/httpd/conf.d/00-phplist.opensourceecology.org.conf:10) port 8000 namevhost seedhome.openbuildinginstitute.org (/etc/httpd/conf.d/00-seedhome.openbuildinginstitute.org.conf:9) port 8000 namevhost store.opensourceecology.org (/etc/httpd/conf.d/00-store.opensourceecology.org.conf:10) port 8000 namevhost wiki.opensourceecology.org (/etc/httpd/conf.d/00-wiki.opensourceecology.org.conf:10) port 8000 namevhost www.openbuildinginstitute.org (/etc/httpd/conf.d/00-www.openbuildinginstitute.org.conf:1) alias openbuildinginstitute.org port 8000 namevhost www.opensourceecology.org (/etc/httpd/conf.d/000-www.opensourceecology.org.conf:10) alias www.opensourceecology.org alias blog.opensourceecology.org alias opensourceecology.org port 8000 namevhost awstats.openbuildinginstitute.org (/etc/httpd/conf.d/awstats.openbuildinginstitute.org.conf:1) port 8000 namevhost awstats.opensourceecology.org (/etc/httpd/conf.d/awstats.opensourceecology.org.conf:1) port 8000 namevhost munin.opensourceecology.org (/etc/httpd/conf.d/munin.opensourceecology.org.conf:1) port 8000 namevhost staging.opensourceecology.org (/etc/httpd/conf.d/staging.opensourceecology.org.conf:10) alias staging.opensourceecology.org alias opensourceecology.org ServerRoot: "/etc/httpd" Main DocumentRoot: "/etc/httpd/htdocs" Main ErrorLog: "/etc/httpd/logs/error_log" Mutex proxy: using_defaults Mutex authn-socache: using_defaults Mutex ssl-cache: using_defaults Mutex default: dir="/run/httpd/" mechanism=default Mutex mpm-accept: using_defaults Mutex authdigest-opaque: using_defaults Mutex proxy-balancer-shm: using_defaults Mutex rewrite-map: using_defaults Mutex authdigest-client: using_defaults Mutex ssl-stapling: using_defaults PidFile: "/run/httpd/httpd.pid" Define: _RH_HAS_HTTPPROTOCOLOPTIONS Define: DUMP_VHOSTS Define: DUMP_RUN_CFG Define: MODSEC_2.5 Define: MODSEC_2.9 User: name="apache" id=48 Group: name="apache" id=48 [maltfield@opensourceecology ~]$
- for the hetzner1 -> hetzner2 migration, this was the order we migrated
- www.openbuildinginstitute.org https://wiki.opensourceecology.org/wiki/CHG-2017-09-25_migrate_obi_to_hetzner2
- fef.opensourceecology.org https://wiki.opensourceecology.org/wiki/CHG-2018-01-03_migrate_fef_to_hetzner2
- forum.opensourceecology.org https://wiki.opensourceecology.org/wiki/CHG-2018-02-04_deprecate_vanilla_forums
- www.opensourceecology.org https://wiki.opensourceecology.org/wiki/CHG-2018-02-05_migrate_osemain_to_hetzner2
- wiki.opensourceecology.org https://wiki.opensourceecology.org/wiki/CHG-2018-05-22_migrate_wiki_to_hetzner2
- that doesn't include sites that have been added since the migration, including microfactory, phplist, seedhome, and store
- I couldn't find a CHG for oswh, but I guess I also migrated that at some point in 2018 (?)
- oswh was the site that got totally hacked and I cleaned it up in 2017 https://wiki.opensourceecology.org/wiki/Maltfield_log_2017#Sat_Dec_30.2C_2017
- I think we should do it in this order
- forum.opensourceecology.org
- store.opensourceecology.org
- microfactory.opensourceecology.org
- fef.opensourceecology.org
- oswh.opensourceecology.org
- seedhome.openbuildinginstitute.org
- www.openbuildinginstitute.org
- www.opensourceecology.org
- phplist.opensourceecology.org
- wiki.opensourceecology.org
- I sent Marcin an email asking him if this looks ideal to him
- for the hetzner1 -> hetzner2 migration, this was the order we migrated
Hey Marcin, We need to decide what order to migrate the sites from Hetzner2 to Hetzner3. Here's what I propose: 1. forum.opensourceecology.org 2. store.opensourceecology.org 3. microfactory.opensourceecology.org 4. fef.opensourceecology.org 5. oswh.opensourceecology.org 6. seedhome.openbuildinginstitute.org 7. www.openbuildinginstitute.org 8. www.opensourceecology.org 9. phplist.opensourceecology.org 10. wiki.opensourceecology.org Does that cover all of the websites? Did I miss any? The general idea is to do simpler & less-priority websites first, saving the trickiest and highest-priority websites for last. Do you agree with this order? Would you prefer some changes? Now is also a good time to consider if you want to retire any of these unused websites, if you'd like. Please let me know if you agree with this migration order. Thank you, Michael Altfield Senior Technology Advisor PGP Fingerprint: 8A4B 0AF8 162F 3B6A 79B7 70D2 AA3E DF71 60E2 D97B Open Source Ecology www.opensourceecology.org
- I updated Hetzner3 with some of the info that hetzner reported back to us
- we learned that our hetzner3 server was a
EX42-NVMe
model. For comparison, Hetzner2 was aEX41S-SSD
. - we learned the 64G RAM is the max that this server can take
- we learned that our hetzner3 server was a
- I booted the server to grab the /proc/cpuinfo and then shut it down again
root@mail ~ # cat /proc/cpuinfo ... processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 94 model name : Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz stepping : 3 microcode : 0xf0 cpu MHz : 905.921 cache size : 8192 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d arch_capabilities vmx flags : vnmi preemption_timer invvpid ept_x_only ept_ad ept_1gb flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest ple shadow_vmcs pml bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa itlb_multihit srbds mmio_stale_data retbleed gds bogomips : 6799.81 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: root@mail ~ #
- I spent some time preparing our ansible roles & provisioning playbook
Wed July 31, 2024
- This morning I woke-up to some emails from Hetzner indicating that our Hetzner3 order is finished, and responding to my questions
>1. By default, is the two disks configured in a RAID1 array? Servers from server auction are without pre-installed OS, so that no Software-Raid is pre-configured. >2. Do we have any other RAID options? The server have two disks, so that you could install an Linux OS via installimage script. There you would have the option to install it without raid or raid1 or raid0 is possible with two disks. Please find information about installimagescript here: https://docs.hetzner.com/robot/dedicated-server/operating-systems/installimage/ >3. How many additional (empty/unused) disk slots does this dedicated >server have? What options would we have for adding additional disks to >this machine in the future, if needed You can add one additional NVMe or up to two additional sata SSD or sata HDD. >4. We ordered this because it has "M.2 NVME disks" as opposed to the >"SSD" disks. Can you confirm that the NVME disks are faster than "SSD" >disks? If not, please cancel our order and we'll purchase another >machine with 3x 512G SSD disks Usually NVMe SSD are faster than sata ssd. >5. Currently we have another dedicated server with "2 x 250 GB SATA 6 >Gb/s SSD". Can you please tell us the "Gb/s" throughput for this >server's disks? Unfortunately I don't have information about it. You should test it yourself on your server.
- so it looks like we did well snagging the auction for the last-available, lowest-price with "2x SSD M.2 NVMe 512 GB"
- I checked their server auction page again, and I do still see one server available at their lowest 37.72 EUR/mo price with 2x 512G NVMe disks, so I guess one listing doesn't necessarily mean that there's only one server available.
- after migration, we should end-up with a 25% full disk. If we ~triple our current disk usage before we retire this server, we have the ability to add two more non-NVMe SATA SSD disks in another RAID1, which we can partition-up as-needed for our backups, tmp files, etc (hopefully we can keep www and DB on the faster NVMe disks)
- their hardware page has info on the addon SSD disks that they lease, and their prices https://docs.hetzner.com/robot/dedicated-server/general-information/root-server-hardware/#drives
- it looks like their cheapest non-NVMe SSD is 8.50 EUR/mo for a 1T disk. Of course, we would need two for the RAID, so that means we can 4x our disk space in the future for an additional 17 EUR / mo. They also have a 3.84 TB SATA SSD for 37 EUR/mo, and they have both 16T and 22T SATA HDDs for 20.50 and 27.00 EUR/mo, respectfully.
- we also get a free 100GB "storage box" along with our purchase, which I guess is some external NFS mount. It's probably slow as hell, but we *could* use it for something like our two local copies of encrypted backup data. We also have this as part of our hetzner2 plan, but we don't use it.
- the free 100G storage box is named "BX10". We can even increase this to a BX11 (1T @ 3.20 EUR/mo), BX21 (5T @ 10.90 EUR/mo), BX31 (20.80 EUR/mo), or BX41 (40.60 EUR/mo)
- this is my first time setting-up a hetzner dedicated server (I inherited both hetzner1 & hetzner2)
- I was hoping for something like an in-browser KVM/VNC like SolusVM offers. Or to feed it 'cloudinit' like hetnzer cloud offers, but I don't see that as an option
- looks like they have scripts for installing a few disros. docs here https://docs.hetzner.com/robot/dedicated-server/operating-systems/installimage/
- and here's the general docs on their dedicated servers https://docs.hetzner.com/robot/dedicated-server/getting-started/root-server-guide/
- ok, they do offer a KVM-over-IP for installing custom distros. Apparently a technician has to physically plug-in to the machine, and you're given 3 free hours. After that it's 8.40 EUR/hr https://docs.hetzner.com/robot/dedicated-server/maintainance/kvm-console/
- the new server is now listed on our "Hetzner Robot" Server Page https://robot.hetzner.com/server
- the old server is listed as "EX41S-SSD #XXXXXX"
- the new server is listed more simply as "Server Auction #XXXXXXX"
- I sent another support request to hetzner asking which type of hardware we have
Hi, I have another question about possible disk upgrades to our newly-purcahsed server "Server Auction #2443019". Can you please tell us what type of server we ordered? Is it an AX? DX? EX? GEX? PX? RX? SX? Or what? I ask because your documentation page on what disk upgrade options are available has a lot of caveats (eg "only for the following servers" or "not available for XYZ"), so to know what our options are I need to know which server type we have. * https://docs.hetzner.com/robot/dedicated-server/general-information/root-server-hardware/#drives Please let us know which server type we have, so that that I can figure out what disk upgrade options are available (and their prices). Thank you,
- I probably should have done this before, but I checked the cloud hetzner offerings
- I didn't check it before because I expected it to be memory bound. I do want to stick with 64G of RAM
- indeed, the cheapest server with 64G of RAM in the cloud is 95.99 EUR/mo — but that also gives us 16 dedicated cores (AMD) and 360 GB disk. And, of course, it's easier to upgrade. But too expensive
- as said above, we *could* probably get-by with 16G of RAM. That's 23.99 EUR/mo with dedicated vCPU. 32G is with dedicated vCPU is 47.99 EU/mo. With shared vCPU, we get 16G RAM for 15.90 EUR/mo or 32G for 31.90GB/mo. But we can't increase the RAM beyond 32GB/mo on the shared vCPU systems
- therefore, I do think going with the dedicated server is our best bet due to the value on the RAM that we get.
- I also asked hetzner sales about possible memory upgrades. I don't think we'll need more than 64G of RAM, but it would be good to know if upgrading is possible
Hi, I have another question about possible memory upgrades to our newly-purcahsed server "Server Auction #2443019". Can you please tell us if our current configuration with "4x RAM 16384 MB DDR4" is the maximum RAM that this system can accept? Does the server only have 4x RAM slots? Or are there some empty ones? Is it possible to increase one or more of the RAM slots with >16G RAM chips? Please let us know what options we have for future memory upgrades on this new dedicated server. Thank you,
- I tried to shut down the hetzner3 server (to eliminate it as a vector until I've hardened it), but there's only an option to reboot :(
- I gave hetzner my ssh public key at order-time, and — yep — it's setup with the root user by default :(
user@ose:~/tmp/ansible$ ssh root@144.76.164.201 Linux rescue 6.9.7 #1 SMP Thu Jun 27 15:07:37 UTC 2024 x86_64 -------------------- Welcome to the Hetzner Rescue System. This Rescue System is based on Debian GNU/Linux 12 (bookworm) with a custom kernel. You can install software like you would in a normal system. To install a new operating system from one of our prebuilt images, run 'installimage' and follow the instructions. Important note: Any data that was not written to the disks will be lost during a reboot. For additional information, check the following resources: Rescue System: https://docs.hetzner.com/robot/dedicated-server/troubleshooting/hetzner-rescue-system Installimage: https://docs.hetzner.com/robot/dedicated-server/operating-systems/installimage Install custom software: https://docs.hetzner.com/robot/dedicated-server/operating-systems/installing-custom-images other articles: https://docs.hetzner.com/robot -------------------- Rescue System (via Legacy/CSM) up since 2024-07-31 09:16 +02:00 Hardware data: CPU1: Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz (Cores 8) Memory: 64099 MB Disk /dev/nvme0n1: 512 GB (=> 476 GiB) doesn't contain a valid partition table Disk /dev/nvme1n1: 512 GB (=> 476 GiB) doesn't contain a valid partition table Total capacity 953 GiB with 2 Disks Network data: eth0 LINK: yes MAC: 90:1b:0e:c4:28:b4 IP: 144.76.164.201 IPv6: 2a01:4f8:200:40d7::2/64 Intel(R) PRO/1000 Network Driver root@rescue ~ #
- here's our disks info. So already only 476.9G. It'll be a bit less after we put a filesystem on it, I'm sure
root@rescue ~ # lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS loop0 7:0 0 3.1G 1 loop nvme0n1 259:0 0 476.9G 0 disk nvme1n1 259:1 0 476.9G 0 disk root@rescue ~ #
- I'm reading through the guide on hetzner's `installimage` tool https://docs.hetzner.com/robot/dedicated-server/operating-systems/installimage/
- the guide suggests a couple commands to see if we have a hardware raid. I didn't think we did, and this appears to confirm it
root@rescue ~ # megacli -LDInfo -Lall -Aall Exit Code: 0x00 root@rescue ~ # root@rescue ~ # arcconf GETCONFIG 1 LD Controllers found: 0 Invalid controller number. root@rescue ~ #
- it appears we don't have any software RAIDs already setup either
root@rescue ~ # ls /dev/md* ls: cannot access '/dev/md*': No such file or directory root@rescue ~ #
- quick disk tests show that we're getting 2 Gb/s disk read. shit, that's slower than the advertised "6 Gb/s" on our prod server (though admittedly I never tested this)
root@rescue ~ # hdparm -Ttv /dev/nvme0n1 /dev/nvme0n1: readonly = 0 (off) readahead = 256 (on) geometry = 488386/64/32, sectors = 1000215216, start = 0 Timing cached reads: 35298 MB in 1.97 seconds = 17911.22 MB/sec Timing buffered disk reads: 6992 MB in 3.00 seconds = 2330.11 MB/sec root@rescue ~ #
- ok, I ran installimage
root@rescue ~ # installimage
- I selected "Debian"
- I selected "Debian-1205-bookworm-amd64-base"
- it dumped me into a midnight commander editor and said I could save with F10
- it said that "by default all disks are used for software raid" — that sounds good
- the "standard config" file that it gave me was too long to try to copy & paste, but the default RAID looked like what we wanted
- the default partition layout had a 32G swap, 1G /boot, and the rest allocated to '/'
- hetzner2 has a 488M /boot that's currently 84% full (using 386M). 386/1024 = 38% full, which is much better. That sounds good.
- hetzner2 has a 32G swap. It does get used, but it's currently using <1G. 32G should be fine.
- I thought about setting up an LVM. It would be a better idea than having everything on one disk, but it would inevitable require more maintenance. For the sake of keeping things simple for a non-profit that has no sysadmins on staff, I'm going to stick to "just allocate the rest to '/'"
- oh, cool, the config file said what our disks are (if it can be trusted). It says: SAMSUNG MZVLB512HAJQ
- looks like they're from 2017 https://ssd.userbenchmark.com/SpeedTest/401452/SAMSUNG-MZVLB512HAJQ-000L2
- samsung advertises them as having 3.5 Gbps sequental read + 2.9 Gbps sequential write + 460K random read iops + 500k random write iops
- I decided to accept all these defaults and proceed with the install
- oh, except I did change the hostname line
- ah shit, how do I send an F10 command over ssh? stupid midnight commander editor...
- I pressed Ctrl+[[ and that seemed to work
- the install finished in a few minutes
Hetzner Online GmbH - installimage Your server will be installed now, this will take some minutes You can abort at any time with CTRL+C ... : Reading configuration done : Loading image file variables done : Loading debian specific functions done 1/16 : Deleting partitions done 2/16 : Test partition size done 3/16 : Creating partitions and /etc/fstab done 4/16 : Creating software RAID level 1 done 5/16 : Formatting partitions : formatting /dev/md/0 with swap done : formatting /dev/md/1 with ext3 done : formatting /dev/md/2 with ext4 done 6/16 : Mounting partitions done 7/16 : Sync time via ntp done : Importing public key for image validation done 8/16 : Validating image before starting extraction done 9/16 : Extracting image (local) done 10/16 : Setting up network config done 11/16 : Executing additional commands : Setting hostname done : Generating new SSH keys done : Generating mdadm config done : Generating ramdisk done : Generating ntp config done 12/16 : Setting up miscellaneous files done 13/16 : Configuring authentication : Fetching SSH keys done : Disabling root password done : Disabling SSH root login with password done : Copying SSH keys done 14/16 : Installing bootloader grub done 15/16 : Running some debian specific functions done 16/16 : Clearing log files done INSTALLATION COMPLETE You can now reboot and log in to your new system with the same credentials that you used to log into the rescue system. root@rescue ~ #
- I have to say that I was happy that it generated new ssh keys and that it said it was verifying some public key and doing some image verification.
- oh, and they disabled root password and ssh login with password. that's better than I expected from them. good.
- I ran `shutdown -h now`, and the server didn't come back. That's actually good. I wanted to see if I could shut the thing down (the safest machine is a machine that's off, especially before hardening), but the hetzner robot WUI didn't give an option to shutdown (only to reboot).
- after waiting 5 minutes with no pongs to my pings, I logged into the hetzner robot wui -> server -> reset tab -> Execute an automatic hardware reset, and I clicked "Send" https://robot.hetzner.com/server
- after a few more minutes with still no pongs to my pings, I logged into the hetzner robot wui -> server -> WOL tab -> clicked "Send WOL signal to server"
- after about 2 minutes, I started getting pings and was able to ssh-in as the 'root' user
- as soon as I got a shell from ssh, I quickly pasted-in my "jumpstart" provisioning and hardening commands to create a user for me, do basic ssh hardening, and setup a basic firewall to block everything except ssh
adduser maltfield --disabled-password --gecos '' groupadd sshaccess gpasswd -a maltfield sshaccess mkdir /home/maltfield/.ssh/ echo "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDGNYjR7UKiJSAG/AbP+vlCBqNfQZ2yuSXfsEDuM7cEU8PQNJyuJnS7m0VcA48JRnpUpPYYCCB0fqtIEhpP+szpMg2LByfTtbU0vDBjzQD9mEfwZ0mzJsfzh1Nxe86l/d6h6FhxAqK+eG7ljYBElDhF4l2lgcMAl9TiSba0pcqqYBRsvJgQoAjlZOIeVEvM1lyfWfrmDaFK37jdUCBWq8QeJ98qpNDX4A76f9T5Y3q5EuSFkY0fcU+zwFxM71bGGlgmo5YsMMdSsW+89fSG0652/U4sjf4NTHCpuD0UaSPB876NJ7QzeDWtOgyBC4nhPpS8pgjsnl48QZuVm6FNDqbXr9bVk5BdntpBgps+gXdSL2j0/yRRayLXzps1LCdasMCBxCzK+lJYWGalw5dNaIDHBsEZiK55iwPp0W3lU9vXFO4oKNJGFgbhNmn+KAaW82NBwlTHo/tOlj2/VQD9uaK5YLhQqAJzIq0JuWZWFLUC2FJIIG0pJBIonNabANcN+vq+YJqjd+JXNZyTZ0mzuj3OAB/Z5zS6lT9azPfnEjpcOngFs46P7S/1hRIrSWCvZ8kfECpa8W+cTMus4rpCd40d1tVKzJA/n0MGJjEs2q4cK6lC08pXxq9zAyt7PMl94PHse2uzDFhrhh7d0ManxNZE+I5/IPWOnG1PJsDlOe4Yqw== maltfield@ose" > /home/maltfield/.ssh/authorized_keys chown -R maltfield:maltfield /home/maltfield/.ssh chmod -R 0600 /home/maltfield/.ssh chmod 0700 /home/maltfield/.ssh # without this, apt-get may get stuck export DEBIAN_FRONTEND=noninteractive apt-get update apt-get -y install iptables iptables-persistent apt-get -y purge nftables update-alternatives --set iptables /usr/sbin/iptables-legacy update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy update-alternatives --set arptables /usr/sbin/arptables-legacy update-alternatives --set ebtables /usr/sbin/ebtables-legacy iptables -A INPUT -i lo -j ACCEPT iptables -A INPUT -s 127.0.0.1/32 -d 127.0.0.1/32 -j DROP iptables -A INPUT -p icmp -j ACCEPT iptables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT iptables -A INPUT -p tcp -m state --state NEW -m tcp --dport 32415 -j ACCEPT iptables -A INPUT -j DROP iptables -A OUTPUT -m state --state RELATED,ESTABLISHED -j ACCEPT iptables -A OUTPUT -s 127.0.0.1/32 -d 127.0.0.1/32 -j ACCEPT iptables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT iptables -A OUTPUT -m owner --uid-owner 42 -j ACCEPT iptables -A OUTPUT -m owner --uid-owner 1000 -j ACCEPT iptables -A OUTPUT -m limit --limit 5/min -j LOG --log-prefix "iptables denied: " --log-level 7 iptables -A OUTPUT -j DROP ip6tables -A INPUT -i lo -j ACCEPT ip6tables -A INPUT -s ::1/128 -d ::1/128 -j DROP ip6tables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT ip6tables -A INPUT -j DROP ip6tables -A OUTPUT -s ::1/128 -d ::1/128 -j ACCEPT ip6tables -A OUTPUT -m state --state RELATED,ESTABLISHED -j ACCEPT ip6tables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT ip6tables -A OUTPUT -m owner --uid-owner 42 -j ACCEPT ip6tables -A OUTPUT -m owner --uid-owner 1000 -j ACCEPT ip6tables -A OUTPUT -j DROP iptables-save > /etc/iptables/rules.v4 ip6tables-save > /etc/iptables/rules.v6 cp /etc/ssh/sshd_config /etc/ssh/sshd_config.orig.`date "+%Y%m%d_%H%M%S"` grep 'Port 32415' /etc/ssh/sshd_config || echo 'Port 32415' >> /etc/ssh/sshd_config grep 'AllowGroups sshaccess' /etc/ssh/sshd_config || echo 'AllowGroups sshaccess' >> /etc/ssh/sshd_config grep 'PermitRootLogin no' /etc/ssh/sshd_config || echo 'PermitRootLogin no' >> /etc/ssh/sshd_config grep 'PasswordAuthentication no' /etc/ssh/sshd_config || echo 'PasswordAuthentication no' >> /etc/ssh/sshd_config systemctl restart sshd.service apt-get -y upgrade
- I added this new entry to my ose VM's /home/user/.ssh/config
Host hetzner3 Hostname 144.76.164.201 Port 32415 ForwardAgent yes User maltfield
- I then gave my user sudo permission
root@mail ~ # cp /etc/sudoers /etc/sudoers.20240731.orig root@mail ~ # root@mail ~ # visudo root@mail ~ # root@mail ~ # diff /etc/sudoers.20240731.orig /etc/sudoers 47a48 > maltfield ALL=(ALL:ALL) NOPASSWD:ALL root@mail ~ #
- alright, basic hardening is done.
- That's all I really wanted to achieve for now. Next I'd like to prepare some ansible playbooks to setup the rest of the basic hardening
- for now, I just want to leave this machine off in the meantime
- I attempted to shut it down again
- I left a ping open for 147 minutes, and I never got a pong back. So I'd say it's off. Great!
- it appears that I just have to trigger a WOL on hetzner robot WUI to turn it back on, which I'll do after I spend some time working on the ansible roles playbooks
Tue July 30, 2024
- Marcin gave me the go-ahead to order a "hetzner3" server and begin provisioning it with Debian in preparation to migrate all our sites from the CentOS7 hetzner2 server to this new server
- This is going to be an enormous project. When I did the hetzner1 -> hetzner2 migration, I inherited both systems (in 2017). For some reason the websites were split across both servers (plus dreamhost too iirc?). but I consolidated everything onto "hetzner2" and canceled "hetzner1" in 2018 https://wiki.opensourceecology.org/index.php?title=OSE_Server&oldid=298909#Assessment_of_Server_Options
- I'll be using ansible to assist in provisioning this server (and hopefully make it easier to provision future servers). Marcin expressed interest in lowering this barrier for others, as well.
- I noticed that 5 years ago I created a repo for OSE's ansible playbooks, but it's empty.
- I just added the LICENSE to this repo, and I plan to use it to publish our ansible roles/playbooks
- First thing I need to do is decide which server to buy from hetzner's dedicated serer offerings
- holy crap, not only are their server auctions *much* cheeper per month, they also don't have a one-time-setup fee (usually ~$50-$200?)
- I've written pretty extensively in the past about what specs I'd be looking to get in a future OSE Server migration https://wiki.opensourceecology.org/index.php?title=OSE_Server&oldid=298909#OSE_Server_and_Server_Requirements
- In 2018, I said we'd want min 2-4 cores
- In 2018, I said we'd want min 8-16 G RAM
- In 2018, I said we'd want min ~200G disk
- Honestly, I expect that the lowest offerings of a dedicated server in 2024 are probably going to suffice for us, but what I'm mostly concerned-about is the disk.
- even last week when I did the yum updates, I nearly filled the disk just by extracting a copy of our backups. Currently we have two 250G disks in a software RAID-1 (mirror) array. That give us a useable 197G
- it's also provisioned with all the data on '/'. It would be smart if we setup an LVM
- It's important to me that we double this at-least, but I'll see if there's any deals on 1TB disks or larger
- also what we currently have is a 6 Gb/s SSD, so I don't want to downgrade that by going to a spinning-disk HDD. NvME might be a welcome upgrade. I/O wait is probably a bottleneck, but not currently one that's causing us agony
- I spent some time reviewing the munin graphs
- load rarely ever touches 3. Most of the time it hovers between 0.2 - 1. So I agree that 4 cores is fine for us now.
- most of these auctions have a Intel Core i7-4770, which is a 4-core + 8 thread proc. That should be fine.
- somehow our varnish hits are way down. They used to average >80%, but currently they're down to 28-44%
- load rarely ever touches 3. Most of the time it hovers between 0.2 - 1. So I agree that 4 cores is fine for us now.
- I documented these charts and my findings on a new Hetzner3 page
- I looked through the listings in the server auctions
- I don't want one that's only 32G RAM (few of these are)
- It looks like some have "2 x SSD SATA 250 GB" and some have "2 x SSD M.2 NVMe 512 GB". If we can, let's get the NVMe disks with better io
- there is one with "2 x HDD SATA 2,0 TB Enterprise". More space would be nice, but not at the sacrifice of io
- questions I have for hetzner:
- how many disk slots are there? Can we add more disks in the future?
- by default, do all these systems have RAID-1? Do we have other RAID options?
- oh, actually, there was only one server available for less than 38 EUR/mo that had the 2x 512GB NVME
- I went ahead and ordered it
- I also sent a separate message to hetzner sales asking them for detailed info about the different read & write speeds of their HDD, SSD, and NVME offerings in dedicated servers
- I sent an email to Marcin
Hey Marcin, I just ordered a dedicated server from Hetzner with the following specs: * Intel Core i7-6700 * 2x SSD M.2 NVMe 512 GB * 4x RAM 16384 MB DDR4 * NIC 1 Gbit Intel I219-LM * Location: Germany, FSN1 * Rescue system (English) * 1 x Primary IPv4 While they had plenty of servers available with the i7-6700 and 16G of RAM, they only had one with 2x 512 GB NVMe disks (the others were just "SSD" disks). Those NVMe disks should give us a performance boost, so I snagged it while it was available. I did some reviews of our munin charts to determine our hetzner3 server's needs. For more info, see * https://wiki.opensourceecology.org/wiki/Hetzner3 Please let me know if you have any questions about this server. Thank you, Michael Altfield Senior Technology Advisor PGP Fingerprint: 8A4B 0AF8 162F 3B6A 79B7 70D2 AA3E DF71 60E2 D97B Open Source Ecology www.opensourceecology.org
Fri July 26, 2024
- I started the CHG-2024-07-26_yum_update today at 11:00
- pre-state proof shows we have lots of outdated system packages, as expected
[root@opensourceecology ~]# yum list updates ... xz-libs.x86_64 5.2.2-2.el7_9 updates yum.noarch 3.4.3-168.el7.centos base yum-cron.noarch 3.4.3-168.el7.centos base yum-plugin-fastestmirror.noarch 1.1.31-54.el7_8 base yum-utils.noarch 1.1.31-54.el7_8 base zlib.x86_64 1.2.7-21.el7_9 updates [root@opensourceecology ~]#
- I tried to check the backups log, but it was empty :/
[root@opensourceecology ~]# cat /var/log/backups/backup.log [root@opensourceecology ~]#
- ok, looks like it rotated already; this file shows a 20.424G backup file successfully uploaded to backblaze with rclone
[root@opensourceecology ~]# ls /var/log/backups/ backup.lo backup.log-20240628.gz backup.log-20240714.gz backup.log backup.log-20240629.gz backup.log-20240715.gz backup.log-20240615.gz backup.log-20240701.gz backup.log-20240716.gz backup.log-20240616.gz backup.log-20240702.gz backup.log-20240718.gz backup.log-20240617.gz backup.log-20240704.gz backup.log-20240719.gz backup.log-20240619.gz backup.log-20240706.gz backup.log-20240721.gz backup.log-20240621.gz backup.log-20240707.gz backup.log-20240722.gz backup.log-20240622.gz backup.log-20240708.gz backup.log-20240724.gz backup.log-20240623.gz backup.log-20240709.gz backup.log-20240725.gz backup.log-20240625.gz backup.log-20240711.gz backup.log-20240726 backup.log-20240626.gz backup.log-20240712.gz backup.log-20240627.gz backup.log-20240713.gz [root@opensourceecology ~]# [root@opensourceecology ~]# tail -n20 /var/log/backups/backup.log-20240726 * daily_hetzner2_20240726_072001.tar.gpg:100% /20.424G, 2.935M/s, - 2024/07/26 09:50:31 INFO : daily_hetzner2_20240726_072001.tar.gpg: Copied (new) 2024/07/26 09:50:31 INFO : Transferred: 20.424G / 20.424 GBytes, 100%, 2.979 MBytes/s, ETA 0s Transferred: 1 / 1, 100% Elapsed time: 1h57m0.8s real 117m1.219s user 4m20.240s sys 2m9.432s + echo ================================================================================ ================================================================================ ++ date -u +%Y%m%d_%H%M%S + echo 'INFO: Finished Backup Run at 20240726_095031' INFO: Finished Backup Run at 20240726_095031 + echo ================================================================================ ================================================================================ + exit 0 [root@opensourceecology ~]#
- the query of b2 backup files also looks good
[root@opensourceecology ~]# sudo -u b2user /home/b2user/virtualenv/bin/b2 ls ose-server-backups | grep `date "+%Y%m%d"` daily_hetzner2_20240726_072001.tar.gpg [root@opensourceecology ~]# date -u Fri Jul 26 16:03:55 UTC 2024 [root@opensourceecology ~]# sudo -u b2user /home/b2user/virtualenv/bin/b2 ls ose-server-backups daily_hetzner2_20240724_072001.tar.gpg daily_hetzner2_20240725_072001.tar.gpg daily_hetzner2_20240726_072001.tar.gpg monthly_hetzner2_20230801_072001.tar.gpg monthly_hetzner2_20230901_072001.tar.gpg monthly_hetzner2_20231001_072001.tar.gpg monthly_hetzner2_20231101_072001.tar.gpg monthly_hetzner2_20231201_072001.tar.gpg monthly_hetzner2_20240201_072001.tar.gpg monthly_hetzner2_20240301_072001.tar.gpg monthly_hetzner2_20240401_072001.tar.gpg monthly_hetzner2_20240501_072001.tar.gpg monthly_hetzner2_20240601_072001.tar.gpg monthly_hetzner2_20240701_072001.tar.gpg weekly_hetzner2_20240708_072001.tar.gpg weekly_hetzner2_20240715_072001.tar.gpg weekly_hetzner2_20240722_072001.tar.gpg yearly_hetzner2_20190101_111520.tar.gpg yearly_hetzner2_20200101_072001.tar.gpg yearly_hetzner2_20210101_072001.tar.gpg yearly_hetzner2_20230101_072001.tar.gpg yearly_hetzner2_20240101_072001.tar.gpg [root@opensourceecology ~]#
- that backup is already 8 hours old; so let's bring down the webserver + stop the databases and take a real fresh backup before we do anything
- stopped nginx
[root@opensourceecology ~]# # create dir for logging the change [root@opensourceecology ~]# tmpDir="/var/tmp/CHG-2024-07-26_yum_update" [root@opensourceecology ~]# mkdir -p $tmpDir [root@opensourceecology ~]# [root@opensourceecology ~]# # begin to gracefully shutdown nginx in the background [root@opensourceecology ~]# time nice /sbin/nginx -s quit nginx: [warn] the "ssl" directive is deprecated, use the "listen ... ssl" directive instead in /etc/nginx/conf.d/ssl.openbuildinginstitute.org.include:11 nginx: [warn] the "ssl" directive is deprecated, use the "listen ... ssl" directive instead in /etc/nginx/conf.d/ssl.opensourceecology.org.include:11 nginx: [warn] the "ssl" directive is deprecated, use the "listen ... ssl" directive instead in /etc/nginx/conf.d/ssl.opensourceecology.org.include:11 nginx: [warn] the "ssl" directive is deprecated, use the "listen ... ssl" directive instead in /etc/nginx/conf.d/ssl.opensourceecology.org.include:11 nginx: [warn] the "ssl" directive is deprecated, use the "listen ... ssl" directive instead in /etc/nginx/conf.d/ssl.opensourceecology.org.include:11 nginx: [warn] the "ssl" directive is deprecated, use the "listen ... ssl" directive instead in /etc/nginx/conf.d/ssl.opensourceecology.org.include:11 nginx: [warn] the "ssl" directive is deprecated, use the "listen ... ssl" directive instead in /etc/nginx/conf.d/ssl.opensourceecology.org.include:11 nginx: [warn] the "ssl" directive is deprecated, use the "listen ... ssl" directive instead in /etc/nginx/conf.d/ssl.opensourceecology.org.include:11 nginx: [warn] the "ssl" directive is deprecated, use the "listen ... ssl" directive instead in /etc/nginx/conf.d/ssl.openbuildinginstitute.org.include:11 nginx: [warn] the "ssl" directive is deprecated, use the "listen ... ssl" directive instead in /etc/nginx/conf.d/ssl.opensourceecology.org.include:11 nginx: [warn] the "ssl" directive is deprecated, use the "listen ... ssl" directive instead in /etc/nginx/conf.d/ssl.opensourceecology.org.include:11 nginx: [warn] the "ssl" directive is deprecated, use the "listen ... ssl" directive instead in /etc/nginx/conf.d/ssl.opensourceecology.org.include:11 nginx: [warn] the "ssl" directive is deprecated, use the "listen ... ssl" directive instead in /etc/nginx/conf.d/ssl.opensourceecology.org.include:11 nginx: [warn] the "ssl" directive is deprecated, use the "listen ... ssl" directive instead in /etc/nginx/conf.d/ssl.openbuildinginstitute.org.include:11 nginx: [warn] the "ssl" directive is deprecated, use the "listen ... ssl" directive instead in /etc/nginx/conf.d/ssl.openbuildinginstitute.org.include:11 nginx: [warn] the "ssl" directive is deprecated, use the "listen ... ssl" directive instead in /etc/nginx/conf.d/ssl.opensourceecology.org.include:11 nginx: [warn] the "ssl" directive is deprecated, use the "listen ... ssl" directive instead in /etc/nginx/conf.d/ssl.opensourceecology.org.include:11 real 0m0.078s user 0m0.038s sys 0m0.021s [root@opensourceecology ~]# [root@opensourceecology ~]# date -u Fri Jul 26 16:06:37 UTC 2024 [root@opensourceecology ~]#
- stopped DBs
[root@opensourceecology ~]# systemctl status mariadb ● mariadb.service - MariaDB database server Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled) Active: active (running) since Mon 2024-07-22 18:55:28 UTC; 3 days ago Process: 1230 ExecStartPost=/usr/libexec/mariadb-wait-ready $MAINPID (code=exited, status=0/SUCCESS) Process: 1099 ExecStartPre=/usr/libexec/mariadb-prepare-db-dir %n (code=exited, status=0/SUCCESS) Main PID: 1229 (mysqld_safe) CGroup: /system.slice/mariadb.service ├─1229 /bin/sh /usr/bin/mysqld_safe --basedir=/usr └─1704 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql ... Jul 22 18:55:25 opensourceecology.org systemd[1]: Starting MariaDB database s.... Jul 22 18:55:26 opensourceecology.org mariadb-prepare-db-dir[1099]: Database M... Jul 22 18:55:26 opensourceecology.org mariadb-prepare-db-dir[1099]: If this is... Jul 22 18:55:26 opensourceecology.org mysqld_safe[1229]: 240722 18:55:26 mysql... Jul 22 18:55:26 opensourceecology.org mysqld_safe[1229]: 240722 18:55:26 mysql... Jul 22 18:55:28 opensourceecology.org systemd[1]: Started MariaDB database se.... Hint: Some lines were ellipsized, use -l to show in full. [root@opensourceecology ~]# systemctl stop mariadb [root@opensourceecology ~]# systemctl status mariadb ● mariadb.service - MariaDB database server Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled) Active: inactive (dead) since Fri 2024-07-26 16:07:43 UTC; 3s ago Process: 1230 ExecStartPost=/usr/libexec/mariadb-wait-ready $MAINPID (code=exited, status=0/SUCCESS) Process: 1229 ExecStart=/usr/bin/mysqld_safe --basedir=/usr (code=exited, status=0/SUCCESS) Process: 1099 ExecStartPre=/usr/libexec/mariadb-prepare-db-dir %n (code=exited, status=0/SUCCESS) Main PID: 1229 (code=exited, status=0/SUCCESS) Jul 22 18:55:25 opensourceecology.org systemd[1]: Starting MariaDB database s.... Jul 22 18:55:26 opensourceecology.org mariadb-prepare-db-dir[1099]: Database M... Jul 22 18:55:26 opensourceecology.org mariadb-prepare-db-dir[1099]: If this is... Jul 22 18:55:26 opensourceecology.org mysqld_safe[1229]: 240722 18:55:26 mysql... Jul 22 18:55:26 opensourceecology.org mysqld_safe[1229]: 240722 18:55:26 mysql... Jul 22 18:55:28 opensourceecology.org systemd[1]: Started MariaDB database se.... Jul 26 16:07:40 opensourceecology.org systemd[1]: Stopping MariaDB database s.... Jul 26 16:07:43 opensourceecology.org systemd[1]: Stopped MariaDB database se.... Hint: Some lines were ellipsized, use -l to show in full. [root@opensourceecology ~]#
- the backup is taking a long time. while I wait, I checked `top`, and I see `gzip` is using 80%-100%
- so it seems that gzip is bound by a single core. it could go much faster if it could be split across multiple cores (parallel processing)
- quick googling while I wait suggests that we could use `pigz` as a replacement to `gzip` to get this (admittedly low priority) performance boost https://stackoverflow.com/questions/12313242/utilizing-multi-core-for-targzip-bzip-compression-decompression
- there's other options too. apparently xz has native multi-treadded support since v5.2.0 https://askubuntu.com/a/858828
- there's also pbzip2 for bzip, and many others https://askubuntu.com/a/258228
- the other two commands that get stuck on one-core are `tar` and `gpg2`
- it looks like gpg also attempts to compress with xz. That gives us no benefits in our case because we're encrypting a tarball that just contains a bunch of already-compressed tarballs. So we could probably get some performance improvements by telling gpg to skip the xz compression with `--compression-algo none` https://stackoverflow.com/questions/46261024/how-to-do-large-file-parallel-encryption-using-gnupg-and-gnu-parallel
- finally (after ~30 min to generate the encrypted backup file), rclone is using >100% of CPU to upload it, so that's good. Our script does limit upload to 3 MB/s. I guess one improvement would be some argument to bypass that throttle
- it said the upload was going to take just under 2 hours, so I canceled it and manually ran the upload command (minus the throttle)
- upload speeds are now ~27-32 MB/s (so ~10x faster). It says it'll finish in just over 10 minutes.
- upload is done
[root@opensourceecology ~]# time sudo /bin/nice /root/backups/backup.sh &>> /var/log/backups/backup.log ^C real 33m47.250s user 23m56.551s sys 2m2.866s [root@opensourceecology ~]# [root@opensourceecology ~]# /bin/sudo -u b2user /bin/rclone -v copy /home/b2user/sync/daily_hetzner2_20240726_160837.tar.gpg b2:ose-server-backups ... 2024/07/26 16:56:38 INFO : Transferred: 18.440G / 19.206 GBytes, 96%, 22.492 MBytes/s, ETA 34s Transferred: 0 / 1, 0% Elapsed time: 14m0.5s Transferring: * daily_hetzner2_20240726_160837.tar.gpg: 96% /19.206G, 21.268M/s, 36s 2024/07/26 16:57:36 INFO : daily_hetzner2_20240726_160837.tar.gpg: Copied (new) 2024/07/26 16:57:36 INFO : Transferred: 19.206G / 19.206 GBytes, 100%, 21.910 MBytes/s, ETA 0s Transferred: 1 / 1, 100% Elapsed time: 14m58.6s [root@opensourceecology ~]#
- ok, this very durable backup is uploaded; let's proceed
[root@opensourceecology ~]# sudo -u b2user /home/b2user/virtualenv/bin/b2 ls ose-server-backups | grep `date "+%Y%m%d"` daily_hetzner2_20240726_072001.tar.gpg daily_hetzner2_20240726_160837.tar.gpg [root@opensourceecology ~]# date -u Fri Jul 26 16:58:11 UTC 2024 [root@opensourceecology ~]# sudo -u b2user /home/b2user/virtualenv/bin/b2 ls ose-server-backups daily_hetzner2_20240724_072001.tar.gpg daily_hetzner2_20240725_072001.tar.gpg daily_hetzner2_20240726_072001.tar.gpg daily_hetzner2_20240726_160837.tar.gpg monthly_hetzner2_20230801_072001.tar.gpg monthly_hetzner2_20230901_072001.tar.gpg monthly_hetzner2_20231001_072001.tar.gpg monthly_hetzner2_20231101_072001.tar.gpg monthly_hetzner2_20231201_072001.tar.gpg monthly_hetzner2_20240201_072001.tar.gpg monthly_hetzner2_20240301_072001.tar.gpg monthly_hetzner2_20240401_072001.tar.gpg monthly_hetzner2_20240501_072001.tar.gpg monthly_hetzner2_20240601_072001.tar.gpg monthly_hetzner2_20240701_072001.tar.gpg weekly_hetzner2_20240708_072001.tar.gpg weekly_hetzner2_20240715_072001.tar.gpg weekly_hetzner2_20240722_072001.tar.gpg yearly_hetzner2_20190101_111520.tar.gpg yearly_hetzner2_20200101_072001.tar.gpg yearly_hetzner2_20210101_072001.tar.gpg yearly_hetzner2_20230101_072001.tar.gpg yearly_hetzner2_20240101_072001.tar.gpg [root@opensourceecology ~]#
- we have a snapshot of the current state of packages
[root@opensourceecology ~]# time nice rpm -qa &> "${tmpDir}/before.log" real 0m0.716s user 0m0.678s sys 0m0.037s [root@opensourceecology ~]# [root@opensourceecology ~]# echo $tmpDir /var/tmp/CHG-2024-07-26_yum_update [root@opensourceecology ~]# [root@opensourceecology ~]# tail /var/tmp/CHG-2024-07-26_yum_update/before.log libdb-utils-5.3.21-25.el7.x86_64 libuser-0.60-9.el7.x86_64 python-lxml-3.2.1-4.el7.x86_64 net-snmp-agent-libs-5.7.2-48.el7_8.x86_64 epel-release-7-14.noarch perl-parent-0.225-244.el7.noarch libstdc++-devel-4.8.5-39.el7.x86_64 libsodium13-1.0.5-1.el7.x86_64 ncurses-5.9-14.20130511.el7_4.x86_64 e2fsprogs-libs-1.42.9-17.el7.x86_64 [root@opensourceecology ~]#
- I kicked-off the updates. I got a bit of a freight at first when we got "404 Not Found" errors from 484 mirrors, but eventually `yum` found a server. I'm glad we did the updates now, before all the mirrors shutdown (centos was EOL some years ago, and will no longer be getting maintenance updates as of a few weeks ago)
[root@opensourceecology ~]# grep "Error 404" /var/tmp/CHG-2024-07-26_yum_update/update.log | wc -l 484 [root@opensourceecology ~]# [root@opensourceecology ~]# cat /etc/centos-release CentOS Linux release 7.9.2009 (Core) [root@opensourceecology ~]#
- actually, it says it's updating 434 packages total. So I guess some dependencies got added to the 200-odd count before
- ok, the update command finished in just under 4 minutes of wall time
... real 3m56.410s user 2m1.833s sys 0m44.510s [root@opensourceecology ~]#
- post update info
[root@opensourceecology ~]# # log the post-state packages and versions [root@opensourceecology ~]# time nice rpm -qa &> "${tmpDir}/after.log" real 0m0.805s user 0m0.769s sys 0m0.036s [root@opensourceecology ~]# [root@opensourceecology ~]# time nice needs-restarting &> "${tmpDir}/needs-restarting.log" real 0m8.156s user 0m6.956s sys 0m0.652s [root@opensourceecology ~]# time nice needs-restarting -r &> "${tmpDir}/needs-reboot.log" real 0m0.155s user 0m0.104s sys 0m0.051s [root@opensourceecology ~]# [root@opensourceecology ~]# cat /var/tmp/CHG-2024-07-26_yum_update/needs-reboot.log Core libraries or services have been updated: systemd -> 219-78.el7_9.9 dbus -> 1:1.10.24-15.el7 openssl-libs -> 1:1.0.2k-26.el7_9 linux-firmware -> 20200421-83.git78c0348.el7_9 kernel -> 3.10.0-1160.119.1.el7 glibc -> 2.17-326.el7_9.3 Reboot is required to ensure that your system benefits from these updates. More information: https://access.redhat.com/solutions/27943 [root@opensourceecology ~]# [root@opensourceecology ~]# cat /var/tmp/CHG-2024-07-26_yum_update/needs-restarting.log 30842 : /usr/lib/systemd/systemd-udevd 13696 : sshd: maltfield@pts/0 27401 : /bin/bash 744 : /sbin/auditd 19086 : /bin/bash 13692 : sshd: maltfield [priv] 30672 : smtpd -n smtp -t inet -u 13699 : -bash 18035 : su - 27436 : less /root/backups/backup.sh 18036 : -bash 18030 : sudo su - 1484 : /var/ossec/bin/ossec-analysisd 24493 : /bin/bash 21581 : su - 21580 : sudo su - 21582 : -bash 797 : /usr/lib/systemd/systemd-logind 24476 : /bin/bash 1830 : qmgr -l -t unix -u 30673 : proxymap -t unix -u 19119 : sudo su - 24511 : /bin/bash 29833 : local -t unix 27417 : sudo su - 19130 : -bash 1 : /usr/lib/systemd/systemd --system --deserialize 23 29830 : cleanup -z -t unix -u 1500 : /var/ossec/bin/ossec-logcollector 24475 : SCREEN -S upgrade 2150 : /usr/sbin/varnishd -P /var/run/varnish.pid -f /etc/varnish/default.vcl -a 127.0.0.1:6081 -T 127.0.0.1:6082 -S /etc/varnish/secret -u varnish -g varnish -s malloc,40G 2152 : /usr/sbin/varnishd -P /var/run/varnish.pid -f /etc/varnish/default.vcl -a 127.0.0.1:6081 -T 127.0.0.1:6082 -S /etc/varnish/secret -u varnish -g varnish -s malloc,40G 29835 : bounce -z -t unix -u 775 : /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation 27419 : -bash 585 : /usr/lib/systemd/systemd-journald 771 : /usr/sbin/irqbalance --foreground 770 : /usr/sbin/acpid 1170 : /sbin/agetty --noclear tty1 linux 30690 : smtp -t unix -u 778 : /usr/sbin/chronyd 8695 : gpg-agent --daemon --use-standard-socket 24529 : /bin/bash 2121 : /var/ossec/bin/ossec-syscheckd 1806 : /usr/libexec/postfix/master -w 19129 : su - 19065 : /bin/bash 2124 : /var/ossec/bin/ossec-monitord 29832 : trivial-rewrite -n rewrite -t unix -u 19044 : /bin/bash 30693 : smtp -t unix -u 30692 : smtp -t unix -u 30691 : cleanup -z -t unix -u 27418 : su - 1475 : /var/ossec/bin/ossec-execd 19025 : /bin/bash 19024 : SCREEN -S CHG-2024-07-26_yum_update 1458 : /var/ossec/bin/ossec-maild 19023 : screen -S CHG-2024-07-26_yum_update [root@opensourceecology ~]#
- alright, time to reboot
[root@opensourceecology ~]# reboot Connection to opensourceecology.org closed by remote host. Connection to opensourceecology.org closed. user@ose:~$
- system came back in about 1 minute
- first attempt to load the wiki resulted in a 503 "Error 503 Backend fetch failed" from varnish
- it's not just warming-up; apache didn't come-up on start
[root@opensourceecology ~]# systemctl status httpd ● httpd.service - The Apache HTTP Server Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Fri 2024-07-26 17:09:47 UTC; 2min 7s ago Docs: man:httpd(8) man:apachectl(8) Process: 1094 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND (code=exited, status=1/FAILURE) Main PID: 1094 (code=exited, status=1/FAILURE) Jul 26 17:09:47 opensourceecology.org systemd[1]: Starting The Apache HTTP Se.... Jul 26 17:09:47 opensourceecology.org httpd[1094]: (98)Address already in use:... Jul 26 17:09:47 opensourceecology.org httpd[1094]: (98)Address already in use:... Jul 26 17:09:47 opensourceecology.org httpd[1094]: no listening sockets availa... Jul 26 17:09:47 opensourceecology.org httpd[1094]: AH00015: Unable to open logs Jul 26 17:09:47 opensourceecology.org systemd[1]: httpd.service: main process...E Jul 26 17:09:47 opensourceecology.org systemd[1]: Failed to start The Apache .... Jul 26 17:09:47 opensourceecology.org systemd[1]: Unit httpd.service entered .... Jul 26 17:09:47 opensourceecology.org systemd[1]: httpd.service failed. Hint: Some lines were ellipsized, use -l to show in full. [root@opensourceecology ~]#
- it says that the port is already in-use
[root@opensourceecology ~]# journalctl -u httpd --no-pager -- Logs begin at Fri 2024-07-26 17:09:34 UTC, end at Fri 2024-07-26 17:15:26 UTC. -- Jul 26 17:09:47 opensourceecology.org systemd[1]: Starting The Apache HTTP Server... Jul 26 17:09:47 opensourceecology.org httpd[1094]: (98)Address already in use: AH00072: make_sock: could not bind to address [::]:443 Jul 26 17:09:47 opensourceecology.org httpd[1094]: (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:443 Jul 26 17:09:47 opensourceecology.org httpd[1094]: no listening sockets available, shutting down Jul 26 17:09:47 opensourceecology.org httpd[1094]: AH00015: Unable to open logs Jul 26 17:09:47 opensourceecology.org systemd[1]: httpd.service: main process exited, code=exited, status=1/FAILURE Jul 26 17:09:47 opensourceecology.org systemd[1]: Failed to start The Apache HTTP Server. Jul 26 17:09:47 opensourceecology.org systemd[1]: Unit httpd.service entered failed state. Jul 26 17:09:47 opensourceecology.org systemd[1]: httpd.service failed. [root@opensourceecology ~]#
- before I start making changes, I'm going to initiate another backup (and wait at-least 30 minutes for the tar to finish)
- I'm going to want to diff the apache configs, so I made a copy of the backup that I made just before the updates and copied it into the temp CHG dir
[root@opensourceecology CHG-2024-07-26_yum_update]# mkdir backup_before [root@opensourceecology CHG-2024-07-26_yum_update]# rsync -av --progress /home/b2user/sync.old/daily_hetzner2_20240726_160837.tar.gpg backup_before/ sending incremental file list daily_hetzner2_20240726_160837.tar.gpg 20,622,312,871 100% 127.14MB/s 0:02:34 (xfr#1, to-chk=0/1) sent 20,627,347,744 bytes received 35 bytes 133,510,341.61 bytes/sec total size is 20,622,312,871 speedup is 1.00 [root@opensourceecology CHG-2024-07-26_yum_update]#
- well, unfortunately the wiki being down means I can't reference our docs on how to restore backups, but I managed to figure it out
[root@opensourceecology backup_before]# gpg --batch --passphrase-file /root/backups/ose-backups-cron.key --decrypt daily_hetzner2_20240726_160837.tar.gpg > daily_hetzner2_20240726_160837.tar gpg: AES256 encrypted data gpg: encrypted with 1 passphrase [root@opensourceecology backup_before]# [root@opensourceecology backup_before]# du -sh * 20G daily_hetzner2_20240726_160837.tar 20G daily_hetzner2_20240726_160837.tar.gpg [root@opensourceecology backup_before]# [root@opensourceecology backup_before]# du -sh * 20G daily_hetzner2_20240726_160837.tar 20G daily_hetzner2_20240726_160837.tar.gpg [root@opensourceecology backup_before]# [root@opensourceecology backup_before]# tar -xf daily_hetzner2_20240726_160837.tar [root@opensourceecology backup_before]# [root@opensourceecology backup_before]# du -sh * 20G daily_hetzner2_20240726_160837.tar 20G daily_hetzner2_20240726_160837.tar.gpg 20G root [root@opensourceecology backup_before]# rm -f daily_hetzner2_20240726_160837.tar.gpg [root@opensourceecology backup_before]# [root@opensourceecology backup_before]# du -sh * 20G daily_hetzner2_20240726_160837.tar 20G root [root@opensourceecology backup_before]#
- to make this easier for the next person, I created a README directly in the backups dir
[root@opensourceecology backups]# cat /root/backups/README.txt 2024-07-26 ========== The process to restore from backups is documented on the wiki * https://wiki.opensourceecology.org/wiki/Backblaze#Restore_from_backups Oh, the wiki is down and you need to restore from backups to restore the wiki? Don't worry, I got you. All backups are stored on Backblaze B2. You can download them with rclone or just by logging into the Backblaze B2 WUI. First decrypt the main wrapper tar with `gpg` gpg --batch --passphrase-file <path-to-symmetric-encrypton-private-key> --decrypt <path-to-encrypted-tarball> > <path-to-decrypted-tarball> For example: gpg --batch --passphrase-file /root/backups/ose-backups-cron.key --decrypt daily_hetzner2_20240726_160837.tar.gpg > daily_hetzner2_20240726_160837.tar Then you can untar the wrapper tarball and the compressed tarball inside of that. For example: tar -xf daily_hetzner2_20240726_160837.tar cd root/backups/sync/daily_hetzner2_20240726_160837/www/ tar -xf www.20240726_160837.tar.gz head var/www/html/www.opensourceecology.org/htdocs/index.php --Michael Altfield <https://michaelaltfield.net.> [root@opensourceecology backups]#
- and I was able to extract the www files from the backups prior to the update
[root@opensourceecology backup_before]# cd root/backups/sync/daily_hetzner2_20240726_160837/www/ [root@opensourceecology www]# [root@opensourceecology www]# ls www.20240726_160837.tar.gz [root@opensourceecology www]# [root@opensourceecology www]# tar -xf www.20240726_160837.tar.gz [root@opensourceecology www]# [root@opensourceecology www]# du -sh * 32G var 19G www.20240726_160837.tar.gz [root@opensourceecology www]#
- oh, actually I want the /ettc/ config file
[root@opensourceecology www]# cd ../etc [root@opensourceecology etc]# [root@opensourceecology etc]# tar -xf etc.20240726_160837.tar.gz [root@opensourceecology etc]# [root@opensourceecology etc]# du -sh * 46M etc 13M etc.20240726_160837.tar.gz [root@opensourceecology etc]#
- a diff of the pre-update configs and the current configs shows 4x new files
[root@opensourceecology etc]# diff -ril etc/httpd /etc/httpd diff: etc/httpd/logs: No such file or directory diff: etc/httpd/modules: No such file or directory Only in /etc/httpd/conf.d: autoindex.conf Only in /etc/httpd/conf.d: ssl.conf Only in /etc/httpd/conf.d: userdir.conf Only in /etc/httpd/conf.d: welcome.conf [root@opensourceecology etc]#
- I just moved these 4x files out (into our tmp change dir), and tried a restart; it came up
[root@opensourceecology CHG-2024-07-26_yum_update]# mkdir moved_from_etc_httpd [root@opensourceecology CHG-2024-07-26_yum_update]# mv /etc/httpd/conf.d/autoindex.conf moved_from_etc_httpd/ [root@opensourceecology CHG-2024-07-26_yum_update]# mv /etc/httpd/conf.d/ssl.conf moved_from_etc_httpd/ [root@opensourceecology CHG-2024-07-26_yum_update]# mv /etc/httpd/conf.d/userdir.conf moved_from_etc_httpd/ [root@opensourceecology CHG-2024-07-26_yum_update]# mv /etc/httpd/conf.d/welcome.conf moved_from_etc_httpd/ [root@opensourceecology CHG-2024-07-26_yum_update]# [root@opensourceecology CHG-2024-07-26_yum_update]# systemctl restart httpd [root@opensourceecology CHG-2024-07-26_yum_update]# [root@opensourceecology CHG-2024-07-26_yum_update]# systemctl status httpd ● httpd.service - The Apache HTTP Server Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled) Active: active (running) since Fri 2024-07-26 17:59:36 UTC; 4s ago Docs: man:httpd(8) man:apachectl(8) Main PID: 15908 (httpd) Status: "Processing requests..." CGroup: /system.slice/httpd.service ├─15908 /usr/sbin/httpd -DFOREGROUND ├─15910 /usr/sbin/httpd -DFOREGROUND ├─15911 /usr/sbin/httpd -DFOREGROUND ├─15912 /usr/sbin/httpd -DFOREGROUND ├─15913 /usr/sbin/httpd -DFOREGROUND ├─15914 /usr/sbin/httpd -DFOREGROUND ├─15921 /usr/sbin/httpd -DFOREGROUND ├─15927 /usr/sbin/httpd -DFOREGROUND ├─15928 /usr/sbin/httpd -DFOREGROUND ├─15936 /usr/sbin/httpd -DFOREGROUND ├─15937 /usr/sbin/httpd -DFOREGROUND ├─15938 /usr/sbin/httpd -DFOREGROUND └─15939 /usr/sbin/httpd -DFOREGROUND Jul 26 17:59:36 opensourceecology.org systemd[1]: Starting The Apache HTTP Se.... Jul 26 17:59:36 opensourceecology.org systemd[1]: Started The Apache HTTP Server. Hint: Some lines were ellipsized, use -l to show in full. [root@opensourceecology CHG-2024-07-26_yum_update]#
- I was able to load and edit the wiki; I spent some time adding some updates to the CHG article https://wiki.opensourceecology.org/wiki/CHG-2020-05-04_yum_update
- for some reason my browser keeps locking-up when all I'm trying to do is edit the text in the textarea for ^ this wiki article. I don't use the wysiwyg editor. I'm literally just editing text in a textarea; that shouldn't require any processing
- It took me ~20 minutes just to make a few changes to one wiki article because the page on firefox kept locking-up, sometimes displaying a spinning circle over the page
- I launched a new DispVM with *only* firefox running and *only* one tab open in firefox. The issue persisted, and the VM with the (idle) firefox on the edit page was taxed with 20-60% CPU usage; something is definitely wrong, but it's unclear if the bug is on our mediawiki server, my firefox client, or both
- anyway, I'm continuing with the validation steps
- I was successfully able to load the frontpage of all the 9x websites
- the logo at the top (and bottom) of https://oswh.opensourceecology.org/ was missing, but I'm not sure if that was the case before the updates or not
- I simply get a 404 on the image http://www.opensourcewarehouse.org/wp-content/uploads/2013/02/headfooter-logonew.png
- I guess the domain is wrong; we don't appear to use opensourcewarehouse.org anymore, so I guess this was an issue that predates our updates now
- everything else looked good
- the logo at the top (and bottom) of https://oswh.opensourceecology.org/ was missing, but I'm not sure if that was the case before the updates or not
- I logged into the munin. It loads fine
- I do see some gaps in mysql charts where everything drops to 0 for a few hours, which I guess is before/why Marcin did reboots again. My job now isn't to investigate this now, but I'm just making a note here
- otherwise munin is working; validated.
- I logged into awstates. it loads fine
- I just quickly scanned the main pages for www.opensourceecology.org and wiki.opensourceecology.org; they look fine
- I already tested edits on wiki.opensourceecolgy.org; they're working (setting aside the client-side lag)
- I was successfully able to make a trivial change to the main wordpress site, and then revert that change https://www.opensourceecology.org/offline-wiki-zim-kiwix/
- the only thing left is the backups, which have been running the background since shortly after the reboot
- the backups finished being created successfully
- the backups are currently being uploaded at the rate-limited 3 MB/s. they're at 39% now, and estimated to finish uploading in 1h10m from now.
- the upload is the last step; that's good enough for me to consider the backups functional
- that completes our validation; I think it's safe to mark this change as successful
- I sent an update email to Marcin & Catarina
Hey Marcin & Catarina, I've finished updating the system packages on the hetzner2 server. It's a very good thing that we did this, because your server tried and failed to download its updates from 484 mirrors before it finally found a server that it could download its updates from. As I mentioned in Nov 2022, your server runs CentOS 7, which stopped receiving "Full Updates" by Red Hat in Aug 2020. As of Jun 2024, it is no longer going to be updated in any way (security, maintenance, etc). At some point in the future, I guess all of their update servers will go down too. We're lucky at least one was still online. * https://wiki.opensourceecology.org/wiki/Maltfield_Log/2022#Wed_November_02.2C_2022 Today I was successfully able to update 434 system packages onto hetzner2. I did some quick validation of a subset of your websites, and I only found a couple minor errors 1. The header & footer images of oswh don't load https://oswh.opensourceecology.org/ 2. Editing the wiki sometimes causes my browser to lock-up; it's not clear if this is a pre-existing issue, or if the issue is caused by your server or my client I did not update your server's applications that cannot be updated by the package manager (eg wordpress, mediawiki, etc). If you don't detect any issues with your server, then I would recommend that we do the application upgrade simultaneously with a migration to a new server running Debian. I'd like to stress again the urgency of the need to migrate off of CentOS 7. Besides the obvious security risks of running a server that is no longer receiving security patches, at some point in the likely-not-too-distant future, your server is going to break and it will be extremely non-trivial to fix it. The deadline for migrating was in 2020. I highly recommend prioritizing a project to migrate your server to a new Debian server ASAP. Please spend some time testing your various websites, and let me know if you experience any issues. Thank you, Michael Altfield Senior Technology Advisor PGP Fingerprint: 8A4B 0AF8 162F 3B6A 79B7 70D2 AA3E DF71 60E2 D97B Open Source Ecology www.opensourceecology.org
- I confirmed the list of updates on the server is now empty
[root@opensourceecology CHG-2024-07-26_yum_update]# yum list updates Loaded plugins: fastestmirror, replace Loading mirror speeds from cached hostfile * base: ftp.plusline.net * epel: mirrors.n-ix.net * extras: ftp.plusline.net * updates: mirror.checkdomain.de [root@opensourceecology CHG-2024-07-26_yum_update]#
- I'm considering the change successful
- looks like my tmp change dir pushed the disk to 86% capacity; let's clean that up
[root@opensourceecology CHG-2024-07-26_yum_update]# df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 32G 0 32G 0% /dev tmpfs 32G 0 32G 0% /dev/shm tmpfs 32G 17M 32G 1% /run tmpfs 32G 0 32G 0% /sys/fs/cgroup /dev/md2 197G 161G 27G 86% / /dev/md1 488M 386M 77M 84% /boot tmpfs 6.3G 0 6.3G 0% /run/user/1005 [root@opensourceecology CHG-2024-07-26_yum_update]# ls after.log before.log needs-reboot.log update.log backup_before moved_from_etc_httpd needs-restarting.log [root@opensourceecology CHG-2024-07-26_yum_update]# du -sh * 28K after.log 70G backup_before 28K before.log 28K moved_from_etc_httpd 4.0K needs-reboot.log 4.0K needs-restarting.log 216K update.log [root@opensourceecology CHG-2024-07-26_yum_update]# ls before.log before.log [root@opensourceecology CHG-2024-07-26_yum_update]# [root@opensourceecology CHG-2024-07-26_yum_update]# ls backup_before/ daily_hetzner2_20240726_160837.tar root [root@opensourceecology CHG-2024-07-26_yum_update]# [root@opensourceecology CHG-2024-07-26_yum_update]# du -sh backup_before/* 20G backup_before/daily_hetzner2_20240726_160837.tar 51G backup_before/root [root@opensourceecology CHG-2024-07-26_yum_update]# [root@opensourceecology CHG-2024-07-26_yum_update]# ls backup_before/root backups [root@opensourceecology CHG-2024-07-26_yum_update]# [root@opensourceecology CHG-2024-07-26_yum_update]# rm -rf backup_before/root [root@opensourceecology CHG-2024-07-26_yum_update]#
- great, now we're down to 59%
[root@opensourceecology CHG-2024-07-26_yum_update]# df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 32G 0 32G 0% /dev tmpfs 32G 0 32G 0% /dev/shm tmpfs 32G 17M 32G 1% /run tmpfs 32G 0 32G 0% /sys/fs/cgroup /dev/md2 197G 110G 78G 59% / /dev/md1 488M 386M 77M 84% /boot tmpfs 6.3G 0 6.3G 0% /run/user/1005 [root@opensourceecology CHG-2024-07-26_yum_update]#
Wed July 24, 2024
- Marcin contacted me a few days ago saying that the server needs reboots again
- I found that the last time we did a system packages update was in 2020, over 4 years ago. I strongly recommended that we update the system packages, and probably the web applications as well
- here's the link to the last time I updated the system packages in May 2020 https://wiki.opensourceecology.org/wiki/CHG-2020-05-04_yum_update
- I also noted that CentOS is now not only EOL, but it's also no longer receiving (security) updates.
- I warned Marcin about this approaching deadline in Nov 2022, and urged him to migrate to a new OS before 2024.
In my prior work at OSE, I've done my best to design your systems to be robust and "well oiled" so that they would run for as long as possible with as little maintenance as possible. However, code rots over time, and there's only so long you can hold-off before things fall apart. Python 2.7.5 was End-of-Life'd on 2020-01-01, and it no longer receives any updates. * https://en.wikipedia.org/wiki/History_of_Python CentOS 7.7 was released 2019-09-17. "Full Updates" stopped 2020-08-06, and it will no longer receive any maintenance updates after 2024-06-30. * https://wiki.centos.org/About/Product At some point, you're going to want to migrate to a new server with a new OS. I strongly recommend initiating this project before 2024.
- Here's the log entry https://wiki.opensourceecology.org/wiki/Maltfield_Log/2022#Wed_November_02.2C_2022
- I told Marcin to budget for ~$10,000 to migrate to a new server, as it's going to be a massive project that will likely require more than a month of full-time work to complete the migration
- Marcin said I should go ahead and prepare a CHG "ticket" article for the upgrade and schedule a time to do it
- I prepared a change ticket for updating the system packages on Friday https://wiki.opensourceecology.org/wiki/CHG-2024-07-26_yum_update
- I also noticed that I kept getting de-auth'd every few minutes on the wiki. That's annoying. Hopefully updates will help this (and other) issues go away.
- If we did a migration to debian, then we'd need to migrate to a new server
- previously when we migrated from hetzner1 to hetzner2, we got a 15x increase in RAM (from 4GB to 64GB). And the price of both servers was the same!
- I was expecting the next jump would have similar results: we'd migrate to a new server that costs the same for much better specs, but that's not looking like it's going to be the case :(
- Here's the currently-offered dedicated servers at hetzner https://www.hetzner.com/dedicated-rootserver/
- Currently we have 8-cores, 64G RAM, and two 250G disks in a RAID-1 software array. We pay 39 EUR/mo
- The cheapest dedicated server (EX44) currently is 46.41 EUR/month and comes with 14-cores, 64G RAM, and 2x 512G disks. That should meet our requirements https://www.hetzner.com/dedicated-rootserver/ex44/configurator/#/
- oh crap, we'd be downgrading the proc from the i7 (Intel® Core™ i7-6700) to an i5 (Intel® Core™ i5-13500)
- I'd have to check the munin charts, but I would be surprised if we ever break a load of 2, so that's still probably fine.
- I met with Marcin tonight to discuss [a] the system-level package upgrades, [b] the application (eg wordpress, mediawiki, etc) upgrades, and [c] the server migration
- I recommended that Marcin do the updates on staging, and described the risk of not doing it
- the problem is that the current staging environment is down, an d it may take a few days to restore it
- the risk is maybe a few days of downtime instead smaller change window during the update
- we agreed that I'll do the system-level package upgrades direct-to-production; Marcin accepted the risk of a few days of downtime
- Marcin also mentioned that Hetzner has a "server auction" page, which has some more servers that meet our needs at a slightly discounted price https://www.hetzner.com/sb/
- actually many of these are 37.72 EUR/mo, so they're actually *cheaper* than our current 39 EUR/mo. Great!
- there's >3 pages of servers for this 37.72 EUR/mo price. One of them has 2x 4TB drives (though it looks like spinning disks). This is a server graveyard built-to-spec for previous customers, it seems. We should be able to find one that meets our needs, so that means we'll easily double our disk and save ~15 EUR per year. Cool :)
- I recommended that Marcin do the updates on staging, and described the risk of not doing it