Maltfield Log/2019 Q3
My work log from the year 2019 Quarter 3. I intentionally made this verbose to make future admin's work easier when troubleshooting. The more keywords, error messages, etc that are listed in this log, the more helpful it will be for the future OSE Sysadmin.
See Also
Thr Aug 22, 2019
- Marcin responded to my inquery about licensing of software on github; per this article, software should be GPLv3 and everything else dual-liceensed CC-BY-SA & GFDL
- I did some cleanup of wiki articles on licensing, iner-linking them with categories and "See Also" sections
- Marcin said he'd put together a video showcasing kiwix browsing our zim-ified offline wiki for a blog post if Christian made a write-up for the blog; this is going to be good
- Christian asked about automating the zim creation; I said we should do this on the dev server after it's setup in some time (weeks/months)
Wed Aug 21, 2019
- Marcin made me an owner of the "OpenSourceEcology" github organization, and I successfully gained access
- I created the new 'ansible' repository
- I changed the "display name" of our "OpenSourceEcology" org from "Marcin Jakubowski" to "Open Source Ecology"
- I did some cleanup on our wiki to make the "Github" page easier to find and to more clearly indicate which org on github is ours
- I noticed that none of our repos on gibhub have licenses, and that CC does not recommend using their licenses for software anyway; I asked Marcin what license we should use for our repos
- I also verified our domain 'www.opensourceecology.org' to make our repo more legitimate. This involved adding a NONCE to a new TXT record '_github-challenge-OpenSourceEcology.www.opensourceecology.org.' in our DNS at cloudflare
- Cloudflare is fast! I was able to verify it immediately
- I also added a description to the org indicating that it's the official github org for OSE and linking back to the wiki article that states this https://wiki.opensourceecology.org/wiki/Github
Tue Aug 20, 2019
- emails with Christian about the offline wiki zim archive
- dev server research. I think I should write an ansible playbook that launches a fresh install of a dev server, gets a list of the packages & versions installed on the prod server, installs them on the dev server, then syncs data from prod to dev and then does the necessary ip address & hostname changes to nginx/NetworkManager/etc configs. Ideally both would be controlled by puppet, but--without segregation of stateful & stateless servers from each other and other issues of stale provisioning configs as documented earlier--I don't see that as being realisitc https://wiki.opensourceecology.org/wiki/OSE_Server#Provisioning
- added '/etc/keepass/.passwords.kdbx.lock' to the "ignore" list of file changes in /var/ossec/etc/ossec.conf'
...
- time to create a dev server in hetzner cloud; first I logged into the hetzner robot wui and went to add a cloud server, but it told me to visit the "hetzner cloud console" https://console.hetzner.cloud
- I created a new "project" called "ose-dev"
- we want this new cloud server to be as close to the prod server as possible. I checked robot and found that our prod server is located in 'FSN1-DC8' which appears to be in Falkenstein https://wiki.hetzner.de/index.php/Benennung_Rechenzentren/en
- I added a new server in 'Falkenstein' running "CentOS 7" of type "CX11" (the cheapest) including a 10G volume (the smallest possible; we'll have to increase this to 50G in the near future) named "ose-dev-volume-1" of type "ext4". I did not add a network. I did add my ssh public key. I named the server "osedev1"
- Interesting, hetzner cloud supports cloud-init. imho, this would be great as a basic hardening & bootstrap step where we https://cloudinit.readthedocs.io/en/latest/
- hardened sshd and set it to a non-standard port
- setup basic iptables rules to block all but the new ssh port
- created my user and installed my ssh key
- ... after that, the rest could be done with ansible scripts stored in our git repo
- it only took a few seconds before the hetzner console told me the server was created. Then I clicked the hamburger menu next to the instance and selected "Console". It opened a cli window showing the machine booting. After another 30 sec, I was prompted for the user login. But, of course, I didn't know the credentials. It wasn't even clear what user my ssh key as added to.
- it looks like the new server was put in 'fsn1-dc14' so a distinct DC as our prod server, but at least it's in the same city.
- in the console, we also apparently have the option to "RESET ROOT PASSWORD" under the "RESCUE" tab
- the ip address of osedev1 is 195.201.233.113 = static.113.233.201.195.clients.your-server.de
- it looks like we can create snapshots for €0.01/GB/month. We can create a max of 7 snapshots. After 7, it deletes the oldest snapshots and replaces it with the new one.
- the hetzner console includes some pretty good (though basic) graphs. they appear to only go back 30 days.
- I tried to ssh into the server using root and my personal ssh key. I got it. Yikes, it allows root to login by default?
- according to this, root won't be given a password unless an ssh key was *not* provied https://github.com/hetznercloud/terraform-provider-hcloud/issues/19
- I thought about creating a distinct set of passwords for this server from prod, but my intention is to keep it as similar to prod as possible (like a staging server) so that we can verify if POCs will break existing services or not. For that, I'm going to be copying the configs from prod and keeping it here, so the passwords will all be the same, anyway. Until we grow to the point where we can segregate services on distinct servers (or at least stateless from stateful servers) and make provisioning a thing (where passwords would be added to config files from a vault at provisioning time), our staging server will just have to be a near-copy of our prod server. And, therefore, it should be treated with the same level of caution regarding privacy of data and security of secrets/passwords. The only reduced concern will be availability; we can take down staging when doing test upgrades/POCs/etc without concern of damaging the production services. Again, more info on why we don't have a provisioning solution here https://wiki.opensourceecology.org/wiki/OSE_Server#Provisioning
- I want to store ansible playbooks in git, but I can't create repos in our OSE git org. I emailed marcin asking him which is our cannonical github org page and asked him to make me an owner of it so I can create a repo for ansible
- I did a basic bootstrap so I could get ansible working
[root@osedev1 ~]# useradd maltfield [root@osedev1 ~]# echo "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDGNYjR7UKiJSAG/AbP+vlCBqNfQZ2yuSXfsEDuM7cEU8PQNJyuJnS7m0VcA48JRnpUpPYYCCB0fqtIEhpP+szpMg2LByfTtbU0vDBjzQD9mEfwZ0mzJsfzh1Nxe86l/d6h6FhxAqK+eG7ljYBElDhF4l2lgcMAl9TiSba0pcqqYBRsvJgQoAjlZOIeVEvM1lyfWfrmDaFK37jdUCBWq8QeJ98qpNDX4A76f9T5Y3q5EuSFkY0fcU+zwFxM71bGGlgmo5YsMMdSsW+89fSG0652/U4sjf4NTHCpuD0UaSPB876NJ7QzeDWtOgyBC4nhPpS8pgjsnl48QZuVm6FNDqbXr9bVk5BdntpBgps+gXdSL2j0/yRRayLXzps1LCdasMCBxCzK+lJYWGalw5dNaIDHBsEZiK55iwPp0W3lU9vXFO4oKNJGFgbhNmn+KAaW82NBwlTHo/tOlj2/VQD9uaK5YLhQqAJzIq0JuWZWFLUC2FJIIG0pJBIonNabANcN+vq+YJqjd+JXNZyTZ0mzuj3OAB/Z5zS6lT9azPfnEjpcOngFs46P7S/1hRIrSWCvZ8kfECpa8W+cTMus4rpCd40d1tVKzJA/n0MGJjEs2q4cK6lC08pXxq9zAyt7PMl94PHse2uzDFhrhh7d0ManxNZE+I5/IPWOnG1PJsDlOe4Yqw== michael@opensourceecology.org" > /home/maltfield/.ssh/authorized_keys -bash: /home/maltfield/.ssh/authorized_keys: No such file or directory [root@osedev1 ~]# su - maltfield [maltfield@osedev1 ~]$ pwd /home/maltfield [maltfield@osedev1 ~]$ mkdir .ssh [maltfield@osedev1 ~]$ echo "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDGNYjR7UKiJSAG/AbP+vlCBqNfQZ2yuSXfsEDuM7cEU8PQNJyuJnS7m0VcA48JRnpUpPYYCCB0fqtIEhpP+szpMg2LByfTtbU0vDBjzQD9mEfwZ0mzJsfzh1Nxe86l/d6h6FhxAqK+eG7ljYBElDhF4l2lgcMAl9TiSba0pcqqYBRsvJgQoAjlZOIeVEvM1lyfWfrmDaFK37jdUCBWq8QeJ98qpNDX4A76f9T5Y3q5EuSFkY0fcU+zwFxM71bGGlgmo5YsMMdSsW+89fSG0652/U4sjf4NTHCpuD0UaSPB876NJ7QzeDWtOgyBC4nhPpS8pgjsnl48QZuVm6FNDqbXr9bVk5BdntpBgps+gXdSL2j0/yRRayLXzps1LCdasMCBxCzK+lJYWGalw5dNaIDHBsEZiK55iwPp0W3lU9vXFO4oKNJGFgbhNmn+KAaW82NBwlTHo/tOlj2/VQD9uaK5YLhQqAJzIq0JuWZWFLUC2FJIIG0pJBIonNabANcN+vq+YJqjd+JXNZyTZ0mzuj3OAB/Z5zS6lT9azPfnEjpcOngFs46P7S/1hRIrSWCvZ8kfECpa8W+cTMus4rpCd40d1tVKzJA/n0MGJjEs2q4cK6lC08pXxq9zAyt7PMl94PHse2uzDFhrhh7d0ManxNZE+I5/IPWOnG1PJsDlOe4Yqw== michael@opensourceecology.org" > .ssh/authorized_keys [maltfield@osedev1 ~]$ chmod 0700 .ssh [maltfield@osedev1 ~]$ chmod 0600 .ssh/authorized_keys [maltfield@osedev1 ~]$
- I confirmed that I could ssh-in using my key as maltfield
user@ose:~/ansible$ ssh maltfield@195.201.233.113 whoami maltfield user@ose:~/ansible$ ssh maltfield@195.201.233.113 hostname osedev1 user@ose:~/ansible$
- then I set a password for 'maltfied', and added 'maltfield' to the 'wheel' group to permit sudo
[root@osedev1 ~]# passwd maltfield Changing password for user maltfield. New password: Retype new password: passwd: all authentication tokens updated successfully. [root@osedev1 ~]# gpasswd -a maltfield wheel Adding user maltfield to group wheel [root@osedev1 ~]#
- and I confirmed that I could sudo
user@ose:~/ansible$ ssh maltfield@195.201.233.113 Last login: Tue Aug 20 14:13:50 2019 [maltfield@osedev1 ~]$ groups maltfield wheel [maltfield@osedev1 ~]$ sudo su - [sudo] password for maltfield: Last login: Tue Aug 20 14:10:51 CEST 2019 on pts/1 Last failed login: Tue Aug 20 14:11:57 CEST 2019 from 45.119.209.91 on ssh:notty There was 1 failed login attempt since the last successful login. [root@osedev1 ~]#
- next, I made a backup of the existing sshd config on the new dev server, copied our hardened ssh config from prod in to replace it, added a new group called 'sshaccess' and added 'maltfield' to it, and restarted ssh
- first on prod
user@ose:~$ ssh opensourceecology.org Last login: Tue Aug 20 11:58:33 2019 from 142.234.200.164 [maltfield@opensourceecology ~]$ sudo su - [sudo] password for maltfield: Last login: Tue Aug 20 12:19:07 UTC 2019 on pts/12 [root@opensourceecology ~]# cp /etc/ssh/sshd_config /home/user/maltfield/ cp: cannot create regular file ‘/home/user/maltfield/’: No such file or directory [root@opensourceecology ~]# cp /etc/ssh/sshd_config /home/maltfield/ [root@opensourceecology ~]# chown maltfield /home/maltfield/sshd_config [root@opensourceecology ~]#
- then on dev
user@ose:~/ansible$ ssh -A maltfield@195.201.233.113 Last login: Tue Aug 20 14:15:33 2019 from 142.234.200.164 [maltfield@osedev1 ~]$ ssh-add -l error fetching identities for protocol 1: agent refused operation 4096 SHA256:nbMcwqUz/ouvQwlNXbJtwijJ/0omJKeq5Nqzkus/sNQ guttersnipe@guttersnipe (RSA) [maltfield@osedev1 ~]$ scp -P32415 opensourceecology.org:sshd_config . The authenticity of host '[opensourceecology.org]:32415 ([2a01:4f8:172:209e::2]:32415)' can't be established. ECDSA key fingerprint is SHA256:HclF8ZQOjGqx+9TmwL111kZ7QxgKkoEw8g3l2YxV0gk. ECDSA key fingerprint is MD5:cd:87:b1:bb:c1:3e:d1:d1:d4:5d:16:c9:e8:30:6a:71. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '[opensourceecology.org]:32415,[2a01:4f8:172:209e::2]:32415' (ECDSA) to the list of known hosts. sshd_config 100% 4455 1.5MB/s 00:00 [maltfield@osedev1 ~]$ sudo su - [sudo] password for maltfield: Last login: Tue Aug 20 14:15:40 CEST 2019 on pts/1 Last failed login: Tue Aug 20 14:20:25 CEST 2019 from 45.119.209.91 on ssh:notty There was 1 failed login attempt since the last successful login. [root@osedev1 ~]# cd /etc/ssh [root@osedev1 ssh]# mv sshd_config sshd_config.`date "+%Y%m%d_%H%M%S"`.orig [root@osedev1 ssh]# mv /home/maltfield/sshd_config . [root@osedev1 ssh]# ls -lah total 620K drwxr-xr-x. 2 root root 4.0K Aug 20 14:27 . drwxr-xr-x. 72 root root 4.0K Aug 20 14:14 .. -rw-r--r--. 1 root root 569K Apr 11 2018 moduli -rw-r--r--. 1 root root 2.3K Apr 11 2018 ssh_config -rw-------. 1 maltfield maltfield 4.4K Aug 20 14:27 sshd_config -rw-------. 1 root root 3.9K Aug 20 12:16 sshd_config.20190820_142740.orig -rw-r-----. 1 root ssh_keys 227 Aug 20 12:16 ssh_host_ecdsa_key -rw-r--r--. 1 root root 162 Aug 20 12:16 ssh_host_ecdsa_key.pub -rw-r-----. 1 root ssh_keys 387 Aug 20 12:16 ssh_host_ed25519_key -rw-r--r--. 1 root root 82 Aug 20 12:16 ssh_host_ed25519_key.pub -rw-r-----. 1 root ssh_keys 1.7K Aug 20 12:16 ssh_host_rsa_key -rw-r--r--. 1 root root 382 Aug 20 12:16 ssh_host_rsa_key.pub [root@osedev1 ssh]# chown root:root sshd_config [root@osedev1 ssh]# ls -lah total 620K drwxr-xr-x. 2 root root 4.0K Aug 20 14:27 . drwxr-xr-x. 72 root root 4.0K Aug 20 14:14 .. -rw-r--r--. 1 root root 569K Apr 11 2018 moduli -rw-r--r--. 1 root root 2.3K Apr 11 2018 ssh_config -rw-------. 1 root root 4.4K Aug 20 14:27 sshd_config -rw-------. 1 root root 3.9K Aug 20 12:16 sshd_config.20190820_142740.orig -rw-r-----. 1 root ssh_keys 227 Aug 20 12:16 ssh_host_ecdsa_key -rw-r--r--. 1 root root 162 Aug 20 12:16 ssh_host_ecdsa_key.pub -rw-r-----. 1 root ssh_keys 387 Aug 20 12:16 ssh_host_ed25519_key -rw-r--r--. 1 root root 82 Aug 20 12:16 ssh_host_ed25519_key.pub -rw-r-----. 1 root ssh_keys 1.7K Aug 20 12:16 ssh_host_rsa_key -rw-r--r--. 1 root root 382 Aug 20 12:16 ssh_host_rsa_key.pub [root@osedev1 ssh]# grep AllowGroups sshd_config AllowGroups sshaccess [root@osedev1 ssh]# grep sshaccess /etc/group [root@osedev1 ssh]# groupadd sshaccess [root@osedev1 ssh]# gpasswd -a maltfield sshaccess Adding user maltfield to group sshaccess [root@osedev1 ssh]# grep sshaccess /etc/group sshaccess:x:1001:maltfield [root@osedev1 ssh]# systemctl restart sshd [root@osedev1 ssh]# logout [maltfield@osedev1 ~]$ logout Connection to 195.201.233.113 closed. user@ose:~/ansible$ ssh -A maltfield@195.201.233.113 ssh: connect to host 195.201.233.113 port 22: Connection refused user@ose:~/ansible$ ssh -p 32415 maltfield@195.201.233.113 Last login: Tue Aug 20 14:17:07 2019 from 142.234.200.164 [maltfield@osedev1 ~]$ echo win win [maltfield@osedev1 ~]$ sudo su - [sudo] password for maltfield: Last login: Tue Aug 20 14:27:12 CEST 2019 on pts/1 Last failed login: Tue Aug 20 14:28:43 CEST 2019 from 45.119.209.91 on ssh:notty There was 1 failed login attempt since the last successful login. [root@osedev1 ~]# echo woot woot [root@osedev1 ~]# logout [maltfield@osedev1 ~]$ logout Connection to 195.201.233.113 closed. user@ose:~/ansible$
- finally cleanup on prod
user@ose:~$ ssh opensourceecology.org Last login: Tue Aug 20 12:19:23 2019 from 142.234.200.164 [maltfield@opensourceecology ~]$ ls sshd_config sshd_config [maltfield@opensourceecology ~]$ rm sshd_config [maltfield@opensourceecology ~]$ ls sshd_config ls: cannot access sshd_config: No such file or directory [maltfield@opensourceecology ~]$ logout Connection to opensourceecology.org closed. user@ose:~$
- I confirmed that I could no longer login as root
user@ose:~$ ssh -i .ssh/id_rsa.ose root@195.201.233.113 ssh: connect to host 195.201.233.113 port 22: Connection refused user@ose:~$ ssh -p 32415 -i .ssh/id_rsa.ose root@195.201.233.113 Permission denied (publickey). user@ose:~$
- and, final step of the pre-ansible, bootstrap: iptables
- first I confirmed that there's no existing rules
[root@osedev1 ~]# iptables-save # Generated by iptables-save v1.4.21 on Tue Aug 20 14:35:43 2019 *filter :INPUT ACCEPT [69:4056] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [38:3704] COMMIT # Completed on Tue Aug 20 14:35:43 2019 [root@osedev1 ~]# ip6tables-save # Generated by ip6tables-save v1.4.21 on Tue Aug 20 14:35:55 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] COMMIT # Completed on Tue Aug 20 14:35:55 2019 [root@osedev1 ~]#
- then I added the basic iptables rules for ipv4 to block everything in except ssh
[root@osedev1 ~]# iptables -A INPUT -i lo -j ACCEPT [root@osedev1 ~]# iptables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT [root@osedev1 ~]# iptables -A INPUT -p icmp -j ACCEPT [root@osedev1 ~]# iptables -A INPUT -p tcp -m state --state NEW -m tcp --dport 32415 -j ACCEPT [root@osedev1 ~]# iptables -A INPUT -j DROP [root@osedev1 ~]# iptables-save # Generated by iptables-save v1.4.21 on Tue Aug 20 14:39:12 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [15:1180] -A INPUT -i lo -j ACCEPT -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -p icmp -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 32415 -j ACCEPT -A INPUT -j DROP COMMIT # Completed on Tue Aug 20 14:39:12 2019 [root@osedev1 ~]#
- and I added the same for ipv6
[root@osedev1 ~]# ip6tables -A INPUT -i lo -j ACCEPT [root@osedev1 ~]# ip6tables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT [root@osedev1 ~]# ip6tables -A INPUT -p icmp -j ACCEPT [root@osedev1 ~]# ip6tables -A INPUT -p tcp -m state --state NEW -m tcp --dport 32415 -j ACCEPT [root@osedev1 ~]# ip6tables -A INPUT -j DROP [root@osedev1 ~]# ip6tables-save # Generated by ip6tables-save v1.4.21 on Tue Aug 20 14:41:12 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] -A INPUT -i lo -j ACCEPT -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -p icmp -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 32415 -j ACCEPT -A INPUT -j DROP COMMIT # Completed on Tue Aug 20 14:41:12 2019 [root@osedev1 ~]#
- note that I had to install 'iptables-services' in order to save these rules
[root@osedev1 ~]# service iptables save The service command supports only basic LSB actions (start, stop, restart, try-restart, reload, force-reload, status). For other actions, please try to use systemctl. [root@osedev1 ~]# systemctl save iptables Unknown operation 'save'. [root@osedev1 ~]# yum install iptables-services Loaded plugins: fastestmirror Determining fastest mirrors * base: mirror.wiuwiu.de * extras: mirror.netcologne.de * updates: mirror.netcologne.de base | 3.6 kB 00:00:00 extras | 3.4 kB 00:00:00 updates | 3.4 kB 00:00:00 (1/2): extras/7/x86_64/primary_db | 215 kB 00:00:01 (2/2): updates/7/x86_64/primary_db | 7.4 MB 00:00:01 Resolving Dependencies --> Running transaction check ---> Package iptables-services.x86_64 0:1.4.21-28.el7 will be installed --> Finished Dependency Resolution Dependencies Resolved =============================================================================================================== Package Arch Version Repository Size =============================================================================================================== Installing: iptables-services x86_64 1.4.21-28.el7 base 52 k Transaction Summary =============================================================================================================== Install 1 Package Total download size: 52 k Installed size: 26 k Is this ok [y/d/N]: y Downloading packages: iptables-services-1.4.21-28.el7.x86_64.rpm | 52 kB 00:00:05 Running transaction check Running transaction test Transaction test succeeded Running transaction Installing : iptables-services-1.4.21-28.el7.x86_64 1/1 Verifying : iptables-services-1.4.21-28.el7.x86_64 1/1 Installed: iptables-services.x86_64 0:1.4.21-28.el7 Complete! [root@osedev1 ~]# service iptables save iptables: Saving firewall rules to /etc/sysconfig/iptables:[ OK ] [root@osedev1 ~]#
- that's enough; the rest I can do from ansible!
- I created a new ansible dir with a super simple inventory file and successfully got it to ping
user@ose:~/ansible$ cat hosts 195.201.233.113 ansible_port=32415 ansible_user=maltfield user@ose:~/ansible$ ansible -i hosts all -m ping 195.201.233.113 | SUCCESS => { "changed": false, "ping": "pong" } user@ose:~/ansible$
- ok, I think I'm actually just going to rsync a ton of stuff over, cross my fingers, and reboot. I'll exclude the following dirs from the rsync https://linuxadmin.io/hot-clone-linux-server/
- /dev
- /sys
- /proc
- /boot/
- /etc/sysconfig/network*
- /tmp
- /var/tmp
- /etc/fstab
- /etc/mtab
- /etc/mdadm.conf
- I think it makes more sense for the prod server to push to the dev server (using my ssh forwarded key) than for dev to pull from prod
- many of these files are actually root-only, so we must be root on both systems but since we don't permit root to ssh, we need a way to sudo from 'maltfield' to 'root' on the dev system. I tested that this works
[maltfield@opensourceecology syncToDev]$ ssh -p 32415 maltfield@195.201.233.113 sudo whoami The authenticity of host '[195.201.233.113]:32415 ([195.201.233.113]:32415)' can't be established. ECDSA key fingerprint is SHA256:U99nmyy5WJZMQ6qQL7vofldQJcpztHzCEzO6OuHuLd4. ECDSA key fingerprint is MD5:3c:37:06:50:4d:48:0c:f4:c1:fe:98:d8:99:fa:7a:14. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '[195.201.233.113]:32415' (ECDSA) to the list of known hosts. sudo: no tty present and no askpass program specified [maltfield@opensourceecology syncToDev]$ ssh -t -p 32415 maltfield@195.201.233.113 sudo whoami [sudo] password for maltfield: root Connection to 195.201.233.113 closed. [maltfield@opensourceecology syncToDev]$
- and I did a simple test rsync of a new dir at /root/rsyncTest/
[maltfield@opensourceecology syncToDev]$ sudo su - [sudo] password for maltfield: Last login: Tue Aug 20 12:36:43 UTC 2019 on pts/0 [root@opensourceecology ~]# cd /root [root@opensourceecology ~]# mkdir rsyncTest [root@opensourceecology ~]# echo "test1" > /root/rsyncTest/testFile [root@opensourceecology ~]# logout
- but I couldn't quite get the rsync syntax correct
[maltfield@opensourceecology syncToDev]$ rsync -avvvv --progress --rsync-path="sudo rsync" /home/maltfield/syncToDev/dirOwnedByMaltfield/ maltfield@195.201.233.133:32415/root/ cmd=<NULL> machine=195.201.233.133 user=maltfield path=32415/root/ cmd[0]=ssh cmd[1]=-l cmd[2]=maltfield cmd[3]=195.201.233.133 cmd[4]=sudo rsync cmd[5]=--server cmd[6]=-vvvvlogDtpre.iLsf cmd[7]=. cmd[8]=32415/root/ opening connection using: ssh -l maltfield 195.201.233.133 "sudo rsync" --server -vvvvlogDtpre.iLsf . 32415/root/ note: iconv_open("UTF-8", "UTF-8") succeeded. packet_write_wait: Connection to 138.201.84.243 port 32415: Broken pipe user@ose:~$
- I think this is compounded by ssh agent forwarding issues with stale env vars after reconnecting to a screen session; I fixed that with grabssh and fixssh
https://samrowe.com/wordpress/ssh-agent-and-gnu-screen/
[maltfield@opensourceecology syncToDev]$ rsync -e 'ssh -p 32415' -av --progress dirOwnedByMaltfield/ maltfield@195.201.233.113: sending incremental file list ./ testFileOwnedByMaltfield 6 100% 0.00kB/s 0:00:00 (xfer#1, to-check=0/2) sent 134 bytes received 34 bytes 112.00 bytes/sec total size is 6 speedup is 0.04 [maltfield@opensourceecology syncToDev]$
- adding sudo made it fail; so how can I make the sudo reach back to the $SUDO_USER's env vars to connect to the ssh-agent forwarded by my machine?
[maltfield@opensourceecology syncToDev]$ sudo rsync -e 'ssh -p 32415' -av --progress /home/maltfield/dirOwnedByMaltfield/ maltfield@195.201.233.113: [sudo] password for maltfield: The authenticity of host '[195.201.233.113]:32415 ([195.201.233.113]:32415)' can't be established. ECDSA key fingerprint is SHA256:U99nmyy5WJZMQ6qQL7vofldQJcpztHzCEzO6OuHuLd4. ECDSA key fingerprint is MD5:3c:37:06:50:4d:48:0c:f4:c1:fe:98:d8:99:fa:7a:14. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '[195.201.233.113]:32415' (ECDSA) to the list of known hosts. Permission denied (publickey). rsync: connection unexpectedly closed (0 bytes received so far) [sender] rsync error: unexplained error (code 255) at io.c(605) [sender=3.0.9] [maltfield@opensourceecology syncToDev]$
- ok, apparenty I somehow fucked up the permisions on '/home/maltfield/'. When I changed it back from 0775 to 0700 it worked. Note that here I also added the '-E' arg to `sudo` to make it keep the env vars needed to get my forwarded ssh key
[maltfield@opensourceecology syncToDev]$ sudo rsync -e 'ssh -p 32415' -av --progress /home/maltfield/dirOwnedByMaltfield/ maltfield@195.201.233.113: [sudo] password for maltfield: The authenticity of host '[195.201.233.113]:32415 ([195.201.233.113]:32415)' can't be established. ECDSA key fingerprint is SHA256:U99nmyy5WJZMQ6qQL7vofldQJcpztHzCEzO6OuHuLd4. ECDSA key fingerprint is MD5:3c:37:06:50:4d:48:0c:f4:c1:fe:98:d8:99:fa:7a:14. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '[195.201.233.113]:32415' (ECDSA) to the list of known hosts. Permission denied (publickey). rsync: connection unexpectedly closed (0 bytes received so far) [sender] rsync error: unexplained error (code 255) at io.c(605) [sender=3.0.9] [maltfield@opensourceecology syncToDev]$
- trying with sudo on the destination too, but this fails because I can't enter the password
[maltfield@opensourceecology syncToDev]$ sudo -E rsync -e 'ssh -p 32415' --rsync-path="sudo rsync" -av --progress dirOwnedByMaltfield/ maltfield@195.201.233.113:syncToDev/ sudo: no tty present and no askpass program specified rsync: connection unexpectedly closed (0 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(605) [sender=3.0.9] [maltfield@opensourceecology syncToDev]$
- I confirmed that I can get sudo to work over plain ssh using the '-t' option
[maltfield@opensourceecology syncToDev]$ ssh -t -p 32415 maltfield@195.201.233.113 sudo whoami [sudo] password for maltfield: root Connection to 195.201.233.113 closed. [maltfield@opensourceecology syncToDev]$
- but it won't let me use this option with rsync
[maltfield@opensourceecology syncToDev]$ sudo -E rsync -e 'ssh -t -p 32415' --rsync-path="sudo rsync" -av --progress dirOwnedByMaltfield/ maltfield@195.201.233.113:syncToDev/ Pseudo-terminal will not be allocated because stdin is not a terminal. sudo: no tty present and no askpass program specified rsync: connection unexpectedly closed (0 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(605) [sender=3.0.9] [maltfield@opensourceecology syncToDev]$
- I think the solution is just to give a whitelist of users with NOPASSWD sudo access. This list would be special users whoose private keys are on lockdown. They are encrypted with a damn good passphrase, never stored on servers, etc.
Sun Aug 18, 2019
- I found this shitty Help Desk article on BackBlaze B2's non-payment procedure to determine at what point they delete all our precious backup data if we accidentally don't pay again. Answer: After 1.5 months. In this case, we discovered the issue after 1.25 months; that was close! https://help.backblaze.com/hc/en-us/articles/219361957-B2-Non-payment-procedures
- but the above article says that all the dates are subject to change, so who the fuck knows *shrug*
- I recommended to Marcin that he setup email forwards and filters from our backblaze b2 google account so that he can be notified more sooner in the 1-month grace period. Personally, I can't fucking login to that account anymore due to google "security features" even though, yeah, I'm the G Suite Admin *facepalm*
...
- I had some emails with Chris about the wiki archival process, which is also an important component of backups and OSE's mission in general
- same as I did back in 2018-05, I created a new snapshot for him since he lost the old version https://wiki.opensourceecology.org/wiki/Maltfield_Log/2018_Q2#Sat_May_26.2C_2018
# DECLARE VARS snapshotDestDir='/var/tmp/snapshotOfWikiForChris.20190818' wikiDbName='osewiki_db' wikiDbUser='osewiki_user' wikiDbPass='CHANGEME' stamp=`date +%Y%m%d_%T` pushd "${snapshotDestDir}" time nice mysqldump --single-transaction -u"${wikiDbUser}" -p"${wikiDbPass}" --databases "${wikiDbName}" | gzip -c > "${wikiDbName}.${stamp}.sql.gz" time nice tar -czvf "${snapshotDestDir}/wiki.opensourceecology.org.vhost.${stamp}.tar.gz" /var/www/html/wiki.opensourceecology.org/*
Sat Aug 17, 2019
- I mentioned to Marcin that we may loose all our prescious backup data if our B2 account becomes unpaid (& unoticed) for some time again, so it may be a good idea to be super-cautious and keep a once-yearly backup at FeF. There's the most recent 2 days of backups live on the server, but they're owned by root (so getting Marcin access would be nontrivial)
[b2user@opensourceecology ~]$ ls -lah /home/b2user/sync total 17G drwxr-xr-x 2 root root 4.0K Aug 16 07:45 . drwx------ 7 b2user b2user 4.0K Aug 16 11:06 .. -rw-r--r-- 1 b2user root 17G Aug 16 07:45 daily_hetzner2_20190816_072001.tar.gpg [b2user@opensourceecology ~]$ ls -lah /home/b2user/sync.old total 17G drwxr-xr-x 2 root root 4.0K Aug 15 07:46 . drwx------ 7 b2user b2user 4.0K Aug 16 11:06 .. -rw-r--r-- 1 b2user root 17G Aug 15 07:46 daily_hetzner2_20190815_072001.tar.gpg [b2user@opensourceecology ~]$
- I asked if Marcin has ~20G somewhere he can store a yearly backup at FeF. It's not free download from B2 (it costs $0.01/G), so at 17G (<$0.20), I asked Marcin to try to download one of our server's backups from the B2 WUI for yearly archival on-site at FeF https://wiki.opensourceecology.org/wiki/Backblaze#Download_from_WUI
...
- I decided to update the local rules to silent alerts to change the level from 0 to 2 to ensure that it at least gets logged
- added another rule = "High amount of POST requests in a small period of time (likely bot)" to the list of overwrites to level 2 (so stop sending email alerrts)
- I noticed that I haven't been recieving (real-time) file integrity monitoring alerts, which is pretty critical. A quick check shows that the syscheck db *is* storing info on these files, for example this obi apache config file
[root@opensourceecology ossec]# bin/syscheck_control -i 000 -f /etc/httpd/conf.d/00-www.openbuildinginstitute.org.conf Integrity checking changes for local system 'opensourceecology.org - 127.0.0.1': Detailed information for entries matching: '/etc/httpd/conf.d/00-www.openbuildinginstitute.org.conf' 2017 Nov 24 17:25:33,0 - /etc/httpd/conf.d/00-www.openbuildinginstitute.org.conf File added to the database. Integrity checking values: Size: 1817 Perm: rw-r--r-- Uid: 0 Gid: 0 Md5: a6cddb9c598ddcd7bf08108e7ca53381 Sha1: 5110018b79f0cd7ae12a2ceb14e357b9c0e2804a 2017 Dec 01 19:39:23,0 - /etc/httpd/conf.d/00-www.openbuildinginstitute.org.conf File changed. - 1st time modified. Integrity checking values: Size: >1824 Perm: rw-r--r-- Uid: 0 Gid: 0 Md5: >647f8e256bd3cd4930b7b7bf54967527 Sha1: >c29ad2e25d9481f7b782f1a9ea1d04a15029ab37 2017 Dec 07 16:30:48,2 - /etc/httpd/conf.d/00-www.openbuildinginstitute.org.conf File changed. - 2nd time modified. Integrity checking values: Size: >1831 Perm: rw-r--r-- Uid: 0 Gid: 0 Md5: >61126df96f2249b917becce25566eb85 Sha1: >9426ac50df19dfccd478a7c65b52525472de1349 2017 Dec 07 16:35:59,3 - /etc/httpd/conf.d/00-www.openbuildinginstitute.org.conf File changed. - 3rd time modified. Integrity checking values: Size: >1838 Perm: rw-r--r-- Uid: 0 Gid: 0 Md5: >cdd2f08b506885a44e4d181d503cca19 Sha1: >f6069ce29ac13259f450d79eaff265971bbf6829 [root@opensourceecology ossec]#
- I think this may be because it auto-ignores files changed after 3 changes. A fix is to change auto_ignore to "no". I also added "alert_new_files" to "yes" https://github.com/ossec/ossec-hids/issues/779
... <syscheck> <!-- Frequency that syscheck is executed - default to every 22 hours --> <frequency>79200</frequency> <!-- Directories to check (perform all possible verifications) --> <directories report_changes="yes" realtime="yes" check_all="yes">/etc,/usr/bin,/usr/sbin</directories> <directories report_changes="yes" realtime="yes" check_all="yes">/bin,/sbin,/boot</directories> <directories report_changes="yes" realtime="yes" check_all="yes">/var/ossec/etc</directories> <alert_new_files>yes</alert_new_files> <auto_ignore>no</auto_ignore> ...
- as soon as I restarted ossec after adding these options, I got a ton of alerts on integrity changes to files like, for example, /etc/shadow! We def always need an email alert sent when /etc/shadow changes..
OSSEC HIDS Notification. 2019 Aug 17 06:03:39 Received From: opensourceecology->syscheck Rule: 550 fired (level 7) -> "Integrity checksum changed." Portion of the log(s): Integrity checksum changed for: '/etc/passwd' ... OSSEC HIDS Notification. 2019 Aug 17 06:01:31 Received From: opensourceecology->syscheck Rule: 550 fired (level 7) -> "Integrity checksum changed." Portion of the log(s): Integrity checksum changed for: '/etc/shadow'
Fri Aug 16, 2019
- added an ossec local rule to prevent emails alerts from being triggered on modsec rejecting queries as they're too many and hide more important alerts
- added an ossec local rule to prevent email alerts from being triggered on 500 errors as they're too many and hide more important alerts
Thr Aug 15, 2019
- confirmed that ose backups are working again. we're missing the first-of-the-month, but the past few days look good
[root@opensourceecology ~]# sudo su - b2user Last login: Sat Aug 3 05:57:40 UTC 2019 on pts/0 [b2user@opensourceecology ~]$ ~/virtualenv/bin/b2 ls ose-server-backups daily_hetzner2_20190813_072001.tar.gpg daily_hetzner2_20190814_072001.tar.gpg monthly_hetzner2_20181001_091809.tar.gpg monthly_hetzner2_20181101_091810.tar.gpg monthly_hetzner2_20181201_091759.tar.gpg monthly_hetzner2_20190201_072001.tar.gpg monthly_hetzner2_20190301_072001.tar.gpg monthly_hetzner2_20190401_072001.tar.gpg monthly_hetzner2_20190501_072001.tar.gpg monthly_hetzner2_20190601_072001.tar.gpg monthly_hetzner2_20190701_072001.tar.gpg weekly_hetzner2_20190812_072001.tar.gpg yearly_hetzner2_20190101_111520.tar.gpg [b2user@opensourceecology ~]$
- I also documented these commands on the wiki for future, easy reference https://wiki.opensourceecology.org/wiki/Backblaze
- re-ran backup report
- fixed error in backup report
- re-ran backup report, looks good
[root@opensourceecology backups]# ./backupReport.sh INFO: email body below ATTENTION: BACKUPS MISSING! WARNING: First of this month's backup (20190801) is missing! See below for the contents of the backblaze b2 bucket = ose-server-backups daily_hetzner2_20190813_072001.tar.gpg daily_hetzner2_20190814_072001.tar.gpg monthly_hetzner2_20181001_091809.tar.gpg monthly_hetzner2_20181101_091810.tar.gpg monthly_hetzner2_20181201_091759.tar.gpg monthly_hetzner2_20190201_072001.tar.gpg monthly_hetzner2_20190301_072001.tar.gpg monthly_hetzner2_20190401_072001.tar.gpg monthly_hetzner2_20190501_072001.tar.gpg monthly_hetzner2_20190601_072001.tar.gpg monthly_hetzner2_20190701_072001.tar.gpg weekly_hetzner2_20190812_072001.tar.gpg yearly_hetzner2_20190101_111520.tar.gpg --- Note: This report was generated on 20190815_084159 UTC by script '/root/backups/backupReport.sh' This script was triggered by '/etc/cron.d/backup_to_backblaze' For more information about OSE backups, please see the relevant documentation pages on the wiki: * https://wiki.opensourceecology.org/wiki/Backblaze * https://wiki.opensourceecology.org/wiki/OSE_Server#Backups [root@opensourceecology backups]#
- confirmed that our accrued bill of $2.57 was paid with Marcin's updates. backups are stable again!
- I emailed Chris asking about the status of the wiki archival process -> archive.org
- I did some fixing to the ossec email alerts
Sat Aug 03, 2019
- we just got an email from the server stating that there was errors with the backups
ATTENTION: BACKUPS MISSING! WARNING: First of this month's backup (20190801) is missing! WARNING: First of last month's backup (20190701) is missing! WARNING: Yesterday's backup (20190802) is missing! WARNING: The day before yesterday's backup (20190801) is missing! See below for the contents of the backblaze b2 bucket = ose-server-backups
- note that there was no contents under the "see below for the contents of the backblaze b2 bucket = ose-server-backups"
- this error was generated by the cron job /etc/cron.d/backup_to_backblaze and the script /root/backups/backupReport.sh. This is the first time I've seen it return a critical failure like this.
- the fact that the output is totally empty and it states that we're missing all the backups even though this s the first time we've recieved this, suggests it's a false-positive
- I logged into the server, changed to the 'b2user' and ran the command to get a listing of the contens of the bucket, and--sure enough--I got an error
[b2user@opensourceecology ~]$ ~/virtualenv/bin/b2 ls ose-server-backups ERROR: Unknown error: 403 account_trouble Account trouble. Please log into your b2 account at www.backblaze.com. [b2user@opensourceecology ~]$
- Per the error message, I logged-into the b2 website. As soon as I authenticated, I saw this pop-up
B2 Access Denied Your access to B2 has been suspended because your account has not been in good standing and your grace period has now ended. Please review your account and update your payment method at payment history, or contact tech support for assistance. B2 API Error Error Detail: Account trouble. Please log into your b2 account at www.backblaze.com. B2 API errors happen for a variety of reasons including failures to connect to the B2 servers, unexpectedly high B2 server load and general networking problems. Please see our documentation for more information about specific errors returned for each API call. You should also investigate our easy-to-use command line tool here: https://www.backblaze.com/b2/docs/quick_command_line.html
- I'm sure they sent us alerts to our account email (backblaze at opensourcecology dot org), but I can't fucking check because gmail demands 2fa via sms that isn't tied to the account. ugh.
- I made some improvements to the backupReport.sh script.
- it now redirects STDERR to STDOUT, so the any errors are captured & sent with the email where the backup files usually appear
- it now has a footer that includes the timestamp of when the script was executed
- it now lists the path of the script itself, to help future admins debug issues
- it now lists the path of the cron that executes the script, to help future admins debug issues
- it now prints links to two relevant documenation pages on the wiki, to help future admins debug issues
- The new email looks like this
ATTENTION: BACKUPS MISSING! ERROR: Unknown error: 403 account_trouble Account trouble. Please log into your b2 account at www.backblaze.com. WARNING: First of this month's backup (20190801) is missing! WARNING: First of last month's backup (20190701) is missing! WARNING: Yesterday's backup (20190802) is missing! WARNING: The day before yesterday's backup (20190801) is missing! See below for the contents of the backblaze b2 bucket = ose-server-backups ERROR: Unknown error: 403 account_trouble Account trouble. Please log into your b2 account at www.backblaze.com. --- Note: This report was generated on 20190803_071847 UTC by script '/root/backups/backupReport.sh' This script was triggered by '/etc/cron.d/backup_to_backblaze' For more information about OSE backups, please see the relevant documentation pages on the wiki: * https://wiki.opensourceecology.org/wiki/Backblaze * https://wiki.opensourceecology.org/wiki/OSE_Server#Backups
Thr Aug 01, 2019
- with Tom on DR & bus factor contingency planning
- Marcin asked if our server could handle thousands of concurrent editors on the wiki for the upcoming cordless drill microfactory contest
- hetzner2 is basically idle. I'm not sure where its limits are, but we're nowhere near it. With varnish in-place, writes are much more costly than concurrent readers. I explained to Marcin that scaling hetzner2 would be dividing up parts (add 1 or more DB servers, 1+ memcache (db cache) servers, 1+ apache backend servers, 1+ nginx ssl terminator servers, 1+ haproxy load balancing servers, 1+ mail servers, 1+ varnish frontend caching servers, etc)
- I went to check munin, but it the graphs were bare! Looks like our server rebooted, and munin wasn't enabled to start at system boot. I fixed that.
[root@opensourceecology init.d]# systemctl enable munin-node Created symlink from /etc/systemd/system/multi-user.target.wants/munin-node.service to /usr/lib/systemd/system/munin-node.service. [root@opensourceecology init.d]# systemctl status munin-node ● munin-node.service - Munin Node Server. Loaded: loaded (/usr/lib/systemd/system/munin-node.service; enabled; vendor preset: disabled) Active: inactive (dead) Docs: man:munin-node [root@opensourceecology init.d]# systemctl start munin-node [root@opensourceecology init.d]# systemctl status munin-node ● munin-node.service - Munin Node Server. Loaded: loaded (/usr/lib/systemd/system/munin-node.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2019-08-01 10:17:09 UTC; 2s ago Docs: man:munin-node Process: 20015 ExecStart=/usr/sbin/munin-node (code=exited, status=0/SUCCESS) Main PID: 20016 (munin-node) CGroup: /system.slice/munin-node.service └─20016 /usr/bin/perl -wT /usr/sbin/munin-node Aug 01 10:17:09 opensourceecology.org systemd[1]: Starting Munin Node Server.... Aug 01 10:17:09 opensourceecology.org systemd[1]: Started Munin Node Server.. [root@opensourceecology init.d]#
- yearly grahps are available showing the data cutting off sometime in June
- Marcin said Discourse is no replacement for Askbot, so we should go with both.
- Marcin approved my request for $100/yr for a dev server in the hetzner cloud. I'll provision a CX11 w/ 50G block storage when I get back from my upcoming vacation
Wed Jul 31, 2019
- Discussion with Tom on DR & bus factor contingency planning
- Wiki changes
Tue Jul 30, 2019
1. Stack Exchange & Askbot research 2. I told Marcin that I think Discourse is the best option, but the dependencies may break our prod server, and I asked for a budget for a dev server 3. the dev server woudn't need to be very powerful, but it does need to have the same setup & disk as prod. 4. I checked the current disk on our prod server, and it has 145G used a. 34G are in /home/b2user = redundant backup data. b. Wow, there's also 72G in /tmp/systemd-private-2311ab4052754ae68f4a114aefa85295-httpd.service-LqLH0q/tmp/ a. so this appears to be caused by a "PrivateTemp" feature of systemd because many apps like httpd will create files in the 777'd /tmp dir. At OSE, I hardened php so that it writes temp files *not* in this dir, anyway. I found several guides on how to disable PrivateTemp, but preventing apache from writing to a 777 dir doesn't sound so bad.https://gryzli.info/2015/06/21/centos-7-missing-phpapache-temporary-files-in-tmp-systemd-private-temp/ b. better question: how do I just cleanup this shit? I tried `systemd-tmpfiles --clean` & `systemd-tmpfiles --remove` to no avail
[root@opensourceecology tmp]# systemd-tmpfiles --clean [root@opensourceecology tmp]# du -sh /tmp 72G /tmp [root@opensourceecology tmp]# systemd-tmpfiles --remove [root@opensourceecology tmp]# du -sh /tmp 72G /tmp [root@opensourceecology tmp]#
6. I also confirmed that the above script *should* be being run every day, anyway https://unix.stackexchange.com/questions/489940/linux-files-folders-cleanup-under-tmp
[root@opensourceecology tmp]# systemctl list-timers NEXT LEFT LAST PASSED UNIT ACTIVATES n/a n/a Sat 2019-06-22 03:11:54 UTC 1 months 7 days ago systemd-readahead-done.timer systemd-readahead-done Wed 2019-07-31 03:29:18 UTC 21h left Tue 2019-07-30 03:29:18 UTC 2h 25min ago systemd-tmpfiles-clean.timer systemd-tmpfiles-clean 2 timers listed. Pass --all to see loaded but inactive timers, too. [root@opensourceecology tmp]#
8. to make matters worse, it does appear that we have everything on one partition
[root@opensourceecology tmp]# cat /etc/fstab proc /proc proc defaults 0 0 devpts /dev/pts devpts gid=5,mode=620 0 0 tmpfs /dev/shm tmpfs defaults 0 0 sysfs /sys sysfs defaults 0 0 /dev/md/0 none swap sw 0 0 /dev/md/1 /boot ext3 defaults 0 0 /dev/md/2 / ext4 defaults 0 0 [root@opensourceecology tmp]# mount [root@opensourceecology tmp]# mount sysfs on /sys type sysfs (rw,relatime) proc on /proc type proc (rw,relatime) devtmpfs on /dev type devtmpfs (rw,nosuid,size=32792068k,nr_inodes=8198017,mode=755) securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) tmpfs on /dev/shm type tmpfs (rw,relatime) devpts on /dev/pts type devpts (rw,relatime,gid=5,mode=620,ptmxmode=000) tmpfs on /run type tmpfs (rw,nosuid,nodev,mode=755) tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd) pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct,cpu) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_prio,net_cls) cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) configfs on /sys/kernel/config type configfs (rw,relatime) /dev/md2 on / type ext4 (rw,relatime,data=ordered) systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=30,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10157) debugfs on /sys/kernel/debug type debugfs (rw,relatime) mqueue on /dev/mqueue type mqueue (rw,relatime) hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime) /dev/md1 on /boot type ext3 (rw,relatime,stripe=4,data=ordered) tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,size=6563484k,mode=700) tmpfs on /run/user/1005 type tmpfs (rw,nosuid,nodev,relatime,size=6563484k,mode=700,uid=1005,gid=1005) binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,relatime) [root@opensourceecology tmp]#
10. It appears there's just a ton of cachegrind files here 444,670 files to be exact (all <1M)
[root@opensourceecology tmp]# ls -lah | grep -vi cachegrind total 72G drwxrwxrwt 2 root root 22M Jul 30 06:02 . drwx------ 3 root root 4.0K Jun 22 03:11 .. -rw-r--r-- 1 apache apache 5 Jun 22 03:12 dos-127.0.0.1 -rw-r--r-- 1 apache apache 112M Jul 30 06:02 xdebug.log [root@opensourceecology tmp]# ls -lah | grep "M" drwxrwxrwt 2 root root 22M Jul 30 06:02 . -rw-r--r-- 1 apache apache 112M Jul 30 06:02 xdebug.log [root@opensourceecology tmp]# ls -lah | grep "G" total 72G [root@opensourceecology tmp]# ls -lah | grep 'cachegrind.out' | wc -l 444670 [root@opensourceecology tmp]# pwd /tmp/systemd-private-2311ab4052754ae68f4a114aefa85295-httpd.service-LqLH0q/tmp [root@opensourceecology tmp]# date Tue Jul 30 06:04:26 UTC 2019 [root@opensourceecology tmp]#
13. These files should be deleted after 30 days, and that appears to be the case https://bugzilla.redhat.com/show_bug.cgi?id=1183684#c4 14. A quick search for xdebug shows that I enabled it for phplist that's probably what's generating these cachegrind files. I uncommented the lines enabling xdebug in the phplist apache vhost config file and gave httpd a restart. That cleared the tmp files. Now the disk usage is down to 73G used and 11M in /tmp
[root@opensourceecology tmp]# date Tue Jul 30 06:10:18 UTC 2019 [root@opensourceecology tmp]# du -sh /tmp 11M /tmp [root@opensourceecology tmp]# df -h Filesystem Size Used Avail Use% Mounted on /dev/md2 197G 73G 115G 40% / devtmpfs 32G 0 32G 0% /dev tmpfs 32G 0 32G 0% /dev/shm tmpfs 32G 865M 31G 3% /run tmpfs 32G 0 32G 0% /sys/fs/cgroup /dev/md1 488M 289M 174M 63% /boot tmpfs 6.3G 0 6.3G 0% /run/user/0 tmpfs 6.3G 0 6.3G 0% /run/user/1005 [root@opensourceecology tmp]#
16. Ok, so that's 73 - 34 = 39G of disk usage. 39*1.3 = 51G for good measure. 17. I found this guide for using rsync and a few touch-up commands to migrate a hetzner vServer to their cloud service https://wiki.hetzner.de/index.php/How_to_migrate_vServers_to_Cloud/en 18. the cheapest cloud hetzner node with >51G is their CX31 w/ 80G disk @ 8.90 EUR/mo = 106.8 EUR/yr = $119/yr 19. ...but they also have block volume storage (where we could, for example, mount /var = 37G). Then we'd only need a 51-37 = 14G root, and we could get hetzner's cheapest cloud node = CX11 w/ 20G disk @ 2.49 EUR/mo = 29.88 EUR/yr = $33.29 /yr + a 50G block volume for 2 EUR/mo = 24 EUR/yr = $26.74/yr. That's a total of 33.29+26.74 = $60.03/yr for a dev node 20. I asked Marcin if he could approve spending $100/yr for a dev node in the hetzner cloud.
Tue Jul 28, 2019
1. Stack Exchange research 2. WebGL research for 3d models on OSE Store Product Pages (ie: 3d printer)
Tue Jul 18, 2019
1. Marcin asked if there's any way to log activity for the Confirm Accounts extentionhttps://www.mediawiki.org/wiki/Extension:ConfirmAccount 2. I didn't find anything in the documenation about logs, but scanning through the code showed some calls to wfDebug() 3. we already have a debug file defined, but apparently mediawiki stopped writing after it reached 2.1G. I created a new file where we can monitor any future issues 4. in the long-term, we should probably setup this file to logrotate and compress after, say, 1G
Tue Jul 02, 2019
1. marchin mentioned that some users are unable to request wiki accounts. 2. I was only given the info for one user in specific. I manually queried the DB by their email address. I found 0 entreis in the 'wiki_user' table and 0 entries in the 'wiki_account_requests" table 3. I was able to request an account using their email address, and I confirmed that it appeared in the Special:ConfirmAccounts WUI. I deleted the row, and confirmed that it disappeared from the WUI. I re-registered (to confirm that they could as well), and deleted the row again. 4. So I can't reproduce this. 5. I emailed Marcin telling him to tell users as a short fix to try again using a different Username and Password. As a long fix, tell us: The "Username" they requested The "Email address" used for the request The day, time, and timezone when they submitted the request, and Any relevant error messages that they were given (bonus points for screenshots) 6. ...so that I can research this further