Maltfield Log/2019 Q4
Jump to navigation
Jump to search
My work log from the year 2019 Quarter 4. I intentionally made this verbose to make future admin's work easier when troubleshooting. The more keywords, error messages, etc that are listed in this log, the more helpful it will be for the future OSE Sysadmin.
See Also
Tue Oct 08, 2019
- continuing from yesterday, I checked-up on the rsync running from prod to staging, and it appears to have stalled
75497472 100% 2.90MB/s 0:00:24 (xfer#4297, to-check=1538/7463) run/log/journal/34a04596e14a410d9f2f816d507c55ab/system@fb40211581a0421d8abbe026c6a270ac-00000000000d4a57-00058e887df34962.journal 75497472 100% 2.80MB/s 0:00:25 (xfer#4298, to-check=1537/7463) run/log/journal/34a04596e14a410d9f2f816d507c55ab/system@fb40211581a0421d8abbe026c6a270ac-00000000000e7f5a-00058ec8f2c8422b.journal 23429120 31% 2.91MB/s 0:00:17
- it's probably not a good idea to sync the /run dir..
- attempting to ssh into the server fails
user@ose:~/openvpn$ ssh osestaging1 The authenticity of host '[10.241.189.11]:32415 ([10.241.189.11]:32415)' can't be established. ECDSA key fingerprint is SHA256:HclF8ZQOjGqx+9TmwL111kZ7QxgKkoEw8g3l2YxV0gk. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '[10.241.189.11]:32415' (ECDSA) to the list of known hosts. Permission denied (publickey). user@ose:~/openvpn$
- I _can_ get into the staging server from the lxc-console on the dev server, but it doesn't look like anything is wrong with the setup of my user
[root@osestaging1 ~]# grep maltfield /etc/passwd maltfield:x:1005:1005::/home/maltfield:/bin/bash [root@osestaging1 ~]# grep maltfield /etc/shadow maltfield:TRUNCATED [root@osestaging1 ~]# grep maltfield /etc/group wheel:x:10:maltfield,crupp,tgriffing,root apache:x:48:cmota,crupp,maltfield,wp,apache,marcin maltfield:x:1005:apache sshaccess:x:1006:cmota,marcin,tgriffing,maltfield,lberezhny,crupp keepass:x:993:maltfield,marcin,cmota,crupp apache-admins:x:1012:cmota,maltfield,marcin,crupp,tgriffing,wp,apache [root@osestaging1 ~]# ls -lah /home/maltfield/.ssh total 16K drwxr-xr-x. 2 tgriffing maltfield 4.0K Jan 19 2018 . drwx------. 10 tgriffing maltfield 4.0K Oct 3 07:06 .. -rw-r--r--. 1 root root 750 Jun 20 2017 authorized_keys -rw-r--r--. 1 tgriffing tgriffing 1.1K Oct 3 13:44 known_hosts [root@osestaging1 ~]# cat /home/maltfield/.ssh/authorized_keys ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDGNYjR7UKiJSAG/AbP+vlCBqNfQZ2yuSXfsEDuM7cEU8PQNJyuJnS7m0VcA48JRnpUpPYYCCB0fqtIEhpP+szpMg2LByfTtbU0vDBjzQD9mEfwZ0mzJsfzh1Nxe86l/d6h6FhxAqK+eG7ljYBElDhF4l2lgcMAl9TiSba0pcqqYBRsvJgQoAjlZOIeVEvM1lyfWfrmDaFK37jdUCBWq8QeJ98qpNDX4A76f9T5Y3q5EuSFkY0fcU+zwFxM71bGGlgmo5YsMMdSsW+89fSG0652/U4sjf4NTHCpuD0UaSPB876NJ7QzeDWtOgyBC4nhPpS8pgjsnl48QZuVm6FNDqbXr9bVk5BdntpBgps+gXdSL2j0/yRRayLXzps1LCdasMCBxCzK+lJYWGalw5dNaIDHBsEZiK55iwPp0W3lU9vXFO4oKNJGFgbhNmn+KAaW82NBwlTHo/tOlj2/VQD9uaK5YLhQqAJzIq0JuWZWFLUC2FJIIG0pJBIonNabANcN+vq+YJqjd+JXNZyTZ0mzuj3OAB/Z5zS6lT9azPfnEjpcOngFs46P7S/1hRIrSWCvZ8kfECpa8W+cTMus4rpCd40d1tVKzJA/n0MGJjEs2q4cK6lC08pXxq9zAyt7PMl94PHse2uzDFhrhh7d0ManxNZE+I5/IPWOnG1PJsDlOe4Yqw== guttersnipe@guttersnipe [root@osestaging1 ~]#
- ssh appears to be running too
[root@osestaging1 ~]# systemctl list-units | grep -i ssh sshd.service loaded active running OpenSSH server daemon [root@osestaging1 ~]# ss -plan | grep -i ssh u_str ESTAB 0 0 * 32621 * 32622 users:(("sshd",pid=350,fd=5)) u_dgr UNCONN 0 0 * 32618 * 29344 users:(("sshd",pid=350,fd=4),("sshd",pid=348,fd=4)) u_str ESTAB 0 0 * 31143 * 0 users:(("sshd",pid=274,fd=2),("sshd",pid=274,fd=1)) u_str ESTAB 0 0 * 32622 * 32621 users:(("sshd",pid=348,fd=7)) tcp LISTEN 0 128 *:32415 *:* users:(("sshd",pid=274,fd=3)) tcp ESTAB 0 0 10.241.189.11:32415 10.241.189.10:41270 users:(("sshd",pid=350,fd=3),("sshd",pid=348,fd=3)) tcp LISTEN 0 128 [::]:32415 [::]:* users:(("sshd",pid=274,fd=4)) [root@osestaging1 ~]#
- the ssh server logs say that the client just disconnects
Oct 8 05:57:01 localhost sshd[3586]: Connection closed by 10.241.189.10 port 41334 [preauth]
- the ssh client says that the server rejected our public key
user@ose:~/openvpn$ ssh -vvv osestaging1 ... debug1: Next authentication method: publickey debug1: Offering RSA public key: /home/user/.ssh/id_rsa.ose debug3: send_pubkey_test debug3: send packet: type 50 debug2: we sent a publickey packet, wait for reply debug3: receive packet: type 51 debug1: Authentications that can continue: publickey debug2: we did not send a packet, disable method debug1: No more authentication methods to try. Permission denied (publickey). user@ose:~/openvpn$
- I did notice that the ownership of the relevant /home/.ssh/authorized_keys file differs on the prod & staging servers
[maltfield@opensourceecology ~]$ ls -lahn /home/maltfield/.ssh total 16K drwxr-xr-x 2 1005 1005 4.0K Jan 19 2018 . drwx------ 10 1005 1005 4.0K Oct 3 07:06 .. -rw-r--r-- 1 0 0 750 Jun 20 2017 authorized_keys -rw-r--r-- 1 1005 1005 1.1K Oct 3 13:44 known_hosts [maltfield@opensourceecology ~]$
[root@osestaging1 ~]# ls -lahn /home/maltfield/.ssh total 16K drwxr-xr-x. 2 1000 1005 4.0K Jan 19 2018 . drwx------. 10 1000 1005 4.0K Oct 3 07:06 .. -rw-r--r--. 1 0 0 750 Jun 20 2017 authorized_keys -rw-r--r--. 1 1000 1000 1.1K Oct 3 13:44 known_hosts [root@osestaging1 ~]#
- while the passwd, group, and shadow files all match
[root@opensourceecology ~]# md5sum /etc/passwd cabf495ca12f7f32605eb764dd12c861 /etc/passwd [root@opensourceecology ~]# md5sum /etc/group 04a70553d59a646406ecb89f2f7b17b5 /etc/group [root@opensourceecology ~]# md5sum /etc/shadow 6f27deaf639ae2db1a1d94739a8bb834 /etc/shadow [root@opensourceecology ~]#
[root@osestaging1 ~]# md5sum /etc/passwd cabf495ca12f7f32605eb764dd12c861 /etc/passwd [root@osestaging1 ~]# md5sum /etc/group 04a70553d59a646406ecb89f2f7b17b5 /etc/group [root@osestaging1 ~]# md5sum /etc/shadow 6f27deaf639ae2db1a1d94739a8bb834 /etc/shadow [root@osestaging1 ~]#
- for some reason my '/home/maltfield' dir was also owned by 'tgriffin'. I was able to ssh-in again after fixing this
[root@osestaging1 ~]# chown -R maltfield:maltfield /home/maltfield/ [root@osestaging1 ~]# ls -lah /home total 52K drwxr-xr-x. 13 root root 4.0K Jul 28 2018 . dr-xr-xr-x. 20 root root 4.0K Oct 7 10:05 .. drwx------. 7 b2user b2user 4.0K Oct 7 07:46 b2user drwx------. 5 cmota cmota 4.0K Jul 14 2017 cmota drwx------. 5 crupp crupp 4.0K Aug 12 2017 crupp drwx------. 2 Flipo Flipo 4.0K Sep 20 2016 Flipo drwx------. 2 hart hart 4.0K Mar 30 2017 hart drwx------. 3 lberezhny lberezhny 4.0K Jul 20 2017 lberezhny drwx------. 10 maltfield maltfield 4.0K Oct 3 07:06 maltfield drwx------. 4 marcin marcin 4.0K Jul 6 2017 marcin drwx------. 2 not-apache not-apache 4.0K Feb 12 2018 not-apache drwx------. 5 tgriffing tgriffing 4.0K Aug 1 09:19 tgriffing drwx------. 5 wp wp 4.0K Oct 7 2017 wp [root@osestaging1 ~]#
- I re-opened the screen for the rsync, and it now exited
75497472 100% 2.90MB/s 0:00:24 (xfer#4297, to-check=1538/7463) run/log/journal/34a04596e14a410d9f2f816d507c55ab/system@fb40211581a0421d8abbe026c6a270ac-00000000000d4a57-00058e887df34962.journal 75497472 100% 2.80MB/s 0:00:25 (xfer#4298, to-check=1537/7463) run/log/journal/34a04596e14a410d9f2f816d507c55ab/system@fb40211581a0421d8abbe026c6a270ac-00000000000e7f5a-00058ec8f2c8422b.journal 23429120 31% 2.91MB/s 0:00:17 packet_write_wait: Connection to 10.241.189.11 port 32415: Broken pipe rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32) rsync: connection unexpectedly closed (119371 bytes received so far) [sender] rsync error: unexplained error (code 255) at io.c(605) [sender=3.0.9] real 1059m42.282s user 12m34.775s sys 3m5.253s [maltfield@opensourceecology ~]$ [maltfield@opensourceecology ~]$ [maltfield@opensourceecology ~]$ [maltfield@opensourceecology ~]$ [maltfield@opensourceecology ~]$ [maltfield@opensourceecology ~]$ [maltfield@opensourceecology ~]$ time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --rsync-path="sudo rsync" --exclude=/root --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/
- I updated the rsync command to exclude /run, and I kicked-off the rsync again
time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --rsync-path="sudo rsync" --exclude=/root --exclude=/run --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/
- ah, ffs! my internet connection here failed me, and I was silently disconnected from my ssh session with the prod node and dumped into a local shell. So I ended-up kicking off this rsync not from the prod node on which I was ssh'd, but my personal laptop (when I was dropped out of the prod server's ssh shell into my laptop's shell). By the time I realized it, the fucking staging server was broken!
- fucking hell, I had successfully copied 35G overnight; now I have to restore from snapshot and start over.
- I prepended a fucking hostname check to make sure this stupid shit doesn't happen again
[ "`hostname`" = "opensourceecology.org" ] && time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --rsync-path="sudo rsync" --exclude=/root --exclude=/run --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/
- I had a bunch of issues restoring from snapshot; eventually I just did an rsync of the '/var/lib/lxcsnaps/osestaging1/snap1' dir to '/var/lib/lxc/osestaging1', and I was finally successfully able to `lxc-start -n osestaging1`
- I did the `visudo` and install of rsync and re-initiated the rsync from prod to staging using the above-command. I noticed that I forgot to exclude the backups; here's what I should use next time
[ "`hostname`" = "opensourceecology.org" ] && time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --rsync-path="sudo rsync" --exclude=/root --exclude=/run --exclude=/home/b2user/sync --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/
- while that ran, I checked our munin graphs. I nice'd & bwlimit'd the above rsync, but it's still good to check.
- there's a spike in varnish requests, which is a bit odd
- there was a shift in memory usage, but no issues there
- load spiked to ~2, but our box has 8; no problems
- there was a spike in 'nice' to ~100% cpu usage; cool
- firewall throughput, eth0 traffic spiked to about the same level as our backups. excellent
- there's a huge spike in disk usage read, disk IO that's much higher than backups; hmm
- I also noted that the apache graphs that I added some time ago are blank; I probably have to setup an apache stats vhost for munin to scrape
- munin processing graphs are also blank; hmm
- all mysql graphs are also blank
- even nginx graphs are all blank
- I also added plugins for monitoring the 'mysqld' process and the memory of a bunch of processes
[root@opensourceecology plugins]# ls apache_access if_err_eth0 mysql_slowqueries uptime varnish_memory_usage.bak apache_processes if_eth0 mysql_threads users varnish_objects apache_volume interrupts nginx_request varnish4_ varnish_objects.bak cpu irqstats nginx_status varnish_backend_traffic varnish_request_rate df load open_files varnish_backend_traffic.bak varnish_request_rate.bak df_inode memory open_inodes varnish_bad varnish_threads diskstats munin_stats postfix_mailqueue varnish_bad.bak varnish_threads.bak entropy mysql_ postfix_mailvolume varnish_expunge varnish_transfer_rates forks mysql_bytes processes varnish_expunge.bak varnish_transfer_rates.bak fw_conntrack mysql_innodb proc_pri varnish_hit_rate varnish_uptime fw_forwarded_local mysql_isam_space_ swap varnish_hit_rate.bak varnish_uptime.bak fw_packets mysql_queries threads varnish_memory_usage vmstat [root@opensourceecology plugins]# ls -lah | head -n 5 total 36K drwxr-xr-x 2 root root 4.0K Sep 7 07:37 . drwxr-xr-x 8 root root 4.0K Jun 24 16:05 .. lrwxrwxrwx 1 root root 38 Sep 7 07:36 apache_access -> /usr/share/munin/plugins/apache_access lrwxrwxrwx 1 root root 41 Sep 7 07:36 apache_processes -> /usr/share/munin/plugins/apache_processes [root@opensourceecology plugins]# ln -s /usr/share/munin/plugins/multip multiping multips multips_memory [root@opensourceecology plugins]# ln -s /usr/share/munin/plugins/multips_memory [root@opensourceecology plugins]# ln -s /usr/share/munin/plugins/ps_ ps_mysqld [root@opensourceecology plugins]#
- for the munin mysql graphs, it looks like I need to grant access for the 'munin' user
[root@opensourceecology plugin-conf.d]# munin-run --debug mysql_queries # Processing plugin configuration from /etc/munin/plugin-conf.d/amavis # Processing plugin configuration from /etc/munin/plugin-conf.d/df # Processing plugin configuration from /etc/munin/plugin-conf.d/fw_ # Processing plugin configuration from /etc/munin/plugin-conf.d/hddtemp_smartctl # Processing plugin configuration from /etc/munin/plugin-conf.d/munin-node # Processing plugin configuration from /etc/munin/plugin-conf.d/postfix # Processing plugin configuration from /etc/munin/plugin-conf.d/postgres # Processing plugin configuration from /etc/munin/plugin-conf.d/sendmail # Processing plugin configuration from /etc/munin/plugin-conf.d/zzz-ose # Setting /rgid/ruid/ to /99/99/ # Setting /egid/euid/ to /99 99/99/ # Setting up environment # Environment mysqlopts = -u munin # About to run '/etc/munin/plugins/mysql_queries' mysqladmin: connect to server at 'localhost' failed error: 'Access denied for user 'munin'@'localhost' (using password: NO)' [root@opensourceecology plugin-conf.d]#
- woah, this guide suggests that there's a ton more graphs than just is what symlink-able https://blog.penumbra.be/2010/04/monitoring-mysql-munin-directadmin/
[root@opensourceecology plugins]# ls -lah mysql_* lrwxrwxrwx 1 root root 31 Sep 7 07:36 mysql_ -> /usr/share/munin/plugins/mysql_ lrwxrwxrwx 1 root root 36 Sep 7 07:36 mysql_bytes -> /usr/share/munin/plugins/mysql_bytes lrwxrwxrwx 1 root root 37 Sep 7 07:36 mysql_innodb -> /usr/share/munin/plugins/mysql_innodb lrwxrwxrwx 1 root root 42 Sep 7 07:36 mysql_isam_space_ -> /usr/share/munin/plugins/mysql_isam_space_ lrwxrwxrwx 1 root root 38 Sep 7 07:36 mysql_queries -> /usr/share/munin/plugins/mysql_queries lrwxrwxrwx 1 root root 42 Sep 7 07:36 mysql_slowqueries -> /usr/share/munin/plugins/mysql_slowqueries lrwxrwxrwx 1 root root 38 Sep 7 07:36 mysql_threads -> /usr/share/munin/plugins/mysql_threads [root@opensourceecology plugins]# ls -lah /usr/share/munin/plugins/mysql_* -rwxr-xr-x 1 root root 33K Mar 3 2017 /usr/share/munin/plugins/mysql_ -rwxr-xr-x 1 root root 1.8K Mar 3 2017 /usr/share/munin/plugins/mysql_bytes -rwxr-xr-x 1 root root 5.4K Mar 3 2017 /usr/share/munin/plugins/mysql_innodb -rwxr-xr-x 1 root root 5.7K Mar 3 2017 /usr/share/munin/plugins/mysql_isam_space_ -rwxr-xr-x 1 root root 2.5K Mar 3 2017 /usr/share/munin/plugins/mysql_queries -rwxr-xr-x 1 root root 1.5K Mar 3 2017 /usr/share/munin/plugins/mysql_slowqueries -rwxr-xr-x 1 root root 1.7K Mar 3 2017 /usr/share/munin/plugins/mysql_threads [root@opensourceecology plugins]# /usr/share/munin/plugins/mysql_ suggest bin_relay_log commands connections files_tables innodb_bpool innodb_bpool_act innodb_insert_buf innodb_io innodb_io_pend innodb_log innodb_rows innodb_semaphores innodb_tnx myisam_indexes network_traffic qcache qcache_mem replication select_types slow sorts table_locks tmp_tables [root@opensourceecology plugins]#
- I added all the mysql things
root@opensourceecology plugins]# ls -lah mysql_* lrwxrwxrwx 1 root root 31 Sep 7 07:36 mysql_ -> /usr/share/munin/plugins/mysql_ lrwxrwxrwx 1 root root 36 Sep 7 07:36 mysql_bytes -> /usr/share/munin/plugins/mysql_bytes lrwxrwxrwx 1 root root 37 Sep 7 07:36 mysql_innodb -> /usr/share/munin/plugins/mysql_innodb lrwxrwxrwx 1 root root 42 Sep 7 07:36 mysql_isam_space_ -> /usr/share/munin/plugins/mysql_isam_space_ lrwxrwxrwx 1 root root 38 Sep 7 07:36 mysql_queries -> /usr/share/munin/plugins/mysql_queries lrwxrwxrwx 1 root root 42 Sep 7 07:36 mysql_slowqueries -> /usr/share/munin/plugins/mysql_slowqueries lrwxrwxrwx 1 root root 38 Sep 7 07:36 mysql_threads -> /usr/share/munin/plugins/mysql_threads [root@opensourceecology plugins]# rm -rf mysql_* [root@opensourceecology plugins]# ln -sf /usr/share/munin/plugins/mysql_ mysql_ [root@opensourceecology plugins]# for i in `./mysql_ suggest`; \ > do ln -sf /usr/share/munin/plugins/mysql_ $i; done [root@opensourceecology plugins]# ls -lah mysql_* lrwxrwxrwx 1 root root 31 Oct 8 08:06 mysql_ -> /usr/share/munin/plugins/mysql_ [root@opensourceecology plugins]# ls -lah commands lrwxrwxrwx 1 root root 31 Oct 8 08:06 commands -> /usr/share/munin/plugins/mysql_ [root@opensourceecology plugins]#
- according to this guide, munin needs a user that doesn't need any GRANTs to any databases, and that's sufficient http://www.mbrando.com/2007/08/06/how-to-get-your-mysql-munin-graphs-working/
create user munin@localhost identified by 'CHANGEME'; flush privileges;
- and I added this stanza to /etc/munin/plugin-conf.d/zzz-ose
[mysql*] user root group wheel env.mysqlopts -u munin_user -pOBFUSCATED
- test worked
[root@opensourceecology plugins]# munin-run --debug mysql_queries # Processing plugin configuration from /etc/munin/plugin-conf.d/amavis # Processing plugin configuration from /etc/munin/plugin-conf.d/df # Processing plugin configuration from /etc/munin/plugin-conf.d/fw_ # Processing plugin configuration from /etc/munin/plugin-conf.d/hddtemp_smartctl # Processing plugin configuration from /etc/munin/plugin-conf.d/munin-node # Processing plugin configuration from /etc/munin/plugin-conf.d/postfix # Processing plugin configuration from /etc/munin/plugin-conf.d/postgres # Processing plugin configuration from /etc/munin/plugin-conf.d/sendmail # Processing plugin configuration from /etc/munin/plugin-conf.d/zzz-ose # Setting /rgid/ruid/ to /99/0/ # Setting /egid/euid/ to /99 99 10/0/ # Setting up environment # Environment mysqlopts = -u munin_user -pqd2qQiFdeNGepvhv5dsQx4rVt7pRyFJ # About to run '/etc/munin/plugins/mysql_queries' delete.value 837242 insert.value 896145 replace.value 1197242 select.value 148647861 update.value 1721521 cache_hits.value 0 [root@opensourceecology plugins]#
- now for nginx, I confirmed that we do have the ability to spit out the status page
[root@opensourceecology plugins]# nginx -V 2>&1 | grep -o with-http_stub_status_module with-http_stub_status_module [root@opensourceecology plugins]#
- I tried adding a block for '/nginx_status' only accessible to '127.0.0.1', but I still got 403'd when attempting to access it via curl on the local machine
- the access logs showed it being accessed from an ipv6 address
2a01:4f8:172:209e::2 - - [08/Oct/2019:08:37:49 +0000] "GET /nginx_status HTTP/1.1" 403 162 "-" "curl/7.29.0" "-"
- I guess it has to go out over eth0 because the server is necessarily bound to that ip (it's not bound to 127.0.0.1)
- I used the following block
# stats for munin location /nginx_status { stub_status on; access_log off; allow 127.0.0.1/32; allow 138.201.84.223/32; allow 138.201.84.243/32; allow ::1/128; allow 2a01:4f8:172:209e::2/128; allow fe80::921b:eff:fe94:7c4/128; deny all; }
- and it worked!
[root@opensourceecology conf.d]# nginx -t nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: configuration file /etc/nginx/nginx.conf test is successful [root@opensourceecology conf.d]# service nginx reload Redirecting to /bin/systemctl reload nginx.service [root@opensourceecology conf.d]# curl https://www.opensourceecology.org/nginx_status Active connections: 1 server accepts handled requests 16063989 16063989 27383851 Reading: 0 Writing: 1 Waiting: 0 [root@opensourceecology conf.d]#
- I found that my nginx module wouldn't work unless I installed the 'perl-LWP-Protocol-https' package
[root@opensourceecology plugins]# yum install perl-LWP-Protocol-https ... Installed: perl-LWP-Protocol-https.noarch 0:6.04-4.el7 Dependency Installed: perl-Mozilla-CA.noarch 0:20130114-5.el7 Complete! [root@opensourceecology plugins]#
- I added nginx configs for both the wiki & osemain. If all is well, I'll add the configs for out other vhosts
- I didn't bother with apache for now (also, the acl will be confusing since it sees all traffic coming from 127.0.0.1 via varnish)
- meanwhile, some of the mysql graphs are populating. good!
- and meanwhile, the rsync is still going; it's currently at "var/lib/mysql" copying or mysql databases' data. cool.
- ...
- after a few hours, I checked-up on rsync; it was stuck again
var/www/html/wiki.opensourceecology.org/htdocs/images/archive/5/5f/20170722193549!CEBPressJuneGroup.fcstd 4840012 100% 2.56MB/s 0:00:01 (xfer#344966, to-check=1043/396314) var/www/html/wiki.opensourceecology.org/htdocs/images/archive/5/5f/20170722195024!CEBPressJuneGroup.fcstd 950272 19% 879.62kB/s 0:00:04
- the vpn client appears to have disconnected, and I can't ping the staging host at all from prod
[maltfield@opensourceecology ~]$ ping 10.241.189.11 PING 10.241.189.11 (10.241.189.11) 56(84) bytes of data. ^C --- 10.241.189.11 ping statistics --- 59 packets transmitted, 0 received, 100% packet loss, time 57999ms [maltfield@opensourceecology ~]$
- I manually exited-out of the openvpn connection & reinitiated it; pings now work. After about 60 seconds, the rsync started outputting again..
- when I went to check the size of the lxc container, I was told <1G, which can't be right
[root@osedev1 lxc]# du -sh /var/lib/lxc/osestaging1 604M /var/lib/lxc/osestaging1 [root@osedev1 lxc]#
- ncdu pointed me to the snap1 dir, which s currently 48G
[root@osedev1 lxc]# du -sh /var/lib/lxcsnaps/osestaging1/snap1 48G /var/lib/lxcsnaps/osestaging1/snap1 [root@osedev1 lxc]#
- apparently this is the consequence of restoring a snapshot just by doing a rsync; the snapshot's config file has a new line that identifies the rootfs path explicitly as the snapshot's rootfs
[root@osedev1 lxc]# tail /var/lib/lxc/osestaging1/config lxc.cap.drop = mac_admin lxc.cap.drop = mac_override lxc.cap.drop = setfcap lxc.cap.drop = sys_module lxc.cap.drop = sys_nice lxc.cap.drop = sys_pacct lxc.cap.drop = sys_rawio lxc.cap.drop = sys_time lxc.hook.clone = /usr/share/lxc/hooks/clonehostname lxc.rootfs = /var/lib/lxcsnaps/osestaging1/snap1/rootfs [root@osedev1 lxc]#
- perhaps that means the actual dir is now my *real* snapshots data
- while rsync continued, I noted that my nginx graphs are appearing, but there's no label that differentiates the wiki from osemain's graphs
- I can see a list of variables defined by my plugin by default with the `munin-run <plugin> config` command https://munin.opensourceecology.org:4443/nginx-day.html
[root@opensourceecology plugins]# munin-run nginx_www.opensourceecology.org_status config graph_title NGINX status graph_args --base 1000 graph_category nginx graph_vlabel Connections total.label Active connections total.info Active connections total.draw LINE2 reading.label Reading reading.info Reading reading.draw LINE2 writing.label Writing writing.info Writing writing.draw LINE2 waiting.label Waiting waiting.info Waiting waiting.draw LINE2 [root@opensourceecology plugins]#
- so it looks like I can set this as 'graph_title' or 'graph_info'
- I restarted munin-node and triggered the munin-cron to update the html pages
[root@opensourceecology plugins]# service munin-node restart Redirecting to /bin/systemctl restart munin-node.service [root@opensourceecology plugins]# [root@opensourceecology plugins]# sudo -u munin /usr/bin/munin-cron
- the new variables didn't affect anything, so I started grepping the logs
- unrelated, the logs complained about mysql auth failure for:
- network_traffic
- select_types
- innodb_tnx
- innodb_log
- sorts
- myisam_indexes
- qcache_mem
- innodb_io
- connections
- qcache
- innodb_insert_buf
- replication
- bin_relay_log
- mysql_queries
- innodb_rows
- innodb_bpool_act
- files_table
- commands
- innodb_bpool
- tmp_tables
- innodb_semaphores
- innodb_io_pend
- table_locks
- slow
- but there was nothing related to nginx
- I tried overriding the graph_title in the plugins, but it didn't work
- I found the datafile for munin in /var/lib/munin/datafile. This is clearly where the graph title is defined before being generated into html files
[root@opensourceecology plugins]# grep nginx /var/lib/munin/datafile | grep -i graph_title localhost;localhost:nginx_wiki_opensourceecology_org_request.graph_title Nginx requests localhost;localhost:nginx_wiki_opensourceecology_org_status.graph_title NGINX status localhost;localhost:nginx_www_opensourceecology_org_status.graph_title NGINX status localhost;localhost:nginx_www_opensourceecology_org_request.graph_title Nginx requests [root@opensourceecology plugins]#
- I found that I *could* override the title in /etc/muin/munin.conf https://www.aroundmyroom.com/2015/01/10/munin-help-needed/
[localhost] address 127.0.0.1 use_node_name yes nginx_www_opensourceecology_org_status.graph_title Nginx Status (www.opensourceecology.org) nginx_wiki_opensourceecology_org_status.graph_title Nginx Status (wiki.opensourceecology.org)
- ...
- meanwhile, the rsync finished!
[maltfield@opensourceecology ~]$ [ "`hostname`" = "opensourceecology.org" ] && time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --rsync-path="sudo rsync" --exclude=/root --exclude=/run --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/ ... var/www/html/www.opensourceecology.org/htdocs/wp-includes/widgets/class-wp-widget-text.php 20735 100% 21.05kB/s 0:00:00 (xfer#450852, to-check=0/517755) var/yp/ sent 59229738371 bytes received 11198208 bytes 2959309.47 bytes/sec total size is 77965794338 speedup is 1.32 rsync warning: some files vanished before they could be transferred (code 24) at main.c(1052) [sender=3.0.9] real 333m37.655s user 19m50.292s sys 6m0.997s [maltfield@opensourceecology ~]$
- but I still can't ssh into it; again, my home dir is owned by the wrong user
[root@osestaging1 ~]# ls -lah /home/maltfield/.ssh total 16K drwxr-xr-x. 2 tgriffing tgriffing 4.0K Jan 19 2018 . drwx------. 10 tgriffing tgriffing 4.0K Oct 3 07:06 .. -rw-r--r--. 1 root root 750 Jun 20 2017 authorized_keys -rw-r--r--. 1 tgriffing tgriffing 1.1K Oct 3 13:44 known_hosts [root@osestaging1 ~]#
- maybe I should add the '--numeric-ids' option if rsync is mapping the uids over?
[ "`hostname`" = "opensourceecology.org" ] && time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --numeric-ids --rsync-path="sudo rsync" --exclude=/root --exclude=/run --exclude=/home/b2user/sync --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/
- I found that the 'sync.old' dir was still trying to sync, so I updated the command to add a wildcard after the exclude; it worked
[ "`hostname`" = "opensourceecology.org" ] && time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --numeric-ids --rsync-path="sudo rsync" --exclude=/root --exclude=/run --exclude=/home/b2user/sync* --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/
- this time the double-tap took only 3 minutes wall time
[maltfield@opensourceecology ~]$ [ "`hostname`" = "opensourceecology.org" ] && time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --numeric-ids --rsync-path="sudo rsync" --exclude=/root --exclude=/run --exclude=/home/b2user/sync* --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/ ... var/www/html/munin/static/zoom.js 4760 100% 1.13MB/s 0:00:00 (xfer#2239, to-check=1002/321739) sent 224884435 bytes received 1668273 bytes 1352553.48 bytes/sec total size is 41283867704 speedup is 182.23 real 2m46.967s user 0m32.382s sys 0m8.095s [maltfield@opensourceecology ~]$
- this time the permissions of my home dir didn't break, and I was able to ssh-in.
- I'd like to take a snapshot of the staging server, but at this point we don't have space for it
[root@osedev1 lxc]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 19G 3.4G 15G 19% / devtmpfs 873M 0 873M 0% /dev tmpfs 896M 0 896M 0% /dev/shm tmpfs 896M 17M 879M 2% /run tmpfs 896M 0 896M 0% /sys/fs/cgroup /dev/mapper/ose_dev_volume_1 125G 94G 25G 80% /mnt/ose_dev_volume_1 tmpfs 180M 0 180M 0% /run/user/1000 [root@osedev1 lxc]#
- ok, now, drum roll: did we break the staging server? let's try to shut it down & start it again.
- aaaaand: IT CAME BACK UP! Now it said its hostname isn't 'osestaging1' but 'opensourceecology'. Coolz.
- I was successfully able to ssh into it, but then it froze. And my attempts to login to the lxc-console all end in timeouts
opensourceecology login: maltfield Password: login: timed out after 60 seconds CentOS Linux 7 (Core) Kernel 3.10.0-957.21.3.el7.x86_64 on an x86_64 opensourceecology login:
- if I attempt to login as root, then it just times-out before it even asks me for a password
opensourceecology login: root login: timed out after 60 seconds CentOS Linux 7 (Core) Kernel 3.10.0-957.21.3.el7.x86_64 on an x86_64 opensourceecology login:
- ssh auth suceeds, but it also fails before I get a shell
... debug1: Authentication succeeded (publickey). Authenticated to 10.241.189.11 ([10.241.189.11]:32415). debug1: channel 0: new [client-session] debug3: ssh_session2_open: channel_new: 0 debug2: channel 0: send open debug3: send packet: type 90 debug1: Requesting no-more-sessions@openssh.com debug3: send packet: type 80 debug1: Entering interactive session. debug1: pledge: network
- I stopped the container again. This time when I tried to start it, I got an error
[root@osedev1 ~]# lxc-start -n -osestaging1 lxc-start: lxc_start.c: main: 290 Executing '/sbin/init' with no configuration file may crash the host [root@osedev1 ~]#
- I moved some dirs around so that I'm no longer using the 'rootfs' dir from the snaps dir, but now I get this damn message. duckducks are dead-ends
[root@osedev1 lxc]# lxc-start -n osestaging1 lxc-start: sync.c: __sync_wake: 74 sync wake failure : Broken pipe lxc-start: start.c: __lxc_start: 1149 failed to spawn 'osestaging1' [root@osedev1 lxc]# lxc-start -P /var/lib/lxc/ -n osestaging1 lxc-start: sync.c: __sync_wake: 74 sync wake failure : Broken pipe lxc-start: start.c: __lxc_start: 1149 failed to spawn 'osestaging1' [root@osedev1 lxc]#
- I tried rebooting the dev server. after it came up, I still got the same error when attempting to `lxc-start`
- I found I could get debug logs by adding `-l log -o <file>` https://github.com/lxc/lxc/issues/1555
[root@osedev1 ~]# lxc-start -n osestaging1 -l debug -o lxc-start.log lxc-start: sync.c: __sync_wake: 74 sync wake failure : Broken pipe lxc-start: start.c: __lxc_start: 1149 failed to spawn 'osestaging1' [root@osedev1 ~]# cat lxc-start.log ...
- all the god damn google results on this "sync wake failure" shit (which are already few) are regarding configs of multiple containers sharing a network. I'll destroy the whole network namespace if needed. but how? why does nobody else encounter this damn issue?
- well, I found the source code. could be an issue with an open file descriptor or something? https://fossies.org/linux/lxc/src/lxc/sync.c
- my best guess is that it's an issue with the 'rootfs.dev' symlink
[root@osedev1 lxc]# ls -lah osestaging1 total 28K drwxrwx---. 5 root root 4.0K Oct 8 16:17 . drwxr-xr-x. 6 root root 4.0K Oct 8 16:05 .. -rw-r--r--. 1 root root 1.1K Oct 8 15:46 config drwxr-xr-x. 3 root root 4.0K Oct 8 15:46 dev drwxr-xr-x. 2 root root 4.0K Oct 8 15:52 osestaging1 dr-xr-xr-x. 20 root root 4.0K Oct 8 15:21 rootfs lrwxrwxrwx. 1 root root 38 Oct 8 16:17 rootfs.dev -> /dev/.lxc/osestaging1.72930b02843095eb -rw-r--r--. 1 root root 19 Oct 3 15:40 ts [root@osedev1 lxc]#
- I commented-out every fucking line in the config file that had the word 'dev' in it...and the system started! Except that, umm, I couldn't connect to its console?
[root@osedev1 lxc]# lxc-start -n osestaging1 -f osestaging1/config -l trace -o lxc-start.log Failed to create unit file /run/systemd/generator.late/netconsole.service: File exists Failed to create unit file /run/systemd/generator.late/network.service: File exists Running in a container, ignoring fstab device entry for /dev/disk/by-uuid/1e457b76-5100-4b53-bcdc-667ca122b941. Running in a container, ignoring fstab device entry for /dev/mapper/ose_dev_volume_1. Failed to create unit file /run/systemd/generator/systemd-cryptsetup@ose_dev_volume_1.service: File exists lxc-start: console.c: lxc_console_peer_proxy_alloc: 315 console not set up
- I found that if I commented-out the first line and added-back a rootfs line, I could get it to boot again, but I couldn't login from the console (same 60 second timeout) or ssh in (or ping it)
#lxc.mount.entry = /dev/net dev/net none bind,create=dir ... lxc.rootfs = /var/lib/lxc/osestaging1/rootfs
- I uncommented the first line, and it still started! looks like the issue was that I didn't explicitly define a rootfs..
- this time I could ping the server from my laptop over the vpn
- I was able to login as 'maltfield' from the console, but it locked-up when I tried to `sudo su -`
- on the next reboot, tailed all the files in /var/log from the osedev1 server (inside the staging container's rootfs dir); I saw some interesting results
==> osestaging1/rootfs/var/log/messages <== Oct 8 14:50:00 opensourceecology NET[248]: /usr/sbin/dhclient-script : updated /etc/resolv.conf Oct 8 14:50:00 opensourceecology dhclient[201]: bound to 192.168.122.201 -- renewal in 1588 seconds. Oct 8 14:50:00 opensourceecology network: Determining IP information for eth0... done. Oct 8 14:50:00 opensourceecology network: [ OK ] Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/kernel/yama/ptrace_scope': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '16' to '/proc/sys/kernel/sysrq': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '1' to '/proc/sys/kernel/core_uses_pid': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '1' to '/proc/sys/net/ipv4/conf/default/rp_filter': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '1' to '/proc/sys/net/ipv4/conf/all/rp_filter': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv4/conf/default/accept_source_route': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv4/conf/all/accept_source_route': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '1' to '/proc/sys/net/ipv4/conf/default/promote_secondaries': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '1' to '/proc/sys/net/ipv4/conf/all/promote_secondaries': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '1' to '/proc/sys/fs/protected_hardlinks': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '1' to '/proc/sys/fs/protected_symlinks': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '1' to '/proc/sys/net/ipv4/icmp_echo_ignore_broadcasts': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/default/autoconf': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/default/accept_dad': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/default/accept_ra': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/default/accept_ra_defrtr': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/default/accept_ra_rtr_pref': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/default/accept_ra_pinfo': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/default/accept_source_route': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/default/accept_redirects': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/default/forwarding': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/all/autoconf': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/all/accept_dad': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/all/accept_ra': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/all/accept_ra_defrtr': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/all/accept_ra_rtr_pref': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/all/accept_ra_pinfo': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/all/accept_source_route': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/all/accept_redirects': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/all/forwarding': Read-only file system Oct 8 14:50:01 opensourceecology systemd: Started LSB: Bring up/down networking.
- and issues with /run
Oct 8 14:50:05 opensourceecology systemd-logind: Failed to remove runtime directory /run/user/0: Device or resource busy
Mon Oct 07, 2019
- I added a comment to our long-standing feature request with the Libre Office Online CODE project for the ability to draw lines & arrows in their online version of "present" https://bugs.documentfoundation.org/show_bug.cgi?id=113386#c4
- wiki updates & logging
- I tried to login to my hetzner cloud account, but I got "Account is disabled" fucking hell. so much for user-specific auditing. I logged-in with our shared account..
- I confirmed that our osedev1 node has a 20G disk + 10G volume.
- we currently are using 3.4/19G on osedev1; I never setup the 10G volume that appears to be at /mnt/HC_Volume_3110278. It has 10G avail
[maltfield@osedev1 ~]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 19G 3.4G 15G 19% / devtmpfs 873M 0 873M 0% /dev tmpfs 896M 0 896M 0% /dev/shm tmpfs 896M 25M 871M 3% /run tmpfs 896M 0 896M 0% /sys/fs/cgroup /dev/sdb 9.8G 37M 9.3G 1% /mnt/HC_Volume_3110278 tmpfs 180M 0 180M 0% /run/user/1000 [maltfield@osedev1 ~]$ ls -lah /mnt/HC_Volume_3110278/ total 24K drwxr-xr-x. 3 root root 4.0K Aug 20 11:50 . drwxr-xr-x. 3 root root 4.0K Aug 20 12:16 .. drwx------. 2 root root 16K Aug 20 11:50 lost+found [maltfield@osedev1 ~]$
- the disk RAID1'd disk on prod is 197G with 75G used
[maltfield@opensourceecology ~]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/md2 197G 75G 113G 40% / devtmpfs 32G 0 32G 0% /dev tmpfs 32G 8.0K 32G 1% /dev/shm tmpfs 32G 2.6G 29G 9% /run tmpfs 32G 0 32G 0% /sys/fs/cgroup /dev/md1 488M 289M 174M 63% /boot tmpfs 6.3G 0 6.3G 0% /run/user/0 tmpfs 6.3G 0 6.3G 0% /run/user/1005 [maltfield@opensourceecology ~]$
- a quick duckduck pulled up this guide for using luks to create an encrypted volume out of hetzner block volumes; this is a good idea https://angristan.xyz/how-to-use-encrypted-block-storage-volumes-hetzner-cloud/
- the guide shows a method for resizing the encrypted volume. I didn't think that would be trivial, but it appears that resize2fs can increase the size of a luks-encrypted volume without issue. this is good to know. if we run out of space (or maybe we create a second staging node or ad-hoc dev nodes), we should be able to shutdown all our lxc containers, unmount the block drive, resize it, and remount it. That said, I don't think we'll be making backups of these (dev/staging) containers, so if we fuck up it would be bad.
- our 10G hetzner cloud block volume has been costing 0.48 EUR/mo = 5.76 EUR/yr
- the min needed for our current prod server is 75G. The slider on the product page has weird increments, but the actual "resize volume" option in the cloud console wui permits resizing in 1G increments. A 75G volume would cost 3.00 EUR/mo = 35 EUR/yr
- A much more sane choice would be equal to the disk on prod = 197G = 7.88 EUR/mo = 94.56 EUR/yr
- fuck, I asked Marcin for $100/yr. Currently we're spending 2.49/mo on the osedev1 instance alone. That's 29.88 EUR/yr = 32.81 USD/yr. For a 100 USD/yr budget, that leaves 67.19 USD for disk space = 61.19 EUR/yr. That's 5.09 EUR/mo, which will buy us a 127G volume at 5.08 EUR/mo.
- 127/197 = 0.64. Therefore, a 127G block volume will allow for an lxc staging node to replicate our prod node until our prod node grows beyond 64% capacity. 70% is a good general high-water-mark at which we'd need to look at migrating prod anyway. This (127G) seems like a resonable low-budget solution that meets the 100 USD/yr line.
- I resized our 10G 'ose-dev-volume-1' volume to 127G in the hetzner WUI.
- I clicked the 'enable protection' option, which prevents it from being deleted until the protection is manually removed
- the 'show configuration' window in the wui tells us that the volume is '/dev/disk/by-id/scsi-0HC_Volume_3110278' on osedev1
- the box itself looks like it's really /dev/sdb
[maltfield@osedev1 ~]$ mount sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel) proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=893568k,nr_inodes=223392,mode=755) securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,seclabel) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000) tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,mode=755) tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,seclabel,mode=755) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd) pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,net_prio,net_cls) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,perf_event) cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,memory) cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,pids) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,cpuset,clone_children) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,cpuacct,cpu) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,blkio) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,freezer) cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,hugetlb) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,devices) configfs on /sys/kernel/config type configfs (rw,relatime) /dev/sda1 on / type ext4 (rw,relatime,seclabel,data=ordered) selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime) debugfs on /sys/kernel/debug type debugfs (rw,relatime) systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=27,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=11033) hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel) mqueue on /dev/mqueue type mqueue (rw,relatime,seclabel) /dev/sdb on /mnt/HC_Volume_3110278 type ext4 (rw,relatime,seclabel,discard,data=ordered) tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=183308k,mode=700,uid=1000,gid=1000) [maltfield@osedev1 ~]$
- but the other name appears in fstab
[root@osedev1 ~]# cat /etc/fstab # # /etc/fstab # Created by anaconda on Sun Jul 14 04:14:25 2019 # # Accessible filesystems, by reference, are maintained under '/dev/disk' # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info # UUID=1e457b76-5100-4b53-bcdc-667ca122b941 / ext4 defaults 1 1 /dev/disk/by-id/scsi-0HC_Volume_3110278 /mnt/HC_Volume_3110278 ext4 discard,nofail,defaults 0 0 [root@osedev1 ~]#
- ah, indeed, the above disk is just a link back to /dev/sdb
[root@osedev1 ~]# ls -lah /dev/disk/by-id/scsi-0HC_Volume_3110278 lrwxrwxrwx. 1 root root 9 Oct 7 10:31 /dev/disk/by-id/scsi-0HC_Volume_3110278 -> ../../sdb [root@osedev1 ~]#
- before I rebuild this volume, the cryptfs command begs the question: where do I store the key?
- assuming I want the server to be able to restart by itself without user interaction, the key should probably be stored in a file somewhere on '/root' on 'osedev1' but while my OS would lock-down the permissions to that file, the key file itself would likely be stored unencrypted on some hetzner drive somewhere. Is it worth encrypting the contents of the block volume when the encryption key itself might be stored unencrypted somewhere at hetzner's datacenter?
- as a test, I ran `testdisk` to see if I could find any deleted files in the 10G volume that hetzner gave us from previous customers; I couldn't.
- someone asked about this, but there wasn't much great discussion on how hetzner provisions their disks https://serverfault.com/questions/950790/cloud-server-vulnerability-analysis?noredirect=1
- so risk assessment: when working in a cloud, we have to accept the integrity of the cloud provider. If a rogue hetzner employee wants to steal all our data, they can. There's absolutely nothing we can do about that other than building the servers ourselves and physically locking them down. The decision to use hetzner predates me, but I agree with it. It does not make sense for OSE to buy a server rack and host our equipment at FeF. So, I accept the risk and trust that hetzner not do something malicious that will put our data at risk
- the real concern here is that we resize our volume (or hetzner in the background shuffles some abstracted blocks around physical devices that's black-boxed to us), and a different customer suddently gets, for example, our user's PII in their new volume. Or a malicious hetzner cloud user triggers some shuffling and is successfully able to exfiltrate our data from their cloud without breaking into our server. This is the risk that we're trying to prevent. In this case, I think it *is* worthwhile to encrypt our block volume. The chances that someone is able to get chunks of our data from an old 127G block volume that lacked encryption is significantly higher than them able to get those *and* the key from our server *and* be able to use the key to extract meaninful data from the likely non-contiguious bits that may be extracted from our recycled block volume data.
- hetzner does not have a clean record, but hardly anybody does. This is only customer data, though. Not the their customer's server contents data https://mybroadband.co.za/news/cloud-hosting/279181-hetzner-client-data-exposed-after-attack.html
- so, while recognizing that it has limitations, I also recognize that there are sufficient benefits to justfy encrypting this block volume with a key stored unencrypted on our cloud instance
- meanwhile, I found a guide for how to migrate the contents of /var to a block volume. It suggested doing so from a resuce disk, then editing fstab for the next reboot https://serverfault.com/questions/947732/how-to-add-hetzner-cloud-disk-volume-to-extend-var-partition
- I created a new key file on my laptop, stored it in our shared keepass, and uploaded it to the server at /root/keys/ose-dev-volume-1.201910.key
- let's shutdown osedev1 and migrate its /var/ to a block volume. First I'll shutdown the osestagng1 staging lxc container then the host osedev1
[root@osedev1 ~]# lxc-stop -n osestaging1 [root@osedev1 ~]# shutdown -h now Connection to 195.201.233.113 closed by remote host. Connection to 195.201.233.113 closed. user@ose:~$
- I confirmed that the server was off in the hetzner cloud console wui
- I clicked on the server. I'm not clear if I should mount a rescue disk or click the "rescue" option. No idea what the latter is, so I navigated to "ISO IMAGES", found SystemRescueCD, and clicked the "MOUNT" button next to it. I went back to the "servers"# I added a comment to our long-standing feature request with the Libre Office Online CODE project for the ability to draw lines & arrows in their online version of "present" https://bugs.documentfoundation.org/show_bug.cgi?id=113386#c4
- wiki updates & logging
- I tried to login to my hetzner cloud account, but I got "Account is disabled" fucking hell. so much for user-specific auditing. I logged-in with our shared account..
- I confirmed that our osedev1 node has a 20G disk + 10G volume.
- we currently are using 3.4/19G on osedev1; I never setup the 10G volume that appears to be at /mnt/HC_Volume_3110278. It has 10G avail
[maltfield@osedev1 ~]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 19G 3.4G 15G 19% / devtmpfs 873M 0 873M 0% /dev tmpfs 896M 0 896M 0% /dev/shm tmpfs 896M 25M 871M 3% /run tmpfs 896M 0 896M 0% /sys/fs/cgroup /dev/sdb 9.8G 37M 9.3G 1% /mnt/HC_Volume_3110278 tmpfs 180M 0 180M 0% /run/user/1000 [maltfield@osedev1 ~]$ ls -lah /mnt/HC_Volume_3110278/ total 24K drwxr-xr-x. 3 root root 4.0K Aug 20 11:50 . drwxr-xr-x. 3 root root 4.0K Aug 20 12:16 .. drwx------. 2 root root 16K Aug 20 11:50 lost+found [maltfield@osedev1 ~]$
- the disk RAID1'd disk on prod is 197G with 75G used
[maltfield@opensourceecology ~]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/md2 197G 75G 113G 40% / devtmpfs 32G 0 32G 0% /dev tmpfs 32G 8.0K 32G 1% /dev/shm tmpfs 32G 2.6G 29G 9% /run tmpfs 32G 0 32G 0% /sys/fs/cgroup /dev/md1 488M 289M 174M 63% /boot tmpfs 6.3G 0 6.3G 0% /run/user/0 tmpfs 6.3G 0 6.3G 0% /run/user/1005 [maltfield@opensourceecology ~]$
- a quick duckduck pulled up this guide for using luks to create an encrypted volume out of hetzner block volumes; this is a good idea https://angristan.xyz/how-to-use-encrypted-block-storage-volumes-hetzner-cloud/
- the guide shows a method for resizing the encrypted volume. I didn't think that would be trivial, but it appears that resize2fs can increase the size of a luks-encrypted volume without issue. this is good to know. if we run out of space (or maybe we create a second staging node or ad-hoc dev nodes), we should be able to shutdown all our lxc containers, unmount the block drive, resize it, and remount it. That said, I don't think we'll be making backups of these (dev/staging) containers, so if we fuck up it would be bad.
- our 10G hetzner cloud block volume has been costing 0.48 EUR/mo = 5.76 EUR/yr
- the min needed for our current prod server is 75G. The slider on the product page has weird increments, but the actual "resize volume" option in the cloud console wui permits resizing in 1G increments. A 75G volume would cost 3.00 EUR/mo = 35 EUR/yr
- A much more sane choice would be equal to the disk on prod = 197G = 7.88 EUR/mo = 94.56 EUR/yr
- fuck, I asked Marcin for $100/yr. Currently we're spending 2.49/mo on the osedev1 instance alone. That's 29.88 EUR/yr = 32.81 USD/yr. For a 100 USD/yr budget, that leaves 67.19 USD for disk space = 61.19 EUR/yr. That's 5.09 EUR/mo, which will buy us a 127G volume at 5.08 EUR/mo.
- 127/197 = 0.64. Therefore, a 127G block volume will allow for an lxc staging node to replicate our prod node until our prod node grows beyond 64% capacity. 70% is a good general high-water-mark at which we'd need to look at migrating prod anyway. This (127G) seems like a resonable low-budget solution that meets the 100 USD/yr line.
- I resized our 10G 'ose-dev-volume-1' volume to 127G in the hetzner WUI.
- I clicked the 'enable protection' option, which prevents it from being deleted until the protection is manually removed
- the 'show configuration' window in the wui tells us that the volume is '/dev/disk/by-id/scsi-0HC_Volume_3110278' on osedev1
- the box itself looks like it's really /dev/sdb
[maltfield@osedev1 ~]$ mount sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel) proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=893568k,nr_inodes=223392,mode=755) securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,seclabel) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000) tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,mode=755) tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,seclabel,mode=755) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd) pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,net_prio,net_cls) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,perf_event) cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,memory) cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,pids) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,cpuset,clone_children) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,cpuacct,cpu) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,blkio) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,freezer) cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,hugetlb) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,devices) configfs on /sys/kernel/config type configfs (rw,relatime) /dev/sda1 on / type ext4 (rw,relatime,seclabel,data=ordered) selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime) debugfs on /sys/kernel/debug type debugfs (rw,relatime) systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=27,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=11033) hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel) mqueue on /dev/mqueue type mqueue (rw,relatime,seclabel) /dev/sdb on /mnt/HC_Volume_3110278 type ext4 (rw,relatime,seclabel,discard,data=ordered) tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=183308k,mode=700,uid=1000,gid=1000) [maltfield@osedev1 ~]$
- but the other name appears in fstab
[root@osedev1 ~]# cat /etc/fstab # # /etc/fstab # Created by anaconda on Sun Jul 14 04:14:25 2019 # # Accessible filesystems, by reference, are maintained under '/dev/disk' # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info # UUID=1e457b76-5100-4b53-bcdc-667ca122b941 / ext4 defaults 1 1 /dev/disk/by-id/scsi-0HC_Volume_3110278 /mnt/HC_Volume_3110278 ext4 discard,nofail,defaults 0 0 [root@osedev1 ~]#
- ah, indeed, the above disk is just a link back to /dev/sdb
[root@osedev1 ~]# ls -lah /dev/disk/by-id/scsi-0HC_Volume_3110278 lrwxrwxrwx. 1 root root 9 Oct 7 10:31 /dev/disk/by-id/scsi-0HC_Volume_3110278 -> ../../sdb [root@osedev1 ~]#
- before I rebuild this volume, the cryptfs command begs the question: where do I store the key?
- assuming I want the server to be able to restart by itself without user interaction, the key should probably be stored in a file somewhere on '/root' on 'osedev1' but while my OS would lock-down the permissions to that file, the key file itself would likely be stored unencrypted on some hetzner drive somewhere. Is it worth encrypting the contents of the block volume when the encryption key itself might be stored unencrypted somewhere at hetzner's datacenter?
- as a test, I ran `testdisk` to see if I could find any deleted files in the 10G volume that hetzner gave us from previous customers; I couldn't.
- someone asked about this, but there wasn't much great discussion on how hetzner provisions their disks https://serverfault.com/questions/950790/cloud-server-vulnerability-analysis?noredirect=1
- so risk assessment: when working in a cloud, we have to accept the integrity of the cloud provider. If a rogue hetzner employee wants to steal all our data, they can. There's absolutely nothing we can do about that other than building the servers ourselves and physically locking them down. The decision to use hetzner predates me, but I agree with it. It does not make sense for OSE to buy a server rack and host our equipment at FeF. So, I accept the risk and trust that hetzner not do something malicious that will put our data at risk
- the real concern here is that we resize our volume (or hetzner in the background shuffles some abstracted blocks around physical devices that's black-boxed to us), and a different customer suddently gets, for example, our user's PII in their new volume. Or a malicious hetzner cloud user triggers some shuffling and is successfully able to exfiltrate our data from their cloud without breaking into our server. This is the risk that we're trying to prevent. In this case, I think it *is* worthwhile to encrypt our block volume. The chances that someone is able to get chunks of our data from an old 127G block volume that lacked encryption is significantly higher than them able to get those *and* the key from our server *and* be able to use the key to extract meaninful data from the likely non-contiguious bits that may be extracted from our recycled block volume data.
- hetzner does not have a clean record, but hardly anybody does. This is only customer data, though. Not the their customer's server contents data https://mybroadband.co.za/news/cloud-hosting/279181-hetzner-client-data-exposed-after-attack.html
- so, while recognizing that it has limitations, I also recognize that there are sufficient benefits to justfy encrypting this block volume with a key stored unencrypted on our cloud instance
- meanwhile, I found a guide for how to migrate the contents of /var to a block volume. It suggested doing so from a resuce disk, then editing fstab for the next reboot https://serverfault.com/questions/947732/how-to-add-hetzner-cloud-disk-volume-to-extend-var-partition
- I created a new key file on my laptop, stored it in our shared keepass, and uploaded it to the server at /root/keys/ose-dev-volume-1.201910.key
- let's shutdown osedev1 and migrate its /var/ to a block volume. First I'll shutdown the osestagng1 staging lxc container then the host osedev1
[root@osedev1 ~]# lxc-stop -n osestaging1 [root@osedev1 ~]# shutdown -h now Connection to 195.201.233.113 closed by remote host. Connection to 195.201.233.113 closed. user@ose:~$
- I confirmed that the server was off in the hetzner cloud console wui
- I clicked on the server. I'm not clear if I should mount a rescue disk or click the "rescue" option. No idea what the latter is, so I navigated to "ISO IMAGES", found SystemRescueCD, and clicked the "MOUNT" button next to it. I went back to the "servers" page, opened a console for 'osedev1', and clicked "Power on"
- the console showed the boot options for the rescue cd. I choose the first menu item = "SystemRescueCd: default boot options"
- I can't copy & paste from the console, but I basically found 5x items in /dev/disk/by-id/
- the DVD for systemrescue
- my 127G block volume with the same name shown above (scsi-0HC_Volume_3110278 )
- scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0
- scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0-part1
- another DVD?
- so 3 & 4 must be our osedev1 disk. Both are 19.1G
- attempting to mount the one without '-part1' failed, but the one with '-part1' succeeded, and all my data was there. It was mounted to '/mnt/osedev1-part/'
- I formatted the new 127G ebs volume using cryptsetup
cryptsetup luksFormat /dev/disk/by-id/scsi-0HC_Volume_211278 /mnt/osedev1/root/keys/ose-dev-volume-1.201910.key
- I opened the new encrypted luks volume and created its ext4 partition
cryptsetup luksOpen --key-file /mnt/osedev1/root/keys/ose-dev-volume-1.201910.key /dev/disk/by-id/scsi-0HC_Volume_211278 ebs mkfs.ext4 -j /dev/mapper/ebs
- I mounted the new FS & began a sync the osedev1's 'var' dir (now only 2.3G) to it
mkdir /mnt/ebs mount /dev/mapper/ebs /mnt/ebs rsync -av --progress /mnt/osedev1/var /mnt/ebs/
- I added entries for fstab & crypttab to auto-mount the volume to /mnt/ose_dev_volume_1/
- I moved the existing /var/ dir to /var.old and made a symlink from /var/ to /mnt/ose_dev_volume_1/var
- I safely umounted & closed all the disks and shutdown
- I removed the systemrescue iso from the server and started it up again
- I was able to ssh-in, and the new '/var/' dir *appeared* to be setup properly
[maltfield@osedev1 /]$ ls -lah /var lrwxrwxrwx. 1 root root 25 Oct 7 13:47 /var -> /mnt/ose_dev_volume_1/var [maltfield@osedev1 /]$ ls -lah /var/ total 80K drwxr-xr-x. 19 root root 4.0K Jul 14 06:18 . drwxr-xr-x. 4 root root 4.0K Oct 7 13:22 .. drwxr-xr-x. 2 root root 4.0K Apr 11 2018 adm drwxr-xr-x. 7 root root 4.0K Oct 2 14:24 cache drwxr-xr-x. 2 root root 4.0K Apr 24 16:03 crash drwxr-xr-x. 3 root root 4.0K Jul 14 06:15 db drwxr-xr-x. 3 root root 4.0K Jul 14 06:15 empty drwxr-xr-x. 2 root root 4.0K Apr 11 2018 games drwxr-xr-x. 2 root root 4.0K Apr 11 2018 gopher drwxr-xr-x. 3 root root 4.0K Jul 14 06:14 kerberos drwxr-xr-x. 34 root root 4.0K Oct 2 15:34 lib drwxr-xr-x. 2 root root 4.0K Apr 11 2018 local lrwxrwxrwx. 1 root root 11 Jul 14 06:14 lock -> ../run/lock drwxr-xr-x. 11 root root 4.0K Oct 7 13:49 log lrwxrwxrwx. 1 root root 10 Jul 14 06:14 mail -> spool/mail drwxr-xr-x. 2 root root 4.0K Apr 11 2018 nis drwxr-xr-x. 2 root root 4.0K Apr 11 2018 opt drwxr-xr-x. 2 root root 4.0K Apr 11 2018 preserve lrwxrwxrwx. 1 root root 6 Jul 14 06:14 run -> ../run drwxr-xr-x. 8 root root 4.0K Oct 3 08:06 spool drwxrwxrwt. 4 root root 4.0K Oct 7 13:49 tmp -rw-r--r--. 1 root root 163 Jul 14 06:14 .updated drwxr-xr-x. 2 root root 4.0K Apr 11 2018 yp [maltfield@osedev1 /]$
- but I immediately noticed that, for exaple, screen wasn't working
[maltfield@osedev1 /]$ screen -S ebs Cannot make directory '/var/run/screen': No such file or directory [maltfield@osedev1 /]$
- oh, damn, '/var/run' is a relative symlink to '../run' which won't work
[maltfield@osedev1 /]$ ls -lah /var/run lrwxrwxrwx. 1 root root 6 Jul 14 06:14 /var/run -> ../run [maltfield@osedev1 /]$
- I made it an absolute symlink instead
[root@osedev1 var]# rm -rf lock [root@osedev1 var]# rm -rf run [root@osedev1 var]# ln -s /run [root@osedev1 var]# ln -s /run/lock [root@osedev1 var]# ls -lah run lrwxrwxrwx. 1 root root 4 Oct 7 13:54 run -> /run [root@osedev1 var]# ls -lah lock lrwxrwxrwx. 1 root root 9 Oct 7 13:54 lock -> /run/lock [root@osedev1 var]#
- it still fails, but everything looks ok; I gave the system a reboot
[root@osedev1 var]# rm -rf lock [root@osedev1 var]# rm -rf run [root@osedev1 var]# ln -s /run [root@osedev1 var]# ln -s /run/lock [root@osedev1 var]# ls -lah run lrwxrwxrwx. 1 root root 4 Oct 7 13:54 run -> /run [root@osedev1 var]# ls -lah lock lrwxrwxrwx. 1 root root 9 Oct 7 13:54 lock -> /run/lock [root@osedev1 var]#
- when the system came back up, `screen` had no issues, and everything looked good.
[maltfield@osedev1 ~]$ screen -ls There is a screen on: 4362.ebs (Attached) 1 Socket in /var/run/screen/S-maltfield. [maltfield@osedev1 ~]$ sudo su - Last login: Mon Oct 7 13:54:28 CEST 2019 on pts/0 [root@osedev1 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 19G 3.4G 15G 19% / devtmpfs 873M 0 873M 0% /dev tmpfs 896M 0 896M 0% /dev/shm tmpfs 896M 17M 879M 2% /run tmpfs 896M 0 896M 0% /sys/fs/cgroup /dev/mapper/ose_dev_volume_1 125G 2.5G 116G 3% /mnt/ose_dev_volume_1 tmpfs 180M 0 180M 0% /run/user/1000 [root@osedev1 ~]# ls -lah /var lrwxrwxrwx. 1 root root 25 Oct 7 13:47 /var -> /mnt/ose_dev_volume_1/var [root@osedev1 ~]# ls -lah /mnt/ose_dev_volume_1/ total 28K drwxr-xr-x. 4 root root 4.0K Oct 7 13:22 . drwxr-xr-x. 4 root root 4.0K Oct 7 13:46 .. drwx------. 2 root root 16K Oct 7 13:18 lost+found drwxr-xr-x. 19 root root 4.0K Oct 7 13:54 var [root@osedev1 ~]#
- I started the staging server, connected to the vpn from my laptop, and was successfully able to ssh into it (though it took a long delay)
- I ssh'd into prod and kicked-off the rsync!
time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --rsync-path="sudo rsync" --exclude=/root --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/
- that also copied the old backups, which is probably unnecessary. I should also exclude
- home/b2user/sync
- this sync is going at a rate of about 1G every 5 minutes. I expect it'll be done in 5-10 hours. I'll check on it tomorrow. page, opened a console for 'osedev1', and clicked "Power on"
- the console showed the boot options for the rescue cd. I choose the first menu item = "SystemRescueCd: default boot options"
- I can't copy & paste from the console, but I basically found 5x items in /dev/disk/by-id/
- the DVD for systemrescue
- my 127G block volume with the same name shown above (scsi-0HC_Volume_3110278 )
- scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0
- scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0-part1
- another DVD?
- so 3 & 4 must be our osedev1 disk. Both are 19.1G
- attempting to mount the one without '-part1' failed, but the one with '-part1' succeeded, and all my data was there. It was mounted to '/mnt/osedev1-part/'
- I formatted the new 127G ebs volume using cryptsetup
cryptsetup luksFormat /dev/disk/by-id/scsi-0HC_Volume_211278 /mnt/osedev1/root/keys/ose-dev-volume-1.201910.key
- I opened the new encrypted luks volume and created its ext4 partition
cryptsetup luksOpen --key-file /mnt/osedev1/root/keys/ose-dev-volume-1.201910.key /dev/disk/by-id/scsi-0HC_Volume_211278 ebs mkfs.ext4 -j /dev/mapper/ebs
- I mounted the new FS & began a sync the osedev1's 'var' dir (now only 2.3G) to it
mkdir /mnt/ebs mount /dev/mapper/ebs /mnt/ebs rsync -av --progress /mnt/osedev1/var /mnt/ebs/
- I added entries for fstab & crypttab to auto-mount the volume to /mnt/ose_dev_volume_1/
- I moved the existing /var/ dir to /var.old and made a symlink from /var/ to /mnt/ose_dev_volume_1/var
- I safely umounted & closed all the disks and shutdown
- I removed the systemrescue iso from the server and started it up again
- I was able to ssh-in, and the new '/var/' dir *appeared* to be setup properly
[maltfield@osedev1 /]$ ls -lah /var lrwxrwxrwx. 1 root root 25 Oct 7 13:47 /var -> /mnt/ose_dev_volume_1/var [maltfield@osedev1 /]$ ls -lah /var/ total 80K drwxr-xr-x. 19 root root 4.0K Jul 14 06:18 . drwxr-xr-x. 4 root root 4.0K Oct 7 13:22 .. drwxr-xr-x. 2 root root 4.0K Apr 11 2018 adm drwxr-xr-x. 7 root root 4.0K Oct 2 14:24 cache drwxr-xr-x. 2 root root 4.0K Apr 24 16:03 crash drwxr-xr-x. 3 root root 4.0K Jul 14 06:15 db drwxr-xr-x. 3 root root 4.0K Jul 14 06:15 empty drwxr-xr-x. 2 root root 4.0K Apr 11 2018 games drwxr-xr-x. 2 root root 4.0K Apr 11 2018 gopher drwxr-xr-x. 3 root root 4.0K Jul 14 06:14 kerberos drwxr-xr-x. 34 root root 4.0K Oct 2 15:34 lib drwxr-xr-x. 2 root root 4.0K Apr 11 2018 local lrwxrwxrwx. 1 root root 11 Jul 14 06:14 lock -> ../run/lock drwxr-xr-x. 11 root root 4.0K Oct 7 13:49 log lrwxrwxrwx. 1 root root 10 Jul 14 06:14 mail -> spool/mail drwxr-xr-x. 2 root root 4.0K Apr 11 2018 nis drwxr-xr-x. 2 root root 4.0K Apr 11 2018 opt drwxr-xr-x. 2 root root 4.0K Apr 11 2018 preserve lrwxrwxrwx. 1 root root 6 Jul 14 06:14 run -> ../run drwxr-xr-x. 8 root root 4.0K Oct 3 08:06 spool drwxrwxrwt. 4 root root 4.0K Oct 7 13:49 tmp -rw-r--r--. 1 root root 163 Jul 14 06:14 .updated drwxr-xr-x. 2 root root 4.0K Apr 11 2018 yp [maltfield@osedev1 /]$
- but I immediately noticed that, for exaple, screen wasn't working
[maltfield@osedev1 /]$ screen -S ebs Cannot make directory '/var/run/screen': No such file or directory [maltfield@osedev1 /]$
- oh, damn, '/var/run' is a relative symlink to '../run' which won't work
[maltfield@osedev1 /]$ ls -lah /var/run lrwxrwxrwx. 1 root root 6 Jul 14 06:14 /var/run -> ../run [maltfield@osedev1 /]$
- I made it an absolute symlink instead
[root@osedev1 var]# rm -rf lock [root@osedev1 var]# rm -rf run [root@osedev1 var]# ln -s /run [root@osedev1 var]# ln -s /run/lock [root@osedev1 var]# ls -lah run lrwxrwxrwx. 1 root root 4 Oct 7 13:54 run -> /run [root@osedev1 var]# ls -lah lock lrwxrwxrwx. 1 root root 9 Oct 7 13:54 lock -> /run/lock [root@osedev1 var]#
- it still fails, but everything looks ok; I gave the system a reboot
[root@osedev1 var]# rm -rf lock [root@osedev1 var]# rm -rf run [root@osedev1 var]# ln -s /run [root@osedev1 var]# ln -s /run/lock [root@osedev1 var]# ls -lah run lrwxrwxrwx. 1 root root 4 Oct 7 13:54 run -> /run [root@osedev1 var]# ls -lah lock lrwxrwxrwx. 1 root root 9 Oct 7 13:54 lock -> /run/lock [root@osedev1 var]#
- when the system came back up, `screen` had no issues, and everything looked good.
[maltfield@osedev1 ~]$ screen -ls There is a screen on: 4362.ebs (Attached) 1 Socket in /var/run/screen/S-maltfield. [maltfield@osedev1 ~]$ sudo su - Last login: Mon Oct 7 13:54:28 CEST 2019 on pts/0 [root@osedev1 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 19G 3.4G 15G 19% / devtmpfs 873M 0 873M 0% /dev tmpfs 896M 0 896M 0% /dev/shm tmpfs 896M 17M 879M 2% /run tmpfs 896M 0 896M 0% /sys/fs/cgroup /dev/mapper/ose_dev_volume_1 125G 2.5G 116G 3% /mnt/ose_dev_volume_1 tmpfs 180M 0 180M 0% /run/user/1000 [root@osedev1 ~]# ls -lah /var lrwxrwxrwx. 1 root root 25 Oct 7 13:47 /var -> /mnt/ose_dev_volume_1/var [root@osedev1 ~]# ls -lah /mnt/ose_dev_volume_1/ total 28K drwxr-xr-x. 4 root root 4.0K Oct 7 13:22 . drwxr-xr-x. 4 root root 4.0K Oct 7 13:46 .. drwx------. 2 root root 16K Oct 7 13:18 lost+found drwxr-xr-x. 19 root root 4.0K Oct 7 13:54 var [root@osedev1 ~]#
- I started the staging server, connected to the vpn from my laptop, and was successfully able to ssh into it (though it took a long delay)
- I ssh'd into prod and kicked-off the rsync!
time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --rsync-path="sudo rsync" --exclude=/root --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/
- that also copied the old backups, which is probably unnecessary. I should also exclude
- home/b2user/sync
- this sync is going at a rate of about 1G every 5 minutes. I expect it'll be done in 5-10 hours. I'll check on it tomorrow.
Sat Oct 05, 2019
Fri Oct 04, 2019
Thr Oct 03, 2019
- continuing from yesterday, I copied the dev-specific encryption key from our shared keepass for the backups to the dev node
[root@osedev1 backups]# mv /home/maltfield/ose-dev-backups-cron.201910.key /root/backups/ [root@osedev1 backups]# chown root:root ose-dev-backups-cron.201910.key [root@osedev1 backups]# chmod 0400 ose-dev-backups-cron.201910.key [root@osedev1 backups]# ls -lah total 32K drwxr-xr-x. 4 root root 4.0K Oct 3 07:09 . dr-xr-x---. 7 root root 4.0K Oct 3 07:03 .. -rw-r--r--. 1 root root 747 Oct 2 15:57 backup.settings -rwxr-xr-x. 1 root root 5.7K Oct 3 07:03 backup.sh drwxr-xr-x. 3 root root 4.0K Sep 9 09:02 iptables -r--------. 1 root root 4.0K Oct 3 07:05 ose-dev-backups-cron.201910.key drwxr-xr-x. 2 root root 4.0K Oct 3 07:04 sync [root@osedev1 backups]#
- note that I also had to install `trickle` on the dev node
[root@osedev1 backups]# ./backup.sh ================================================================================ INFO: Beginning Backup Run on 20191003_051037 INFO: Cleaning up old backup files ... INFO: moving encrypted backup file to b2user's sync dir INFO: Beginning upload to backblaze b2 sudo: /bin/trickle: command not found real 0m0.030s user 0m0.009s sys 0m0.021s [root@osedev1 backups]# yum install trickle ... Installed: trickle.x86_64 0:1.07-19.el7 Complete! [root@osedev1 backups]#
- note that something changed in the install process of the b2cli that required me to use the '--user' flag, which changed the path to the b2 binary. To keep the mods to the backup.sh script minimal, I just created a symlink
[root@osedev1 backups]# ./backup.sh ... + echo 'INFO: Beginning upload to backblaze b2' INFO: Beginning upload to backblaze b2 + /bin/sudo -u b2user /bin/trickle -s -u 3000 /home/b2user/virtualenv/bin/b2 upload-file --noProgress --threads 1 ose-dev-server-backups /home/b2user/sync/daily_osedev120191003_051511.tar.gpg daily_osedev120191003_051511.tar.gpg trickle: exec(): No such file or directory real 0m0.040s user 0m0.012s sys 0m0.020s + exit 0 [root@osedev1 backups]# /bin/sudo -u b2user /bin/trickle -s -u 3000 /home/b2user/virtualenv/bin/b2 upload-file --noProgress --threads 1 ose-dev-server-backups /home/b2user/sync/daily_osedev120191003_051511.tar.gpg daily_osedev120191003_051511.tar.gpg trickle: exec(): No such file or directory [root@osedev1 b2user]# ln -s /home/b2user/.local/bin/b2 /home/b2user/virtualenv/bin/b2 [root@osedev1 b2user]#
- the backup script still failed at the upload to b2
[root@osedev1 backups]# ./backup.sh ... INFO: Beginning upload to backblaze b2 + /bin/sudo -u b2user /bin/trickle -s -u 3000 /home/b2user/virtualenv/bin/b2 upload-file --noProgress --threads 1 ose-dev-server-backups /home/b2user/sync/daily_osedev1_20191003_052059.tar.gpg daily_osedev1_20191003_052059.tar.gpg ERROR: Missing account data: 'NoneType' object has no attribute 'getitem' Use: b2 authorize-account real 0m0.363s user 0m0.281s sys 0m0.076s + exit 0 [root@osedev1 b2user]# [root@osedev1 b2user]# /bin/sudo -u b2user /bin/trickle -s -u 3000 /home/b2user/virtualenv/bin/b2 upload-file --noProgress --threads 1 ose-dev-server-backups /home/b2user/sync/daily_osedev1_20191003_052059.tar.gpg daily_osedev1_20191003_052059.tar.gpg ERROR: Missing account data: 'NoneType' object has no attribute 'getitem' Use: b2 authorize-account [root@osedev1 b2user]#
- per the error, I used `b2 authorize-account` and added my creds for the user 'b2user'
[root@osedev1 b2user]# su - b2user Last login: Wed Oct 2 16:15:28 CEST 2019 on pts/8 [b2user@osedev1 ~]$ .local/bin/b2 authorize-account Using https://api.backblazeb2.com Backblaze application key ID: XXXXXXXXXXXXXXXXXXXXXXXXX Backblaze application key: [b2user@osedev1 ~]$
- this time the backup succeeded!
[root@osedev1 b2user]# /root/backups/backup.sh ... INFO: moving encrypted backup file to b2user's sync dir + /bin/mv /root/backups/sync/daily_osedev1_20191003_052448.tar.gpg /home/b2user/sync/daily_osedev1_20191003_052448.tar.gpg + /bin/chown b2user /home/b2user/sync/daily_osedev1_20191003_052448.tar.gpg + echo 'INFO: Beginning upload to backblaze b2' INFO: Beginning upload to backblaze b2 + /bin/sudo -u b2user /bin/trickle -s -u 3000 /home/b2user/virtualenv/bin/b2 upload-file --noProgress --threads 1 ose-dev-server-backups /home/b2user/sync/daily_osedev1_20191003_052448.tar.gpg daily_osedev1_20191003_052448.tar.gpg URL by file name: https://f001.backblazeb2.com/file/ose-dev-server-backups/daily_osedev1_20191003_052448.tar.gpg URL by fileId: https://f001.backblazeb2.com/b2api/v2/b2_download_file_by_id?fileId=4_z2675c17c55dd1d696edd0118_f1082387e9ca2c0d4_d20191003_m052459_c001_v0001109_t0038 { "action": "upload", "fileId": "4_z2675c17c55dd1d696edd0118_f1082387e9ca2c0d4_d20191003_m052459_c001_v0001109_t0038", "fileName": "daily_osedev1_20191003_052448.tar.gpg", "size": 17233113, "uploadTimestamp": 1570080299000 } real 0m26.435s user 0m0.706s sys 0m0.251s + exit 0 [root@osedev1 b2user]#
- as an out-of-band restore validation, I downloaded the 17.2M backup file from the backblaze b2 wui onto my laptop
- again, I downloaded the encryption key from our shared keepass
user@disp5653:~/Downloads$ gpg --batch --passphrase-file ose-dev-backups-cron.201910.key --output daily_osedev1_20191003_052448.tar ose-dev-backups-cron.201910.key gpg: WARNING: no command supplied. Trying to guess what you mean ... gpg: no valid OpenPGP data found. gpg: processing message failed: Unknown system error user@disp5653:~/Downloads$ gpg --batch --passphrase-file ose-dev-backups-cron.201910.key --output daily_osedev1_20191003_052448.tar daily_osedev1_20191003_052448.tar.gpg gpg: WARNING: no command supplied. Trying to guess what you mean ... gpg: AES256 encrypted data gpg: encrypted with 1 passphrase user@disp5653:~/Downloads$ tar -xf daily_osedev1_20191003_052448.tar user@disp5653:~/Downloads$ ls daily_osedev1_20191003_052448.tar ose-dev-backups-cron.201910.key daily_osedev1_20191003_052448.tar.gpg root user@disp5653:~/Downloads$ find root/backups/sync/daily_osedev1_20191003_052448/ -type f root/backups/sync/daily_osedev1_20191003_052448/www/www.20191003_052448.tar.gz root/backups/sync/daily_osedev1_20191003_052448/root/root.20191003_052448.tar.gz root/backups/sync/daily_osedev1_20191003_052448/log/log.20191003_052448.tar.gz root/backups/sync/daily_osedev1_20191003_052448/etc/etc.20191003_052448.tar.gz root/backups/sync/daily_osedev1_20191003_052448/home/home.20191003_052448.tar.gz user@disp5653:~/Downloads$
- it looks like it's working; here's the contents of the backup file (note there's some varnish config files on here from when I did my test rsync back in on Sep 9th Maltfield_Log/2019_Q3#Mon_Sep_09.2C_2019
user@disp5653:~/Downloads$ find root/backups/sync/daily_osedev1_20191003_052448/ -type f -exec tar -tvf '{}' \; | awk '{print $6}' | cut -d/ -f 1-2 | sort -u etc/adjtime etc/aliases etc/alternatives etc/anacrontab etc/audisp etc/audit etc/bash_completion.d etc/bashrc etc/binfmt.d etc/centos-release etc/centos-release-upstream etc/chkconfig.d etc/chrony.conf etc/chrony.keys etc/cloud etc/cron.d etc/cron.daily etc/cron.deny etc/cron.hourly etc/cron.monthly etc/crontab etc/cron.weekly etc/crypttab etc/csh.cshrc etc/csh.login etc/dbus-1 etc/default etc/depmod.d etc/dhcp etc/DIR_COLORS etc/DIR_COLORS.256color etc/DIR_COLORS.lightbgcolor etc/dnsmasq.conf etc/dnsmasq.d etc/dracut.conf etc/dracut.conf.d etc/e2fsck.conf etc/environment etc/ethertypes etc/exports etc/exports.d etc/filesystems etc/firewalld etc/fstab etc/gcrypt etc/GeoIP.conf etc/GeoIP.conf.default etc/gnupg etc/GREP_COLORS etc/groff etc/group etc/group- etc/grub2.cfg etc/grub.d etc/gshadow etc/gshadow- etc/gss etc/gssproxy etc/host.conf etc/hostname etc/hosts etc/hosts.allow etc/hosts.deny etc/idmapd.conf etc/init.d etc/inittab etc/inputrc etc/iproute2 etc/iscsi etc/issue etc/issue.net etc/kdump.conf etc/kernel etc/krb5.conf etc/krb5.conf.d etc/ld.so.cache etc/ld.so.conf etc/ld.so.conf.d etc/libaudit.conf etc/libnl etc/libuser.conf etc/libvirt etc/locale.conf etc/localtime etc/login.defs etc/logrotate.conf etc/logrotate.d etc/lvm etc/lxc etc/machine-id etc/magic etc/makedumpfile.conf.sample etc/man_db.conf etc/mke2fs.conf etc/modprobe.d etc/modules-load.d etc/motd etc/mtab etc/netconfig etc/NetworkManager etc/networks etc/nfs.conf etc/nfsmount.conf etc/nsswitch.conf etc/nsswitch.conf.bak etc/numad.conf etc/openldap etc/openvpn etc/opt etc/os-release etc/pam.d etc/passwd etc/passwd- etc/pkcs11 etc/pki etc/pm etc/polkit-1 etc/popt.d etc/ppp etc/prelink.conf.d etc/printcap etc/profile etc/profile.d etc/protocols etc/python etc/qemu-ga etc/radvd.conf etc/rc0.d etc/rc1.d etc/rc2.d etc/rc3.d etc/rc4.d etc/rc5.d etc/rc6.d etc/rc.d etc/rc.local etc/redhat-release etc/request-key.conf etc/request-key.d etc/resolv.conf etc/rpc etc/rpm etc/rsyncd.conf etc/rsyslog.conf etc/rsyslog.d etc/rwtab etc/rwtab.d etc/sasl2 etc/screenrc etc/securetty etc/security etc/selinux etc/services etc/sestatus.conf etc/shadow etc/shadow- etc/shells etc/skel etc/ssh etc/ssl etc/statetab etc/statetab.d etc/subgid etc/subuid etc/sudo.conf etc/sudoers etc/sudoers.d etc/sudo-ldap.conf etc/sysconfig etc/sysctl.conf etc/sysctl.d etc/systemd etc/system-release etc/system-release-cpe etc/tcsd.conf etc/terminfo etc/timezone etc/tmpfiles.d etc/trickled.conf etc/tuned etc/udev etc/unbound etc/varnish etc/vconsole.conf etc/vimrc etc/virc etc/wpa_supplicant etc/X11 etc/xdg etc/xinetd.d etc/yum etc/yum.conf etc/yum.repos.d home/b2user home/maltfield root/anaconda-ks.cfg root/backups root/Finished root/original-ks.cfg root/Package root/pki root/Running var/log user@disp5653:~/Downloads$
- and a true end-to-end test, I restored the sshd_config file
user@disp5653:~/Downloads/root/backups/sync/daily_osedev1_20191003_052448/etc$ pwd /home/user/Downloads/root/backups/sync/daily_osedev1_20191003_052448/etc user@disp5653:~/Downloads/root/backups/sync/daily_osedev1_20191003_052448/etc$ date Thu Oct 3 11:37:49 +0545 2019 user@disp5653:~/Downloads/root/backups/sync/daily_osedev1_20191003_052448/etc$ ls etc.20191003_052448.tar.gz user@disp5653:~/Downloads/root/backups/sync/daily_osedev1_20191003_052448/etc$ tar -xzf etc.20191003_052448.tar.gz user@disp5653:~/Downloads/root/backups/sync/daily_osedev1_20191003_052448/etc$ tail etc/ssh/sshd_config # override default of no subsystems Subsystem sftp /usr/libexec/openssh/sftp-server # Example of overriding settings on a per-user basis #Match User anoncvs # X11Forwarding no # AllowTcpForwarding no # PermitTTY no # ForceCommand cvs server user@disp5653:~/Downloads/root/backups/sync/daily_osedev1_20191003_052448/etc$
- I also copied the cron job and the backup report script to the dev node
[root@opensourceecology ~]# cat /etc/cron.d/backup_to_backblaze 20 07 * * * root time /bin/nice /root/backups/backup.sh &>> /var/log/backups/backup.log 20 04 03 * * root time /bin/nice /root/backups/backupReport.sh [root@opensourceecology ~]#
- I tried testing the backup report script, but it complained that the `mail` command was absent. otherwise it appears to be working without modifications
[root@osedev1 backups]# ./backupReport.sh ./backupReport.sh: line 90: /usr/bin/mail: No such file or directory INFO: email body below ATTENTION: BACKUPS MISSING! WARNING: First of this month's backup (20191001) is missing! WARNING: First of last month's backup (20190901) is missing! WARNING: Yesterday's backup (20191002) is missing! WARNING: The day before yesterday's backup (20191001) is missing! See below for the contents of the backblaze b2 bucket = ose-dev-server-backups daily_osedev1_20191003_052448.tar.gpg --- Note: This report was generated on 20191003_060036 UTC by script '/root/backups/backupReport.sh' This script was triggered by '/etc/cron.d/backup_to_backblaze' For more information about OSE backups, please see the relevant documentation pages on the wiki: * https://wiki.opensourceecology.org/wiki/Backblaze * https://wiki.opensourceecology.org/wiki/OSE_Server#Backups [root@osedev1 backups]#
- I installed mailx and re-ran the script
[root@osedev1 backups]# yum install mailx ... Installed: mailx.x86_64 0:12.5-19.el7 Complete! [root@osedev1 backups]#
- this time it failed because sendmail is not installed; I *could* install postfix, but I decided just to install sendmail
[root@osedev1 backups]# ./backupReport.sh ... /usr/sbin/sendmail: No such file or directory "/root/dead.letter" 30/1215 . . . message not sent. [root@osedev1 backups]# rpm -qa | grep postfix [root@osedev1 backups]# rpm -qa | grep exim [root@osedev1 backups]# yum install sendmail ... Installed: sendmail.x86_64 0:8.14.7-5.el7 Dependency Installed: hesiod.x86_64 0:3.2.1-3.el7 procmail.x86_64 0:3.22-36.el7_4.1 Complete! [root@osedev1 backups]#
- this time it ran without error, but I never got an email. this is probably because gmail is rejecting it; we don't have DNS setup properly for this server to send mail. Anyway, this is good enough for our dev node's backups for now.
- I also added the same lifecycle rules that we have for the 'ose-server-backups' bucket to the 'ose-dev-server-backups' bucket in the backblaze b2 wui
- let's proceed with getting openvpn clients configured for the prod node (and its clone the staging node, which will use the same client cert)
- as I did on Sep 9 to create my client cert for 'maltfield', I created a new cert for 'hetzner2' Maltfield_Log/2019_Q3#Mon_Sep_09.2C_2019
- again, the ca and cert files are located in /usr/share/easy-rsa/3/pki/
- I documented this dir on the wiki OpenVPN
- interestingly, I could only execute these command from the dir above the pki dir
[root@osedev1 pki]# /usr/share/easy-rsa/3.0.6/easyrsa build-client-full hetzner2 Easy-RSA error: EASYRSA_PKI does not exist (perhaps you need to run init-pki)? Expected to find the EASYRSA_PKI at: /usr/share/easy-rsa/3/pki/pki Run easyrsa without commands for usage and command help. [root@osedev1 pki]# [root@osedev1 pki]# cd .. [root@osedev1 3]# /usr/share/easy-rsa/3.0.6/easyrsa build-client-full hetzner2 Using SSL: openssl OpenSSL 1.0.2k-fips 26 Jan 2017 Generating a 2048 bit RSA private key .......................................................................+++ ............................................+++ writing new private key to '/usr/share/easy-rsa/3/pki/private/hetzner2.key.7F3A32KzES' Enter PEM pass phrase:
- note I appended the option 'nopass' so that the hetzner2 prod server could connect to the vpn using a private certificate file only & automatically, without requiring a password (it may be a good idea to look into if we can whitelist a specific IP for this user, since this hetzner2 client will only connect from the prod or staging server's static ip addresses)
[root@osedev1 3]# /usr/share/easy-rsa/3.0.6/easyrsa help build-client-full build-client-full <filename_base> [ cmd-opts ] build-server-full <filename_base> [ cmd-opts ] build-serverClient-full <filename_base> [ cmd-opts ] Generate a keypair and sign locally for a client and/or server This mode uses the <filename_base> as the X509 CN. cmd-opts is an optional set of command options from this list: nopass - do not encrypt the private key (default is encrypted) [root@osedev1 3]# /usr/share/easy-rsa/3.0.6/easyrsa build-client-full hetzner2 nopass Using SSL: openssl OpenSSL 1.0.2k-fips 26 Jan 2017 Generating a 2048 bit RSA private key ..................................................................................................+++ .....+++ writing new private key to '/usr/share/easy-rsa/3/pki/private/hetzner2.key.qQ1HGf7ovg' ----- Using configuration from /usr/share/easy-rsa/3/pki/safessl-easyrsa.cnf Enter pass phrase for /usr/share/easy-rsa/3/pki/private/ca.key: Check that the request matches the signature Signature ok The Subject's Distinguished Name is as follows commonName :ASN.1 12:'hetzner2' Certificate is to be certified until Sep 17 06:42:28 2022 GMT (1080 days) Write out database with 1 new entries Data Base Updated [root@osedev1 3]#
- I copied the necessary files to the prod server
[root@osedev1 3]# cp pki/private/hetzner2.key /home/maltfield/ [root@osedev1 3]# cp pki/issued/hetzner2.crt /home/maltfield/ [root@osedev1 3]# cp pki/private/ta.key /home/maltfield/ [root@osedev1 3]# cp pki/ca.crt /home/maltfield/ [root@osedev1 3]# chown maltfield /home/maltfield/*.cert [root@osedev1 3]# chown maltfield /home/maltfield/*.key [root@osedev1 3]# logout [maltfield@osedev1 ~]$ scp -P32415 /home/maltfield/hetzner2* opensourceecology.org: hetzner2.crt 100% 5675 2.8MB/s 00:00 hetzner2.key 100% 1708 1.0MB/s 00:00 [maltfield@osedev1 ~]$ scp -P32415 /home/maltfield/*.key opensourceecology.org: hetzner2.key 100% 1708 1.0MB/s 00:00 ta.key 100% 636 368.9KB/s 00:00 [maltfield@osedev1 ~]$ shred -u /home/maltfield/*.key [maltfield@osedev1 ~]$ shred -u /home/maltfield/hetzner2.* [maltfield@osedev1 ~]$
- and I moved them to '/root/openvpn' and locked-down the files on the prod hetzner2 server
[root@opensourceecology maltfield]# cd /root [root@opensourceecology ~]# ls backups bin iptables output.json rsyncTest sandbox staging.opensourceecology.org tmp [root@opensourceecology ~]# mkdir openvpn [root@opensourceecology ~]# cd openvpn [root@opensourceecology openvpn]# mv /home/maltfield/hetzner2* . [root@opensourceecology openvpn]# mv /home/maltfield/*.key . [root@opensourceecology openvpn]# mv /home/maltfield/ca.crt . [root@opensourceecology openvpn]# ls -lah total 28K drwxr-xr-x 2 root root 4.0K Oct 3 06:53 . dr-xr-x---. 20 root root 4.0K Oct 3 06:53 .. -rw------- 1 maltfield maltfield 3.3K Oct 3 06:51 ca.crt -rw------- 1 maltfield maltfield 5.6K Oct 3 06:51 hetzner2.crt -rw------- 1 maltfield maltfield 1.7K Oct 3 06:51 hetzner2.key -rw------- 1 maltfield maltfield 636 Oct 3 06:51 ta.key [root@opensourceecology openvpn]# chown root:root * [root@opensourceecology openvpn]# ls -lah total 28K drwxr-xr-x 2 root root 4.0K Oct 3 06:53 . dr-xr-x---. 20 root root 4.0K Oct 3 06:53 .. -rw------- 1 root root 3.3K Oct 3 06:51 ca.crt -rw------- 1 root root 5.6K Oct 3 06:51 hetzner2.crt -rw------- 1 root root 1.7K Oct 3 06:51 hetzner2.key -rw------- 1 root root 636 Oct 3 06:51 ta.key [root@opensourceecology openvpn]# chmod 0700 . [root@opensourceecology openvpn]# ls -lah total 28K drwx------ 2 root root 4.0K Oct 3 06:53 . dr-xr-x---. 20 root root 4.0K Oct 3 06:53 .. -rw------- 1 root root 3.3K Oct 3 06:51 ca.crt -rw------- 1 root root 5.6K Oct 3 06:51 hetzner2.crt -rw------- 1 root root 1.7K Oct 3 06:51 hetzner2.key -rw------- 1 root root 636 Oct 3 06:51 ta.key [root@opensourceecology openvpn]#
- then I created a client.conf file from my personal client.conf file & modified it to use the new cert & key files
[root@opensourceecology openvpn]# vim client.conf [root@opensourceecology openvpn]# ls -lah client.conf -rw-r--r-- 1 root root 3.6K Oct 3 06:56 client.conf [root@opensourceecology openvpn]# chmod 0600 client.conf [root@opensourceecology openvpn]# cat client.conf ############################################## # Sample client-side OpenVPN 2.0 config file # # for connecting to multi-client server. # # # # This configuration can be used by multiple # # clients, however each client should have # # its own cert and key files. # # # # On Windows, you might want to rename this # # file so it has a .ovpn extension # ############################################## # Specify that we are a client and that we # will be pulling certain config file directives # from the server. client # Use the same setting as you are using on # the server. # On most systems, the VPN will not function # unless you partially or fully disable # the firewall for the TUN/TAP interface. ;dev tap dev tun # Windows needs the TAP-Win32 adapter name # from the Network Connections panel # if you have more than one. On XP SP2, # you may need to disable the firewall # for the TAP adapter. ;dev-node MyTap # Are we connecting to a TCP or # UDP server? Use the same setting as # on the server. ;proto tcp proto udp # The hostname/IP and port of the server. # You can have multiple remote entries # to load balance between the servers. remote 195.201.233.113 1194 ;remote my-server-2 1194 # Choose a random host from the remote # list for load-balancing. Otherwise # try hosts in the order specified. ;remote-random # Keep trying indefinitely to resolve the # host name of the OpenVPN server. Very useful # on machines which are not permanently connected # to the internet such as laptops. resolv-retry infinite # Most clients don't need to bind to # a specific local port number. nobind # Downgrade privileges after initialization (non-Windows only) ;user nobody ;group nobody # Try to preserve some state across restarts. persist-key persist-tun # If you are connecting through an # HTTP proxy to reach the actual OpenVPN # server, put the proxy server/IP and # port number here. See the man page # if your proxy server requires # authentication. ;http-proxy-retry # retry on connection failures ;http-proxy [proxy server] [proxy port #] # Wireless networks often produce a lot # of duplicate packets. Set this flag # to silence duplicate packet warnings. ;mute-replay-warnings # SSL/TLS parms. # See the server config file for more # description. It's best to use # a separate .crt/.key file pair # for each client. A single ca # file can be used for all clients. ca ca.crt cert hetzner2.crt key hetzner2.key # Verify server certificate by checking that the # certicate has the correct key usage set. # This is an important precaution to protect against # a potential attack discussed here: # http://openvpn.net/howto.html#mitm # # To use this feature, you will need to generate # your server certificates with the keyUsage set to # digitalSignature, keyEncipherment # and the extendedKeyUsage to # serverAuth # EasyRSA can do this for you. remote-cert-tls server # If a tls-auth key is used on the server # then every client must also have the key. tls-auth ta.key 1 # Select a cryptographic cipher. # If the cipher option is used on the server # then you must also specify it here. # Note that v2.4 client/server will automatically # negotiate AES-256-GCM in TLS mode. # See also the ncp-cipher option in the manpage cipher AES-256-GCM # Enable compression on the VPN link. # Don't enable this unless it is also # enabled in the server config file. #comp-lzo # Set log file verbosity. verb 3 # Silence repeating messages ;mute 20 # hardening tls-cipher TLS-DHE-RSA-WITH-AES-256-GCM-SHA384 [root@opensourceecology openvpn]#
- I installed the 'openvpn' package on the production hetzner2 server
[root@opensourceecology openvpn]# yum install openvpn ... Installed: openvpn.x86_64 0:2.4.7-1.el7 Dependency Installed: lz4.x86_64 0:1.7.5-3.el7 pkcs11-helper.x86_64 0:1.11-3.el7 Complete! [root@opensourceecology openvpn]#
- I was successfully able to connect to the vpn on the dev node from the prod node
[root@opensourceecology openvpn]# openvpn client.conf Thu Oct 3 07:06:45 2019 OpenVPN 2.4.7 x86_64-redhat-linux-gnu [Fedora EPEL patched] [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Feb 20 2019 Thu Oct 3 07:06:45 2019 library versions: OpenSSL 1.0.2k-fips 26 Jan 2017, LZO 2.06 Thu Oct 3 07:06:45 2019 Outgoing Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Thu Oct 3 07:06:45 2019 Incoming Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Thu Oct 3 07:06:45 2019 TCP/UDP: Preserving recently used remote address: [AF_INET]195.201.233.113:1194 Thu Oct 3 07:06:45 2019 Socket Buffers: R=[212992->212992] S=[212992->212992] Thu Oct 3 07:06:45 2019 UDP link local: (not bound) Thu Oct 3 07:06:45 2019 UDP link remote: [AF_INET]195.201.233.113:1194 Thu Oct 3 07:06:45 2019 TLS: Initial packet from [AF_INET]195.201.233.113:1194, sid=865b6fa1 7dcf4731 Thu Oct 3 07:06:45 2019 VERIFY OK: depth=1, CN=osedev1 Thu Oct 3 07:06:45 2019 VERIFY KU OK Thu Oct 3 07:06:45 2019 Validating certificate extended key usage Thu Oct 3 07:06:45 2019 ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication Thu Oct 3 07:06:45 2019 VERIFY EKU OK Thu Oct 3 07:06:45 2019 VERIFY OK: depth=0, CN=server Thu Oct 3 07:06:45 2019 Control Channel: TLSv1.2, cipher TLSv1/SSLv3 DHE-RSA-AES256-GCM-SHA384, 4096 bit RSA Thu Oct 3 07:06:45 2019 [server] Peer Connection Initiated with [AF_INET]195.201.233.113:1194 Thu Oct 3 07:06:46 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 07:06:46 2019 PUSH: Received control message: 'PUSH_REPLY,route 10.241.189.1,topology net30,ping 10,ping-restart 120,ifconfig 10.241.189.10 10.241.189.9,peer-id 0,cipher AES-256-GCM' Thu Oct 3 07:06:46 2019 OPTIONS IMPORT: timers and/or timeouts modified Thu Oct 3 07:06:46 2019 OPTIONS IMPORT: --ifconfig/up options modified Thu Oct 3 07:06:46 2019 OPTIONS IMPORT: route options modified Thu Oct 3 07:06:46 2019 OPTIONS IMPORT: peer-id set Thu Oct 3 07:06:46 2019 OPTIONS IMPORT: adjusting link_mtu to 1624 Thu Oct 3 07:06:46 2019 OPTIONS IMPORT: data channel crypto options modified Thu Oct 3 07:06:46 2019 Outgoing Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key Thu Oct 3 07:06:46 2019 Incoming Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key Thu Oct 3 07:06:46 2019 ROUTE_GATEWAY 138.201.84.193 Thu Oct 3 07:06:46 2019 TUN/TAP device tun0 opened Thu Oct 3 07:06:46 2019 TUN/TAP TX queue length set to 100 Thu Oct 3 07:06:46 2019 /sbin/ip link set dev tun0 up mtu 1500 Thu Oct 3 07:06:46 2019 /sbin/ip addr add dev tun0 local 10.241.189.10 peer 10.241.189.9 Thu Oct 3 07:06:46 2019 /sbin/ip route add 10.241.189.1/32 via 10.241.189.9 Thu Oct 3 07:06:46 2019 WARNING: this configuration may cache passwords in memory -- use the auth-nocache option to prevent this Thu Oct 3 07:06:46 2019 Initialization Sequence Completed
- the prod server now has a tun0 interface with an ip address of 10.241.189.10 on the VPN private network subnet
[root@opensourceecology ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 90:1b:0e:94:07:c4 brd ff:ff:ff:ff:ff:ff inet 138.201.84.223 peer 138.201.84.193/32 brd 138.201.84.223 scope global eth0 valid_lft forever preferred_lft forever inet 138.201.84.223/32 scope global eth0 valid_lft forever preferred_lft forever inet 138.201.84.243/16 scope global eth0 valid_lft forever preferred_lft forever inet 138.201.84.243 peer 138.201.84.193/32 brd 138.201.255.255 scope global secondary eth0 valid_lft forever preferred_lft forever inet6 2a01:4f8:172:209e::2/64 scope global valid_lft forever preferred_lft forever inet6 fe80::921b:eff:fe94:7c4/64 scope link valid_lft forever preferred_lft forever 4: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 100 link/none inet 10.241.189.10 peer 10.241.189.9/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::a834:c77a:f65f:76fc/64 scope link flags 800 valid_lft forever preferred_lft forever [root@opensourceecology ~]#
- I confirmed that the website didn't break ☺
- now I created the same dir on the staging node (note this weird systemd journal corruption error that slowed things down quite a bit)
[root@osedev1 ~]# lxc-start -n osestaging1 systemd 219 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN) Detected virtualization lxc. Detected architecture x86-64. Welcome to CentOS Linux 7 (Core)! ... Kernel 3.10.0-957.21.3.el7.x86_64 on an x86_64 osestaging1 login: maltfield Password: Last login: Wed Oct 2 13:01:56 on lxc/console [maltfield@osestaging1 ~]$ sudo su - [sudo] password for maltfield: <44>systemd-journald[297]: File /run/log/journal/dd9978e8797e4112832634fa4d174c7b/system.journal corrupted or uncleanly shut down, renaming and replacing. Last login: Wed Oct 2 13:15:46 UTC 2019 on lxc/console Last failed login: Thu Oct 3 07:11:57 UTC 2019 on lxc/console There was 1 failed login attempt since the last successful login. [root@osestaging1 ~]#
- on the dev node again
[root@osedev1 pki]# cp private/hetzner2.key /home/maltfield/ [root@osedev1 pki]# cp issued/hetzner2.crt /home/maltfield/ [root@osedev1 pki]# cp private/ta.key /home/maltfield/ [root@osedev1 pki]# chown maltfield /home/maltfield/*.key [root@osedev1 pki]# chown maltfield /home/maltfield/*.crt [root@osedev1 pki]# logout [maltfield@osedev1 ~]$ scp -P 32415 /home/maltfield/*.key 192.168.122.201: hetzner2.key 100% 1708 2.4MB/s 00:00 ta.key 100% 636 1.2MB/s 00:00 [maltfield@osedev1 ~]$ scp -P 32415 /home/maltfield/*.crt 192.168.122.201: ca.crt 100% 1850 2.6MB/s 00:00 hetzner2.crt 100% 5675 9.0MB/s 00:00 [maltfield@osedev1 ~]$ shred -u /home/maltfield/*.key [maltfield@osedev1 ~]$ shred -u /home/maltfield/*.crt [maltfield@osedev1 ~]$
- and back on the staging container node
[root@osestaging1 ~]# cd /root/openvpn [root@osestaging1 openvpn]# ls [root@osestaging1 openvpn]# mv /home/maltfield/*.crt . [root@osestaging1 openvpn]# mv /home/maltfield/*.key . [root@osestaging1 openvpn]# ls -lah total 28K drwxr-xr-x. 2 root root 4.0K Oct 3 07:23 . dr-xr-x---. 3 root root 4.0K Oct 3 07:18 .. -rw-------. 1 maltfield maltfield 1.9K Oct 3 07:21 ca.crt -rw-------. 1 maltfield maltfield 5.6K Oct 3 07:21 hetzner2.crt -rw-------. 1 maltfield maltfield 1.7K Oct 3 07:21 hetzner2.key -rw-------. 1 maltfield maltfield 636 Oct 3 07:21 ta.key [root@osestaging1 openvpn]# chown root:root * [root@osestaging1 openvpn]# chmod 0700 . [root@osestaging1 openvpn]# ls -lah total 28K drwx------. 2 root root 4.0K Oct 3 07:23 . dr-xr-x---. 3 root root 4.0K Oct 3 07:18 .. -rw-------. 1 root root 1.9K Oct 3 07:21 ca.crt -rw-------. 1 root root 5.6K Oct 3 07:21 hetzner2.crt -rw-------. 1 root root 1.7K Oct 3 07:21 hetzner2.key -rw-------. 1 root root 636 Oct 3 07:21 ta.key [root@osestaging1 openvpn]#
- I also installed vim, epel-release, and openvpn on the staging node
- I had an issue connecting to to the vpn from within the staging node; this appears to be an issue for trying to connect to a vpn from within a docker or lxc container https://serverfault.com/questions/429461/no-tun-device-in-lxc-guest-for-openvpn
[root@osestaging1 openvpn]# openvpn client.conf Thu Oct 3 07:29:17 2019 OpenVPN 2.4.7 x86_64-redhat-linux-gnu [Fedora EPEL patched] [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Feb 20 2019 Thu Oct 3 07:29:17 2019 library versions: OpenSSL 1.0.2k-fips 26 Jan 2017, LZO 2.06 Thu Oct 3 07:29:17 2019 Outgoing Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Thu Oct 3 07:29:17 2019 Incoming Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Thu Oct 3 07:29:17 2019 TCP/UDP: Preserving recently used remote address: [AF_INET]195.201.233.113:1194 Thu Oct 3 07:29:17 2019 Socket Buffers: R=[212992->212992] S=[212992->212992] Thu Oct 3 07:29:17 2019 UDP link local: (not bound) Thu Oct 3 07:29:17 2019 UDP link remote: [AF_INET]195.201.233.113:1194 Thu Oct 3 07:29:17 2019 TLS: Initial packet from [AF_INET]195.201.233.113:1194, sid=f2e8fcad efdb9311 Thu Oct 3 07:29:17 2019 VERIFY OK: depth=1, CN=osedev1 Thu Oct 3 07:29:17 2019 VERIFY KU OK Thu Oct 3 07:29:17 2019 Validating certificate extended key usage Thu Oct 3 07:29:17 2019 ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication Thu Oct 3 07:29:17 2019 VERIFY EKU OK Thu Oct 3 07:29:17 2019 VERIFY OK: depth=0, CN=server Thu Oct 3 07:29:17 2019 Control Channel: TLSv1.2, cipher TLSv1/SSLv3 DHE-RSA-AES256-GCM-SHA384, 4096 bit RSA Thu Oct 3 07:29:17 2019 [server] Peer Connection Initiated with [AF_INET]195.201.233.113:1194 Thu Oct 3 07:29:18 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 07:29:18 2019 PUSH: Received control message: 'PUSH_REPLY,route 10.241.189.1,topology net30,ping 10,ping-restart 120,ifconfig 10.241.189.10 10.241.189.9,peer-id 0,cipher AES-256-GCM' Thu Oct 3 07:29:18 2019 OPTIONS IMPORT: timers and/or timeouts modified Thu Oct 3 07:29:18 2019 OPTIONS IMPORT: --ifconfig/up options modified Thu Oct 3 07:29:18 2019 OPTIONS IMPORT: route options modified Thu Oct 3 07:29:18 2019 OPTIONS IMPORT: peer-id set Thu Oct 3 07:29:18 2019 OPTIONS IMPORT: adjusting link_mtu to 1624 Thu Oct 3 07:29:18 2019 OPTIONS IMPORT: data channel crypto options modified Thu Oct 3 07:29:18 2019 Outgoing Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key Thu Oct 3 07:29:18 2019 Incoming Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key Thu Oct 3 07:29:18 2019 ROUTE_GATEWAY 192.168.122.1/255.255.255.0 IFACE=eth0 HWADDR=fe:07:06:a6:5f:1d Thu Oct 3 07:29:18 2019 ERROR: Cannot open TUN/TAP dev /dev/net/tun: No such file or directory (errno=2) Thu Oct 3 07:29:18 2019 Exiting due to fatal error [root@osestaging1 openvpn]#
- the above link suggests following the arch linux guide to create an openvpn client systemd module within the container
[root@osestaging1 openvpn]# ls /usr/lib/systemd/system/openvpn-client\@.service /usr/lib/systemd/system/openvpn-client@.service [root@osestaging1 openvpn]# ls /etc/systemd/system/ basic.target.wants default.target.wants local-fs.target.wants sysinit.target.wants default.target getty.target.wants multi-user.target.wants system-update.target.wants [root@osestaging1 openvpn]# cp /usr/lib/systemd/system/openvpn-client\@.service /etc/systemd/system/ [root@osestaging1 openvpn]# grep /etc/systemd/system/openvpn-client\@.service LimitNPROC grep: LimitNPROC: No such file or directory [root@osestaging1 openvpn]# grep LimitNPROC /etc/systemd/system/openvpn-client\@.service LimitNPROC=10 [root@osestaging1 openvpn]# vim /etc/systemd/system/openvpn-client\@.service [root@osestaging1 openvpn]# grep LimitNPROC /etc/systemd/system/openvpn-client\@.service #LimitNPROC=10 [root@osestaging1 openvpn]#
- that didn't work; it wants something after the '@' I did that, and realized that I'll need to further modify it with the correct config file
[root@osestaging1 openvpn]# cd /etc/systemd/system [root@osestaging1 system]# ls basic.target.wants getty.target.wants openvpn-client@.service default.target local-fs.target.wants sysinit.target.wants default.target.wants multi-user.target.wants system-update.target.wants [root@osestaging1 system]# mv openvpn-client\@.service openvpn-client\@dev.service [root@osestaging1 system]# systemctl status openvpn-client\@dev.service ● openvpn-client@dev.service - OpenVPN tunnel for dev Loaded: loaded (/etc/systemd/system/openvpn-client@dev.service; disabled; vendor preset: disabled) Active: inactive (dead) Docs: man:openvpn(8) https://community.openvpn.net/openvpn/wiki/Openvpn24ManPage https://community.openvpn.net/openvpn/wiki/HOWTO [root@osestaging1 system]# systemctl start openvpn-client\@dev.service Job for openvpn-client@dev.service failed because the control process exited with error code. See "systemctl status openvpn-client@dev.service" and "journalctl -xe" for details. [root@osestaging1 system]# systemctl status openvpn-client\@dev.service ● openvpn-client@dev.service - OpenVPN tunnel for dev Loaded: loaded (/etc/systemd/system/openvpn-client@dev.service; disabled; vendor preset: disabled) Active: failed (Result: exit-code) since Thu 2019-10-03 07:44:09 UTC; 16s ago Docs: man:openvpn(8) https://community.openvpn.net/openvpn/wiki/Openvpn24ManPage https://community.openvpn.net/openvpn/wiki/HOWTO Process: 557 ExecStart=/usr/sbin/openvpn --suppress-timestamps --nobind --config %i.conf (code=exited, status=1/FAILURE) Main PID: 557 (code=exited, status=1/FAILURE) Oct 03 07:44:08 osestaging1 systemd[1]: Starting OpenVPN tunnel for dev... Oct 03 07:44:09 osestaging1 openvpn[557]: Options error: In [CMD-LINE]:1: Error opening configuration file: dev.conf Oct 03 07:44:09 osestaging1 openvpn[557]: Use --help for more information. Oct 03 07:44:09 osestaging1 systemd[1]: openvpn-client@dev.service: main process exited, code=exited, status=...ILURE Oct 03 07:44:09 osestaging1 systemd[1]: Failed to start OpenVPN tunnel for dev. Oct 03 07:44:09 osestaging1 systemd[1]: Unit openvpn-client@dev.service entered failed state. Oct 03 07:44:09 osestaging1 systemd[1]: openvpn-client@dev.service failed. Hint: Some lines were ellipsized, use -l to show in full. [root@osestaging1 system]# vim openvpn-client\@dev.service
- I updated the working dir and changed the service name to match the name of the config file in there
[root@osestaging1 system]# cat openvpn-client\@dev.service [Unit] Description=OpenVPN tunnel for %I After=syslog.target network-online.target Wants=network-online.target Documentation=man:openvpn(8) Documentation=https://community.openvpn.net/openvpn/wiki/Openvpn24ManPage Documentation=https://community.openvpn.net/openvpn/wiki/HOWTO [Service] Type=notify PrivateTmp=true WorkingDirectory=/etc/openvpn/client ExecStart=/usr/sbin/openvpn --suppress-timestamps --nobind --config %i.conf CapabilityBoundingSet=CAP_IPC_LOCK CAP_NET_ADMIN CAP_NET_RAW CAP_SETGID CAP_SETUID CAP_SYS_CHROOT CAP_DAC_OVERRIDE #LimitNPROC=10 DeviceAllow=/dev/null rw DeviceAllow=/dev/net/tun rw ProtectSystem=true ProtectHome=true KillMode=process [Install] WantedBy=multi-user.target [root@osestaging1 system]# vim openvpn-client\@dev.service [root@osestaging1 system]# mv openvpn-client\@dev.service openvpn-client\@client.service [root@osestaging1 system]# cat openvpn-client\@client.service [Unit] Description=OpenVPN tunnel for %I After=syslog.target network-online.target Wants=network-online.target Documentation=man:openvpn(8) Documentation=https://community.openvpn.net/openvpn/wiki/Openvpn24ManPage Documentation=https://community.openvpn.net/openvpn/wiki/HOWTO [Service] Type=notify PrivateTmp=true #WorkingDirectory=/etc/openvpn/client WorkingDirectory=/root/openvpn ExecStart=/usr/sbin/openvpn --suppress-timestamps --nobind --config %i.conf CapabilityBoundingSet=CAP_IPC_LOCK CAP_NET_ADMIN CAP_NET_RAW CAP_SETGID CAP_SETUID CAP_SYS_CHROOT CAP_DAC_OVERRIDE #LimitNPROC=10 DeviceAllow=/dev/null rw DeviceAllow=/dev/net/tun rw ProtectSystem=true ProtectHome=true KillMode=process [Install] WantedBy=multi-user.target [root@osestaging1 system]#
- this failed; I gave up and went with manually creating the tun interface per the guide, even though someone else commented taht this would no longer work; it worked!
[root@osestaging1 openvpn]# openvpn client.conf Thu Oct 3 08:02:50 2019 OpenVPN 2.4.7 x86_64-redhat-linux-gnu [Fedora EPEL patched] [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Feb 20 2019 Thu Oct 3 08:02:50 2019 library versions: OpenSSL 1.0.2k-fips 26 Jan 2017, LZO 2.06 Thu Oct 3 08:02:50 2019 Outgoing Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Thu Oct 3 08:02:50 2019 Incoming Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Thu Oct 3 08:02:50 2019 TCP/UDP: Preserving recently used remote address: [AF_INET]195.201.233.113:1194 Thu Oct 3 08:02:50 2019 Socket Buffers: R=[212992->212992] S=[212992->212992] Thu Oct 3 08:02:50 2019 UDP link local: (not bound) Thu Oct 3 08:02:50 2019 UDP link remote: [AF_INET]195.201.233.113:1194 Thu Oct 3 08:02:50 2019 TLS: Initial packet from [AF_INET]195.201.233.113:1194, sid=10846fe0 74bf0345 Thu Oct 3 08:02:50 2019 VERIFY OK: depth=1, CN=osedev1 Thu Oct 3 08:02:50 2019 VERIFY KU OK Thu Oct 3 08:02:50 2019 Validating certificate extended key usage Thu Oct 3 08:02:50 2019 ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication Thu Oct 3 08:02:50 2019 VERIFY EKU OK Thu Oct 3 08:02:50 2019 VERIFY OK: depth=0, CN=server Thu Oct 3 08:02:50 2019 Control Channel: TLSv1.2, cipher TLSv1/SSLv3 DHE-RSA-AES256-GCM-SHA384, 4096 bit RSA Thu Oct 3 08:02:50 2019 [server] Peer Connection Initiated with [AF_INET]195.201.233.113:1194 Thu Oct 3 08:02:51 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:02:51 2019 PUSH: Received control message: 'PUSH_REPLY,route 10.241.189.1,topology net30,ping 10,ping-restart 120,ifconfig 10.241.189.10 10.241.189.9,peer-id 0,cipher AES-256-GCM' Thu Oct 3 08:02:51 2019 OPTIONS IMPORT: timers and/or timeouts modified Thu Oct 3 08:02:51 2019 OPTIONS IMPORT: --ifconfig/up options modified Thu Oct 3 08:02:51 2019 OPTIONS IMPORT: route options modified Thu Oct 3 08:02:51 2019 OPTIONS IMPORT: peer-id set Thu Oct 3 08:02:51 2019 OPTIONS IMPORT: adjusting link_mtu to 1624 Thu Oct 3 08:02:51 2019 OPTIONS IMPORT: data channel crypto options modified Thu Oct 3 08:02:51 2019 Outgoing Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key Thu Oct 3 08:02:51 2019 Incoming Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key Thu Oct 3 08:02:51 2019 ROUTE_GATEWAY 192.168.122.1/255.255.255.0 IFACE=eth0 HWADDR=fe:07:06:a6:5f:1d Thu Oct 3 08:02:51 2019 ERROR: Cannot open TUN/TAP dev /dev/net/tun: No such file or directory (errno=2) Thu Oct 3 08:02:51 2019 Exiting due to fatal error [root@osestaging1 openvpn]# mkdir /dev/net [root@osestaging1 openvpn]# mknod /dev/net/tun c 10 200 [root@osestaging1 openvpn]# chmod 666 /dev/net/tun [root@osestaging1 openvpn]# openvpn client.conf Thu Oct 3 08:03:42 2019 OpenVPN 2.4.7 x86_64-redhat-linux-gnu [Fedora EPEL patched] [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Feb 20 2019 Thu Oct 3 08:03:42 2019 library versions: OpenSSL 1.0.2k-fips 26 Jan 2017, LZO 2.06 Thu Oct 3 08:03:42 2019 Outgoing Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Thu Oct 3 08:03:42 2019 Incoming Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Thu Oct 3 08:03:42 2019 TCP/UDP: Preserving recently used remote address: [AF_INET]195.201.233.113:1194 Thu Oct 3 08:03:42 2019 Socket Buffers: R=[212992->212992] S=[212992->212992] Thu Oct 3 08:03:42 2019 UDP link local: (not bound) Thu Oct 3 08:03:42 2019 UDP link remote: [AF_INET]195.201.233.113:1194 Thu Oct 3 08:03:42 2019 TLS: Initial packet from [AF_INET]195.201.233.113:1194, sid=dcadaef9 7ebea8f1 Thu Oct 3 08:03:42 2019 VERIFY OK: depth=1, CN=osedev1 Thu Oct 3 08:03:42 2019 VERIFY KU OK Thu Oct 3 08:03:42 2019 Validating certificate extended key usage Thu Oct 3 08:03:42 2019 ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication Thu Oct 3 08:03:42 2019 VERIFY EKU OK Thu Oct 3 08:03:42 2019 VERIFY OK: depth=0, CN=server Thu Oct 3 08:03:42 2019 Control Channel: TLSv1.2, cipher TLSv1/SSLv3 DHE-RSA-AES256-GCM-SHA384, 4096 bit RSA Thu Oct 3 08:03:42 2019 [server] Peer Connection Initiated with [AF_INET]195.201.233.113:1194 Thu Oct 3 08:03:43 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:03:48 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:03:53 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:03:59 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:04 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:09 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:15 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:20 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:25 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:30 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:35 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:41 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:46 2019 No reply from server after sending 12 push requests Thu Oct 3 08:04:46 2019 SIGUSR1[soft,no-push-reply] received, process restarting Thu Oct 3 08:04:46 2019 Restart pause, 5 second(s) Thu Oct 3 08:04:51 2019 TCP/UDP: Preserving recently used remote address: [AF_INET]195.201.233.113:1194 Thu Oct 3 08:04:51 2019 Socket Buffers: R=[212992->212992] S=[212992->212992] Thu Oct 3 08:04:51 2019 UDP link local: (not bound) Thu Oct 3 08:04:51 2019 UDP link remote: [AF_INET]195.201.233.113:1194 Thu Oct 3 08:04:51 2019 TLS: Initial packet from [AF_INET]195.201.233.113:1194, sid=c3f6bcfa 04f701bb Thu Oct 3 08:04:51 2019 VERIFY OK: depth=1, CN=osedev1 Thu Oct 3 08:04:51 2019 VERIFY KU OK Thu Oct 3 08:04:51 2019 Validating certificate extended key usage Thu Oct 3 08:04:51 2019 ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication Thu Oct 3 08:04:51 2019 VERIFY EKU OK Thu Oct 3 08:04:51 2019 VERIFY OK: depth=0, CN=server Thu Oct 3 08:04:51 2019 Control Channel: TLSv1.2, cipher TLSv1/SSLv3 DHE-RSA-AES256-GCM-SHA384, 4096 bit RSA Thu Oct 3 08:04:51 2019 [server] Peer Connection Initiated with [AF_INET]195.201.233.113:1194 Thu Oct 3 08:04:53 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:53 2019 PUSH: Received control message: 'PUSH_REPLY,route 10.241.189.1,topology net30,ping 10,ping-restart 120,ifconfig 10.241.189.10 10.241.189.9,peer-id 1,cipher AES-256-GCM' Thu Oct 3 08:04:53 2019 OPTIONS IMPORT: timers and/or timeouts modified Thu Oct 3 08:04:53 2019 OPTIONS IMPORT: --ifconfig/up options modified Thu Oct 3 08:04:53 2019 OPTIONS IMPORT: route options modified Thu Oct 3 08:04:53 2019 OPTIONS IMPORT: peer-id set Thu Oct 3 08:04:53 2019 OPTIONS IMPORT: adjusting link_mtu to 1624 Thu Oct 3 08:04:53 2019 OPTIONS IMPORT: data channel crypto options modified Thu Oct 3 08:04:53 2019 Outgoing Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key Thu Oct 3 08:04:53 2019 Incoming Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key Thu Oct 3 08:04:53 2019 ROUTE_GATEWAY 192.168.122.1/255.255.255.0 IFACE=eth0 HWADDR=fe:07:06:a6:5f:1d Thu Oct 3 08:04:53 2019 TUN/TAP device tun0 opened Thu Oct 3 08:04:53 2019 TUN/TAP TX queue length set to 100 Thu Oct 3 08:04:53 2019 /sbin/ip link set dev tun0 up mtu 1500 Thu Oct 3 08:04:53 2019 /sbin/ip addr add dev tun0 local 10.241.189.10 peer 10.241.189.9 Thu Oct 3 08:04:53 2019 /sbin/ip route add 10.241.189.1/32 via 10.241.189.9 Thu Oct 3 08:04:53 2019 WARNING: this configuration may cache passwords in memory -- use the auth-nocache option to prevent this Thu Oct 3 08:04:53 2019 Initialization Sequence Completed
- I found that I've become stuck in a lxc console since the escape keyboard sequence uses the same keystroke as screen (ctrl-a). the solution is to define an alternate escape sequence (ie: ctrl-e) using `-e'^e'` https://serverfault.com/questions/567696/byobu-how-to-disconnect-from-lxc-console
[root@osedev1 ~]# lxc-console -e '^e' -n osestaging1 Connected to tty 1 Type <Ctrl+e q> to exit the console, <Ctrl+e Ctrl+e> to enter Ctrl+e itself [root@osedev1 ~]#
- I also had to change the tty to 0 to actually get access
[root@osedev1 ~]# lxc-console -e '^e' -n osestaging1 -t 0 lxc_container: commands.c: lxc_cmd_console: 724 Console 0 invalid, busy or all consoles busy. [root@osedev1 ~]# [root@osedev1 ~]#
- I went ahead and connected to the vpn from 3x clients: my laptop, the staging container, and the prod server
- oddly, I noticed that the ip address given to the staging server and the prod server were the same (they do use the same client cert, but I expected them to have a distinct ip address
user@ose:~/openvpn$ ip address show dev tun0 4: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100 link/none inet 10.241.189.6 peer 10.241.189.5/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::2ab6:3617:63cc:c654/64 scope link flags 800 valid_lft forever preferred_lft forever user@ose:~/openvpn$
[root@opensourceecology openvpn]# ip address show dev tun0 4: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 100 link/none inet 10.241.189.10 peer 10.241.189.9/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::a834:c77a:f65f:76fc/64 scope link flags 800 valid_lft forever preferred_lft forever [root@opensourceecology openvpn]#
[root@osestaging1 ~]# ip address show dev tun0 2: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100 link/none inet 10.241.189.10 peer 10.241.189.9/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::5e8c:3af2:2e6:4aea/64 scope link flags 800 valid_lft forever preferred_lft forever [root@osestaging1 ~]#
- I noticed a few relevant options to our openvpn server config
- by default, I have 'ifconfig-pool-persist ipp.txt' defined, which makes clients have the same ip address persistently across the server's reboots; we appear to be using '/etc/openvpn/ipp.txt' here. The one in the 'server' dir appears to be from earlier, probably when I started the server manually rather than through systemd. Interestingly, this isn't even right! From above, we see that my 'maltfield' user has '.6' while the 'hetzner2' users have '.10'. Hmm.
[root@osedev1 server]# grep -iB5 ipp server.conf # Maintain a record of client <-> virtual IP address # associations in this file. If OpenVPN goes down or # is restarted, reconnecting clients can be assigned # the same virtual IP address from the pool that was # previously assigned. ifconfig-pool-persist ipp.txt [root@osedev1 server]# find /etc/openvpn | grep -i ipp.txt /etc/openvpn/server/ipp.txt /etc/openvpn/ipp.txt [root@osedev1 server]# cat /etc/openvpn/server/ipp.txt maltfield,10.241.189.4 [root@osedev1 server]# cat /etc/openvpn/ipp.txt maltfield,10.241.189.4 hetzner2,10.241.189.8
- there's also an option that I have commented-out whoose comments say it should be uncommented if multiple clients will share the same cert
[root@osedev1 server]# grep -iB5 duplicate server.conf # # IF YOU HAVE NOT GENERATED INDIVIDUAL # CERTIFICATE/KEY PAIRS FOR EACH CLIENT, # EACH HAVING ITS OWN UNIQUE "COMMON NAME", # UNCOMMENT THIS LINE OUT. ;duplicate-cn [root@osedev1 server]#
- I uncommented the above 'duplicate-cn' line and restarted openvpn on the dev node
[root@osedev1 server]# vim server.conf [root@osedev1 server]# systemctl restart openvpn@server.service
- I reconnected to the vpn from the staging & prod servers; they got new IP addresses
[root@opensourceecology openvpn]# ip address show dev tun0 5: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 100 link/none inet 10.241.189.14 peer 10.241.189.13/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::e5fb:f261:801b:1c3d/64 scope link flags 800 valid_lft forever preferred_lft forever [root@opensourceecology openvpn]#
[root@osestaging1 openvpn]# ip address show dev tun0 4: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100 link/none inet 10.241.189.18 peer 10.241.189.17/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::27f3:9643:5530:bd0e/64 scope link flags 800 valid_lft forever preferred_lft forever [root@osestaging1 openvpn]#
- I confirmed that each client could ping themselves, but not each-other, so I uncommented the line 'client-to-client' and restarted the openvpn server again
- after that, I confirmed that staging could ping prod, prod could ping staging, and my laptop could ping both staging & prod. Cool!
- for some reason the servers could still not ping my laptop; maybe that's some complication in my like quad-NAT'd QubesOS networking stack flowing through two nested VPN connections. Anyway, that shouldn't be required *shrug*
- and, holy shit, I was successfully able to ssh into the staging node from the production node through the private VPN IP
[maltfield@opensourceecology ~]$ ssh -p 32415 10.241.189.18 The authenticity of host '[10.241.189.18]:32415 ([10.241.189.18]:32415)' can't be established. ECDSA key fingerprint is SHA256:a6NpVsq/qdOCV8o7u3TXeVfZIxp7hpgMqXFOifTuNrI. ECDSA key fingerprint is MD5:ab:eb:7f:f2:bb:83:a1:e5:21:49:1e:22:93:17:70:d6. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '[10.241.189.18]:32415' (ECDSA) to the list of known hosts. Last login: Thu Oct 3 08:56:23 2019 from gateway [maltfield@osestaging1 ~]$
- but I was unable to ssh into our staging node from my laptop. oddly, it *is* able to establish a connection, but it gets stuck at some handshake step
user@ose:~/openvpn$ ssh -vvvvvvp 32415 maltfield@10.241.189.18 OpenSSH_7.4p1 Debian-10+deb9u7, OpenSSL 1.0.2t 10 Sep 2019 debug1: Reading configuration data /home/user/.ssh/config debug1: Reading configuration data /etc/ssh/ssh_config debug1: /etc/ssh/ssh_config line 19: Applying options for * debug2: resolving "10.241.189.18" port 32415 debug2: ssh_connect_direct: needpriv 0 debug1: Connecting to 10.241.189.18 [10.241.189.18] port 32415. debug1: Connection established. ... debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none debug3: send packet: type 30 debug1: expecting SSH2_MSG_KEX_ECDH_REPLY Connection closed by 10.241.189.18 port 32415 user@ose:~/openvpn$
- ok, I fixed this issue by removing the second VPN (qubes was configured to use a vpn qube as its NetVM; changing this to 'sys-firewall' fixed this issue)
user@ose:~/openvpn$ ssh -p 32415 maltfield@10.241.189.18 Last login: Thu Oct 3 09:20:50 2019 from 10.241.189.6 [maltfield@osestaging1 ~]$
- on second thought, I really should have static ip addresses unique for both the prod & staging nodes. to achieve this, I can't share the same cert; I'll just make '/root/openvpn' one of those dirs (like networking config dirs) that is not changed by the rsync
- I commented-out the 'duplicate-cn' line again in the openvpn server config & restarted the openvpn server
[root@osedev1 openvpn]# systemctl restart openvpn@server.service (reverse-i-search)`grep': ss -plan | ^Cep -i 8080 [root@osedev1 openvpn]# grep -B5 duplicate-cn server.conf # # IF YOU HAVE NOT GENERATED INDIVIDUAL # CERTIFICATE/KEY PAIRS FOR EACH CLIENT, # EACH HAVING ITS OWN UNIQUE "COMMON NAME", # UNCOMMENT THIS LINE OUT. ;duplicate-cn [root@osedev1 openvpn]# systemctl restart openvpn@server.service
- and I created a distinct cert for 'osestaging1'
[root@osedev1 3]# /usr/share/easy-rsa/3.0.6/easyrsa build-client-full osestaging1 nopass Using SSL: openssl OpenSSL 1.0.2k-fips 26 Jan 2017 Generating a 2048 bit RSA private key ....+++ ...........................+++ writing new private key to '/usr/share/easy-rsa/3/pki/private/osestaging1.key.WsJhUsDCny' ----- Using configuration from /usr/share/easy-rsa/3/pki/safessl-easyrsa.cnf Enter pass phrase for /usr/share/easy-rsa/3/pki/private/ca.key: Check that the request matches the signature Signature ok The Subject's Distinguished Name is as follows commonName :ASN.1 12:'osestaging1' Certificate is to be certified until Sep 17 10:34:03 2022 GMT (1080 days) Write out database with 1 new entries Data Base Updated [root@osedev1 3]# cp pki/private/osestaging1.key /home/maltfield/ [root@osedev1 3]# cp pki/private/ta.key /home/maltfield/ [root@osedev1 3]# cp pki/issued/osestaging1.crt /home/maltfield/ [root@osedev1 3]# cp pki/ca.crt /home/maltfield/ [root@osedev1 3]# chown maltfield /home/maltfield/*.key [root@osedev1 3]# chown maltfield /home/maltfield/*.crt [root@osedev1 3]# logout
- and on the staging server
[root@osestaging1 ~]# cd /root/openvpn/ [root@osestaging1 openvpn]# mv /home/maltfield/*.key . mv: overwrite './ta.key'? y [root@osestaging1 openvpn]# mv /home/maltfield/*.crt . mv: overwrite './ca.crt'? y [root@osestaging1 openvpn]# ls ca.crt hetzner2.crt osestaging1.crt ta.key client.conf hetzner2.key osestaging1.key [root@osestaging1 openvpn]# shred -u hetzner2.* [root@osestaging1 openvpn]# ls -lah total 32K drwx------. 2 root root 4.0K Oct 3 10:40 . dr-xr-x---. 4 root root 4.0K Oct 3 07:59 .. -rw-------. 1 maltfield maltfield 1.9K Oct 3 10:36 ca.crt -rw-r--r--. 1 root root 3.6K Oct 3 07:27 client.conf -rw-------. 1 maltfield maltfield 5.6K Oct 3 10:36 osestaging1.crt -rw-------. 1 maltfield maltfield 1.7K Oct 3 10:36 osestaging1.key -rw-------. 1 maltfield maltfield 636 Oct 3 10:36 ta.key [root@osestaging1 openvpn]# chown root:root *.crt [root@osestaging1 openvpn]# chown root:root *.key [root@osestaging1 openvpn]# chmod 0600 client.conf [root@osestaging1 openvpn]# ls -lah total 32K drwx------. 2 root root 4.0K Oct 3 10:40 . dr-xr-x---. 4 root root 4.0K Oct 3 07:59 .. -rw-------. 1 root root 1.9K Oct 3 10:36 ca.crt -rw-------. 1 root root 3.6K Oct 3 07:27 client.conf -rw-------. 1 root root 5.6K Oct 3 10:36 osestaging1.crt -rw-------. 1 root root 1.7K Oct 3 10:36 osestaging1.key -rw-------. 1 root root 636 Oct 3 10:36 ta.key [root@osestaging1 openvpn]# vim client.conf
- I decided to make the following static IPs
- 10.241.189.10 hetzner2 (prod)
- 10.241.189.11 osestaging1
- I did this by uncommenting the line 'client-config-dir ccd', creating a client-specifc config file in the '/etc/openvpn/ccd/' dir whoose name matches the CN (Common Name) on the client cert, and restarting the openvpn server service
[root@osedev1 openvpn]# vim server.conf [root@osedev1 openvpn]# grep -Ei '^client-config-dir ccd' server.conf client-config-dir ccd [root@osedev1 openvpn]# echo "ifconfig-push 10.241.189.11 255.255.255.255" > ccd/osestaging1 [root@osedev1 openvpn]# systemctl restart openvpn@server.service [root@osedev1 openvpn]#
- I did the same for prod
[root@osedev1 openvpn]# echo "ifconfig-push 10.241.189.10 255.255.255.255" > ccd/hetzner2 [root@osedev1 openvpn]# systemctl restart openvpn@server.service [root@osedev1 openvpn]#
- now that it's static, I can update my ssh config to make connecting to the staging node easy after connecting to the vpn from my laptop
user@ose:~/openvpn$ vim ~/.ssh/config user@ose:~/openvpn$ head -n21 ~/.ssh/config # OSE Host openbuildinginstitute.org *.openbuildinginstitute.org opensourceecology.org *.opensourceecology.org Port 32415 ForwardAgent yes IdentityFile /home/user/.ssh/id_rsa.ose User maltfield Host osedev1 HostName 195.201.233.113 Port 32415 ForwardAgent yes IdentityFile /home/user/.ssh/id_rsa.ose User maltfield Host osestaging1 HostName 10.241.189.11 Port 32415 ForwardAgent yes IdentityFile /home/user/.ssh/id_rsa.ose User maltfield user@ose:~/openvpn$ ssh osestaging1 The authenticity of host '[10.241.189.11]:32415 ([10.241.189.11]:32415)' can't be established. ECDSA key fingerprint is SHA256:a6NpVsq/qdOCV8o7u3TXeVfZIxp7hpgMqXFOifTuNrI. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '[10.241.189.11]:32415' (ECDSA) to the list of known hosts. Last login: Thu Oct 3 10:42:40 2019 from 10.241.189.10 [maltfield@osestaging1 ~]$
- another issue remains: we need the staging node to connect to the vpn on startup, but I can't get the fucking systemd module to work
[root@osestaging1 system]# systemctl start openvpn-client\@client.service Job for openvpn-client@client.service failed because the control process exited with error code. See "systemctl status openvpn-client@client.service" and "journalctl -xe" for details. [root@osestaging1 system]# systemctl status openvpn-client\@client.service ● openvpn-client@client.service - OpenVPN tunnel for client Loaded: loaded (/etc/systemd/system/openvpn-client@client.service; disabled; vendor preset: disabled) Active: failed (Result: exit-code) since Thu 2019-10-03 12:34:56 UTC; 8s ago Docs: man:openvpn(8) https://community.openvpn.net/openvpn/wiki/Openvpn24ManPage https://community.openvpn.net/openvpn/wiki/HOWTO Process: 1295 ExecStart=/usr/sbin/openvpn --suppress-timestamps --nobind --config client.conf (code=exited, status=200/CHDIR) Main PID: 1295 (code=exited, status=200/CHDIR) Oct 03 12:34:56 osestaging1 systemd[1]: Starting OpenVPN tunnel for client... Oct 03 12:34:56 osestaging1 systemd[1]: openvpn-client@client.service: main process exited, code=exited, status=200/CHDIR Oct 03 12:34:56 osestaging1 systemd[1]: Failed to start OpenVPN tunnel for client. Oct 03 12:34:56 osestaging1 systemd[1]: Unit openvpn-client@client.service entered failed state. Oct 03 12:34:56 osestaging1 systemd[1]: openvpn-client@client.service failed. [root@osestaging1 system]# tail -n 7 /var/log/messages Oct 3 12:29:29 localhost systemd: openvpn-client@client.service failed. Oct 3 12:34:56 localhost systemd: Starting OpenVPN tunnel for client... Oct 3 12:34:56 localhost systemd: Failed at step CHDIR spawning /usr/sbin/openvpn: No such file or directory Oct 3 12:34:56 localhost systemd: openvpn-client@client.service: main process exited, code=exited, status=200/CHDIR Oct 3 12:34:56 localhost systemd: Failed to start OpenVPN tunnel for client. Oct 3 12:34:56 localhost systemd: Unit openvpn-client@client.service entered failed state. Oct 3 12:34:56 localhost systemd: openvpn-client@client.service failed. [root@osestaging1 system]#
- the /usr/sbin/openvpn file definitely exists; I think the issue is with the tun0 not existing or something
- I gave the osestaging1 container a reboot
- after a reboot, osestaging1 now says that the openvpn-client@client.service doesn't exist!
[maltfield@osestaging1 ~]$ systemctl start openvpn-client\@client.service Failed to start openvpn-client@client.service: The name org.freedesktop.PolicyKit1 was not provided by any .service files See system logs and 'systemctl status openvpn-client@client.service' for details. [maltfield@osestaging1 ~]$ systemctl list-unit-files | grep -i vpn openvpn-client@.service disabled openvpn-client@client.service disabled openvpn-server@.service disabled openvpn@.service disabled [maltfield@osestaging1 ~]$
- attempting to enable it failes
[maltfield@osestaging1 ~]$ systemctl enable /etc/systemd/system/openvpn-client\@client.service Failed to execute operation: The name org.freedesktop.PolicyKit1 was not provided by any .service files [maltfield@osestaging1 ~]$
- oh, duh, I wasn't root
[root@osestaging1 ~]# systemctl status openvpn-client\@client.service ● openvpn-client@client.service - OpenVPN tunnel for client Loaded: loaded (/etc/systemd/system/openvpn-client@client.service; disabled; vendor preset: disabled) Active: inactive (dead) Docs: man:openvpn(8) https://community.openvpn.net/openvpn/wiki/Openvpn24ManPage https://community.openvpn.net/openvpn/wiki/HOWTO [root@osestaging1 ~]# systemctl start openvpn-client\@client.service Job for openvpn-client@client.service failed because the control process exited with error code. See "systemctl status openvpn-client@client.service" and "journalctl -xe" for details. [root@osestaging1 ~]# systemctl status openvpn-client\@client.service ● openvpn-client@client.service - OpenVPN tunnel for client Loaded: loaded (/etc/systemd/system/openvpn-client@client.service; disabled; vendor preset: disabled) Active: failed (Result: exit-code) since Thu 2019-10-03 12:52:39 UTC; 7s ago Docs: man:openvpn(8) https://community.openvpn.net/openvpn/wiki/Openvpn24ManPage https://community.openvpn.net/openvpn/wiki/HOWTO Process: 379 ExecStart=/usr/sbin/openvpn --suppress-timestamps --nobind --config client.conf (code=exited, status=200/CHDIR) Main PID: 379 (code=exited, status=200/CHDIR) Oct 03 12:52:38 osestaging1 systemd[1]: Starting OpenVPN tunnel for client... Oct 03 12:52:39 osestaging1 systemd[1]: openvpn-client@client.service: main process exited, code=exited, status=200/CHDIR Oct 03 12:52:39 osestaging1 systemd[1]: Failed to start OpenVPN tunnel for client. Oct 03 12:52:39 osestaging1 systemd[1]: Unit openvpn-client@client.service entered failed state. Oct 03 12:52:39 osestaging1 systemd[1]: openvpn-client@client.service failed. [root@osestaging1 ~]# tail -n 7 /var/log/messages Oct 3 12:52:38 localhost systemd: Created slice system-openvpn\x2dclient.slice. Oct 3 12:52:38 localhost systemd: Starting OpenVPN tunnel for client... Oct 3 12:52:39 localhost systemd: Failed at step CHDIR spawning /usr/sbin/openvpn: No such file or directory Oct 3 12:52:39 localhost systemd: openvpn-client@client.service: main process exited, code=exited, status=200/CHDIR Oct 3 12:52:39 localhost systemd: Failed to start OpenVPN tunnel for client. Oct 3 12:52:39 localhost systemd: Unit openvpn-client@client.service entered failed state. Oct 3 12:52:39 localhost systemd: openvpn-client@client.service failed. [root@osestaging1 ~]#
- after fighting with this shit for hours, I finally just copied all my files from /root/openvpn into /etc/openvpn/client/ and it worked!
[root@osestaging1 system]# cp /root/openvpn/* /etc/openvpn/client [root@osestaging1 system]# vim openvpn-client\@client.service ... [root@osestaging1 system]# systemctl daemon-reload <30>systemd-fstab-generator[425]: Running in a container, ignoring fstab device entry for /dev/root. [root@osestaging1 system]# systemctl restart openvpn-client\@client.service [root@osestaging1 system]# systemctl status openvpn-client\@client.service ● openvpn-client@client.service - OpenVPN tunnel for client Loaded: loaded (/etc/systemd/system/openvpn-client@client.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2019-10-03 13:33:32 UTC; 1s ago Docs: man:openvpn(8) https://community.openvpn.net/openvpn/wiki/Openvpn24ManPage https://community.openvpn.net/openvpn/wiki/HOWTO Main PID: 432 (openvpn) Status: "Initialization Sequence Completed" CGroup: /user.slice/user-1000.slice/session-582.scope/system.slice/system-openvpn\x2dclient.slice/openvpn-client@client.service └─432 /usr/sbin/openvpn --suppress-timestamps --nobind --config client.conf Oct 03 13:33:33 osestaging1 openvpn[432]: Incoming Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key Oct 03 13:33:33 osestaging1 openvpn[432]: WARNING: Since you are using --dev tun with a point-to-point topology, the second arg...nowarn) Oct 03 13:33:33 osestaging1 openvpn[432]: ROUTE_GATEWAY 192.168.122.1/255.255.255.0 IFACE=eth0 HWADDR=fe:07:06:a6:5f:1d Oct 03 13:33:33 osestaging1 openvpn[432]: TUN/TAP device tun0 opened Oct 03 13:33:33 osestaging1 openvpn[432]: TUN/TAP TX queue length set to 100 Oct 03 13:33:33 osestaging1 openvpn[432]: /sbin/ip link set dev tun0 up mtu 1500 Oct 03 13:33:33 osestaging1 openvpn[432]: /sbin/ip addr add dev tun0 local 10.241.189.11 peer 255.255.255.255 Oct 03 13:33:33 osestaging1 openvpn[432]: /sbin/ip route add 10.241.189.0/24 via 255.255.255.255 Oct 03 13:33:33 osestaging1 openvpn[432]: WARNING: this configuration may cache passwords in memory -- use the auth-nocache opt...nt this Oct 03 13:33:33 osestaging1 openvpn[432]: Initialization Sequence Completed Hint: Some lines were ellipsized, use -l to show in full. [root@osestaging1 system]# ip address show dev tun0 2: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100 link/none inet 10.241.189.11 peer 255.255.255.255/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::927:fae4:1356:9b90/64 scope link flags 800 valid_lft forever preferred_lft forever [root@osestaging1 system]#
- I confirmed that I could ssh into the staging node from my laptop
- I rebooted the staging node
- I confirmed that I could ssh into the staging node again after the reboot!
- I'm not going to bother with trying to setup this with the prod node for now; I'm not in a place where I want to make & test that prod change by rebooting the server..
- this is a good stopping point; I created another snapshot of the staging node
[root@osedev1 ~]# lxc-stop -n osestaging1 [root@osedev1 ~]# lxc-snapshot --name osestaging1 --list snap0 (/var/lib/lxcsnaps/osestaging1) 2019:10:02 15:37:58 [root@osedev1 ~]# lxc-snapshot --name osestaging1 afterVPN lxc_container: lxccontainer.c: lxcapi_snapshot: 2891 Snapshot of directory-backed container requested. lxc_container: lxccontainer.c: lxcapi_snapshot: 2892 Making a copy-clone. If you do want snapshots, then lxc_container: lxccontainer.c: lxcapi_snapshot: 2893 please create an aufs or overlayfs clone first, snapshot that lxc_container: lxccontainer.c: lxcapi_snapshot: 2894 and keep the original container pristine. [root@osedev1 ~]# lxc-snapshot --name osestaging1 --list snap1 (/var/lib/lxcsnaps/osestaging1) 2019:10:03 15:40:16 snap0 (/var/lib/lxcsnaps/osestaging1) 2019:10:02 15:37:58 [root@osedev1 ~]#
- I started the staging container again, and I tested an rsync from prod to staging; first let's see the contents of /etc/varnish on staging
[root@osestaging1 ~]# ls -lah /etc | grep -i varnish [root@osestaging1 ~]#
- and the rsync; it failed. right, I need passwordless sudo on the staging node setup
[maltfield@opensourceecology ~]$ sudo -E rsync -e 'ssh -p 32415' --rsync-path="sudo rsync" -av --progress /etc/varnish maltfield@10.241.189.10:/etc/ [sudo] password for maltfield: The authenticity of host '[10.241.189.10]:32415 ([10.241.189.10]:32415)' can't be established. ECDSA key fingerprint is SHA256:HclF8ZQOjGqx+9TmwL111kZ7QxgKkoEw8g3l2YxV0gk. ECDSA key fingerprint is MD5:cd:87:b1:bb:c1:3e:d1:d1:d4:5d:16:c9:e8:30:6a:71. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '[10.241.189.10]:32415' (ECDSA) to the list of known hosts. sudo: no tty present and no askpass program specified rsync: connection unexpectedly closed (0 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(605) [sender=3.0.9] [maltfield@opensourceecology ~]$
- I added this line to the end of the staging node with 'visudo'
maltfield ALL=(ALL) NOPASSWD: ALL
- doh, I gotta install rsync on the staging node. so many prereqs...
[maltfield@opensourceecology ~]$ sudo -E rsync -e 'ssh -p 32415' --rsync-path="sudo rsync" -av --progress /etc/varnish maltfield@10.241.189.11:/etc/ The authenticity of host '[10.241.189.11]:32415 ([10.241.189.11]:32415)' can't be established. ECDSA key fingerprint is SHA256:a6NpVsq/qdOCV8o7u3TXeVfZIxp7hpgMqXFOifTuNrI. ECDSA key fingerprint is MD5:ab:eb:7f:f2:bb:83:a1:e5:21:49:1e:22:93:17:70:d6. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '[10.241.189.11]:32415' (ECDSA) to the list of known hosts. sudo: rsync: command not found rsync: connection unexpectedly closed (0 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(605) [sender=3.0.9] [maltfield@opensourceecology ~]$
- this time the rsync worked!
[maltfield@opensourceecology ~]$ sudo -E rsync -e 'ssh -p 32415' --rsync-path="sudo rsync" -av --progress /etc/varnish maltfield@10.241.189.11:/etc/ ... sent 192211 bytes received 503 bytes 128476.00 bytes/sec total size is 190106 speedup is 0.99 [maltfield@opensourceecology ~]$
- here's the dir on staging node's side
[root@osestaging1 ~]# ls -lah /etc/varnish total 44K drwxr-xr-x. 5 root root 4.0K Aug 27 06:19 . drwxr-xr-x. 63 root root 4.0K Oct 3 13:52 .. -rw-r--r--. 1 root root 1.4K Apr 9 19:10 all-vhosts.vcl -rw-r--r--. 1 root root 697 Nov 19 2017 catch-all.vcl drwxr-xr-x. 2 root root 4.0K Aug 27 06:17 conf -rw-rw-r--. 1 1011 1011 737 Nov 23 2017 default.vcl drwxr-xr-x. 2 root root 4.0K Apr 12 2018 lib -rw-------. 1 root root 129 Apr 12 2018 secret -rw-------. 1 root root 129 Apr 12 2018 secret.20180412.bak drwxr-xr-x. 2 root root 4.0K Aug 27 06:18 sites-enabled -rw-r--r--. 1 root root 1.1K Oct 21 2017 varnish.params [root@osestaging1 ~]#
- again, here's the dirs we want to exclude; the openvpn configs are already preserved
/root /etc/sudo* /etc/openvpn /usr/share/easy-rsa /dev /sys /proc /boot/ /etc/sysconfig/network* /tmp /var/tmp /etc/fstab /etc/mtab /etc/mdadm.conf
- aaaand *fingers crossed* I kicked-off the rsync
[maltfield@opensourceecology ~]$ time sudo -E rsync -e 'ssh -p 32415' --rsync-path="sudo rsync" --exclude=/root --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/ ...
- whoops, I got ahead of myself! I killed it & left the staging server in a broken state, so I restored from snapshot & re-did the visudo & install rsync steps. But before we actually kick-off this whole-system rsync, I need to attach a hetzner cloud volume and mount it to /var. Else, the dev node's little disk will fill-up!
[root@osedev1 ~]# lxc-snapshot --name osestaging1 -r snap1 [root@osedev1 ~]# lxc-start -n osestaging1
Wed Oct 02, 2019
- continuing on the dev node, I want to create a container for lxc. First I installed 'lxc'
[root@osedev1 ~]# yum install lxc Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile epel/x86_64/metalink | 27 kB 00:00:00 * base: mirror.alpix.eu * epel: mirror.wiuwiu.de * extras: centosmirror.netcup.net * updates: mirror.alpix.eu base | 3.6 kB 00:00:00 epel | 5.3 kB 00:00:00 extras | 2.9 kB 00:00:00 updates | 2.9 kB 00:00:00 (1/6): base/7/x86_64/group_gz | 165 kB 00:00:00 (2/6): base/7/x86_64/primary_db | 6.0 MB 00:00:00 (3/6): epel/x86_64/updateinfo | 1.0 MB 00:00:00 (4/6): updates/7/x86_64/primary_db | 1.1 MB 00:00:00 (5/6): epel/x86_64/primary_db | 6.8 MB 00:00:00 (6/6): extras/7/x86_64/primary_db | 152 kB 00:00:00 Resolving Dependencies --> Running transaction check ---> Package lxc.x86_64 0:1.0.11-2.el7 will be installed --> Processing Dependency: lua-lxc(x86-64) = 1.0.11-2.el7 for package: lxc-1.0.11-2.el7.x86_64 --> Processing Dependency: lua-alt-getopt for package: lxc-1.0.11-2.el7.x86_64 --> Processing Dependency: liblxc.so.1()(64bit) for package: lxc-1.0.11-2.el7.x86_64 --> Running transaction check ---> Package lua-alt-getopt.noarch 0:0.7.0-4.el7 will be installed ---> Package lua-lxc.x86_64 0:1.0.11-2.el7 will be installed --> Processing Dependency: lua-filesystem for package: lua-lxc-1.0.11-2.el7.x86_64 ---> Package lxc-libs.x86_64 0:1.0.11-2.el7 will be installed --> Running transaction check ---> Package lua-filesystem.x86_64 0:1.6.2-2.el7 will be installed --> Finished Dependency Resolution Dependencies Resolved ========================================================================================================================================= Package Arch Version Repository Size ========================================================================================================================================= Installing: lxc x86_64 1.0.11-2.el7 epel 140 k Installing for dependencies: lua-alt-getopt noarch 0.7.0-4.el7 epel 7.4 k lua-filesystem x86_64 1.6.2-2.el7 epel 28 k lua-lxc x86_64 1.0.11-2.el7 epel 17 k lxc-libs x86_64 1.0.11-2.el7 epel 276 k Transaction Summary ========================================================================================================================================= Install 1 Package (+4 Dependent packages) Total download size: 468 k Installed size: 1.0 M Is this ok [y/d/N]: y Downloading packages: (1/5): lua-alt-getopt-0.7.0-4.el7.noarch.rpm | 7.4 kB 00:00:00 (2/5): lua-filesystem-1.6.2-2.el7.x86_64.rpm | 28 kB 00:00:00 (3/5): lua-lxc-1.0.11-2.el7.x86_64.rpm | 17 kB 00:00:00 (4/5): lxc-1.0.11-2.el7.x86_64.rpm | 140 kB 00:00:00 (5/5): lxc-libs-1.0.11-2.el7.x86_64.rpm | 276 kB 00:00:00 ----------------------------------------------------------------------------------------------------------------------------------------- Total 717 kB/s | 468 kB 00:00:00 Running transaction check Running transaction test Transaction test succeeded Running transaction Installing : lxc-libs-1.0.11-2.el7.x86_64 1/5 Installing : lua-filesystem-1.6.2-2.el7.x86_64 2/5 Installing : lua-lxc-1.0.11-2.el7.x86_64 3/5 Installing : lua-alt-getopt-0.7.0-4.el7.noarch 4/5 Installing : lxc-1.0.11-2.el7.x86_64 5/5 Verifying : lua-lxc-1.0.11-2.el7.x86_64 1/5 Verifying : lua-alt-getopt-0.7.0-4.el7.noarch 2/5 Verifying : lxc-1.0.11-2.el7.x86_64 3/5 Verifying : lua-filesystem-1.6.2-2.el7.x86_64 4/5 Verifying : lxc-libs-1.0.11-2.el7.x86_64 5/5 Installed: lxc.x86_64 0:1.0.11-2.el7 Dependency Installed: lua-alt-getopt.noarch 0:0.7.0-4.el7 lua-filesystem.x86_64 0:1.6.2-2.el7 lua-lxc.x86_64 0:1.0.11-2.el7 lxc-libs.x86_64 0:1.0.11-2.el7 Complete! [root@osedev1 ~]#
- by default, it appears that we have no lxc containers
[root@osedev1 ~]# ls -lah /usr/share/lxc/templates/ total 8.0K drwxr-xr-x. 2 root root 4.0K Mar 7 2019 . drwxr-xr-x. 6 root root 4.0K Oct 2 12:16 .. [root@osedev1 ~]#
- I installed the 'lxc-templates' package (also from epel), and it gave me templates for many distros, including centos
[root@osedev1 ~]# yum -y install lxc-templates Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: mirror.alpix.eu * epel: mirror.wiuwiu.de * extras: centosmirror.netcup.net * updates: mirror.alpix.eu Resolving Dependencies --> Running transaction check ---> Package lxc-templates.x86_64 0:1.0.11-2.el7 will be installed --> Finished Dependency Resolution Dependencies Resolved ========================================================================================================================================= Package Arch Version Repository Size ========================================================================================================================================= Installing: lxc-templates x86_64 1.0.11-2.el7 epel 81 k Transaction Summary ========================================================================================================================================= Install 1 Package Total download size: 81 k Installed size: 333 k Downloading packages: lxc-templates-1.0.11-2.el7.x86_64.rpm | 81 kB 00:00:00 Running transaction check Running transaction test Transaction test succeeded Running transaction Installing : lxc-templates-1.0.11-2.el7.x86_64 1/1 Verifying : lxc-templates-1.0.11-2.el7.x86_64 1/1 Installed: lxc-templates.x86_64 0:1.0.11-2.el7 Complete! [root@osedev1 ~]# ls -lah /usr/share/lxc/templates/ total 348K drwxr-xr-x. 2 root root 4.0K Oct 2 12:29 . drwxr-xr-x. 6 root root 4.0K Oct 2 12:16 .. -rwxr-xr-x. 1 root root 11K Mar 7 2019 lxc-alpine -rwxr-xr-x. 1 root root 14K Mar 7 2019 lxc-altlinux -rwxr-xr-x. 1 root root 11K Mar 7 2019 lxc-archlinux -rwxr-xr-x. 1 root root 9.5K Mar 7 2019 lxc-busybox -rwxr-xr-x. 1 root root 30K Mar 7 2019 lxc-centos -rwxr-xr-x. 1 root root 11K Mar 7 2019 lxc-cirros -rwxr-xr-x. 1 root root 18K Mar 7 2019 lxc-debian -rwxr-xr-x. 1 root root 18K Mar 7 2019 lxc-download -rwxr-xr-x. 1 root root 49K Mar 7 2019 lxc-fedora -rwxr-xr-x. 1 root root 28K Mar 7 2019 lxc-gentoo -rwxr-xr-x. 1 root root 14K Mar 7 2019 lxc-openmandriva -rwxr-xr-x. 1 root root 14K Mar 7 2019 lxc-opensuse -rwxr-xr-x. 1 root root 35K Mar 7 2019 lxc-oracle -rwxr-xr-x. 1 root root 12K Mar 7 2019 lxc-plamo -rwxr-xr-x. 1 root root 6.7K Mar 7 2019 lxc-sshd -rwxr-xr-x. 1 root root 24K Mar 7 2019 lxc-ubuntu -rwxr-xr-x. 1 root root 12K Mar 7 2019 lxc-ubuntu-cloud [root@osedev1 ~]#
- now I was successfully able to create an lxc container for our staging node named 'osestaging1' from the template 'centos'. I didn't specify the version, but it does appear to be centos7
[root@osedev1 ~]# lxc-create -n osestaging1 -t centos Host CPE ID from /etc/os-release: cpe:/o:centos:centos:7 Checking cache download in /var/cache/lxc/centos/x86_64/7/rootfs ... Downloading CentOS minimal ... ... Download complete. Copy /var/cache/lxc/centos/x86_64/7/rootfs to /var/lib/lxc/osestaging1/rootfs ... Copying rootfs to /var/lib/lxc/osestaging1/rootfs ... sed: can't read /var/lib/lxc/osestaging1/rootfs/etc/init/tty.conf: No such file or directory Storing root password in '/var/lib/lxc/osestaging1/tmp_root_pass' Expiring password for user root. passwd: Success sed: can't read /var/lib/lxc/osestaging1/rootfs/etc/rc.sysinit: No such file or directory sed: can't read /var/lib/lxc/osestaging1/rootfs/etc/rc.d/rc.sysinit: No such file or directory Container rootfs and config have been created. Edit the config file to check/enable networking setup. The temporary root password is stored in: '/var/lib/lxc/osestaging1/tmp_root_pass' The root password is set up as expired and will require it to be changed at first login, which you should do as soon as possible. If you lose the root password or wish to change it without starting the container, you can change it from the host by running the following command (which will also reset the expired flag): chroot /var/lib/lxc/osestaging1/rootfs passwd [root@osedev1 ~]#
- the sync from prod to sync is going to override the staging root password, so I won't bother creating & setting a distinct root password for this staging container
- `lxc-top` shows that we have 0 containers running
[root@osedev1 ~]# lxc-top Container CPU CPU CPU BlkIO Mem Name Used Sys User Total Used TOTAL (0 ) 0.00 0.00 0.00 0.00 0.00
- I tried to start the staging container, but I got a networking error
[root@osedev1 ~]# lxc-start -n osestaging1 lxc-start: conf.c: instantiate_veth: 3115 failed to attach 'vethWX1L1G' to the bridge 'virbr0': No such device lxc-start: conf.c: lxc_create_network: 3407 failed to create netdev lxc-start: start.c: lxc_spawn: 875 failed to create the network lxc-start: start.c: __lxc_start: 1149 failed to spawn 'osestaging1' lxc-start: lxc_start.c: main: 336 The container failed to start. lxc-start: lxc_start.c: main: 340 Additional information can be obtained by setting the --logfile and --logpriority options. [root@osedev1 ~]#
- it looks like there is no 'vibr0' device; we only have the loopback, ethernet, and tun device for openvpn
[root@osedev1 ~]# ip -all address show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 96:00:00:2e:48:9d brd ff:ff:ff:ff:ff:ff inet 195.201.233.113/32 brd 195.201.233.113 scope global dynamic eth0 valid_lft 56775sec preferred_lft 56775sec inet6 2a01:4f8:c010:3ca0::1/64 scope global valid_lft forever preferred_lft forever inet6 fe80::9400:ff:fe2e:489d/64 scope link valid_lft forever preferred_lft forever 3: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100 link/none inet 10.241.189.1 peer 10.241.189.2/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::4ca6:2d27:e97f:1a66/64 scope link flags 800 valid_lft forever preferred_lft forever [root@osedev1 ~]#
- Ideally, the container would not be given an internet-facing ip address, anyway. It would be better to give it a bridge on the tun0 openvpn network
- it looks like the relevant files for containers is in /var/lib/lxc/<containerName>/
[root@osedev1 osestaging1]# date Wed Oct 2 12:47:07 CEST 2019 [root@osedev1 osestaging1]# pwd /var/lib/lxc/osestaging1 [root@osedev1 osestaging1]# ls config rootfs tmp_root_pass [root@osedev1 osestaging1]#
- here is the default config
[root@osedev1 osestaging1]# cat config # Template used to create this container: /usr/share/lxc/templates/lxc-centos # Parameters passed to the template: # For additional config options, please look at lxc.container.conf(5) lxc.network.type = veth lxc.network.flags = up lxc.network.link = virbr0 lxc.network.hwaddr = fe:07:06:a6:5f:1d lxc.rootfs = /var/lib/lxc/osestaging1/rootfs # Include common configuration lxc.include = /usr/share/lxc/config/centos.common.conf lxc.arch = x86_64 lxc.utsname = osestaging1 lxc.autodev = 1 # When using LXC with apparmor, uncomment the next line to run unconfined: #lxc.aa_profile = unconfined # example simple networking setup, uncomment to enable #lxc.network.type = veth #lxc.network.flags = up #lxc.network.link = lxcbr0 #lxc.network.name = eth0 # Additional example for veth network type # static MAC address, #lxc.network.hwaddr = 00:16:3e:77:52:20 # persistent veth device name on host side # Note: This may potentially collide with other containers of same name! #lxc.network.veth.pair = v-osestaging1-e0 [root@osedev1 osestaging1]#
- to my horror, I discovered that iptables was disabled on the dev server! why!?!
[root@osedev1 osestaging1]# iptables-save [root@osedev1 osestaging1]# ip6tables-save [root@osedev1 osestaging1]# service iptables status Redirecting to /bin/systemctl status iptables.service ● iptables.service - IPv4 firewall with iptables Loaded: loaded (/usr/lib/systemd/system/iptables.service; disabled; vendor preset: disabled) Active: inactive (dead) [root@osedev1 osestaging1]# service iptables start Redirecting to /bin/systemctl start iptables.service [root@osedev1 osestaging1]# iptables-save # Generated by iptables-save v1.4.21 on Wed Oct 2 12:58:21 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [17:1396] -A INPUT -i lo -j ACCEPT -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -p icmp -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 32415 -j ACCEPT -A INPUT -p udp -m state --state NEW -m udp --dport 1194 -j ACCEPT -A INPUT -j DROP COMMIT # Completed on Wed Oct 2 12:58:21 2019 [root@osedev1 osestaging1]# ip6tables-save root@osedev1 osestaging1]# service ip6tables start Redirecting to /bin/systemctl start ip6tables.service [root@osedev1 osestaging1]# ip6tables-save # Generated by ip6tables-save v1.4.21 on Wed Oct 2 12:59:51 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -p ipv6-icmp -j ACCEPT -A INPUT -i lo -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT -A INPUT -d fe80::/64 -p udp -m udp --dport 546 -m state --state NEW -j ACCEPT -A INPUT -j REJECT --reject-with icmp6-adm-prohibited -A FORWARD -j REJECT --reject-with icmp6-adm-prohibited COMMIT # Completed on Wed Oct 2 12:59:51 2019 [root@osedev1 osestaging1]#
- systemd says that both iptables.service & ip6tables.service are 'loaded active exited'
[root@osedev1 osestaging1]# systemctl list-units | grep -Ei 'iptables|ip6tables' ip6tables.service loaded active exited IPv6 firewall with ip6tables iptables.service loaded active exited IPv4 firewall with iptables [root@osedev1 osestaging1]#
- systemd status shows both services are 'disabled'
[root@osedev1 osestaging1]# systemctl status iptables.service ● iptables.service - IPv4 firewall with iptables Loaded: loaded (/usr/lib/systemd/system/iptables.service; disabled; vendor preset: disabled) Active: active (exited) since Wed 2019-10-02 12:58:17 CEST; 7min ago Process: 29121 ExecStart=/usr/libexec/iptables/iptables.init start (code=exited, status=0/SUCCESS) Main PID: 29121 (code=exited, status=0/SUCCESS) CGroup: /system.slice/iptables.service Oct 02 12:58:17 osedev1 systemd[1]: Starting IPv4 firewall with iptables... Oct 02 12:58:17 osedev1 iptables.init[29121]: iptables: Applying firewall rules: [ OK ] Oct 02 12:58:17 osedev1 systemd[1]: Started IPv4 firewall with iptables. [root@osedev1 osestaging1]# systemctl status ip6tables.service ● ip6tables.service - IPv6 firewall with ip6tables Loaded: loaded (/usr/lib/systemd/system/ip6tables.service; disabled; vendor preset: disabled) Active: active (exited) since Wed 2019-10-02 12:59:46 CEST; 6min ago Process: 29233 ExecStart=/usr/libexec/iptables/ip6tables.init start (code=exited, status=0/SUCCESS) Main PID: 29233 (code=exited, status=0/SUCCESS) Oct 02 12:59:46 osedev1 systemd[1]: Starting IPv6 firewall with ip6tables... Oct 02 12:59:46 osedev1 ip6tables.init[29233]: ip6tables: Applying firewall rules: [ OK ] Oct 02 12:59:46 osedev1 systemd[1]: Started IPv6 firewall with ip6tables. [root@osedev1 osestaging1]#
- I enabled both, and I confirmed that they're now set to 'enabled' (see second line)
[root@osedev1 osestaging1]# systemctl enable iptables.service Created symlink from /etc/systemd/system/basic.target.wants/iptables.service to /usr/lib/systemd/system/iptables.service. [root@osedev1 osestaging1]# systemctl enable ip6tables.service Created symlink from /etc/systemd/system/basic.target.wants/ip6tables.service to /usr/lib/systemd/system/ip6tables.service. [root@osedev1 osestaging1]# systemctl status iptables.service ● iptables.service - IPv4 firewall with iptables Loaded: loaded (/usr/lib/systemd/system/iptables.service; enabled; vendor preset: disabled) Active: active (exited) since Wed 2019-10-02 12:58:17 CEST; 8min ago Main PID: 29121 (code=exited, status=0/SUCCESS) CGroup: /system.slice/iptables.service Oct 02 12:58:17 osedev1 systemd[1]: Starting IPv4 firewall with iptables... Oct 02 12:58:17 osedev1 iptables.init[29121]: iptables: Applying firewall rules: [ OK ] Oct 02 12:58:17 osedev1 systemd[1]: Started IPv4 firewall with iptables. [root@osedev1 osestaging1]# systemctl status ip6tables.service ● ip6tables.service - IPv6 firewall with ip6tables Loaded: loaded (/usr/lib/systemd/system/ip6tables.service; enabled; vendor preset: disabled) Active: active (exited) since Wed 2019-10-02 12:59:46 CEST; 7min ago Main PID: 29233 (code=exited, status=0/SUCCESS) Oct 02 12:59:46 osedev1 systemd[1]: Starting IPv6 firewall with ip6tables... Oct 02 12:59:46 osedev1 ip6tables.init[29233]: ip6tables: Applying firewall rules: [ OK ] Oct 02 12:59:46 osedev1 systemd[1]: Started IPv6 firewall with ip6tables. [root@osedev1 osestaging1]#
- actually, it doesn't make sense to have the staging server only have an ip address on the openvpn subnet; if that were the case, then it couldn't access the internet...which would make developing a POC nearly impossible. We want to prevent forwarding ports from the internet to the machine, but we do want to let it reach OUT to the internet. Perhaps we should setup the bridge per normal and then just have the openvpn client running on he staging server. Indeed, we'll need the prod server to be running an openvpn client, so we should be able to just duplicate this config (they'll be the same anyway!)
- I looked into what options are available for 'lxc.network.type', which is listed in section 5 of the man page for 'lxc.container.conf' = `man 5 lxc.container.conf`
lxc.network.type specify what kind of network virtualization to be used for the container. Each time a lxc.network.type field is found a new round of network configuration begins. In this way, several network virtualization types can be specified for the same container, as well as assigning several network interfaces for one container. The different virtualization types can be: none: will cause the container to share the host's network namespace. This means the host network devices are usable in the container. It also means that if both the container and host have upstart as init, 'halt' in a container (for instance) will shut down the host. empty: will create only the loopback interface. veth: a virtual ethernet pair device is created with one side assigned to the container and the other side attached to a bridge specified by the lxc.network.link option. If the bridge is not specified, then the veth pair device will be created but not attached to any bridge. Otherwise, the bridge has to be created on the system before starting the con‐ tainer. lxc won't handle any configuration outside of the container. By default, lxc chooses a name for the network device belonging to the outside of the container, but if you wish to handle this name yourselves, you can tell lxc to set a specific name with the lxc.network.veth.pair option (except for unprivileged containers where this option is ignored for security reasons). vlan: a vlan interface is linked with the interface specified by the lxc.network.link and assigned to the container. The vlan identifier is specified with the option lxc.network.vlan.id. macvlan: a macvlan interface is linked with the interface specified by the lxc.network.link and assigned to the con‐ tainer. lxc.network.macvlan.mode specifies the mode the macvlan will use to communicate between different macvlan on the same upper device. The accepted modes are private, the device never communicates with any other device on the same upper_dev (default), vepa, the new Virtual Ethernet Port Aggregator (VEPA) mode, it assumes that the adjacent bridge returns all frames where both source and destination are local to the macvlan port, i.e. the bridge is set up as a reflective relay. Broadcast frames coming in from the upper_dev get flooded to all macvlan interfaces in VEPA mode, local frames are not delivered locally, or bridge, it provides the behavior of a simple bridge between different macvlan interfaces on the same port. Frames from one interface to another one get delivered directly and are not sent out externally. Broadcast frames get flooded to all other bridge ports and to the external interface, but when they come back from a reflective relay, we don't deliver them again. Since we know all the MAC addresses, the macvlan bridge mode does not require learning or STP like the bridge module does. phys: an already existing interface specified by the lxc.network.link is assigned to the container.
- we want the container to be able to touch the internet, so hat rules out 'empty'
- we don't have a spare physical interface on the server for each container, so that rules out 'phys'
- I'm unclear on the distinction between macvlan, vlan, veth, and none. Probably we want veth and we need to get the 'virbr0' interface actually working
- google says our error may be caused by libvert not being installed
- I didn't have libvirt installed, so I did so
[root@osedev1 osestaging1]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 96:00:00:2e:48:9d brd ff:ff:ff:ff:ff:ff inet 195.201.233.113/32 brd 195.201.233.113 scope global dynamic eth0 valid_lft 50735sec preferred_lft 50735sec inet6 2a01:4f8:c010:3ca0::1/64 scope global valid_lft forever preferred_lft forever inet6 fe80::9400:ff:fe2e:489d/64 scope link valid_lft forever preferred_lft forever 3: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100 link/none inet 10.241.189.1 peer 10.241.189.2/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::4ca6:2d27:e97f:1a66/64 scope link flags 800 valid_lft forever preferred_lft forever [root@osedev1 osestaging1]# rpm -qa | grep -i libvirt [root@osedev1 osestaging1]# yum -y install libvirt Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: mirror.alpix.eu * epel: mirror.wiuwiu.de * extras: centosmirror.netcup.net * updates: mirror.alpix.eu Resolving Dependencies ... Complete! [root@osedev1 osestaging1]#
- but there didn't appear to be any changes; I had to manually start the libvirtd service to get the changes; now it shows two new interfaces: 'virbr0' & 'virbr0-nic'
[root@osedev1 osestaging1]# systemctl status libvirtd ● libvirtd.service - Virtualization daemon Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled) Active: inactive (dead) Docs: man:libvirtd(8) https://libvirt.org [root@osedev1 osestaging1]# systemctl start libvirtd [root@osedev1 osestaging1]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 96:00:00:2e:48:9d brd ff:ff:ff:ff:ff:ff inet 195.201.233.113/32 brd 195.201.233.113 scope global dynamic eth0 valid_lft 50619sec preferred_lft 50619sec inet6 2a01:4f8:c010:3ca0::1/64 scope global valid_lft forever preferred_lft forever inet6 fe80::9400:ff:fe2e:489d/64 scope link valid_lft forever preferred_lft forever 3: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100 link/none inet 10.241.189.1 peer 10.241.189.2/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::4ca6:2d27:e97f:1a66/64 scope link flags 800 valid_lft forever preferred_lft forever 6: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000 link/ether 52:54:00:7d:01:71 brd ff:ff:ff:ff:ff:ff inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0 valid_lft forever preferred_lft forever 7: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN group default qlen 1000 link/ether 52:54:00:7d:01:71 brd ff:ff:ff:ff:ff:ff [root@osedev1 osestaging1]#
- and there's some changes to the routing table too
[root@osedev1 osestaging1]# ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 link/ether 96:00:00:2e:48:9d brd ff:ff:ff:ff:ff:ff 3: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 100 link/none 6: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 link/ether 52:54:00:7d:01:71 brd ff:ff:ff:ff:ff:ff 7: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN mode DEFAULT group default qlen 1000 link/ether 52:54:00:7d:01:71 brd ff:ff:ff:ff:ff:ff [root@osedev1 osestaging1]# ip r default via 172.31.1.1 dev eth0 10.241.189.0/24 via 10.241.189.2 dev tun0 10.241.189.2 dev tun0 proto kernel scope link src 10.241.189.1 169.254.0.0/16 dev eth0 scope link metric 1002 172.31.1.1 dev eth0 scope link 192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 [root@osedev1 osestaging1]#
- now I was successfully able to start the 'osestaging1' container
[root@osedev1 osestaging1]# lxc-start -n osestaging1 systemd 219 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN) Detected virtualization lxc. Detected architecture x86-64. Welcome to CentOS Linux 7 (Core)! Running in a container, ignoring fstab device entry for /dev/root. Cannot add dependency job for unit display-manager.service, ignoring: Unit not found. [ OK ] Reached target Remote File Systems. [ OK ] Reached target Swap. [ OK ] Started Forward Password Requests to Wall Directory Watch. [ OK ] Created slice Root Slice. [ OK ] Created slice User and Session Slice. [ OK ] Listening on /dev/initctl Compatibility Named Pipe. [ OK ] Listening on Journal Socket. [ OK ] Started Dispatch Password Requests to Console Directory Watch. [ OK ] Reached target Local Encrypted Volumes. [ OK ] Reached target Paths. [ OK ] Listening on Delayed Shutdown Socket. [ OK ] Created slice System Slice. [ OK ] Created slice system-getty.slice. Starting Journal Service... Mounting POSIX Message Queue File System... [ OK ] Reached target Slices. Starting Read and set NIS domainname from /etc/sysconfig/network... Mounting Huge Pages File System... Starting Remount Root and Kernel File Systems... [ OK ] Mounted Huge Pages File System. [ OK ] Mounted POSIX Message Queue File System. [ OK ] Started Journal Service. [ OK ] Started Read and set NIS domainname from /etc/sysconfig/network. [ OK ] Started Remount Root and Kernel File Systems. [ OK ] Reached target Local File Systems (Pre). Starting Configure read-only root support... Starting Rebuild Hardware Database... Starting Flush Journal to Persistent Storage... <46>systemd-journald[14]: Received request to flush runtime journal from PID 1 [ OK ] Started Flush Journal to Persistent Storage. [ OK ] Started Configure read-only root support. Starting Load/Save Random Seed... [ OK ] Reached target Local File Systems. Starting Rebuild Journal Catalog... Starting Mark the need to relabel after reboot... Starting Create Volatile Files and Directories... [ OK ] Started Load/Save Random Seed. [ OK ] Reached target Local File Systems. Starting Rebuild Journal Catalog... Starting Mark the need to relabel after reboot... Starting Create Volatile Files and Directories... [ OK ] Started Load/Save Random Seed. [ OK ] Started Rebuild Journal Catalog. [ OK ] Started Mark the need to relabel after reboot. [ OK ] Started Create Volatile Files and Directories. Starting Update UTMP about System Boot/Shutdown... [ OK ] Started Update UTMP about System Boot/Shutdown. [ OK ] Started Rebuild Hardware Database. Starting Update is Completed... [ OK ] Started Update is Completed. [ OK ] Reached target System Initialization. [ OK ] Listening on D-Bus System Message Bus Socket. [ OK ] Reached target Sockets. [ OK ] Reached target Basic System. Starting LSB: Bring up/down networking... Starting Permit User Sessions... Starting Login Service... Starting OpenSSH Server Key Generation... [ OK ] Started D-Bus System Message Bus. [ OK ] Started Daily Cleanup of Temporary Directories. [ OK ] Reached target Timers. [ OK ] Started Permit User Sessions. Starting Cleanup of Temporary Directories... [ OK ] Started Command Scheduler. [ OK ] Started Console Getty. [ OK ] Reached target Login Prompts. [ OK ] Started Cleanup of Temporary Directories. [ OK ] Started Login Service. [ OK ] Started OpenSSH Server Key Generation. CentOS Linux 7 (Core) Kernel 3.10.0-957.21.3.el7.x86_64 on an x86_64 osestaging1 login:
- I was successfully able to login as root, but it made me change the password immedately. I just set it to the same root password as our prod server
osestaging1 login: root Password: You are required to change your password immediately (root enforced) Changing password for root. (current) UNIX password: New password: Retype new password: [root@osestaging1 ~]#
- this new container has an ip address of '192.168.122.201', and it does have access to the internet
[root@osestaging1 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 8: eth0@if9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether fe:07:06:a6:5f:1d brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 192.168.122.201/24 brd 192.168.122.255 scope global dynamic eth0 valid_lft 3310sec preferred_lft 3310sec inet6 fe80::fc07:6ff:fea6:5f1d/64 scope link valid_lft forever preferred_lft forever [root@osestaging1 ~]# ping 1.1.1.1 PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data. 64 bytes from 1.1.1.1: icmp_seq=1 ttl=55 time=5.46 ms 64 bytes from 1.1.1.1: icmp_seq=2 ttl=55 time=5.48 ms --- 1.1.1.1 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 5.468/5.474/5.480/0.006 ms [root@osestaging1 ~]#
- on the dev node host, we can also see the bridge with `brctl`
[root@osedev1 osestaging1]# brctl show bridge name bridge id STP enabled interfaces virbr0 8000.5254007d0171 yes vethYMJVGD virbr0-nic [root@osedev1 osestaging1]#
- now I think we're about ready to initiate this sync. Interesting decision: we could either rsync (via ssh) to the dev node or to the staging container. I think it would be safer to go to the container, as you can't fuck up the host dev node in that case.
- I confirmed that ssh is listening on the default install of the staging container
[root@osestaging1 ~]# ss -plan | grep -i ssh u_str ESTAB 0 0 * 162265 * 0 users:(("sshd",pid=298,fd=2),("sshd",pid=298,fd=1)) tcp LISTEN 0 128 *:22 *:* users:(("sshd",pid=298,fd=3)) tcp LISTEN 0 128 [::]:22 [::]:* users:(("sshd",pid=298,fd=4)) [root@osestaging1 ~]#
- I did some basic bootstrap config of the staging container, following my documentation for doing the same to its host dev server Maltfield_Log/2019_Q3#Tue_Aug_20.2C_2019
[root@osestaging1 ~]# useradd maltfield [root@osestaging1 ~]# su - maltfield [maltfield@osestaging1 ~]$ mkdir .ssh [maltfield@osestaging1 ~]$ echo "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDGNYjR7UKiJSAG/AbP+vlCBqNfQZ2yuSXfsEDuM7cEU8PQNJyuJnS7m0VcA48JRnpUpPYYCCB0fqtIEhpP+szpMg2LByfTtbU0vDBjzQD9mEfwZ0mzJsfzh1Nxe86l/d6h6FhxAqK+eG7ljYBElDhF4l2lgcMAl9TiSba0pcqqYBRsvJgQoAjlZOIeVEvM1lyfWfrmDaFK37jdUCBWq8QeJ98qpNDX4A76f9T5Y3q5EuSFkY0fcU+zwFxM71bGGlgmo5YsMMdSsW+89fSG0652/U4sjf4NTHCpuD0UaSPB876NJ7QzeDWtOgyBC4nhPpS8pgjsnl48QZuVm6FNDqbXr9bVk5BdntpBgps+gXdSL2j0/yRRayLXzps1LCdasMCBxCzK+lJYWGalw5dNaIDHBsEZiK55iwPp0W3lU9vXFO4oKNJGFgbhNmn+KAaW82NBwlTHo/tOlj2/VQD9uaK5YLhQqAJzIq0JuWZWFLUC2FJIIG0pJBIonNabANcN+vq+YJqjd+JXNZyTZ0mzuj3OAB/Z5zS6lT9azPfnEjpcOngFs46P7S/1hRIrSWCvZ8kfECpa8W+cTMus4rpCd40d1tVKzJA/n0MGJjEs2q4cK6lC08pXxq9zAyt7PMl94PHse2uzDFhrhh7d0ManxNZE+I5/IPWOnG1PJsDlOe4Yqw== michael@opensourceecology.org" > .ssh/authorized_keys [maltfield@osestaging1 ~]$ chmod 0700 .ssh [maltfield@osestaging1 ~]$ chmod 0600 .ssh/authorized_keys [maltfield@osestaging1 ~]$
- I confirmed that I could now successfully ssh in as 'maltfield' using my key into staging from within dev
user@ose:~$ ssh -A osedev1 Last login: Wed Oct 2 12:09:35 2019 from 5.254.96.238 [maltfield@osedev1 ~]$ ssh maltfield@192.168.122.201 hostname The authenticity of host '192.168.122.201 (192.168.122.201)' can't be established. ECDSA key fingerprint is SHA256:a6NpVsq/qdOCV8o7u3TXeVfZIxp7hpgMqXFOifTuNrI. ECDSA key fingerprint is MD5:ab:eb:7f:f2:bb:83:a1:e5:21:49:1e:22:93:17:70:d6. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.168.122.201' (ECDSA) to the list of known hosts. osestaging1 [maltfield@osedev1 ~]$
- and continued with the bootstrap of my user, giving myself sudo rights
[root@osestaging1 ~]# yum -y install sudo ... Installed: sudo.x86_64 0:1.8.23-4.el7 Complete! [root@osestaging1 ~]# passwd maltfield Changing password for user maltfield. New password: Retype new password: passwd: all authentication tokens updated successfully. [root@osestaging1 ~]# gpasswd -a maltfield wheel Adding user maltfield to group wheel [root@osestaging1 ~]# su - maltfield Last login: Wed Oct 2 13:00:29 UTC 2019 on lxc/console [maltfield@osestaging1 ~]$ sudo su - We trust you have received the usual lecture from the local System Administrator. It usually boils down to these three things: #1) Respect the privacy of others. #2) Think before you type. #3) With great power comes great responsibility. [sudo] password for maltfield: Last login: Wed Oct 2 12:33:00 UTC 2019 on lxc/console [root@osestaging1 ~]#
- this time I took the hardened config from dev and gave it to staging; first on dev I ran:
user@ose:~$ ssh osedev1 Last login: Wed Oct 2 14:57:15 2019 from 5.254.96.238 [maltfield@osedev1 ~]$ sudo cp /etc/ssh/sshd_config . [maltfield@osedev1 ~]$ sudo chown maltfield sshd_config [maltfield@osedev1 ~]$ scp sshd_config 192.168.122.201: sshd_config 100% 4455 5.7MB/s 00:00 [maltfield@osedev1 ~]$
- and then in staging
[maltfield@osestaging1 ~]$ ls sshd_config [maltfield@osestaging1 ~]$ sudo su - [sudo] password for maltfield: Last login: Wed Oct 2 13:02:02 UTC 2019 on lxc/console [root@osestaging1 ~]# cd /etc/ssh [root@osestaging1 ssh]# mv sshd_config sshd_config.20191002.orig [root@osestaging1 ssh]# mv /home/maltfield/sshd_config . [root@osestaging1 ssh]# ls -lah total 620K drwxr-xr-x. 2 root root 4.0K Oct 2 13:16 . drwxr-xr-x. 60 root root 4.0K Oct 2 13:01 .. -rw-r--r--. 1 root root 569K Aug 9 01:40 moduli -rw-r--r--. 1 root root 2.3K Aug 9 01:40 ssh_config -rw-r-----. 1 root ssh_keys 227 Oct 2 12:28 ssh_host_ecdsa_key -rw-r--r--. 1 root root 162 Oct 2 12:28 ssh_host_ecdsa_key.pub -rw-r-----. 1 root ssh_keys 387 Oct 2 12:28 ssh_host_ed25519_key -rw-r--r--. 1 root root 82 Oct 2 12:28 ssh_host_ed25519_key.pub -rw-r-----. 1 root ssh_keys 1.7K Oct 2 12:28 ssh_host_rsa_key -rw-r--r--. 1 root root 382 Oct 2 12:28 ssh_host_rsa_key.pub -rw-------. 1 maltfield maltfield 4.4K Oct 2 13:07 sshd_config -rw-------. 1 root root 3.9K Aug 9 01:40 sshd_config.20191002.orig [root@osestaging1 ssh]# chown root:root sshd_config [root@osestaging1 ssh]# ls -lah total 620K drwxr-xr-x. 2 root root 4.0K Oct 2 13:16 . drwxr-xr-x. 60 root root 4.0K Oct 2 13:01 .. -rw-r--r--. 1 root root 569K Aug 9 01:40 moduli -rw-r--r--. 1 root root 2.3K Aug 9 01:40 ssh_config -rw-r-----. 1 root ssh_keys 227 Oct 2 12:28 ssh_host_ecdsa_key -rw-r--r--. 1 root root 162 Oct 2 12:28 ssh_host_ecdsa_key.pub -rw-r-----. 1 root ssh_keys 387 Oct 2 12:28 ssh_host_ed25519_key -rw-r--r--. 1 root root 82 Oct 2 12:28 ssh_host_ed25519_key.pub -rw-r-----. 1 root ssh_keys 1.7K Oct 2 12:28 ssh_host_rsa_key -rw-r--r--. 1 root root 382 Oct 2 12:28 ssh_host_rsa_key.pub -rw-------. 1 root root 4.4K Oct 2 13:07 sshd_config -rw-------. 1 root root 3.9K Aug 9 01:40 sshd_config.20191002.orig [root@osestaging1 ssh]# grep AllowGroups sshd_config AllowGroups sshaccess [root@osestaging1 ssh]# grep sshaccess /etc/group [root@osestaging1 ssh]# groupadd sshaccess [root@osestaging1 ssh]# gpasswd -a maltfield sshaccess Adding user maltfield to group sshaccess [root@osestaging1 ssh]# grep sshaccess /etc/group sshaccess:x:1001:maltfield [root@osestaging1 ssh]# systemctl restart sshd [root@osestaging1 ssh]#
- confirmed that I could still ssh-in on the new non-standard port from dev to staging
user@ose:~$ ssh osedev1 Last login: Wed Oct 2 15:13:21 2019 from 5.254.96.225 [maltfield@osedev1 ~]$ ssh maltfield@192.168.122.201 hostname ssh: connect to host 192.168.122.201 port 22: Connection refused [maltfield@osedev1 ~]$ ssh -p 32415 maltfield@192.168.122.201 hostname osestaging1 [maltfield@osedev1 ~]$
- I could go on further to setup iptables to block things incoming, but the beauty of the fact that this is a container with a NAT'd private ip address on a host with iptables locked-down on its internet-facing ip address is that we really don't need to do that. It's already inaccessible to the internet, and it will only be accessible from the dev node--onto which our developers will vpn into as a necessary prerequisite to reach this staging node
- let's make it so that prod can touch staging; we'll create a cert for openvpn for our prod node, and install it on both our prod & staging nodes. Then we'll update our openvpn config to include the client-to-client option https://openvpn.net/community-resources/how-to/#scope
- before continuing, it would be wise to create a snapshot of the staging container
[root@osedev1 ssh]# lxc-snapshot --name osestaging1 --list No snapshots [root@osedev1 ssh]# lxc-snapshot --name osestaging1 afterBootstrap lxc_container: lxccontainer.c: lxcapi_snapshot: 2891 Snapshot of directory-backed container requested. lxc_container: lxccontainer.c: lxcapi_snapshot: 2892 Making a copy-clone. If you do want snapshots, then lxc_container: lxccontainer.c: lxcapi_snapshot: 2893 please create an aufs or overlayfs clone first, snapshot that lxc_container: lxccontainer.c: lxcapi_snapshot: 2894 and keep the original container pristine. lxc_container: lxccontainer.c: lxcapi_clone: 2643 error: Original container (osestaging1) is running lxc_container: lxccontainer.c: lxcapi_snapshot: 2899 clone of /var/lib/lxc:osestaging1 failed lxc_container: lxc_snapshot.c: do_snapshot: 55 Error creating a snapshot [root@osedev1 ssh]#
- I tried to create a snapshot; it told me that it can't do deltas unless I use overlayfs or aufs (or probably also zfs, butter, etc). It failed probably because the container is not stopped. I stopped it and tried again.
[root@osedev1 ssh]# lxc-snapshot --name osestaging1 afterBootstrap lxc_container: lxccontainer.c: lxcapi_snapshot: 2891 Snapshot of directory-backed container requested. lxc_container: lxccontainer.c: lxcapi_snapshot: 2892 Making a copy-clone. If you do want snapshots, then lxc_container: lxccontainer.c: lxcapi_snapshot: 2893 please create an aufs or overlayfs clone first, snapshot that lxc_container: lxccontainer.c: lxcapi_snapshot: 2894 and keep the original container pristine. [root@osedev1 ssh]# lxc-snapshot --name osestaging1 --list snap0 (/var/lib/lxcsnaps/osestaging1) 2019:10:02 15:37:58 [root@osedev1 ssh]#
- so our container is 0.5G, and so is our 1x snapshot
[root@osedev1 ssh]# du -sh /var/lib/lxcsnaps/* 459M /var/lib/lxcsnaps/osestaging1 [root@osedev1 ssh]# du -sh /var/lib/lxc/* 459M /var/lib/lxc/osestaging1 [root@osedev1 ssh]#
- eventually we'll need to mount the external block volume to /var/, especially before the sync from pod
[root@osedev1 ssh]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 19G 2.4G 16G 14% / devtmpfs 873M 0 873M 0% /dev tmpfs 896M 0 896M 0% /dev/shm tmpfs 896M 17M 879M 2% /run tmpfs 896M 0 896M 0% /sys/fs/cgroup /dev/sdb 9.8G 37M 9.3G 1% /mnt/HC_Volume_3110278 tmpfs 180M 0 180M 0% /run/user/1000 [root@osedev1 ssh]#
- as for backups, I created new API keys that have access to only the 'ose-dev-server-backups' bucket.
- because randomware is a topic of concern (and where the randomware deletes your backups), I also noticed that when we create the api key, we can remove the 'deleteFiles' and 'deleteBuckets' capabilities (the cleanup is actually done by the storage rules on backblaze's sides--not our script's logic) Apparently there's no way to edit the capabilities of exiting keys, so this would be a non-trivial change.
- I wrote the api key creds to osedev1:/root/scripts/backup.settings
- And I created a new 4K encryption key. TO make it clearer, I named it 'ose-dev-backups-cron.201910.key'. I added it to the shared ose keepass db under "backups" (files attached are under the "Advanced" tab)
- I also installed the b2cli depends to the dev node, unfortunately I hit some issues https://wiki.opensourceecology.org/wiki/Backblaze#Install_CLI
[root@osedev1 backups]# yum install python-virtualenv ... Installed: python-virtualenv.noarch 0:15.1.0-2.el7 Dependency Installed: python-devel.x86_64 0:2.7.5-86.el7 python-rpm-macros.noarch 0:3-32.el7 python-srpm-macros.noarch 0:3-32.el7 python2-rpm-macros.noarch 0:3-32.el7 Dependency Updated: python.x86_64 0:2.7.5-86.el7 python-libs.x86_64 0:2.7.5-86.el7 Complete! [root@osedev1 backups]# yum install python-setuptools Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: mirror.alpix.eu * epel: mirror.wiuwiu.de * extras: centosmirror.netcup.net * updates: mirror.alpix.eu Package python-setuptools-0.9.8-7.el7.noarch already installed and latest version Nothing to do [root@osedev1 backups]# yum install git ... Installed: git.x86_64 0:1.8.3.1-20.el7 Dependency Installed: perl-Error.noarch 1:0.17020-2.el7 perl-Git.noarch 0:1.8.3.1-20.el7 perl-TermReadKey.x86_64 0:2.30-20.el7 Complete! [root@osedev1 backups]# adduser b2user [root@osedev1 backups]# sudo su - b2user [b2user@osedev1 ~]$ mkdir virtualenv [b2user@osedev1 ~]$ cd virtualenv/ [b2user@osedev1 virtualenv]$ virtualenv . New python executable in /home/b2user/virtualenv/bin/python Installing setuptools, pip, wheel...done. [b2user@osedev1 virtualenv]$ cd .. [b2user@osedev1 ~]$ mkdir sandbox [b2user@osedev1 ~]$ cd sandbox/ [b2user@osedev1 sandbox]$ git clone https://github.com/Backblaze/B2_Command_Line_Tool.git Cloning into 'B2_Command_Line_Tool'... remote: Enumerating objects: 151, done. remote: Counting objects: 100% (151/151), done. remote: Compressing objects: 100% (93/93), done. remote: Total 7130 (delta 90), reused 102 (delta 55), pack-reused 6979 Receiving objects: 100% (7130/7130), 1.80 MiB | 3.35 MiB/s, done. Resolving deltas: 100% (5127/5127), done. [b2user@osedev1 sandbox]$ cd B2_Command_Line_Tool/ [b2user@osedev1 B2_Command_Line_Tool]$ python setup.py install setuptools 20.2 or later is required. To fix, try running: pip install "setuptools>=20.2" [b2user@osedev1 B2_Command_Line_Tool]$
- I hate using pip; it often breaks the OS and apps installed, but I bit my tounge & proceeded (I wouldn't do this on prod)
[root@osedev1 backups]# yum install python3-setuptools Installed: python3-setuptools.noarch 0:39.2.0-10.el7 Dependency Installed: python3.x86_64 0:3.6.8-10.el7 python3-libs.x86_64 0:3.6.8-10.el7 python3-pip.noarch 0:9.0.3-5.el7 Complete! [root@osedev1 backups]# [root@osedev1 backups]# pip install "setuptools>=20.2" -bash: pip: command not found [root@osedev1 backups]# yum install python-pip ... Installed: python2-pip.noarch 0:8.1.2-10.el7 Complete! [root@osedev1 backups]# pip install "setuptools>=20.2" Collecting setuptools>=20.2 Downloading https://files.pythonhosted.org/packages/b2/86/095d2f7829badc207c893dd4ac767e871f6cd547145df797ea26baea4e2e/setuptools-41.2.0-py2.py3-none-any.whl (576kB) 100% || 583kB 832kB/s Installing collected packages: setuptools Found existing installation: setuptools 0.9.8 Uninstalling setuptools-0.9.8: Successfully uninstalled setuptools-0.9.8 Successfully installed setuptools-41.2.0 You are using pip version 8.1.2, however version 19.2.3 is available. You should consider upgrading via the 'pip install --upgrade pip' command. [root@osedev1 backups]# pip install --upgrade pip Collecting pip Downloading https://files.pythonhosted.org/packages/30/db/9e38760b32e3e7f40cce46dd5fb107b8c73840df38f0046d8e6514e675a1/pip-19.2.3-py2.py3-none-any.whl (1.4MB) 100% || 1.4MB 511kB/s Installing collected packages: pip Found existing installation: pip 8.1.2 Uninstalling pip-8.1.2: Successfully uninstalled pip-8.1.2 Successfully installed pip-19.2.3 [root@osedev1 backups]#
- when it came time to install it, I had to add the '--user' flag
[b2user@osedev1 B2_Command_Line_Tool]$ python setup.py install --user ... Installed /home/b2user/.local/lib/python2.7/site-packages/python_dateutil-2.8.0-py2.7.egg Searching for setuptools==41.2.0 Best match: setuptools 41.2.0 Adding setuptools 41.2.0 to easy-install.pth file Installing easy_install script to /home/b2user/.local/bin Installing easy_install-3.6 script to /home/b2user/.local/bin Using /usr/lib/python2.7/site-packages Finished processing dependencies for b2==1.4.1 [b2user@osedev1 B2_Command_Line_Tool]$ [b2user@osedev1 B2_Command_Line_Tool]$ ^C [b2user@osedev1 B2_Command_Line_Tool]$ ~/.local/bin/b2 version b2 command line tool, version 1.4.1 [b2user@osedev1 B2_Command_Line_Tool]$