Maltfield Log/2019 Q4
My work log from the year 2019 Quarter 4. I intentionally made this verbose to make future admin's work easier when troubleshooting. The more keywords, error messages, etc that are listed in this log, the more helpful it will be for the future OSE Sysadmin.
Contents
- 1 See Also
- 2 Tue Dec 31, 2019
- 3 Mon Dec 30, 2019
- 4 Sun Dec 29, 2019
- 5 Sat Dec 28, 2019
- 6 Fri Dec 27, 2019
- 7 Tue Dec 24, 2019
- 8 Wed Dec 18, 2019
- 9 Tue Dec 17, 2019
- 10 Mon Dec 16, 2019
- 11 Fri Dec 13, 2019
- 12 Thr Dec 12, 2019
- 13 Thr Dec 05, 2019
- 14 Wed Dec 04, 2019
- 15 Tue Dec 03, 2019
- 16 Mon Dec 02, 2019
- 17 Tue Nov 26, 2019
- 18 Mon Nov 25, 2019
- 19 Mon Nov 18, 2019
- 20 Sun Nov 17, 2019
- 21 Tue Nov 12, 2019
- 22 Mon Nov 11, 2019
- 23 Sun Nov 10, 2019
- 24 Sat Nov 09, 2019
- 25 Fri Nov 08, 2019
- 26 Thr Nov 07, 2019
- 27 Mon Oct 28, 2019
- 28 Wed Oct 25, 2019
- 29 Tue Oct 24, 2019
- 30 Mon Oct 23, 2019
- 31 Mon Oct 21, 2019
- 32 Tue Oct 08, 2019
- 33 Mon Oct 07, 2019
- 34 Sat Oct 05, 2019
- 35 Fri Oct 04, 2019
- 36 Thr Oct 03, 2019
- 37 Wed Oct 02, 2019
See Also
Tue Dec 31, 2019
- I created a backup of our current year's awstats docroots before awstats overwrites them
[root@opensourceecology awstats.opensourceecology.org]# cp -r htdocs htdocs.20191231 [root@opensourceecology awstats.opensourceecology.org]# cd ../awstats.openbuildinginstitute.org/ [root@opensourceecology awstats.openbuildinginstitute.org]# cp -r htdocs htdocs.20191231 [root@opensourceecology awstats.openbuildinginstitute.org]# ls htdocs htdocs.20191231 [root@opensourceecology awstats.openbuildinginstitute.org]# du -sh * 3.0M htdocs 3.0M htdocs.20191231 [root@opensourceecology awstats.openbuildinginstitute.org]# date Tue Dec 31 09:46:59 UTC 2019 [root@opensourceecology awstats.openbuildinginstitute.org]# pwd /var/www/html/awstats.openbuildinginstitute.org [root@opensourceecology awstats.openbuildinginstitute.org]#
- this is a consequence of the stupid way I setup awstats' cron job
[root@opensourceecology awstats.openbuildinginstitute.org]# cat /etc/cron.d/awstats_generate_static_files 06 * * * * root /bin/nice /usr/share/awstats/tools/awstats_updateall.pl -configdir=/etc/awstats/ now 16 * * * * root /bin/nice /usr/share/awstats/tools/awstats_buildstaticpages.pl -config=www.openbuildinginstitute.org -dir=/var/www/html/awstats.openbuildinginstitute.org/htdocs/ 17 * * * * root /bin/nice /usr/share/awstats/tools/awstats_buildstaticpages.pl -config=seedhome.openbuildinginstitute.org -dir=/var/www/html/awstats.openbuildinginstitute.org/htdocs/ 18 * * * * root /bin/nice /usr/share/awstats/tools/awstats_buildstaticpages.pl -config=fef.opensourceecology.org -dir=/var/www/html/awstats.opensourceecology.org/htdocs/ 19 * * * * root /bin/nice /usr/share/awstats/tools/awstats_buildstaticpages.pl -config=www.opensourceecology.org -dir=/var/www/html/awstats.opensourceecology.org/htdocs/ 20 * * * * root /bin/nice /usr/share/awstats/tools/awstats_buildstaticpages.pl -config=wiki.opensourceecology.org -dir=/var/www/html/awstats.opensourceecology.org/htdocs/ 21 * * * * root /bin/nice /usr/share/awstats/tools/awstats_buildstaticpages.pl -config=microfactory.opensourceecology.org -dir=/var/www/html/awstats.opensourceecology.org/htdocs/ 21 * * * * root /bin/nice /usr/share/awstats/tools/awstats_buildstaticpages.pl -config=store.opensourceecology.org -dir=/var/www/html/awstats.opensourceecology.org/htdocs/ [root@opensourceecology awstats.openbuildinginstitute.org]#
- I added to my TODO list to update awstats' cron job to use a <year> dir in the output dir; that'll fix this issue.
- ...
- I also sent-off an email to the Auroville Earth Institute asking when would be a good time to meet them in mid-January
Hello, I will be visiting Auroville for some time in early January, and I would love to meet with Auroville Earth Institute to learn your techniques for CEBs and CSEBs. My name is Michael. I am working with Open Source Ecology. We are currently designing open source blueprints for civilization. We also work with Open Building Institute. * https://www.opensourceecology.org * https://www.openbuildinginstitute.org One of the machines that we develop is an Earth Brick Press for making CEBs. * https://wiki.opensourceecology.org/wiki/CEB_Press We are also currently working on a prototype modification to our CEB press that will include a fully automated hammermill soil crusher & cement/water mixer before loading the mix into our compression chamber. * https://wiki.opensourceecology.org/wiki/Soil_Mixer_2019 * https://wiki.opensourceecology.org/wiki/Soil_Preparation_for_CEB I would love to meet with you while I'm in Auroville. When would be a good time to visit in mid-January? Thank you, Michael Altfield Senior System Administrator PGP Fingerprint: 8A4B 0AF8 162F 3B6A 79B7 70D2 AA3E DF71 60E2 D97B Open Source Ecology www.opensourceecology.org
Mon Dec 30, 2019
- The Discourse team responded to me about their silly huge cookies. Looks like they store client data there. One of their staff suggested that it may be better to move this to redis instead https://meta.discourse.org/t/discourse-session-cookies-400-request-header-or-cookie-too-large/137245
- ...
- Marcin just sent out a phplist email, but hit a ton of issues with mod_security. Let's whitelist those
- first, let's get a list of the mod_security errors specific to our phplist site. Here's the most recent ones
[root@opensourceecology log]# grep 'ModSecurity' httpd/phplist.opensourceecology.org/error_log | tail -n3 [Mon Dec 30 06:41:13.586546 2019] [:error] [pid 2678] [client 127.0.0.1] ModSecurity: Access denied with code 403 (phase 2). String match "bytes=0-" at REQUEST_HEADERS:Range. [file "/etc/httpd/modsecurity.d/activated_rules/modsecurity_crs_20_protocol_violations.conf"] [line "428"] [id "958291"] [rev "2"] [msg "Range: field exists and begins with 0."] [data "bytes=0-524287"] [severity "WARNING"] [ver "OWASP_CRS/2.2.9"] [maturity "6"] [accuracy "8"] [tag "OWASP_CRS/PROTOCOL_VIOLATION/INVALID_HREQ"] [hostname "phplist.opensourceecology.org"] [uri "/lists/"] [unique_id "XgmcCfCVv--9NfCMmWfB@QAAAAQ"] [Mon Dec 30 06:41:41.349145 2019] [:error] [pid 9847] [client 127.0.0.1] ModSecurity: Access denied with code 403 (phase 2). Operator EQ matched 0 at REQUEST_HEADERS. [file "/etc/httpd/modsecurity.d/activated_rules/modsecurity_crs_21_protocol_anomalies.conf"] [line "47"] [id "960015"] [rev "1"] [msg "Request Missing an Accept Header"] [severity "NOTICE"] [ver "OWASP_CRS/2.2.9"] [maturity "9"] [accuracy "9"] [tag "OWASP_CRS/PROTOCOL_VIOLATION/MISSING_HEADER_ACCEPT"] [tag "WASCTC/WASC-21"] [tag "OWASP_TOP_10/A7"] [tag "PCI/6.5.10"] [hostname "phplist.opensourceecology.org"] [uri "/lists/ut.php"] [unique_id "XgmcJdQv-PvExNQwXttQ2AAAAAo"] [Mon Dec 30 06:49:30.756792 2019] [:error] [pid 9844] [client 127.0.0.1] ModSecurity: Access denied with code 403 (phase 2). Operator EQ matched 0 at REQUEST_HEADERS. [file "/etc/httpd/modsecurity.d/activated_rules/modsecurity_crs_21_protocol_anomalies.conf"] [line "47"] [id "960015"] [rev "1"] [msg "Request Missing an Accept Header"] [severity "NOTICE"] [ver "OWASP_CRS/2.2.9"] [maturity "9"] [accuracy "9"] [tag "OWASP_CRS/PROTOCOL_VIOLATION/MISSING_HEADER_ACCEPT"] [tag "WASCTC/WASC-21"] [tag "OWASP_TOP_10/A7"] [tag "PCI/6.5.10"] [hostname "phplist.opensourceecology.org"] [uri "/lists/ut.php"] [unique_id "Xgmd@jjXTNvlhqNnJNUG7gAAAAY"] [root@opensourceecology log]#
- Here's a quick command to see all the mod_security rules that triggered by rule id, counted by the number of their occurance and sorted from most common to least common
[root@opensourceecology log]# grep 'ModSecurity' httpd/phplist.opensourceecology.org/error_log | sed 's/^.*\[id "\([^"]*\).*$/\1/' | sort -n | uniq -c | sort -rn 484 960015 9 981242 9 958008 7 959071 7 950109 5 960020 4 959072 1 958291 [root@opensourceecology log]#
- so by far the most common is 484x occurances of rule #960015 = "Request Missing an Accept Header"
- the next most common is a tie for 9 occurances of rule#98124 = "Detects classic SQL injection probings 1/2"
- And rule #958008 = "Cross-site Scripting (XSS) Attack"
- While I was tailing the log, I saw one of those 960015 errors (by far the most common occuring one) pop into the logs. It was for this /lists/ut.php page. I loaded it and, yeah, it's a tracking pixel. That makes sense as to why the clients are missing accept headers; it's probably some email client thing https://phplist.opensourceecology.org/lists/ut.php
[Mon Dec 30 10:52:56.122828 2019] [:error] [pid 28739] [client 127.0.0.1] ModSecurity: Access denied with code 403 (phase 2). Operator EQ matched 0 at REQUEST_HEADERS. [file "/etc/httpd/modsecurity.d/activated_rules/modsecurity_crs_21_protocol_anomalies.conf"] [line "47"] [id "960015"] [rev "1"] [msg "Request Missing an Accept Header"] [severity "NOTICE"] [ver "OWASP_CRS/2.2.9"] [maturity "9"] [accuracy "9"] [tag "OWASP_CRS/PROTOCOL_VIOLATION/MISSING_HEADER_ACCEPT"] [tag "WASCTC/WASC-21"] [tag "OWASP_TOP_10/A7"] [tag "PCI/6.5.10"] [hostname "phplist.opensourceecology.org"] [uri "/lists/ut.php"] [unique_id "XgnXCDikFSEDO3Zy3gVuUAAAAAY"]
- I tested one of the images that Marcin said caused a 403, but it added & sent without issues
- I tested a string he said failed, and it indeed failed for me too
use 1.75 and 3 mm filament interchangeably,
- Adding that to the body of the email and pressing "Next" produced a 403 and this to the logs
[Mon Dec 30 10:10:21.953649 2019] [:error] [pid 19509] [client 127.0.0.1] ModSecurity: Access denied with code 403 (phase 2). Pattern match "(?i)\\\\b(?i:and)\\\\b\\\\s+(\\\\d{1,10}|'[^=]{1,10}')\\\\s*?[=]|\\\\b(?i:and)\\\\b\\\\s+(\\\\d{1,10}|'[^=]{1,10}')\\\\s*?[<>]|\\\\band\\\\b ?(?:\\\\d{1,10}|[\\\\'\\"][^=]{1,10}[\\\\'\\"]) ?[=<>]+|\\\\b(?i:and)\\\\b\\\\s+(\\\\d{1,10}|'[^=]{1,10}')" at ARGS:message. [file "/etc/httpd/modsecurity.d/activated_rules/modsecurity_crs_41_sql_injection_attacks.conf"] [line "136"] [id "959072"] [rev "2"] [msg "SQL Injection Attack"] [data "Matched Data: and 3 found within ARGS:message: <p><img alt=\\x22\\x22 src=\\x22https://wiki.opensourceecology.org/images/3/31/Vid.png\\x22 /></p>\\x0d\\x0a\\x0d\\x0a<p>use 1.75 and 3 mm filament interchangeably,</p>\\x0d\\x0a\\x0d\\x0a<p> </p>\\x0d\\x0a"] [severity "CRITICAL"] [ver "OWASP_CRS/2.2.9"] [maturity "9"] [accuracy "8"] [tag "OWASP_CRS/WEB_ATTACK/SQL_INJECTION"] [tag "WASCTC/WASC-19"] [tag "OWASP_TOP_10/A1"] [tag "OWASP_AppSensor/CIE1"] [tag "PCI/6.5.2"] [hostname "phplist.opensourceecology.org"] [uri "/lists/admin/"] [unique_id "XgnNDTENrNCvqKpO5DuR9AAAAAQ"]
- I whitelisted that rule, here's the diff that OSSEC emailed me immediately
Integrity checksum changed for: '/etc/httpd/conf.d/00-phplist.opensourceecology.org.conf' Size changed from '2984' to '2991' What changed: 76c76 < SecRuleRemoveById 970901 950001 950120 950901 981173 981317 973300 960024 950911 981231 981248 981245 973338 973304 973306 973333 973344 981257 981240 981246 981243 973336 958057 958006 958049 958051 958056 958011 958030 958039 959073 959151 973301 973302 973308 973314 973331 973315 973330 973327 973322 973348 973321 973335 973334 973332 973347 973316 200004 981172 960915 200003 --- > SecRuleRemoveById 970901 950001 950120 950901 981173 981317 973300 960024 950911 981231 981248 981245 973338 973304 973306 973333 973344 981257 981240 981246 981243 973336 958057 958006 958049 958051 958056 958011 958030 958039 959072 959073 959151 973301 973302 973308 973314 973331 973315 973330 973327 973322 973348 973321 973335 973334 973332 973347 973316 200004 981172 960915 200003 Old md5sum was: 'e7de88aeca71933be49f1e640cc45a78' New md5sum is : 'cb2dab2dbf47f98196bfea264f6b5a32' Old sha1sum was: 'fc2a186b0751e0c51404856e345b4473da05504b' New sha1sum is : '59e6bb98c3c40920b2052af4770815216d1a3d9d'
- All of the rest of the images he said he had issues with worked for me. I asked him to send me his workflow
- In any case, there's only one other ModSecurity rule that was fired in the most recent logs, excluding the one from the pixel (MISSING_HEADER_ACCECPT)
[root@opensourceecology log]# grep 'ModSecurity' httpd/phplist.opensourceecology.org/error_log | sed 's/^.*\[id "\([^"]*\).*$/\1/' | grep -vi 960015 960020 959072 959072 [root@opensourceecology log]#
- that id = 960020 appears to just be a NOTICE that the request included the "Pragma" header but not the "Cache-Control" header. I don't think that would be an issue.
--4dcba435-A-- [30/Dec/2019:10:09:30 +0000] XgnM2h1TkKX56GufLCph1QAAAAA 127.0.0.1 56908 127.0.0.1 8000 --4dcba435-B-- GET /category/steam-engine-construction-set/ HTTP/1.1 X-Real-IP: 157.55.39.193 X-Forwarded-Proto: https X-Forwarded-Port: 443 Host: www.opensourceecology.org Pragma: no-cache Accept: */* User-Agent: Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) X-Forwarded-For: 157.55.39.193, 127.0.0.1, 127.0.0.1 Accept-Encoding: gzip hash: #www.opensourceecology.org X-Varnish: 77278507 --4dcba435-F-- HTTP/1.1 403 Forbidden Content-Length: 241 Content-Type: text/html; charset=iso-8859-1 --4dcba435-E-- --4dcba435-H-- Message: Access denied with code 403 (phase 2). String match "HTTP/1.1" at REQUEST_PROTOCOL. [file "/etc/httpd/modsecurity.d/activated_rules/modsecurity_crs_20_protocol_violations.conf"] [line "399"] [id "960020"] [rev "2"] [msg "Pragma Header requires Cache-Control Header for HTTP/1.1 requests."] [severity "NOTICE"] [ver "OWASP_CRS/2.2.9"] [maturity "6"] [accuracy "8"] [tag "OWASP_CRS/PROTOCOL_VIOLATION/INVALID_HREQ"] Action: Intercepted (phase 2) Stopwatch: 1577700570285846 8089 (- - -) Stopwatch2: 1577700570285846 8089; combined=320, p1=282, p2=24, p3=0, p4=0, p5=14, sr=62, sw=0, l=0, gc=0 Response-Body-Transformed: Dechunked Producer: ModSecurity for Apache/2.7.3 (http://www.modsecurity.org/); OWASP_CRS/2.2.9. Server: Apache Engine-Mode: "ENABLED" --4dcba435-Z--
- oh, well, that's from www.opensourceecology.org. Here's the most recent one from phplist.opensourceecology.org
--7e84b452-A-- [30/Dec/2019:09:17:01 +0000] XgnAjVKRjngP0Kz-535QIwAAAAs 127.0.0.1 44750 127.0.0.1 8000 --7e84b452-B-- GET /lists/ut.php?m=46&u=9e7c1f1611704d32a5ae958fb67c21fb HTTP/1.1 X-Real-IP: 46.135.15.135 X-Forwarded-Proto: https X-Forwarded-Port: 443 Host: phplist.opensourceecology.org Pragma: no-cache User-Agent: Mozilla/5.0 (Linux; Android 8.0.0; SM-N950F Build/R16NW; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/66.0.3359.126 Mobile Safari/537.36 Accept: image/webp,image/apng,image/*,*/*;q=0.8 Accept-Language: en-GB,en-US;q=0.9 X-Requested-With: me.bluemail.mail X-Forwarded-For: 46.135.15.135, 127.0.0.1, 127.0.0.1 Accept-Encoding: gzip hash: #phplist.opensourceecology.org X-Varnish: 76200967 --7e84b452-F-- HTTP/1.1 403 Forbidden Content-Length: 214 Content-Type: text/html; charset=iso-8859-1 --7e84b452-E-- --7e84b452-H-- Message: Access denied with code 403 (phase 2). String match "HTTP/1.1" at REQUEST_PROTOCOL. [file "/etc/httpd/modsecurity.d/activated_rules/modsecurity_crs_20_protocol_violations.conf"] [line "399"] [id "960020"] [rev "2"] [msg "Pragma Header requires Cache-Control Header for HTTP/1.1 requests."] [severity "NOTICE"] [ver "OWASP_CRS/2.2.9"] [maturity "6"] [accuracy "8"] [tag "OWASP_CRS/PROTOCOL_VIOLATION/INVALID_HREQ"] Action: Intercepted (phase 2) Apache-Handler: php5-script Stopwatch: 1577697421335612 8009 (- - -) Stopwatch2: 1577697421335612 8009; combined=316, p1=275, p2=22, p3=0, p4=0, p5=19, sr=62, sw=0, l=0, gc=0 Response-Body-Transformed: Dechunked Producer: ModSecurity for Apache/2.7.3 (http://www.modsecurity.org/); OWASP_CRS/2.2.9. Server: Apache Engine-Mode: "ENABLED" --7e84b452-Z--
- whatever, that's someone on safari I guess violating soeme standard viewing our email. Anyway, shouldn't be an issue.
- Oh, but there's a log rotated for today already; that one has more
[root@opensourceecology log]# cat httpd/phplist.opensourceecology.org/error_log-20191230 | grep 'ModSecurity' | sed 's/^.*\[id "\([^"]*\).*$/\1/' | grep -vi 960015 | grep -vi 960020 | uniq -c | sort -rn 9 981242 9 958008 7 959071 5 950109 2 959072 2 959072 2 958008 2 950109 1 958291 [root@opensourceecology log]# x
- ...
- I just got an email from Marcin saying he's resizing the images to 500 on the X; I tried doing that too, and it worked. But I checked the logs again in-case he recently put something there
[root@opensourceecology log]# cat httpd/phplist.opensourceecology.org/error_log | grep 'ModSecurity' | sed 's/^.*\[id "\([^"]*\).*$/\1/' | uniq 960015 960020 960015 959072 960015 [root@opensourceecology log]#
- None of those work. I did some more poking around, and I got it to trigger a 403 when I clicked the "Image Button" button (in the second row) instead of the "Image" button in the (second-to-last row). Please don't ask me what the difference is *shrug*. The logs say this
[Mon Dec 30 11:23:06.013512 2019] [:error] [pid 27948] [client 127.0.0.1] ModSecurity: Access denied with code 403 (phase 2). Pattern match "<input\\\\b.*?\\\\btype\\\\b\\\\W*?\\\\bimage\\\\b" at ARGS:message. [file "/etc/httpd/modsecurity.d/activated_rules/modsecurity_crs_41_xss_attacks.conf"] [line "229"] [id "958008"] [rev "2"] [msg "Cross-site Scripting (XSS) Attack"] [data "Matched Data: <input alt=\\x22\\x22 src=\\x22https://wiki.opensourceecology.org/images/b/bf/soimixerhyd.jpg\\x22 style=\\x22width: 1244px; height: 820px;\\x22 type=\\x22image found within ARGS:message: <p><input alt=\\x22\\x22 src=\\x22https://wiki.opensourceecology.org/images/b/bf/soimixerhyd.jpg\\x22 style=\\x22width: 1244px; height: 820px;\\x22 type=\\x22image\\x22 /><iframe frameborder=\\x220\\x22 scrolling=\\x22no\\x22 src=\\x22https://wiki.opensourceecology.org/images/b/bf/soimixerhyd.jpg\\x22 width=\\x22500\\x22></ifram..."] [severity "CRITICAL"] [ver "OWASP_CRS/2.2.9"] [maturity "8"] [accuracy "8"] [tag "OWASP_CRS/WEB_ATTACK/XSS"] [tag "WASCTC/WASC-8"] [tag "WASCTC/WASC-22"] [tag "OWASP_TOP_10/A2"] [tag "OWASP_AppSensor/IE1"] [tag "PCI/6.5.1"] [hostname "phplist.opensourceecology.org"] [uri "/lists/admin/"] [unique_id "XgneGWMB55Go@mg1L473bwAAAAA"]
- So I whitelisted "958008" too
- Marcin said it now works for him over email.
- ...
- I spent some time getting screenshots for the 2019 year from awstats and putting them on the wiki like we have for Munin.
- I also drafted a summary of the site's stats in 2019 and sent it to Marcin & Catarina
Hey All, Before the year closes, I gathered some stats about our sites and stored them to the wiki. * https://wiki.opensourceecology.org/wiki/Category:Awstats_Graphs A snapshot of our most popular sites: wiki.opensourceecology.org received 10,695,729 hits YTD in 2019 from 452,260 unique visitors. The most popular month was in August, where we got 1,164,161 hits from 48,918 unique visitors. The least popular month was February, where we got 828,342 hits from 30,326 unique visitors. Here's our top pages on the wiki: /wiki/Main_Page /wiki/Cost_of_Living /wiki/OSE_Machine_Design_Guide /wiki/Special:RequestAccount /wiki/Global_Village_Construction_Set /wiki/Civilization_Starter_Kit_DVD_v0.01 /wiki/Aquaponics * https://awstats.opensourceecology.org:4443/awstats.wiki.opensourceecology.org.html www.opensourceecology.org received 9,413,080 hits YTD in 2019 from 360,643 unique visitors. The most popular month was also August, and the least popular December (so far). Here's the top pages on osemain: /gvcs/ /gvcs/gvcs-machine-index/ /about-videos-3/ /ceb-microhouse-build-in-belize/ /marcin-jakubowski/ /web-developers-for-better-true-fans-campaign/ /portfolio/microhouse/ * https://awstats.opensourceecology.org:4443/awstats.www.opensourceecology.org.html www.openbuildinginstitute.org received 3,949,352 hits YTD in 2019 from 84,220 unique visitors. The most popular month was in May, where we got 357,333 hits from 8,668 unique visitors. The least popular month is currently December, where we've so-far received 238,738 from 3,795 unique visitors. Here's our top pages for OBI: /use/ /buildings/ /how-it-works/ /about-what-we-do/ /structures/ /library-modules/ /portfolio/studio-12x16/ * https://awstats.openbuildinginstitute.org:4443/awstats.www.openbuildinginstitute.org.html Think we can double these numbers in 2020? Happy new years :) Cheers, Michael Altfield Senior System Administrator PGP Fingerprint: 8A4B 0AF8 162F 3B6A 79B7 70D2 AA3E DF71 60E2 D97B Open Source Ecology www.opensourceecology.org
- I wanted to get uptime stats from statuscake, but it looks like all our tests were paused. Strange, because we're still getting 7-day stats on our status page though *shrug* http://status.opensourceecology.org/
Sun Dec 29, 2019
- On my last call with Marcin, he again asked what bottlenecks we would hit on our server if we suddenly had thousands of OSE developers contributing in September 2020. I told him again that disk was my biggest concern
- Considering adding docker to our server, I bean to wonder how that would increase our disk usage. Indeed, a common issue with implementing docker in prod is disk fill because of all the old docker images that pile up without implementing some sort of cleanup cron job. Adding such a cron job is on my TODO list, but it's still going to add a reasonably large tax on our disks just by utilizing Discourese & docker.
- I checked our disks. Currently prod is using 77/197 usable space = 41%. I think 70% usage is when we'd want to begin migrating to a server with bigger disks. That migration would be a huge effort and be monetarily costly.
[maltfield@opensourceecology ~]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/md2 197G 77G 111G 41% / devtmpfs 32G 0 32G 0% /dev tmpfs 32G 8.0K 32G 1% /dev/shm tmpfs 32G 3.2G 29G 11% /run tmpfs 32G 0 32G 0% /sys/fs/cgroup /dev/md1 488M 289M 174M 63% /boot tmpfs 6.3G 0 6.3G 0% /run/user/0 tmpfs 6.3G 0 6.3G 0% /run/user/1005 [maltfield@opensourceecology ~]$
- In any case, I decided to message hetzner to see what our options were
- first of all, when logging-into our hetzner admin page, I saw they have an offer for a "Dedicated Root Server AX41-NVMe" server at the same price as what we're currently paying (39 EUR/mo) with twice the size of RAID1'd disks (2x 512G NVMe SSD vs 2x 250G SSD). RAM would be the same. Processor would probably be faster. So merely "upgrading" our monolithic server at no additional montly cost (plus 39 EUR one-time setup fee) is one option.
- I noticed that we have a tab on our wui admin page with hetzner called "storage box". Motherfucker, it says "inclusive". Looks like it comes with our Dedicated Server plan & is SSD with 100G storage.
- wow, they got back to me pretty fast. Looks like our server can have two additional drives from 0.5T - 12T added. They have a ton of options from NVMe SSDs to fat & cheap spinning disks. The easy fix would be to just add two 12T drives RAID 1'd for 26.5 EUR. The better solution would be to migrate to another server of better specs and the same price, perhaps setting that one up with a software RAID5 with all 4x disks included. https://wiki.hetzner.de/index.php/Root_Server_Hardware/en#Drives
- It looks like the 100G storage box isn't going to be decent enough for NFS mount; I think it's meant for backups https://wiki.hetzner.de/index.php/Storage_Boxes/en
- But even then, we can move our encrypted backups off to the Storage box, and that would save us 18G of space
[root@opensourceecology ~]# du -sh /home/b2user/sync* 18G /home/b2user/sync 18G /home/b2user/sync.old [root@opensourceecology ~]#
- I also updated our wiki OSE_Server page with info on this disk bottleneck dedicated server solutions https://wiki.opensourceecology.org/wiki/OSE_Server#Dedicated_Servers
Sat Dec 28, 2019
- let's attempt to update discourse
- first let's get the current versions
- It says our installed Discourse version is "2.4.0.beta8 (8106d94c05)" https://discourse.opensourceecology.org/admin
- The upgrade page itself listed the version of discourse_manager (I guess that's a plugin) as "2c89085" https://discourse.opensourceecology.org/admin/upgrade
- Now let's attempt an upgrade
- first we make a backup
[root@osestaging1 ~]# export vhostDir="/var/discourse/" [root@osestaging1 ~]# [root@osestaging1 ~]# # verify [root@osestaging1 ~]# echo "${vhostDir}" /var/discourse/ [root@osestaging1 ~]# ls -lah "${vhostDir}" total 1.1M drwxr-xr-x. 12 root root 4.0K Dec 17 11:36 . drwxr-xr-x. 23 root root 4.0K Oct 28 12:07 .. drwxr-xr-x. 2 root root 4.0K Oct 28 12:07 bin drwxr-xr-x. 2 root root 4.0K Dec 17 11:29 cids drwxr-xr-x. 2 root root 4.0K Dec 17 12:18 containers -rwxr-xr-x. 1 root root 12K Oct 28 12:07 discourse-doctor -rwxr-xr-x. 1 root root 21K Dec 16 12:20 discourse-setup -rw-r--r--. 1 root root 2.4K Nov 7 08:59 docker-ce.repo -rw-r--r--. 1 root root 1.6K Nov 7 09:05 docker.gpg -rw-r--r--. 1 root root 13K Oct 7 23:35 get-docker.sh drwxr-xr-x. 8 root root 4.0K Dec 17 07:38 .git -rw-r--r--. 1 root root 309 Oct 28 12:07 .gitignore drwxr-xr-x. 8 root root 4.0K Nov 18 08:17 image -rw-r--r--. 1 root root 13K Oct 7 23:35 index.html -rw-r--r--. 1 root root 76K Oct 28 13:55 install.sh -rwxr-xr-x. 1 root root 23K Dec 17 11:36 launcher -rwxr-xr-x. 1 root root 23K Nov 18 12:13 launcher.20191118_122249 -rwxr-xr-x. 1 root root 23K Nov 18 12:03 launcher.20191118.orig -rwxr-xr-x. 1 root root 30K Nov 26 16:02 launcher.20191217 -rwxr-xr-x. 1 root root 23K Dec 17 07:38 launcher.20191217_074503 -rwxr-xr-x. 1 root root 23K Dec 17 07:45 launcher.20191217_104906 -rwxr-xr-x. 1 root root 23K Nov 18 12:02 launcher.new -rwxr-xr-x. 1 root root 24K Nov 26 10:25 launcher.old drwxr-xr-x. 5 root root 4.0K Nov 12 11:11 libbrotli -rw-r--r--. 1 root root 1.1K Oct 28 12:07 LICENSE -rw-r--r--. 1 root root 664K Nov 18 11:43 output.log -rw-r--r--. 1 root root 8.7K Oct 28 12:07 README.md drwxr-xr-x. 2 root root 4.0K Dec 17 11:18 samples drwxr-xr-x. 2 root root 4.0K Oct 28 12:07 scripts drwxr-xr-x. 3 root root 4.0K Nov 7 11:27 shared drwxr-xr-x. 3 root root 4.0K Dec 17 10:48 templates -rw-r--r--. 1 root root 1.3K Oct 28 12:07 Vagrantfile [root@osestaging1 ~]#
stamp=`date +%Y%m%d_%T` tmpDir="/var/tmp/discourseUpgrade.${stamp}" mkdir "${tmpDir}" chown root:root "${tmpDir}" chmod 0700 "${tmpDir}" pushd "${tmpDir}" # discourse backup (db & uploaded files only) nice rm -rf /var/discourse/shared/standalone/backups/default/*.tar.gz time nice docker exec discourse_ose discourse backup nice mv /var/discourse/shared/standalone/backups/default/*.tar.gz ${tmpDir}/ # files backup (all discourse files) time nice tar --exclude "${vhostDir}/shared/standalone/postgres_data" --exclude "${vhostDir}/shared/standalone/postgres_data/uploads" --exclude "${vhostDir}/shared/standalone/backups" -czf ${tmpDir}/discourse_files.${stamp}.tar.gz /var/discourse/*
[root@osestaging1 discourseUpgrade.20191228_14:00:30]# date Sat Dec 28 14:09:31 UTC 2019 [root@osestaging1 discourseUpgrade.20191228_14:00:30]# pwd /var/tmp/discourseUpgrade.20191228_14:00:30 [root@osestaging1 discourseUpgrade.20191228_14:00:30]# ls discourse-2019-12-28-140114-v20191211170000.tar.gz discourse_files.20191228_14:00:30.tar.gz [root@osestaging1 discourseUpgrade.20191228_14:00:30]# du -sh * 52M discourse-2019-12-28-140114-v20191211170000.tar.gz 78M discourse_files.20191228_14:00:30.tar.gz [root@osestaging1 discourseUpgrade.20191228_14:00:30]#
- now we `git pull` the changes to /var/discourse on the docker host (osestaging1). Ok, that failed because of our changes to "launcher"
[root@osestaging1 ~]# pushd "${vhostDir}" /var/discourse ~ [root@osestaging1 discourse]# git pull remote: Enumerating objects: 3, done. remote: Counting objects: 100% (3/3), done. remote: Compressing objects: 100% (2/2), done. remote: Total 3 (delta 1), reused 2 (delta 1), pack-reused 0 Unpacking objects: 100% (3/3), done. From https://github.com/discourse/discourse_docker 026a664..1b3dd3a master -> origin/master Updating 026a664..1b3dd3a error: Your local changes to the following files would be overwritten by merge: launcher Please, commit your changes or stash them before you can merge. Aborting [root@osestaging1 discourse]#
- here's a list of all the changes I've made; namely the modified three at the top should probably be reverted before the git pull then updated after the git pull
[root@osestaging1 discourse]# git status # On branch master # Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded. # (use "git pull" to update your local branch) # # Changes not staged for commit: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # modified: image/base/Dockerfile # modified: image/base/install-nginx # modified: launcher # # Untracked files: # (use "git add <file>..." to include in what will be committed) # # docker-ce.repo # docker.gpg # get-docker.sh # image/base/install-nginx.20191112 # image/base/install-nginx.20191112.orig # image/base/install-nginx.20191125_122229.orig # image/base/install-nginx.20191125_123516.orig # image/base/runit-1.d-01-iptables # index.html # install.sh # launcher.20191118.orig # launcher.20191118_122249 # launcher.20191217 # launcher.20191217_074503 # launcher.20191217_104906 # launcher.new # launcher.old # libbrotli/ # output.log # templates/iptables.template.yml # templates/web.modsecurity.template.yml no changes added to commit (use "git add" and/or "git commit -a") [root@osestaging1 discourse]#
- the changes to Dockerfile were not needed; I'm updating the documentation to move the 'launcher' and 'install-nginx' scripts out of the way
[root@osestaging1 discourse]# mv "${vhostDir}/launcher" "${vhostDir}/launcher.`date "+%Y%m%d_%H%M%S"`" [root@osestaging1 discourse]# mv "${vhostDir}/image/base/install-nginx" "${vhostDir}/image/base/install-nginx.`date "+%Y%m%d_%H%M%S"`" [root@osestaging1 discourse]# [root@osestaging1 discourse]# pwd /var/discourse [root@osestaging1 discourse]# git pull Updating 026a664..1b3dd3a Fast-forward launcher | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) [root@osestaging1 discourse]#
- then I re-updated the install-nginx script to add mod_security support
[root@osestaging1 discourse]# pushd "${vhostDir}/image/base" /var/discourse/image/base /var/discourse ~ [root@osestaging1 base]# cp install-nginx install-nginx.`date "+%Y%m%d_%H%M%S"`.orig [root@osestaging1 base]# [root@osestaging1 base]# # add a block to checkout the the modsecurity nginx module just before downloading the nginx source [root@osestaging1 base]# grep 'ModSecurity' install-nginx || sed -i 's%\(curl.*nginx\.org/download.*\)%# mod_security --maltfield\napt-get install -y libmodsecurity-dev modsecurity-crs\ncd /tmp\ngit clone --depth 1 https://github.com/SpiderLabs/ModSecurity-nginx.git\n\n\1%' install-nginx [root@osestaging1 base]# [root@osestaging1 base]# # update the configure line to include the ModSecurity module checked-out above [root@osestaging1 base]# sed -i '/ModSecurity/! s%^[^#]*./configure \(.*nginx.*\)%#./configure \1\n./configure \1 --add-module=/tmp/ModSecurity-nginx%' install-nginx [root@osestaging1 base]# [root@osestaging1 base]# # add a line to cleanup section [root@osestaging1 base]# grep 'rm -fr /tmp/ModSecurity-nginx' install-nginx || sed -i 's%\(rm -fr.*/tmp/nginx.*\)%rm -fr /tmp/ModSecurity-nginx\n\1%' install-nginx [root@osestaging1 base]# [root@osestaging1 base]# popd /var/discourse ~ [root@osestaging1 discourse]#
- And the necessary change to 'launcher'
[root@osestaging1 discourse]# pushd "${vhostDir}" /var/discourse /var/discourse ~ [root@osestaging1 discourse]# [root@osestaging1 discourse]# # replace the line "image="discourse/base:<version>" with 'image="discourse_ose"' [root@osestaging1 discourse]# grep 'discourse_ose' launcher || sed --in-place=.`date "+%Y%m%d_%H%M%S"` '/base_image/! s%^\(\s*\)image=\(.*\)$%#\1image=\2\n\1image="discourse_ose"%' launcher [root@osestaging1 discourse]# [root@osestaging1 discourse]# popd /var/discourse ~ [root@osestaging1 discourse]#
- git diff confirms the changes
[root@osestaging1 discourse]# git diff diff --git a/image/base/install-nginx b/image/base/install-nginx index 7b91333..172d795 100755 --- a/image/base/install-nginx +++ b/image/base/install-nginx @@ -18,6 +18,11 @@ cd /tmp # this is the reason we are compiling by hand... git clone https://github.com/google/ngx_brotli.git +# mod_security --maltfield +apt-get install -y libmodsecurity-dev modsecurity-crs +cd /tmp +git clone --depth 1 https://github.com/SpiderLabs/ModSecurity-nginx.git + curl -O https://nginx.org/download/nginx-$VERSION.tar.gz tar zxf nginx-$VERSION.tar.gz cd nginx-$VERSION @@ -31,13 +36,15 @@ apt-mark hold nginx cd /tmp/ngx_brotli && git submodule update --init && cd /tmp/nginx-$VERSION # ignoring depracations with -Wno-deprecated-declarations while we wait for this https://github.com/google/ngx_brotli/issues/39#issuecomment-254093378 -./configure --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wno-deprecated-declarations' --with-ld-opt='-Wl,-Bsymbolic- +#./configure --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wno-deprecated-declarations' --with-ld-opt='-Wl,-Bsymbolic +./configure --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wno-deprecated-declarations' --with-ld-opt='-Wl,-Bsymbolic- make install mv /usr/share/nginx/sbin/nginx /usr/sbin cd / +rm -fr /tmp/ModSecurity-nginx rm -fr /tmp/nginx rm -fr /tmp/libbrotli rm -fr /tmp/ngx_brotli diff --git a/launcher b/launcher index 41e7c72..0b8b5c8 100755 --- a/launcher +++ b/launcher @@ -88,7 +88,8 @@ git_rec_version='1.8.0' config_file=containers/"$config".yml cidbootstrap=cids/"$config"_bootstrap.cid local_discourse=local_discourse -image="discourse/base:2.0.20191219-2109" +#image="discourse/base:2.0.20191219-2109" +image="discourse_ose" docker_path=`which docker.io 2> /dev/null || which docker` git_path=`which git`
- now we can rebuild the Discourse Docker image with nginx mod_security
[root@osestaging1 discourse]# pushd "${vhostDir}/image/base" /var/discourse/image/base /var/discourse ~ [root@osestaging1 base]# [root@osestaging1 base]# # force a fresh build (no-cache) so the `git pull` lines will trigger [root@osestaging1 base]# # note this will take a *ridiculously* long time; the Discourse team compiles many packages from source :( [root@osestaging1 base]# time nice docker build --no-cache --network=host --tag 'discourse_ose' /var/discourse/image/base/ ... Removing intermediate container 9a74706741d2 ---> f360219e7107 Successfully built f360219e7107 Successfully tagged discourse_ose:latest real 40m46.372s user 0m0.998s sys 0m0.659s [root@osestaging1 base]# [root@osestaging1 base]# popd /var/discourse ~ [root@osestaging1 discourse]#
- And finally we rebuild the Discourse app. Ugh, that failed with that old stupid message
[root@osestaging1 discourse]# ${vhostDir}/launcher rebuild discourse_ose ... + /bin/docker rm discourse_ose Error response from daemon: container 038ea7a12fa5882a16a22da89ccd5d8b04cda241ea80cf0b017c76b1d34a76ee: driver "overlay2" failed to remove root filesystem: unlinkat /var/lib/docker/overlay2/799e97d530bd7cb1d8d93aeb685f13daf0e4bfbe272cab76aa0f017c1a04e7b3/merged: device or resource busy debug2: channel 0: window 999139 sent adjust 49437 starting up existing container + /bin/docker start discourse_ose Error response from daemon: container is marked for removal and cannot be started Error: failed to start containers: discourse_ose [root@osestaging1 discourse]#
- I guess it's unhappy about this one that says "Removal In Progress"
[root@osestaging1 discourse]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 038ea7a12fa5 902ab1153546 "/sbin/boot" 11 days ago Removal In Progress discourse_ose 5cc0db30940b 940c0024cbd7 "/bin/bash -c 'cd /p…" 11 days ago Exited (1) 11 days ago thirsty_borg 24a1f9f4c038 6a959e2d597c "/bin/bash" 4 weeks ago Exited (1) 4 weeks ago peaceful_leavitt 6932865cc6a1 6a959e2d597c "/bin/bash" 4 weeks ago Exited (1) 4 weeks ago friendly_grothendieck fce75ef5ce06 940c0024cbd7 "/bin/bash" 4 weeks ago Exited (0) 4 weeks ago gifted_booth 03ea184c205e 940c0024cbd7 "/bin/bash" 4 weeks ago Exited (127) 4 weeks ago clever_solomon 6bd5bb0ab7b5 940c0024cbd7 "whoami" 4 weeks ago Exited (0) 4 weeks ago upbeat_booth 4fbcfcc1e05f 940c0024cbd7 "echo hello" 4 weeks ago Created sweet_lalande 88d916eb12b0 940c0024cbd7 "echo hello" 4 weeks ago Created goofy_allen 4a3b6e123460 940c0024cbd7 "/bin/bash" 4 weeks ago Exited (1) 4 weeks ago adoring_mirzakhani ef4f90be07e6 940c0024cbd7 "/bin/bash" 4 weeks ago Exited (0) 4 weeks ago awesome_mcclintock 580c0e430c47 940c0024cbd7 "/bin/bash" 4 weeks ago Exited (130) 4 weeks ago naughty_greider 4bce62d2e873 940c0024cbd7 "/usr/bin/apt-get in…" 4 weeks ago Created boring_lehmann 6d4ef0ebb57d 940c0024cbd7 "/usr/bin/apt-get in…" 4 weeks ago Created loving_davinci 4d5c8b2a90e0 940c0024cbd7 "/usr/bin/apt-get in…" 4 weeks ago Exited (0) 4 weeks ago quizzical_mestorf 34a3f6146a1d 940c0024cbd7 "/usr/bin/apt-get in…" 4 weeks ago Exited (0) 4 weeks ago epic_williamson f0a73d8db0db 940c0024cbd7 "iptables -L" 4 weeks ago Created dazzling_beaver 4f34a5f5ee65 940c0024cbd7 "/usr/bin/apt-get in…" 4 weeks ago Exited (0) 4 weeks ago quizzical_haslett 0980ad174804 940c0024cbd7 "/usr/bin/apt-get in…" 4 weeks ago Exited (0) 4 weeks ago wonderful_tereshkova 79413047322f 940c0024cbd7 "/usr/bin/apt-get in…" 4 weeks ago Created naughty_proskuriakova ba00edad459a 940c0024cbd7 "sudo apt-get instal…" 4 weeks ago Created quizzical_burnell 7364dbb52542 940c0024cbd7 "sudo apt-get instal…" 4 weeks ago Created cocky_bhaskara 9d0e485beba0 940c0024cbd7 "sudo apt-get instal…" 4 weeks ago Created nervous_greider 75394a9e553f 940c0024cbd7 "/usr/sbin/iptables …" 4 weeks ago Created admiring_cori 8c59607a7b23 940c0024cbd7 "iptables -L" 4 weeks ago Created silly_buck 92a929061a43 940c0024cbd7 "bash" 4 weeks ago Exited (0) 4 weeks ago sleepy_cohen 0d4c01df1acb 940c0024cbd7 "bash" 4 weeks ago Exited (0) 4 weeks ago busy_satoshi 3557078bec62 940c0024cbd7 "/bin/bash -c 'echo …" 4 weeks ago Exited (0) 4 weeks ago busy_sammet 56360e585353 bd5b8ac7ac36 "/bin/sh -c 'apt upd…" 4 weeks ago Exited (100) 4 weeks ago youthful_hermann 53bbee438a5e 9b33df0cef8e "/bin/sh -c 'apt upd…" 5 weeks ago Exited (127) 5 weeks ago awesome_newton [root@osestaging1 discourse]#
- well, for some reason I was just able to remove it myself without issue..
[root@osestaging1 discourse]# docker rm 038ea7a12fa5 038ea7a12fa5 [root@osestaging1 discourse]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 5cc0db30940b 940c0024cbd7 "/bin/bash -c 'cd /p…" 11 days ago Exited (1) 11 days ago thirsty_borg 24a1f9f4c038 6a959e2d597c "/bin/bash" 4 weeks ago Exited (1) 4 weeks ago peaceful_leavitt 6932865cc6a1 6a959e2d597c "/bin/bash" 4 weeks ago Exited (1) 4 weeks ago friendly_grothendieck fce75ef5ce06 940c0024cbd7 "/bin/bash" 4 weeks ago Exited (0) 4 weeks ago gifted_booth 03ea184c205e 940c0024cbd7 "/bin/bash" 4 weeks ago Exited (127) 4 weeks ago clever_solomon 6bd5bb0ab7b5 940c0024cbd7 "whoami" 4 weeks ago Exited (0) 4 weeks ago upbeat_booth 4fbcfcc1e05f 940c0024cbd7 "echo hello" 4 weeks ago Created sweet_lalande 88d916eb12b0 940c0024cbd7 "echo hello" 4 weeks ago Created goofy_allen 4a3b6e123460 940c0024cbd7 "/bin/bash" 4 weeks ago Exited (1) 4 weeks ago adoring_mirzakhani ef4f90be07e6 940c0024cbd7 "/bin/bash" 4 weeks ago Exited (0) 4 weeks ago awesome_mcclintock 580c0e430c47 940c0024cbd7 "/bin/bash" 4 weeks ago Exited (130) 4 weeks ago naughty_greider 4bce62d2e873 940c0024cbd7 "/usr/bin/apt-get in…" 4 weeks ago Created boring_lehmann 6d4ef0ebb57d 940c0024cbd7 "/usr/bin/apt-get in…" 4 weeks ago Created loving_davinci 4d5c8b2a90e0 940c0024cbd7 "/usr/bin/apt-get in…" 4 weeks ago Exited (0) 4 weeks ago quizzical_mestorf 34a3f6146a1d 940c0024cbd7 "/usr/bin/apt-get in…" 4 weeks ago Exited (0) 4 weeks ago epic_williamson f0a73d8db0db 940c0024cbd7 "iptables -L" 4 weeks ago Created dazzling_beaver 4f34a5f5ee65 940c0024cbd7 "/usr/bin/apt-get in…" 4 weeks ago Exited (0) 4 weeks ago quizzical_haslett 0980ad174804 940c0024cbd7 "/usr/bin/apt-get in…" 4 weeks ago Exited (0) 4 weeks ago wonderful_tereshkova 79413047322f 940c0024cbd7 "/usr/bin/apt-get in…" 4 weeks ago Created naughty_proskuriakova ba00edad459a 940c0024cbd7 "sudo apt-get instal…" 4 weeks ago Created quizzical_burnell 7364dbb52542 940c0024cbd7 "sudo apt-get instal…" 4 weeks ago Created cocky_bhaskara 9d0e485beba0 940c0024cbd7 "sudo apt-get instal…" 4 weeks ago Created nervous_greider 75394a9e553f 940c0024cbd7 "/usr/sbin/iptables …" 4 weeks ago Created admiring_cori 8c59607a7b23 940c0024cbd7 "iptables -L" 4 weeks ago Created silly_buck 92a929061a43 940c0024cbd7 "bash" 4 weeks ago Exited (0) 4 weeks ago sleepy_cohen 0d4c01df1acb 940c0024cbd7 "bash" 4 weeks ago Exited (0) 4 weeks ago busy_satoshi 3557078bec62 940c0024cbd7 "/bin/bash -c 'echo …" 4 weeks ago Exited (0) 4 weeks ago busy_sammet 56360e585353 bd5b8ac7ac36 "/bin/sh -c 'apt upd…" 4 weeks ago Exited (100) 4 weeks ago youthful_hermann 53bbee438a5e 9b33df0cef8e "/bin/sh -c 'apt upd…" 5 weeks ago Exited (127) 5 weeks ago awesome_newton [root@osestaging1 discourse]#
- And I started it..
[root@osestaging1 discourse]# ./launcher start discourse_ose + /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=2 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=osediscourse_2019@michaelaltfield.net -e DISCOURSE_SMTP_ADDRESS=172.17.0.1 -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-discourse-ose -e DOCKER_HOST_IP=172.17.0.1 --name discourse_ose -t -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:fc:97:b8:b4:0d --cap-add NET_ADMIN local_discourse/discourse_ose /sbin/boot 7987b80223d8b02ed1904d95e20d0f5258af80e8740fda8e844d5f8010d0bc9c [root@osestaging1 discourse]#
- And it worked! I was able to "enter" the container and access the site in my browser
- The wui now indicates that it's "2.4.0.beta9 (7200653e16)"
- And has docker_manager = 2c89085
- So we were able to successfully upgrate, great! I also updated the documentation for the upgrade on the wiki https://wiki.opensourceecology.org/wiki/Discourse#Updating_Discoruse
- ...
- I noticed a page that lists sidekiq errors https://discourse.opensourceecology.org/sidekiq/retries
2019-12-28 16:15:09 UTC 0 default Jobs::VersionCheck {} Jobs::HandledExceptionWrapper: Wrapped Excon::Error::Socket: getaddrinfo: Temporary failure in name resolution (SocketError) 2019-12-28 16:23:24 UTC 5 default Jobs::VersionCheck {} Jobs::HandledExceptionWrapper: Wrapped Excon::Error::Socket: getaddrinfo: Temporary failure in name resolution (SocketError) 2019-12-28 21:31:57 UTC 20 low Jobs::UserEmail {"type"=>"activation_reminder", "user_id"=>4, "email_token"=>"f74bd0aedda592aff4f05e907c0c16b6", "current_site_id"=>"default"} Jobs::HandledExceptionWrapper: Wrapped Net::OpenTimeout: execution expired 2019-12-29 01:56:34 UTC 21 critical Jobs::CriticalUserEmail {"type"=>"signup", "user_id"=>4, "email_token"=>"1a550bcca81e773f24c5c6d94a4fc7c0", "current_site_id"=>"default"} Jobs::HandledExceptionWrapper: Wrapped Net::OpenTimeout: execution expired
- The version checks are expected to fail, but I would think that the UserEmail would be sent to 127.0.0.1, so it *should* work.
- Oh, right, docker. The SMTP server is *actually* the docker host. I suppose there's two solutions here:
- I poke a hole in the firewall to permit sidekiq to communicate with the docker host (osestaging1) or
- I install & setup postfix inside the docker container with a simple config that just relays traffic from 127.0.0.1 up to the discourse docker container
- The first (poking a firewall hole) is the easiest, but the more security risk. But I think if the hole is only opened for TCP traffic on port 25, it should be OK
- I tried to "retry" the email task, but I got back a "400 Bad Request" "Request Header Or Cookie Too Large" from nginx. Indeed, the cookie that my browser sent was fucking massive! Why?!?
Cookie: _t=403b8203003bdaf522679e0b6c17605f; rack.session=BAh7C0kiD3Nlc3Npb25faWQGOgZFVG86HVJhY2s6OlNlc3Npb246OlNlc3Np%0Ab25JZAY6D0BwdWJsaWNfaWRJIkU4NjY1Y2Y3MjU4OGYzZjk3MDUyMDdhNzhh%0ANzc4OGZlZTRhNzFkY2JjMzA4NDAyNjY3MWMwNmFhYzc2Zjg0NWIyBjsARkki%0AEF9jc3JmX3Rva2VuBjsARkkiMW9KNk83S1B5ZUVaR0I2WnBxckxISzNxbEla%0AdGRyVXNCUnc3d2JiaVorVmM9BjsARkkiFnNlY3VyZV9zZXNzaW9uX2lkBjsA%0AVEkiJWM3YWJjYTk3OTRlNTExNTllZWUyMTBkYmVkNDgzNDc4BjsARkkiCmZs%0AYXNoBjsAVHsHSSIMZGlzY2FyZAY7AFRbAEkiDGZsYXNoZXMGOwBUewZJIgxy%0AZWZlcmVyBjsAVCJPaHR0cHM6Ly9kaXNjb3Vyc2Uub3BlbnNvdXJjZWVjb2xv%0AZ3kub3JnL3UvbWFsdGZpZWxkMC9tZXNzYWdlcy9ncm91cC9hZG1pbnNJIglj%0Ac3JmBjsARkkiMVRuUHJ6TTMzUHZqcG9oditwdTRyem9HeDUxQnAwL0psci9z%0AYkFZWkpxdkU9BjsARkkiDXRyYWNraW5nBjsARnsGSSIUSFRUUF9VU0VSX0FH%0ARU5UBjsAVEkiLTk3MDJkMjYxMmJlZmM5N2U4YTIzMDVkNjU0Y2IwOThmMmQ4%0AYTI1NTUGOwBG%0A--e602081dddcd88bdb269034e7acb8c582665be0e; _forum_session=V2dWY0FGVGhsMDVtcmdmNVdicXpJVkxnVC9vUWpkeVdpSHRIYWZaVVhVZDUxcFlRdTh4bHFTQVRRcUpGR3pMRGZ1M3NGeENzUloreGdEWEtQS2Z2WDJKNFZUeXRjNXlTTTRHVzJsQzBORzVuSW9NNHg3UHhwNzdUNFlCNGVvcytkSjA1b0d3NlM3czNlTlFxMEloQmNOYzMxTm5mYW4zaWlMSkpxWXZiZDlBRFJnR3dxTkphM0ZtZmk4bGswcUdzYm94b3pkUk0zTG5sMjhqNkxYMnZqMjJPYkhzMGFLM2JWZzBCRXpFa2wyZm1HbUl3REVzd3c5MmhRMG5YMkFJV0t6Z2ZPRlI2bVpOQWJlZWJQd2pyclEvdmVmWUlsYkxyU0EzcDFaZkRpOU14SVptMk01TjZtZlNXa2VUQnFaaGZpNDlaQVBnY0RCS2ZlbVBiQWt3S2lSeHcvY0g2WUlXOTRLejh2dVhhcDFoRXVobUdRMnJvcjhtRTkxZjZCYXM1eDd2NU1rZ3duRy83VVhVSG5Ua3BIOTJoQ1orY2dlMjh2M0Fuc0lwb3p3ckVtaXhxaDkxT2E1YnEvbnBWTVlCaXd4Q2h6eTd5Ty84WUYzeUVFbE9KSXhadXZ4aUw1TmVoSEE0cW9YSHA3VTJoK3NtdktrL09qcDVxMnR5bFhhUmU1dDMzT0ZBUGxBRXBZVHB5WlNtODM2YzBsOVRkc3RpMmFFSW5COEhyRjFTY2ZCZk5VbUpYN2JzYlh6SGNGWEs2dWhQUkJnMmd4K3ZJQUFkQThwa2tOMnI3Vi9qMFo5RE5XWWxxRXFTTTNmRnJKU294aStKZFJ4NHRDTGh4WXR1Z3F5QWU3ZkMxTXBpMzcvYTd5QkRwajNjcDF6SWdFSkdqNDJlMk0vYW1mODNEdDhZSk9jbzRPRHNhZUYzNjVOWkErbVJSNG82VnhRL0FFRUtWbE1uQWFPa0JqQUFmZ21iL0YvSFIrM1dlSDNvPS0tK1pMRzNmRjA4L3c4VTkrVEllYmNQZz09--325ca2e886afaefffeb6174c99776f91fefd4292
- So it looks like Discourse appears to have two session ids: rack.session is 813 characters long and _forum_session is 1,075 characters long. For comparison, MediaWiki only has one session id that's 32 characters long. This really doesn't make sense. I asked why this cookie's uids are so long in the discourse meta forums https://meta.discourse.org/t/discourse-session-cookies-400-request-header-or-cookie-too-large/137245
Fri Dec 27, 2019
- Discourse unattended-upgrades
Tue Dec 24, 2019
- I updated our Staging Server wiki with an alternative using firejail to force it to use osedev1 as its DNS server (note that DNS cannot be set in firefox itself; firejail creates a sandbox that simulates the OS's system DNS and passes that to firefox) https://wiki.opensourceecology.org/wiki/OSE_Staging_Server#Accessing_Staging_Websites
sudo apt-get install firejail firejail --dns=10.241.189.1 firefox
- ...
- I got no response on my query if the Discourse docker container uses unattended-upgrades https://meta.discourse.org/t/does-discourse-container-use-unattended-upgrades/136296
- A quick check shows that it *is* installed, but I still don't know where it comes from
root@osestaging1-discourse-ose:/var/www/discourse# dpkg -l | grep -i unatt ii unattended-upgrades 1.11.2 all automatic installation of security upgrades root@osestaging1-discourse-ose:/var/www/discourse#
- And the config looks like it *should* be working too
root@osestaging1-discourse-ose:/var/www/discourse# grep -ir 'origin=' /etc/apt/apt.conf.d/50unattended-upgrades // "origin=Debian,codename=${distro_codename}-updates"; // "origin=Debian,codename=${distro_codename}-proposed-updates"; "origin=Debian,codename=${distro_codename},label=Debian"; "origin=Debian,codename=${distro_codename},label=Debian-Security"; root@osestaging1-discourse-ose:/var/www/discourse# cat /etc/apt/apt.conf.d/20auto-upgrades APT::Periodic::Update-Package-Lists "1"; APT::Periodic::Unattended-Upgrade "1"; root@osestaging1-discourse-ose:/var/www/discourse#
- But it looks like the last log entry was 1 month ago
root@osestaging1-discourse-ose:/var/www/discourse# tail -f /var/log/unattended-upgrades/unattended-upgrades*.log ==> /var/log/unattended-upgrades/unattended-upgrades-dpkg.log <== Log started: 2019-11-17 12:34:54 (Reading database ... 44559 files and directories currently installed.) Removing freetype2-doc (2.9.1-3+deb10u1) ... Log ended: 2019-11-17 12:34:54 Log started: 2019-11-17 12:34:56 (Reading database ... 44389 files and directories currently installed.) Removing libjs-jquery (3.3.1~dfsg-3) ... Log ended: 2019-11-17 12:34:57 ==> /var/log/unattended-upgrades/unattended-upgrades.log <== 2019-11-26 16:37:47,549 INFO Initial blacklist : 2019-11-26 16:37:47,550 INFO Initial whitelist: 2019-11-26 16:37:47,551 INFO Starting unattended upgrades script 2019-11-26 16:37:47,552 INFO Allowed origins are: origin=Debian,codename=buster,label=Debian, origin=Debian,codename=buster,label=Debian-Security 2019-11-26 16:37:50,811 INFO Checking if system is running on battery is skipped. Please install powermgmt-base package to check power status and skip installing updates when the system is running on battery. 2019-11-26 16:37:50,814 INFO Initial blacklist : 2019-11-26 16:37:50,815 INFO Initial whitelist: 2019-11-26 16:37:50,815 INFO Starting unattended upgrades script 2019-11-26 16:37:50,815 INFO Allowed origins are: origin=Debian,codename=buster,label=Debian, origin=Debian,codename=buster,label=Debian-Security 2019-11-26 16:37:53,119 INFO No packages found that can be upgraded unattended and no pending auto-removals ^C root@osestaging1-discourse-ose:/var/www/discourse#
- I manually executed an unattended-upgrades run
root@osestaging1-discourse-ose:/var/www/discourse# sudo unattended-upgrade -d Checking if system is running on battery is skipped. Please install powermgmt-base package to check power status and skip installing updates when the system is running on battery. Initial blacklist : Initial whitelist: Starting unattended upgrades script Allowed origins are: origin=Debian,codename=buster,label=Debian, origin=Debian,codename=buster,label=Debian-Security Using (^linux-image-[0-9]+\.[0-9\.]+-.*|^linux-headers-[0-9]+\.[0-9\.]+-.*|^linux-image-extra-[0-9]+\.[0-9\.]+-.*|^linux-modules-[0-9]+\.[0-9\.]+-.*|^linux-modules-extra-[0-9]+\.[0-9\.]+-.*|^linux-signed-image-[0-9]+\.[0-9\.]+-.*|^linux-image-unsigned-[0-9]+\.[0-9\.]+-.*|^kfreebsd-image-[0-9]+\.[0-9\.]+-.*|^kfreebsd-headers-[0-9]+\.[0-9\.]+-.*|^gnumach-image-[0-9]+\.[0-9\.]+-.*|^.*-modules-[0-9]+\.[0-9\.]+-.*|^.*-kernel-[0-9]+\.[0-9\.]+-.*|^linux-backports-modules-.*-[0-9]+\.[0-9\.]+-.*|^linux-modules-.*-[0-9]+\.[0-9\.]+-.*|^linux-tools-[0-9]+\.[0-9\.]+-.*|^linux-cloud-tools-[0-9]+\.[0-9\.]+-.*|^linux-buildinfo-[0-9]+\.[0-9\.]+-.*|^linux-source-[0-9]+\.[0-9\.]+-.*) regexp to find kernel packages Using (^linux-image-3\.10\.0\-957\.21\.3\.el7\.x86_64$|^linux-headers-3\.10\.0\-957\.21\.3\.el7\.x86_64$|^linux-image-extra-3\.10\.0\-957\.21\.3\.el7\.x86_64$|^linux-modules-3\.10\.0\-957\.21\.3\.el7\.x86_64$|^linux-modules-extra-3\.10\.0\-957\.21\.3\.el7\.x86_64$|^linux-signed-image-3\.10\.0\-957\.21\.3\.el7\.x86_64$|^linux-image-unsigned-3\.10\.0\-957\.21\.3\.el7\.x86_64$|^kfreebsd-image-3\.10\.0\-957\.21\.3\.el7\.x86_64$|^kfreebsd-headers-3\.10\.0\-957\.21\.3\.el7\.x86_64$|^gnumach-image-3\.10\.0\-957\.21\.3\.el7\.x86_64$|^.*-modules-3\.10\.0\-957\.21\.3\.el7\.x86_64$|^.*-kernel-3\.10\.0\-957\.21\.3\.el7\.x86_64$|^linux-backports-modules-.*-3\.10\.0\-957\.21\.3\.el7\.x86_64$|^linux-modules-.*-3\.10\.0\-957\.21\.3\.el7\.x86_64$|^linux-tools-3\.10\.0\-957\.21\.3\.el7\.x86_64$|^linux-cloud-tools-3\.10\.0\-957\.21\.3\.el7\.x86_64$|^linux-buildinfo-3\.10\.0\-957\.21\.3\.el7\.x86_64$|^linux-source-3\.10\.0\-957\.21\.3\.el7\.x86_64$) regexp to find running kernel packages Checking: git ([<Origin component:'main' archive:'stable' origin:'Debian' label:'Debian-Security' site:'security.debian.org' isTrusted:True>]) Checking: git-man ([<Origin component:'main' archive:'stable' origin:'Debian' label:'Debian-Security' site:'security.debian.org' isTrusted:True>]) pkgs that look like they should be upgraded: git git-man Get:1 http://security.debian.org/debian-security buster/updates/main amd64 git-man all 1:2.20.1-2+deb10u1 [1620 kB] Get:2 http://security.debian.org/debian-security buster/updates/main amd64 git amd64 1:2.20.1-2+deb10u1 [5620 kB] Fetched 7239 kB in 0s (0 B/s) fetch.run() result: 0 <apt_pkg.AcquireItem object:Status: 2 Complete: 1 Local: 0 IsTrusted: 1 FileSize: 1619784 DestFile:'/var/cache/apt/archives/git-man_1%3a2.20.1-2+deb10u1_all.deb' DescURI: 'http://security.debian.org/debian-security/pool/updates/main/g/git/git-man_2.20.1-2+deb10u1_all.deb' ID:1 ErrorText: ''> check_conffile_prompt(/var/cache/apt/archives/git-man_1%3a2.20.1-2+deb10u1_all.deb) found pkg: git-man No conffiles in deb /var/cache/apt/archives/git-man_1%3a2.20.1-2+deb10u1_all.deb (There is no member named 'conffiles') <apt_pkg.AcquireItem object:Status: 2 Complete: 1 Local: 0 IsTrusted: 1 FileSize: 5619704 DestFile:'/var/cache/apt/archives/git_1%3a2.20.1-2+deb10u1_amd64.deb' DescURI: 'http://security.debian.org/debian-security/pool/updates/main/g/git/git_2.20.1-2+deb10u1_amd64.deb' ID:2 ErrorText: ''> check_conffile_prompt(/var/cache/apt/archives/git_1%3a2.20.1-2+deb10u1_amd64.deb) found pkg: git conffile line: /etc/bash_completion.d/git-prompt 7baac5c3ced94ebf2c0e1dde65c3b1a6 current md5: 7baac5c3ced94ebf2c0e1dde65c3b1a6 blacklist: [] whitelist: [] Packages that will be upgraded: git git-man Writing dpkg log to /var/log/unattended-upgrades/unattended-upgrades-dpkg.log applying set ['git'] debconf: delaying package configuration, since apt-utils is not installed (Reading database ... 45278 files and directories currently installed.) Preparing to unpack .../git_1%3a2.20.1-2+deb10u1_amd64.deb ... Unpacking git (1:2.20.1-2+deb10u1) over (1:2.20.1-2) ... Setting up git (1:2.20.1-2+deb10u1) ... left to upgrade {'git-man'} applying set ['git-man'] Log ended: 2019-12-24 17:33:04 Log started: 2019-12-24 17:33:06 debconf: delaying package configuration, since apt-utils is not installed (Reading database ... 45285 files and directories currently installed.) Preparing to unpack .../git-man_1%3a2.20.1-2+deb10u1_all.deb ... Unpacking git-man (1:2.20.1-2+deb10u1) over (1:2.20.1-2) ... Setting up git-man (1:2.20.1-2+deb10u1) ... left to upgrade set() All upgrades installed InstCount=0 DelCount=0 BrokenCount=0 Extracting content from /var/log/unattended-upgrades/unattended-upgrades-dpkg.log since 2019-12-24 17:32:55 root@osestaging1-discourse-ose:/var/www/discourse#
- Ok, that proved that unattended-upgrades isn't running. There were two git packages waiting to be installed. And now the logs show an entry from my manual run
root@osestaging1-discourse-ose:/var/www/discourse# tail -f /var/log/unattended-upgrades/unattended-upgrades*.log ==> /var/log/unattended-upgrades/unattended-upgrades-dpkg.log <== Log ended: 2019-12-24 17:33:04 Log started: 2019-12-24 17:33:06 debconf: delaying package configuration, since apt-utils is not installed (Reading database ... 45285 files and directories currently installed.) Preparing to unpack .../git-man_1%3a2.20.1-2+deb10u1_all.deb ... Unpacking git-man (1:2.20.1-2+deb10u1) over (1:2.20.1-2) ... Setting up git-man (1:2.20.1-2+deb10u1) ... Log ended: 2019-12-24 17:33:07 ==> /var/log/unattended-upgrades/unattended-upgrades.log <== 2019-12-24 17:32:59,326 DEBUG whitelist: [] 2019-12-24 17:32:59,327 INFO Packages that will be upgraded: git git-man 2019-12-24 17:32:59,327 INFO Writing dpkg log to /var/log/unattended-upgrades/unattended-upgrades-dpkg.log 2019-12-24 17:32:59,509 DEBUG applying set ['git'] 2019-12-24 17:33:06,281 DEBUG left to upgrade {'git-man'} 2019-12-24 17:33:06,404 DEBUG applying set ['git-man'] 2019-12-24 17:33:08,729 DEBUG left to upgrade set() 2019-12-24 17:33:08,730 INFO All upgrades installed 2019-12-24 17:33:09,166 DEBUG InstCount=0 DelCount=0 BrokenCount=0 2019-12-24 17:33:09,179 DEBUG Extracting content from /var/log/unattended-upgrades/unattended-upgrades-dpkg.log since 2019-12-24 17:32:55 ^C root@osestaging1-discourse-ose:/var/www/discourse#
- The Timers for downloading & upgrading these packages are defined in these systemd files
root@osestaging1-discourse-ose:/var/www/discourse# cat /lib/systemd/system/apt-daily.timer [Unit] Description=Daily apt download activities [Timer] OnCalendar=*-*-* 6,18:00 RandomizedDelaySec=12h Persistent=true [Install] WantedBy=timers.target root@osestaging1-discourse-ose:/var/www/discourse# cat /etc/systemd/system/apt-daily.timer.d/override.conf cat: /etc/systemd/system/apt-daily.timer.d/override.conf: No such file or directory root@osestaging1-discourse-ose:/var/www/discourse# cat /lib/systemd/system/apt-daily-upgrade.timer [Unit] Description=Daily apt upgrade and clean activities After=apt-daily.timer [Timer] OnCalendar=*-*-* 6:00 RandomizedDelaySec=60m Persistent=true [Install] WantedBy=timers.target root@osestaging1-discourse-ose:/var/www/discourse# cat /etc/systemd/system/apt-daily-upgrade.timer.d/override.conf cat: /etc/systemd/system/apt-daily-upgrade.timer.d/override.conf: No such file or directory root@osestaging1-discourse-ose:/var/www/discourse#
- I updated my topic on meta.discourse.org, asking the devs if this was intentional or not https://meta.discourse.org/t/does-discourse-container-use-unattended-upgrades/136296/3
- ...
- Now that iptables is setup, the admin page of the Discourse site does *not* show what the latest version of Discourse is because it cannot make malicous-like "calls home" (this is good) https://discourse.opensourceecology.org/admin
- I guess sidekiq is what's responsible for initiating that denied query
A check for updates has not been performed lately. Ensure sidekiq is running.
Wed Dec 18, 2019
- Meeting with Marcin to set him up with VPN and train him on how to access our staging server's websites in his web browser, including our Discourse POC
- Marcin had an error in Openvpn that the cipher "AES-256-GCM" ws not available in openssl. The solution was to upgrade OpenVPN, but a minimal version (2.4?) was not available in the Ubuntu 16.05 (Xenial) repos
Cipher algorithm 'AES-256-GCM' not found Exiting due to fatal erro
- The solution was to add the offical openvpn.net repo and install it with apt
sudo apt-get remove openvpn wget -O - https://swupdate.openvpn.net/repos/repo-public.gpg | sudo apt-key add - echo "deb http://build.openvpn.net/debian/openvpn/stable xenial main" | sudo tee /etc/apt/sources.list.d/openvpn-aptrepo.list sudo apt-get update sudo apt-get install openvpn
- We also updated Marcin's backup usb's encrypted veracrypt file volume with
- The openvpn dir including his encrypted 'marcin.key' file, CA-signed 'marcin.crt' certificate file, openvpn client.conf file, and other necessary files
- His personal keepass file with an entry with his 2fa totp secret key.
- His personal keepass file with an entry with his openvpn private key password.
- He also had some issues with the DNS push from the openvpn server. His solution was documented here https://wiki.opensourceecology.org/wiki/DNS_Correction_for_the_Staging_Server
- I think that a more reasonable way for developers to set this DNS is by launching a browser from the command line with explicitly defined dns server (using osedev1) at the browser-level. This should be much safer and permit them to access the prod site in one browser and the dev site in another browser.
- I'm not 100% sure if browsers natively support setting the DNS, but I do know they they spport setting DNS for DoH (DNS-over-HTTPS). So, at the very least, if we got our osedev1 DNS server to serve DoH clients, we'd be able to update the browser in that way..
- ...
- I discovered that Discourse will provide free hosting (~$110+ per year value) for open source projects. While this would certainly be faster and easier than self-hosting our own Discourse install on OSE Server, I have the following concerns https://free.discourse.group/
- We won't have much control over the install, so we may not be able to do advanced tasks as needed
- We may find that the default standard plugin set may not meet our needs
- Max 5 "staff" users may not be sufficient
- We may outgrow the 5G storage limit
- Our site may exceed the 50k monthly page view limit
- If we eventually do need to migrate off the free site to our own site, it may be a more involved process than if we just launched on our own site to begin with
- after such a move, the domain name would change, potentially breaking a lot of redirects to our forums on the Internet
Tue Dec 17, 2019
- Marcin didn't show for our VPN meeting. After waiting for 1 hour, I'm thinking that by Tuesday at 00:30 he actually meant Wednesday at 00:30 (or "Tuesday night" but after midnight, really Wednesday).
- ...
- continuing from yesterday, I can't start discourse anymore due to some docker error.
[root@osestaging1 discourse]# docker start discourse_ose Error response from daemon: container is marked for removal and cannot be started Error: failed to start containers: discourse_ose [root@osestaging1 discourse]#
- Here's the output of `docker ps`. The one anomoly that stands out is the one that says Status = "Dead"
[root@osestaging1 discourse]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@osestaging1 discourse]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 5cc0db30940b discourse_ose "/bin/bash -c 'cd /p…" 13 hours ago Exited (1) 13 hours ago thirsty_borg f8733eb8d9e4 684c8db14460 "/sbin/boot" 18 hours ago Dead discourse_ose 24a1f9f4c038 6a959e2d597c "/bin/bash" 2 weeks ago Exited (1) 2 weeks ago peaceful_leavitt 6932865cc6a1 6a959e2d597c "/bin/bash" 2 weeks ago Exited (1) 2 weeks ago friendly_grothendieck fce75ef5ce06 discourse_ose "/bin/bash" 2 weeks ago Exited (0) 2 weeks ago gifted_booth 03ea184c205e discourse_ose "/bin/bash" 2 weeks ago Exited (127) 2 weeks ago clever_solomon 6bd5bb0ab7b5 discourse_ose "whoami" 2 weeks ago Exited (0) 2 weeks ago upbeat_booth 4fbcfcc1e05f discourse_ose "echo hello" 2 weeks ago Created sweet_lalande 88d916eb12b0 discourse_ose "echo hello" 2 weeks ago Created goofy_allen 4a3b6e123460 discourse_ose "/bin/bash" 3 weeks ago Exited (1) 3 weeks ago adoring_mirzakhani ef4f90be07e6 discourse_ose "/bin/bash" 3 weeks ago Exited (0) 3 weeks ago awesome_mcclintock 580c0e430c47 discourse_ose "/bin/bash" 3 weeks ago Exited (130) 3 weeks ago naughty_greider 4bce62d2e873 discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Created boring_lehmann 6d4ef0ebb57d discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Created loving_davinci 4d5c8b2a90e0 discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Exited (0) 3 weeks ago quizzical_mestorf 34a3f6146a1d discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Exited (0) 3 weeks ago epic_williamson f0a73d8db0db discourse_ose "iptables -L" 3 weeks ago Created dazzling_beaver 4f34a5f5ee65 discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Exited (0) 3 weeks ago quizzical_haslett 0980ad174804 discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Exited (0) 3 weeks ago wonderful_tereshkova 79413047322f discourse_# Marcin didn't show for our VPN meeting. After waiting for 1 hour, I'm thinking that by Tuesday at 00:30 he actually meant Wednesday at 00:30 (or "Tuesday night" but after midnight, really Wednesday). # ... # continuing from yesterday, I can't start discourse anymore due to some docker error. <pre> [root@osestaging1 discourse]# docker start discourse_ose Error response from daemon: container is marked for removal and cannot be started Error: failed to start containers: discourse_ose [root@osestaging1 discourse]#
- Here's the output of `docker ps`. The one anomoly that stands out is the one that says Status = "Dead"
[root@osestaging1 discourse]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@osestaging1 discourse]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 5cc0db30940b discourse_ose "/bin/bash -c 'cd /p…" 13 hours ago Exited (1) 13 hours ago thirsty_borg f8733eb8d9e4 684c8db14460 "/sbin/boot" 18 hours ago Dead discourse_ose 24a1f9f4c038 6a959e2d597c "/bin/bash" 2 weeks ago Exited (1) 2 weeks ago peaceful_leavitt 6932865cc6a1 6a959e2d597c "/bin/bash" 2 weeks ago Exited (1) 2 weeks ago friendly_grothendieck fce75ef5ce06 discourse_ose "/bin/bash" 2 weeks ago Exited (0) 2 weeks ago gifted_booth 03ea184c205e discourse_ose "/bin/bash" 2 weeks ago Exited (127) 2 weeks ago clever_solomon 6bd5bb0ab7b5 discourse_ose "whoami" 2 weeks ago Exited (0) 2 weeks ago upbeat_booth 4fbcfcc1e05f discourse_ose "echo hello" 2 weeks ago Created sweet_lalande 88d916eb12b0 discourse_ose "echo hello" 2 weeks ago Created goofy_allen 4a3b6e123460 discourse_ose "/bin/bash" 3 weeks ago Exited (1) 3 weeks ago adoring_mirzakhani ef4f90be07e6 discourse_ose "/bin/bash" 3 weeks ago Exited (0) 3 weeks ago awesome_mcclintock 580c0e430c47 discourse_ose "/bin/bash" 3 weeks ago Exited (130) 3 weeks ago naughty_greider 4bce62d2e873 discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Created boring_lehmann 6d4ef0ebb57d discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Created loving_davinci 4d5c8b2a90e0 discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Exited (0) 3 weeks ago quizzical_mestorf 34a3f6146a1d discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Exited (0) 3 weeks ago epic_williamson f0a73d8db0db discourse_ose "iptables -L" 3 weeks ago Created dazzling_beaver 4f34a5f5ee65 discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Exited (0) 3 weeks ago quizzical_haslett 0980ad174804 discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Exited (0) 3 weeks ago wonderful_tereshkova 79413047322f discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Created naughty_proskuriakova ba00edad459a discourse_ose "sudo apt-get instal…" 3 weeks ago Created quizzical_burnell 7364dbb52542 discourse_ose "sudo apt-get instal…" 3 weeks ago Created cocky_bhaskara 9d0e485beba0 discourse_ose "sudo apt-get instal…" 3 weeks ago Created nervous_greider 75394a9e553f discourse_ose "/usr/sbin/iptables …" 3 weeks ago Created admiring_cori 8c59607a7b23 discourse_ose "iptables -L" 3 weeks ago Created silly_buck 92a929061a43 discourse_ose "bash" 3 weeks ago Exited (0) 3 weeks ago sleepy_cohen 0d4c01df1acb discourse_ose "bash" 3 weeks ago Exited (0) 3 weeks ago busy_satoshi 3557078bec62 discourse_ose "/bin/bash -c 'echo …" 3 weeks ago Exited (0) 3 weeks ago busy_sammet 56360e585353 bd5b8ac7ac36 "/bin/sh -c 'apt upd…" 3 weeks ago Exited (100) 3 weeks ago youthful_hermann 53bbee438a5e 9b33df0cef8e "/bin/sh -c 'apt upd…" 4 weeks ago Exited (127) 4 weeks ago awesome_newton [root@osestaging1 discourse]#
- I tried removing it
[root@osestaging1 discourse]# docker rm f8733eb8d9e4 f8733eb8d9e4 [root@osestaging1 discourse]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 5cc0db30940b discourse_ose "/bin/bash -c 'cd /p…" 13 hours ago Exited (1) 13 hours ago thirsty_borg 24a1f9f4c038 6a959e2d597c "/bin/bash" 2 weeks ago Exited (1) 2 weeks ago peaceful_leavitt 6932865cc6a1 6a959e2d597c "/bin/bash" 2 weeks ago Exited (1) 2 weeks ago friendly_grothendieck fce75ef5ce06 discourse_ose "/bin/bash" 2 weeks ago Exited (0) 2 weeks ago gifted_booth 03ea184c205e discourse_ose "/bin/bash" 2 weeks ago Exited (127) 2 weeks ago clever_solomon 6bd5bb0ab7b5 discourse_ose "whoami" 2 weeks ago Exited (0) 2 weeks ago upbeat_booth 4fbcfcc1e05f discourse_ose "echo hello" 2 weeks ago Created sweet_lalande 88d916eb12b0 discourse_ose "echo hello" 2 weeks ago Created goofy_allen 4a3b6e123460 discourse_ose "/bin/bash" 3 weeks ago Exited (1) 3 weeks ago adoring_mirzakhani ef4f90be07e6 discourse_ose "/bin/bash" 3 weeks ago Exited (0) 3 weeks ago awesome_mcclintock 580c0e430c47 discourse_ose "/bin/bash" 3 weeks ago Exited (130) 3 weeks ago naughty_greider 4bce62d2e873 discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Created boring_lehmann 6d4ef0ebb57d discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Created loving_davinci 4d5c8b2a90e0 discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Exited (0) 3 weeks ago quizzical_mestorf 34a3f6146a1d discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Exited (0) 3 weeks ago epic_williamson f0a73d8db0db discourse_ose "iptables -L" 3 weeks ago Created dazzling_beaver 4f34a5f5ee65 discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Exited (0) 3 weeks ago quizzical_haslett 0980ad174804 discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Exited (0) 3 weeks ago wonderful_tereshkova 79413047322f discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Created naughty_proskuriakova ba00edad459a discourse_ose "sudo apt-get instal…" 3 weeks ago Created quizzical_burnell 7364dbb52542 discourse_ose "sudo apt-get instal…" 3 weeks ago Created cocky_bhaskara 9d0e485beba0 discourse_ose "sudo apt-get instal…" 3 weeks ago Created nervous_greider 75394a9e553f discourse_ose "/usr/sbin/iptables …" 3 weeks ago Created admiring_cori 8c59607a7b23 discourse_ose "iptables -L" 3 weeks ago Created silly_buck 92a929061a43 discourse_ose "bash" 3 weeks ago Exited (0) 3 weeks ago sleepy_cohen 0d4c01df1acb discourse_ose "bash" 3 weeks ago Exited (0) 3 weeks ago busy_satoshi 3557078bec62 discourse_ose "/bin/bash -c 'echo …" 3 weeks ago Exited (0) 3 weeks ago busy_sammet 56360e585353 bd5b8ac7ac36 "/bin/sh -c 'apt upd…" 3 weeks ago Exited (100) 3 weeks ago youthful_hermann 53bbee438a5e 9b33df0cef8e "/bin/sh -c 'apt upd…" 4 weeks ago Exited (127) 4 weeks ago awesome_newton [root@osestaging1 discourse]#
- well now it just says "no such container"
[root@osestaging1 discourse]# docker start discourse_ose Error response from daemon: No such container: discourse_ose Error: failed to start containers: discourse_ose [root@osestaging1 discourse]#
- but a `./launcher start discourse_ose` did start it up!
[root@osestaging1 discourse]# ./launcher start discourse_ose ... + /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=2 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=osediscourse_2019@michaelaltfield.net -e DISCOURSE_SMTP_ADDRESS=172.17.0.1 -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-discourse-ose --cap-add NET_ADMIN -e DOCKER_HOST_IP=172.17.0.1 --name discourse_ose -t -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:fc:97:b8:b4:0d local_discourse/discourse_ose /sbin/boot b55581b930865eb4cf744410cdb7dc2f5ce37517042781a6227fbb640b456d86 ... [root@osestaging1 discourse]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b55581b93086 local_discourse/discourse_ose "/sbin/boot" 7 seconds ago Up 5 seconds discourse_ose [root@osestaging1 discourse]#
- I documented this issue & its solution here https://meta.discourse.org/t/how-to-fix-container-is-marked-for-removal-and-cannot-be-started-without-reboot/136301
- I poked a few more holes in the firewall to allow OUTPUT to the loopback addresses in ipv4 & ipv6, and this time a rebuild worked!
[root@osestaging1 discourse]# cat templates/iptables.template.yml run: - file: path: /etc/runit/1.d/01-iptables chmod: "+x" contents: | #!/bin/bash ################################################################################ # File: /etc/runit/1.d/01-iptables # Version: 0.1 # Purpose: installs & locks-down iptables # Author: Michael Altfield <michael@opensourceecology.org> # Created: 2019-11-26 # Updated: 2019-11-26 ################################################################################ sudo apt-get install -y iptables sudo iptables -A OUTPUT -s 127.0.0.1/32 -d 127.0.0.1/32 -j ACCEPT sudo iptables -A OUTPUT -m state --state RELATED,ESTABLISHED -j ACCEPT sudo iptables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT sudo iptables -A OUTPUT -m owner --uid-owner 100 -j ACCEPT sudo iptables -A OUTPUT -j DROP sudo ip6tables -A OUTPUT -s ::1/128 -d ::1/128 -j ACCEPT sudo ip6tables -A OUTPUT -m state --state RELATED,ESTABLISHED -j ACCEPT sudo ip6tables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT sudo ip6tables -A OUTPUT -m owner --uid-owner 100 -j ACCEPT sudo ip6tables -A OUTPUT -j DROP [root@osestaging1 discourse]#
- I also confirmed that the OS can make updates (this is the only reason we grant the server any internet access at all, since it actually can be 100% drop with nginx on the docker host just reverse proxing to the unix socket file on teh docker container)
root@osestaging1-discourse-ose:/var/www/discourse# apt-get upgrade Reading package lists... Done Building dependency tree Reading state information... Done Calculating upgrade... Done The following packages will be upgraded: git git-man 2 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. Need to get 7,239 kB of archives. After this operation, 18.4 kB of additional disk space will be used. Do you want to continue? [Y/n] y Get:1 http://security.debian.org/debian-security buster/updates/main amd64 git-man all 1:2.20.1-2+deb10u1 [1,620 kB] Get:2 http://security.debian.org/debian-security buster/updates/main amd64 git amd64 1:2.20.1-2+deb10u1 [5,620 kB] Fetched 7,239 kB in 1s (11.5 MB/s) debconf: delaying package configuration, since apt-utils is not installed (Reading database ... 45278 files and directories currently installed.) Preparing to unpack .../git-man_1%3a2.20.1-2+deb10u1_all.deb ... Unpacking git-man (1:2.20.1-2+deb10u1) over (1:2.20.1-2) ... Preparing to unpack .../git_1%3a2.20.1-2+deb10u1_amd64.deb ... Unpacking git (1:2.20.1-2+deb10u1) over (1:2.20.1-2) ... Setting up git-man (1:2.20.1-2+deb10u1) ... Setting up git (1:2.20.1-2+deb10u1) ... root@osestaging1-discourse-ose:/var/www/discourse#
- I added some more restrictions to iptables, but then the next build failed. Interesting, but it's a 404, so unlikely related to iptables
I, [2019-12-17T09:34:03.927443 #1] INFO -- : > sudo apt-get install -y modsecurity-crs E: Failed to fetch http://security.debian.org/debian-security/pool/updates/main/a/apache2/apache2-bin_2.4.38-3+deb10u1_amd64.deb 404 Not Found [IP: 151.101.112.204 80] E: Failed to fetch http://deb.debian.org/debian/pool/main/m/modsecurity-crs/modsecurity-crs_3.1.0-1_all.deb 404 Not Found [IP: 151.101.112.204 80] E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing? I, [2019-12-17T09:34:06.259975 #1] INFO -- : Reading package lists...
- Looks like my apt cache was stale; adding an `apt-get update` before the installs for modsecurity & iptables to fix this.
- With my new iptables rules (including INPUT rules), Discourse isn't working now. I'm back to the 502 Bad Gateway. Looks like the ruby app can't connect to redis
Failed to report error: Error connecting to Redis on localhost:6379 (Errno::ECONNREFUSED) 2 Error connecting to Redis on localhost:6379 (Errno::ECONNREFUSED) subscribe failed, reconnecting in 1 second. ... 2019-12-17T09:25:50.512Z pid=219 tid=or2tdiqvj ERROR: heartbeat: Error connecting to Redis on localhost:6379 (Errno::ECONNREFUSED)
- tcpdump shows that they're tcp packets flowing over the ipv6 loopback interface
root@osestaging1-discourse-ose:/var/www/discourse# tcpdump -i lo tcp -nn ... 10:26:31.730131 IP6 ::1.50578 > ::1.6379: Flags [P.], seq 400:533, ack 20, win 3084, options [nop,nop,TS val 523878086 ecr 523878086], length 133: RESP "brpop" "sidekiq:queue:critical" "sidekiq:queue:default" "sidekiq:queue:ultra_low" "sidekiq:queue:low" "2" 10:26:31.730152 IP6 ::1.6379 > ::1.50578: Flags [.], ack 533, win 3097, options [nop,nop,TS val 523878086 ecr 523878086], length 0 10:26:31.892156 IP6 ::1.50572 > ::1.6379: Flags [P.], seq 3591:3639, ack 598, win 3634, options [nop,nop,TS val 523878248 ecr 523877247], length 48: RESP "get" "default:sidekiq_is_paused_v2" 10:26:31.892673 IP6 ::1.6379 > ::1.50572: Flags [P.], seq 598:603, ack 3639, win 2729, options [nop,nop,TS val 523878249 ecr 523878248], length 5: RESP null 10:26:31.892701 IP6 ::1.50572 > ::1.6379: Flags [.], ack 603, win 3634, options [nop,nop,TS val 523878249 ecr 523878249], length 0 10:26:31.893103 IP6 ::1.50572 > ::1.6379: Flags [P.], seq 3639:3710, ack 603, win 3634, options [nop,nop,TS val 523878249 ecr 523878249], length 71: RESP "setnx" "default:_scheduler_lock_default_" "1576578451" 10:26:31.893348 IP6 ::1.6379 > ::1.50572: Flags [P.], seq 603:607, ack 3710, win 2729, options [nop,nop,TS val 523878249 ecr 523878249], length 4: RESP "1" 10:26:31.893969 IP6 ::1.50572 > ::1.6379: Flags [P.], seq 3710:3773, ack 607, win 3634, options [nop,nop,TS val 523878250 ecr 523878249], length 63: RESP "expire" "default:_scheduler_lock_default_" "60" 10:26:31.894207 IP6 ::1.6379 > ::1.50572: Flags [P.], seq 607:611, ack 3773, win 2729, options [nop,nop,TS val 523878250 ecr 523878250], length 4: RESP "1" 10:26:31.894595 IP6 ::1.50572 > ::1.6379: Flags [P.], seq 3773:3860, ack 611, win 3634, options [nop,nop,TS val 523878251 ecr 523878250], length 87: RESP "zrange" "default:_scheduler_queue_default_" "0" "0" "WITHSCORES" 10:26:31.894863 IP6 ::1.6379 > ::1.50572: Flags [P.], seq 611:664, ack 3860, win 2729, options [nop,nop,TS val 523878251 ecr 523878251], length 53: RESP "Jobs::ProcessBadgeBacklog" "1576578432" 10:26:31.895248 IP6 ::1.50572 > ::1.6379: Flags [P.], seq 3860:3973, ack 664, win 3634, options [nop,nop,TS val 523878251 ecr 523878251], length 113: RESP "zrange" "default:_scheduler_queue_default_osestaging1-discourse-ose_" "0" "0" "WITHSCORES" 10:26:31.895548 IP6 ::1.6379 > ::1.50572: Flags [P.], seq 664:668, ack 3973, win 2729, options [nop,nop,TS val 523878252 ecr 523878251], length 4: RESP empty 10:26:31.895839 IP6 ::1.50572 > ::1.6379: Flags [P.], seq 3973:4025, ack 668, win 3634, options [nop,nop,TS val 523878252 ecr 523878252], length 52: RESP "del" "default:_scheduler_lock_default_" 10:26:31.896093 IP6 ::1.6379 > ::1.50572: Flags [P.], seq 668:672, ack 4025, win 2729, options [nop,nop,TS val 523878252 ecr 523878252], length 4: RESP "1" 10:26:31.935586 IP6 ::1.50572 > ::1.6379: Flags [.], ack 672, win 3634, options [nop,nop,TS val 523878292 ecr 523878252], length 0 ^C 237 packets captured 544 packets received by filter 65 packets dropped by kernel root@osestaging1-discourse-ose:/var/www/discourse#
- oh, it looks like this was actually nginx failing with an error about modsecurity. I checked the nginx config with `nginx -V` and saw that it wasn't compiled with mod_security. Looks like I clobbered the launcher script when documenting & testing the sed commands for updating it with the NET_ADMIN capacity. I added-back the image line, and the next rebuild worked, even with all my iptables lines!
- I updated the documentation on our wiki https://wiki.opensourceecology.org/wiki/Discourse
- And I documented this on the discourse meta forums https://meta.discourse.org/t/how-to-use-iptables-inside-discourse-docker-container/136305/2ose "/usr/bin/apt-get in…" 3 weeks ago Created naughty_proskuriakova
ba00edad459a discourse_ose "sudo apt-get instal…" 3 weeks ago Created quizzical_burnell 7364dbb52542 discourse_ose "sudo apt-get instal…" 3 weeks ago Created cocky_bhaskara 9d0e485beba0 discourse_ose "sudo apt-get instal…" 3 weeks ago Created nervous_greider 75394a9e553f discourse_ose "/usr/sbin/iptables …" 3 weeks ago Created admiring_cori 8c59607a7b23 discourse_ose "iptables -L" 3 weeks ago Created silly_buck 92a929061a43 discourse_ose "bash" 3 weeks ago Exited (0) 3 weeks ago sleepy_cohen 0d4c01df1acb discourse_ose "bash" 3 weeks ago Exited (0) 3 weeks ago busy_satoshi 3557078bec62 discourse_ose "/bin/bash -c 'echo …" 3 weeks ago Exited (0) 3 weeks ago busy_sammet 56360e585353 bd5b8ac7ac36 "/bin/sh -c 'apt upd…" 3 weeks ago Exited (100) 3 weeks ago youthful_hermann 53bbee438a5e 9b33df0cef8e "/bin/sh -c 'apt upd…" 4 weeks ago Exited (127) 4 weeks ago awesome_newton [root@osestaging1 discourse]# </pre>
- I tried removing it
[root@osestaging1 discourse]# docker rm f8733eb8d9e4 f8733eb8d9e4 [root@osestaging1 discourse]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 5cc0db30940b discourse_ose "/bin/bash -c 'cd /p…" 13 hours ago Exited (1) 13 hours ago thirsty_borg 24a1f9f4c038 6a959e2d597c "/bin/bash" 2 weeks ago Exited (1) 2 weeks ago peaceful_leavitt 6932865cc6a1 6a959e2d597c "/bin/bash" 2 weeks ago Exited (1) 2 weeks ago friendly_grothendieck fce75ef5ce06 discourse_ose "/bin/bash" 2 weeks ago Exited (0) 2 weeks ago gifted_booth 03ea184c205e discourse_ose "/bin/bash" 2 weeks ago Exited (127) 2 weeks ago clever_solomon 6bd5bb0ab7b5 discourse_ose "whoami" 2 weeks ago Exited (0) 2 weeks ago upbeat_booth 4fbcfcc1e05f discourse_ose "echo hello" 2 weeks ago Created sweet_lalande 88d916eb12b0 discourse_ose "echo hello" 2 weeks ago Created goofy_allen 4a3b6e123460 discourse_ose "/bin/bash" 3 weeks ago Exited (1) 3 weeks ago adoring_mirzakhani ef4f90be07e6 discourse_ose "/bin/bash" 3 weeks ago Exited (0) 3 weeks ago awesome_mcclintock 580c0e430c47 discourse_ose "/bin/bash" 3 weeks ago Exited (130) 3 weeks ago naughty_greider 4bce62d2e873 discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Created boring_lehmann 6d4ef0ebb57d discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Created loving_davinci 4d5c8b2a90e0 discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Exited (0) 3 weeks ago quizzical_mestorf 34a3f6146a1d discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Exited (0) 3 weeks ago epic_williamson f0a73d8db0db discourse_ose "iptables -L" 3 weeks ago Created dazzling_beaver 4f34a5f5ee65 discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Exited (0) 3 weeks ago quizzical_haslett 0980ad174804 discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Exited (0) 3 weeks ago wonderful_tereshkova 79413047322f discourse_ose "/usr/bin/apt-get in…" 3 weeks ago Created naughty_proskuriakova ba00edad459a discourse_ose "sudo apt-get instal…" 3 weeks ago Created quizzical_burnell 7364dbb52542 discourse_ose "sudo apt-get instal…" 3 weeks ago Created cocky_bhaskara 9d0e485beba0 discourse_ose "sudo apt-get instal…" 3 weeks ago Created nervous_greider 75394a9e553f discourse_ose "/usr/sbin/iptables …" 3 weeks ago Created admiring_cori 8c59607a7b23 discourse_ose "iptables -L" 3 weeks ago Created silly_buck 92a929061a43 discourse_ose "bash" 3 weeks ago Exited (0) 3 weeks ago sleepy_cohen 0d4c01df1acb discourse_ose "bash" 3 weeks ago Exited (0) 3 weeks ago busy_satoshi 3557078bec62 discourse_ose "/bin/bash -c 'echo …" 3 weeks ago Exited (0) 3 weeks ago busy_sammet 56360e585353 bd5b8ac7ac36 "/bin/sh -c 'apt upd…" 3 weeks ago Exited (100) 3 weeks ago youthful_hermann 53bbee438a5e 9b33df0cef8e "/bin/sh -c 'apt upd…" 4 weeks ago Exited (127) 4 weeks ago awesome_newton [root@osestaging1 discourse]#
- well now it just says "no such container"
[root@osestaging1 discourse]# docker start discourse_ose Error response from daemon: No such container: discourse_ose Error: failed to start containers: discourse_ose [root@osestaging1 discourse]#
- but a `./launcher start discourse_ose` did start it up!
[root@osestaging1 discourse]# ./launcher start discourse_ose ... + /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=2 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=osediscourse_2019@michaelaltfield.net -e DISCOURSE_SMTP_ADDRESS=172.17.0.1 -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-discourse-ose --cap-add NET_ADMIN -e DOCKER_HOST_IP=172.17.0.1 --name discourse_ose -t -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:fc:97:b8:b4:0d local_discourse/discourse_ose /sbin/boot b55581b930865eb4cf744410cdb7dc2f5ce37517042781a6227fbb640b456d86 ... [root@osestaging1 discourse]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b55581b93086 local_discourse/discourse_ose "/sbin/boot" 7 seconds ago Up 5 seconds discourse_ose [root@osestaging1 discourse]#
- I documented this issue & its solution here https://meta.discourse.org/t/how-to-fix-container-is-marked-for-removal-and-cannot-be-started-without-reboot/136301
- I poked a few more holes in the firewall to allow OUTPUT to the loopback addresses in ipv4 & ipv6, and this time a rebuild worked!
[root@osestaging1 discourse]# cat templates/iptables.template.yml run: - file: path: /etc/runit/1.d/01-iptables chmod: "+x" contents: | #!/bin/bash ################################################################################ # File: /etc/runit/1.d/01-iptables # Version: 0.1 # Purpose: installs & locks-down iptables # Author: Michael Altfield <michael@opensourceecology.org> # Created: 2019-11-26 # Updated: 2019-11-26 ################################################################################ sudo apt-get install -y iptables sudo iptables -A OUTPUT -s 127.0.0.1/32 -d 127.0.0.1/32 -j ACCEPT sudo iptables -A OUTPUT -m state --state RELATED,ESTABLISHED -j ACCEPT sudo iptables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT sudo iptables -A OUTPUT -m owner --uid-owner 100 -j ACCEPT sudo iptables -A OUTPUT -j DROP sudo ip6tables -A OUTPUT -s ::1/128 -d ::1/128 -j ACCEPT sudo ip6tables -A OUTPUT -m state --state RELATED,ESTABLISHED -j ACCEPT sudo ip6tables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT sudo ip6tables -A OUTPUT -m owner --uid-owner 100 -j ACCEPT sudo ip6tables -A OUTPUT -j DROP [root@osestaging1 discourse]#
- I also confirmed that the OS can make updates (this is the only reason we grant the server any internet access at all, since it actually can be 100% drop with nginx on the docker host just reverse proxing to the unix socket file on teh docker container)
root@osestaging1-discourse-ose:/var/www/discourse# apt-get upgrade Reading package lists... Done Building dependency tree Reading state information... Done Calculating upgrade... Done The following packages will be upgraded: git git-man 2 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. Need to get 7,239 kB of archives. After this operation, 18.4 kB of additional disk space will be used. Do you want to continue? [Y/n] y Get:1 http://security.debian.org/debian-security buster/updates/main amd64 git-man all 1:2.20.1-2+deb10u1 [1,620 kB] Get:2 http://security.debian.org/debian-security buster/updates/main amd64 git amd64 1:2.20.1-2+deb10u1 [5,620 kB] Fetched 7,239 kB in 1s (11.5 MB/s) debconf: delaying package configuration, since apt-utils is not installed (Reading database ... 45278 files and directories currently installed.) Preparing to unpack .../git-man_1%3a2.20.1-2+deb10u1_all.deb ... Unpacking git-man (1:2.20.1-2+deb10u1) over (1:2.20.1-2) ... Preparing to unpack .../git_1%3a2.20.1-2+deb10u1_amd64.deb ... Unpacking git (1:2.20.1-2+deb10u1) over (1:2.20.1-2) ... Setting up git-man (1:2.20.1-2+deb10u1) ... Setting up git (1:2.20.1-2+deb10u1) ... root@osestaging1-discourse-ose:/var/www/discourse#
- I added some more restrictions to iptables, but then the next build failed. Interesting, but it's a 404, so unlikely related to iptables
I, [2019-12-17T09:34:03.927443 #1] INFO -- : > sudo apt-get install -y modsecurity-crs E: Failed to fetch http://security.debian.org/debian-security/pool/updates/main/a/apache2/apache2-bin_2.4.38-3+deb10u1_amd64.deb 404 Not Found [IP: 151.101.112.204 80] E: Failed to fetch http://deb.debian.org/debian/pool/main/m/modsecurity-crs/modsecurity-crs_3.1.0-1_all.deb 404 Not Found [IP: 151.101.112.204 80] E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing? I, [2019-12-17T09:34:06.259975 #1] INFO -- : Reading package lists...
- Looks like my apt cache was stale; adding an `apt-get update` before the installs for modsecurity & iptables to fix this.
- With my new iptables rules (including INPUT rules), Discourse isn't working now. I'm back to the 502 Bad Gateway. Looks like the ruby app can't connect to redis
Failed to report error: Error connecting to Redis on localhost:6379 (Errno::ECONNREFUSED) 2 Error connecting to Redis on localhost:6379 (Errno::ECONNREFUSED) subscribe failed, reconnecting in 1 second. ... 2019-12-17T09:25:50.512Z pid=219 tid=or2tdiqvj ERROR: heartbeat: Error connecting to Redis on localhost:6379 (Errno::ECONNREFUSED)
- tcpdump shows that they're tcp packets flowing over the ipv6 loopback interface
root@osestaging1-discourse-ose:/var/www/discourse# tcpdump -i lo tcp -nn ... 10:26:31.730131 IP6 ::1.50578 > ::1.6379: Flags [P.], seq 400:533, ack 20, win 3084, options [nop,nop,TS val 523878086 ecr 523878086], length 133: RESP "brpop" "sidekiq:queue:critical" "sidekiq:queue:default" "sidekiq:queue:ultra_low" "sidekiq:queue:low" "2" 10:26:31.730152 IP6 ::1.6379 > ::1.50578: Flags [.], ack 533, win 3097, options [nop,nop,TS val 523878086 ecr 523878086], length 0 10:26:31.892156 IP6 ::1.50572 > ::1.6379: Flags [P.], seq 3591:3639, ack 598, win 3634, options [nop,nop,TS val 523878248 ecr 523877247], length 48: RESP "get" "default:sidekiq_is_paused_v2" 10:26:31.892673 IP6 ::1.6379 > ::1.50572: Flags [P.], seq 598:603, ack 3639, win 2729, options [nop,nop,TS val 523878249 ecr 523878248], length 5: RESP null 10:26:31.892701 IP6 ::1.50572 > ::1.6379: Flags [.], ack 603, win 3634, options [nop,nop,TS val 523878249 ecr 523878249], length 0 10:26:31.893103 IP6 ::1.50572 > ::1.6379: Flags [P.], seq 3639:3710, ack 603, win 3634, options [nop,nop,TS val 523878249 ecr 523878249], length 71: RESP "setnx" "default:_scheduler_lock_default_" "1576578451" 10:26:31.893348 IP6 ::1.6379 > ::1.50572: Flags [P.], seq 603:607, ack 3710, win 2729, options [nop,nop,TS val 523878249 ecr 523878249], length 4: RESP "1" 10:26:31.893969 IP6 ::1.50572 > ::1.6379: Flags [P.], seq 3710:3773, ack 607, win 3634, options [nop,nop,TS val 523878250 ecr 523878249], length 63: RESP "expire" "default:_scheduler_lock_default_" "60" 10:26:31.894207 IP6 ::1.6379 > ::1.50572: Flags [P.], seq 607:611, ack 3773, win 2729, options [nop,nop,TS val 523878250 ecr 523878250], length 4: RESP "1" 10:26:31.894595 IP6 ::1.50572 > ::1.6379: Flags [P.], seq 3773:3860, ack 611, win 3634, options [nop,nop,TS val 523878251 ecr 523878250], length 87: RESP "zrange" "default:_scheduler_queue_default_" "0" "0" "WITHSCORES" 10:26:31.894863 IP6 ::1.6379 > ::1.50572: Flags [P.], seq 611:664, ack 3860, win 2729, options [nop,nop,TS val 523878251 ecr 523878251], length 53: RESP "Jobs::ProcessBadgeBacklog" "1576578432" 10:26:31.895248 IP6 ::1.50572 > ::1.6379: Flags [P.], seq 3860:3973, ack 664, win 3634, options [nop,nop,TS val 523878251 ecr 523878251], length 113: RESP "zrange" "default:_scheduler_queue_default_osestaging1-discourse-ose_" "0" "0" "WITHSCORES" 10:26:31.895548 IP6 ::1.6379 > ::1.50572: Flags [P.], seq 664:668, ack 3973, win 2729, options [nop,nop,TS val 523878252 ecr 523878251], length 4: RESP empty 10:26:31.895839 IP6 ::1.50572 > ::1.6379: Flags [P.], seq 3973:4025, ack 668, win 3634, options [nop,nop,TS val 523878252 ecr 523878252], length 52: RESP "del" "default:_scheduler_lock_default_" 10:26:31.896093 IP6 ::1.6379 > ::1.50572: Flags [P.], seq 668:672, ack 4025, win 2729, options [nop,nop,TS val 523878252 ecr 523878252], length 4: RESP "1" 10:26:31.935586 IP6 ::1.50572 > ::1.6379: Flags [.], ack 672, win 3634, options [nop,nop,TS val 523878292 ecr 523878252], length 0 ^C 237 packets captured 544 packets received by filter 65 packets dropped by kernel root@osestaging1-discourse-ose:/var/www/discourse#
- oh, it looks like this was actually nginx failing with an error about modsecurity. I checked the nginx config with `nginx -V` and saw that it wasn't compiled with mod_security. Looks like I clobbered the launcher script when documenting & testing the sed commands for updating it with the NET_ADMIN capacity. I added-back the image line, and the next rebuild worked, even with all my iptables lines!
- I updated the documentation on our wiki https://wiki.opensourceecology.org/wiki/Discourse
- And I documented this on the discourse meta forums https://meta.discourse.org/t/how-to-use-iptables-inside-discourse-docker-container/136305/2
Mon Dec 16, 2019
- continuing from last week, I need a way to generate a 2FA token directly in linux for our vpn clients like the production & staging server that need to be able to automatically connect to the vpn without human input. Note it would be preferred to create user-specific exclusions to the 2fa requirement since adding 2fa secret keys next to our rsa secret keys adds no security, but that's just not possible with the current state of 2FA implemented in OpenVPN using PAM.
- It looks like one tool I can use is 'oathtool'
[root@osestaging1 ~]# yum search oath Loaded plugins: fastestmirror, replace Loading mirror speeds from cached hostfile * base: mirror.checkdomain.de * epel: mirror.23media.com * extras: mirror.softaculous.com * updates: mirror.checkdomain.de * webtatic: uk.repo.webtatic.com N/S matched: oath ================= liboath.x86_64 : Library for OATH handling liboath-devel.x86_64 : Development files for liboath liboath-doc.noarch : Documentation files for liboath pam_oath.x86_64 : A PAM module for pluggable login authentication for OATH gen-oath-safe.noarch : Script for generating HOTP/TOTP keys (and QR code) oathtool.x86_64 : A command line tool for generating and validating OTPs Name and summary matches only, use "search all" for everything. [root@osestaging1 ~]#
- It looks like oathtool can take the secret key in base32, thank god
- And oathtool supporst many distinct hash functions for totp, but which should we use?
[root@osestaging1 ~]# oathtool --help oathtool 2.6.2 Generate and validate OATH one-time passwords. Usage: oathtool [OPTIONS]... [KEY [OTP]]... -h, --help Print help and exit -V, --version Print version and exit --hotp use event-based HOTP mode (default=on) --totp[=STRING] use time-variant TOTP mode (possible values="sha1", "sha256", "sha512" default=`sha1') -b, --base32 use base32 encoding of KEY instead of hex (default=off) -c, --counter=COUNTER HOTP counter value -s, --time-step-size=DURATION TOTP time-step duration (default=`30s') -S, --start-time=TIME when to start counting time steps for TOTP (default=`1970-01-01 00:00:00 UTC') -N, --now=TIME use this time as current time for TOTP (default=`now') -d, --digits=DIGITS number of digits in one-time password -w, --window=WIDTH window of counter values to test when validating OTPs -v, --verbose explain what is being done (default=off) Report bugs to: oath-toolkit-help@nongnu.org oathtool home page: <http://www.nongnu.org/oath-toolkit/> General help using GNU software: <http://www.gnu.org/gethelp/> [root@osestaging1 ~]#
- the 'google-authenticator' tool that we use for generating TOTP secret keys doesn't have an option to specify the hash function, nor does it explicity state in its documentation which hash function it uses https://github.com/google/google-authenticator
- But a quick search in their repo shows references to he SHA1 hash. Yikes, but ok. https://github.com/google/google-authenticator/search?utf8=%E2%9C%93&q=hash&type=
- Anyway, this works
[root@osestaging1 ~]# oathtool -b --totp BASE32SECRETKEYOBFUSCATED 123456 [root@osestaging1 ~]#
- I was successfully able to have the osestaging1 server connect to the openvpn server! I did this by creating a user 'osestaging1' on the server and generating their 2fa secret key using `google-auhenticator` (I did this last week), then I had to make the following changes to the client
- I added the following lines to the openvpn client.conf file
[root@osestaging1 openvpn]# ls ca.crt client.conf connect.sh osestaging1.crt osestaging1.key ta.key [root@osestaging1 openvpn]# tail -n5 client.conf # 2fa auth-user-pass /root/openvpn/auth.txt auth-nocache reneg-sec 0 [root@osestaging1 openvpn]#
- I created the auth.txt file, which holds the username that on the dev server whoose $HOME dir holds the '.google_authenticator' file with the 2FA secret key and the password, which is the current 6-digit 2FA token. Note that this file should be owned by root:root 0600
[root@osestaging1 openvpn]# touch auth.txt [root@osestaging1 openvpn]# chown root:root auth.txt [root@osestaging1 openvpn]# chmod 0600 auth.txt [root@osestaging1 openvpn]# ls -lah auth.txt -rw-------. 1 root root 19 Dec 16 09:45 auth.txt [root@osestaging1 openvpn]#
- I created a 'connect.sh' script that would populate the auth.txt file above with the ever-changing TOTP token before calling openvpn
[root@osestaging1 openvpn]# cat connect.sh TOTP_SECRET=BASE32SECRETKEYOBFUSCATED token=`oathtool --base32 --totp ${TOTP_SECRET}` echo -e "osestaging1\n${token}" > /root/openvpn/auth.txt sudo openvpn /root/openvpn/client.conf [root@osestaging1 openvpn]#
- That was good for a test, but the better solution is to update systemd so openvpn connects on reboots, etc
- In merging it into systemd, I found that I can't use /root/openvpn. Instead I used the existing dir /etc/openvpn/client for this, and I updated the ExecStart line in the relevant systemd unit file
[root@osestaging1 system]# cat /etc/systemd/system/openvpn-client.service [Unit] Description=OpenVPN tunnel for %I After=syslog.target network-online.target Wants=network-online.target Documentation=man:openvpn(8) Documentation=https://community.openvpn.net/openvpn/wiki/Openvpn24ManPage Documentation=https://community.openvpn.net/openvpn/wiki/HOWTO [Service] User=root Type=notify PrivateTmp=true WorkingDirectory=/etc/openvpn/client #WorkingDirectory=/root/openvpn #ExecStart=/usr/sbin/openvpn --suppress-timestamps --nobind --config client.conf ExecStart=/etc/openvpn/client/connect.sh CapabilityBoundingSet=CAP_IPC_LOCK CAP_NET_ADMIN CAP_NET_RAW CAP_SETGID CAP_SETUID CAP_SYS_CHROOT CAP_DAC_OVERRIDE #LimitNPROC=10 LimitNPROC=infinity DeviceAllow=/dev/null rw DeviceAllow=/dev/net/tun rw ProtectSystem=true ProtectHome=true KillMode=process [Install] WantedBy=multi-user.target [root@osestaging1 system]#
- For some reason my dns never gets properly updated after connecting, even though the settings appear correct in the server & client config files. Anyway, I documented how to manually set these DNS settings to see be pointed to the staging server after connecting to the VPN. https://wiki.opensourceecology.org/wiki/OSE_Staging_Server#Accessing_Staging_Websites
- I found that I had to restart nginx on the staging server; it was broken for some reason. But it came back fine, and I could access the staging OSE site without issues.
- I noticed that our https cert refereshed last week, so I just convirmed that we have until mid-January until our staging cert will expire. I'll need to kick-off a prod to staging sync at some point, but I'd like to reach a better stopping point with Discourse first.
- I also installed oathtool on production and set it up with 2fa auth over the openvpn-client service as I did with staging & demonstrated above.
- ok, vpn is all set now
- ...
- back to Discourse. I left off trying to iron-out issues with the iptables template https://wiki.opensourceecology.org/wiki/Maltfield_Log/2019_Q4#Tue_Nov_26.2C_2019
- the issue is this templates/iptables.template.yml file
[root@osestaging1 discourse]# head -n20 containers/discourse_ose.yml ## this is the all-in-one, standalone Discourse Docker container template ## ## After making changes to this file, you MUST rebuild ## /var/discourse/launcher rebuild app ## ## BE *VERY* CAREFUL WHEN EDITING! ## YAML FILES ARE SUPER SUPER SENSITIVE TO MISTAKES IN WHITESPACE OR ALIGNMENT! ## visit http://www.yamllint.com/ to validate this file as needed templates: - "templates/iptables.template.yml" #- "templates/postgres.template.yml" #- "templates/redis.template.yml" #- "templates/web.template.yml" #- "templates/web.ratelimited.template.yml" #- "templates/web.socketed.template.yml" #- "templates/web.modsecurity.template.yml" ## Uncomment these two lines if you wish to add Lets Encrypt (https) #- "templates/web.ssl.template.yml" #- "templates/web.letsencrypt.ssl.template.yml" [root@osestaging1 discourse]#
- attempting the rebuild with just that one template yaml produces a docker image that gets stuck in a "restarting" boot loop
[root@osestaging1 discourse]# ./launcher rebuild discourse_ose + /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e UNICORN_WORKERS=2 -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=osediscourse_2019@michaelaltfield.net -e DISCOURSE_SMTP_ADDRESS=172.17.0.1 -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-discourse-ose --cap-add NET_ADMIN -e DOCKER_HOST_IP=172.17.0.1 --name discourse_ose -t -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:fc:97:b8:b4:0d local_discourse/discourse_ose /sbin/boot 5d7eeae555cdb88a95d638de7af7f358d131ac4c99640b18fe2e8078975a948c ... [root@osestaging1 discourse]# ./launcher enter discourse_ose cannot exec in a stopped state: unknown [root@osestaging1 discourse]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 5d7eeae555cd local_discourse/discourse_ose "/sbin/boot" 26 seconds ago Restarting (100) 7 seconds ago discourse_ose [root@osestaging1 discourse]#
- here's the current iptables template file; let's simplify it to an `exit 0` and see if we can fix the bootloop, then build from there to isolate the specific command causing the issue
[root@osestaging1 discourse]# cat templates/iptables.template.yml run: # - exec: # cmd: # - sudo apt-get install -y iptables - file: path: /etc/runit/1.d/000-iptables chmod: "+x" contents: | ################################################################################ # File: /etc/runit/1.d/000-iptables # Version: 0.1 # Purpose: installs & locks-down iptables # Author: Michael Altfield <michael@opensourceecology.org> # Created: 2019-11-26 # Updated: 2019-11-26 ################################################################################ #!/bin/bash sudo apt-get install -y iptables sudo iptables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT sudo iptables -A OUTPUT -m owner --uid-owner 100 -j ACCEPT sudo iptables -A OUTPUT -j DROP sudo ip6tables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT sudo ip6tables -A OUTPUT -m owner --uid-owner 100 -j ACCEPT sudo ip6tables -A OUTPUT -j DROP [root@osestaging1 discourse]#
- holy shit, just making it an "exit 0" still causes it to get stuck in a boot loop!
- If I remove the one iptables template line, it comes up fine. When it did, I checked the existing runit scripts in that dir for reference. None of them exit at all. Hmm
root@osestaging1-discourse-ose:/# ls -lah /etc/runit/1.d total 24K drwxr-xr-x. 1 root root 4.0K Nov 25 13:22 . drwxr-xr-x. 1 root root 4.0K Nov 25 13:22 .. -rwxr-xr-x. 1 root root 321 Oct 28 12:07 00-fix-var-logs -rwxr-xr-x. 1 root root 33 Oct 28 12:07 anacron -rwxr-xr-x. 1 root root 75 Oct 28 12:07 cleanup-pids root@osestaging1-discourse-ose:/# cat /etc/runit/1.d/00-fix-var-logs #!/bin/bash mkdir -p /var/log/nginx chown -R www-data:www-data /var/log/nginx chmod -R 644 /var/log/nginx chmod 755 /var/log/nginx touch /var/log/syslog && chown -f root:adm /var/log/syslog* touch /var/log/auth.log && chown -f root:adm /var/log/auth.log* touch /var/log/kern.log && chown -f root:adm /var/log/kern.log* root@osestaging1-discourse-ose:/# cat /etc/runit/1.d/anacron #!/bin/bash /usr/sbin/anacron -s root@osestaging1-discourse-ose:/# cat /etc/runit/1.d/cleanup-pids #!/bin/bash /bin/echo "Cleaning stale PID files" /bin/rm -f /var/run/*.pid root@osestaging1-discourse-ose:/#
- oh, duh, the issue was that my shebang wasn't the first line. Moving it above the comments fixed it. Also, I changed the name of the scrip to be 01-iptables so it's executed after the Discourse script to fix logging
- I got it working! Note that when I first entered the docker container, it said iptables didn't exist. I guess because it was still installing it on-boot. Hopefully this happens early enough that it's not an issue, but sooner is better
[root@osestaging1 discourse]# ./launcher rebuild discourse_ose + /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e UNICORN_WORKERS=2 -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=osediscourse_2019@michaelaltfield.net -e DISCOURSE_SMTP_ADDRESS=172.17.0.1 -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-discourse-ose --cap-add NET_ADMIN -e DOCKER_HOST_IP=172.17.0.1 --name discourse_ose -t -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:fc:97:b8:b4:0d local_discourse/discourse_ose /sbin/boot 26a9c6442aa6343fb0c7ec296a2d3824428d98c70d603bc59147bc96a7d2ea92 ... [root@osestaging1 discourse]# ./launcher enter discourse_ose root@osestaging1-discourse-ose:/# iptables -L bash: iptables: command not found root@osestaging1-discourse-ose:/# iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination ACCEPT all -- anywhere anywhere owner UID match root ACCEPT all -- anywhere anywhere owner UID match _apt DROP all -- anywhere anywhere # Warning: iptables-legacy tables present, use iptables-legacy to see them root@osestaging1-discourse-ose:/# logout [root@osestaging1 discourse]#
- here's the template file now that worked for the above rebuild. It's the bare necessities and needs improvement. Currently all it does is block all users besides root and apt. I'll also probably want to add INPUT blocks.
[root@osestaging1 discourse]# cat templates/iptables.template.yml run: # - exec: # cmd: # - sudo apt-get install -y iptables - file: path: /etc/runit/1.d/01-iptables chmod: "+x" contents: | #!/bin/bash ################################################################################ # File: /etc/runit/1.d/01-iptables # Version: 0.1 # Purpose: installs & locks-down iptables # Author: Michael Altfield <michael@opensourceecology.org> # Created: 2019-11-26 # Updated: 2019-11-26 ################################################################################ sudo apt-get install -y iptables sudo iptables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT sudo iptables -A OUTPUT -m owner --uid-owner 100 -j ACCEPT sudo iptables -A OUTPUT -j DROP sudo ip6tables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT sudo ip6tables -A OUTPUT -m owner --uid-owner 100 -j ACCEPT sudo ip6tables -A OUTPUT -j DROP [root@osestaging1 discourse]#
- I re-added all the other templates and did a full rebuild. God damn it takes so long! When it finished, I manged to catch it just during the iptables install, before the iptables rules were added, and after the iptables rules were added
[root@osestaging1 discourse]# ./launcher enter discourse_ose root@osestaging1-discourse-ose:/var/www/discourse# ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 12:59 pts/0 00:00:00 /bin/bash /sbin/boot root 6 1 0 12:59 pts/0 00:00:00 /bin/bash /etc/runit/1 root 7 6 0 12:59 pts/0 00:00:00 /bin/run-parts --verbose --exit-on-error /etc/runit/1.d root 29 7 0 12:59 pts/0 00:00:00 /bin/bash /etc/runit/1.d/01-iptables root 30 29 0 12:59 pts/0 00:00:00 sudo apt-get install -y iptables root 31 30 13 12:59 pts/0 00:00:01 apt-get install -y iptables root 40 31 5 12:59 pts/1 00:00:00 /usr/bin/dpkg --status-fd 17 --no-triggers --unpack --auto-deconfigure --recursive /tmp/apt-dpkg-install-I1JWzF root 49 0 1 12:59 pts/2 00:00:00 /bin/bash --login root 74 49 0 12:59 pts/2 00:00:00 ps -ef root@osestaging1-discourse-ose:/var/www/discourse# iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination # Warning: iptables-legacy tables present, use iptables-legacy to see them root@osestaging1-discourse-ose:/var/www/discourse# iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination ACCEPT all -- anywhere anywhere owner UID match root ACCEPT all -- anywhere anywhere owner UID match _apt DROP all -- anywhere anywhere # Warning: iptables-legacy tables present, use iptables-legacy to see them root@osestaging1-discourse-ose:/var/www/discourse#
- I found that the discourse site was broken (I got a 504 gateway time-out). The logs showed it failed to connect to redis. Ah, right, we need to whitelist loopback traffic
==> shared/standalone/log/rails/production.log <== Error connecting to Redis on localhost:6379 (Redis::TimeoutError) subscribe failed, reconnecting in 1 second. ...
- I made some changes, but I found that even after I was accepting 100% of the packets, I still got a 502 bad gateway. I flushed iptables rules, and I *still* got the same error. This isn't an iptables issue
root@osestaging1-discourse-ose:~/backups/iptables# iptables -nvL Chain INPUT (policy ACCEPT 12 packets, 600 bytes) pkts bytes target prot opt in out source destination Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 12 600 ACCEPT all -- * * 127.0.0.1 127.0.0.1 0 0 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED 0 0 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 owner UID match 0 0 0 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 owner UID match 100 0 0 DROP all -- * * 0.0.0.0/0 0.0.0.0/0 # Warning: iptables-legacy tables present, use iptables-legacy to see them root@osestaging1-discourse-ose:~/backups/iptables# iptables -F root@osestaging1-discourse-ose:~/backups/iptables# iptables -nvL Chain INPUT (policy ACCEPT 18 packets, 900 bytes) pkts bytes target prot opt in out source destination Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination # Warning: iptables-legacy tables present, use iptables-legacy to see them root@osestaging1-discourse-ose:~/backups/iptables#
- we have double nginx in this stupid setup; one's in the docker host and one's in the docker container. Here's the config for the docker host that passes back to the container. It does it via unix socket
[root@osestaging1 ~]# tail -n15 /etc/nginx/conf.d/discourse.opensourceecology.org.conf ################## # SEND TO DOCKER # ################## location / { proxy_pass http://unix:/var/discourse/shared/standalone/nginx.http.sock:; proxy_set_header Host $http_host; proxy_http_version 1.1; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto https; proxy_set_header X-Real-IP $remote_addr; } } [root@osestaging1 ~]#
- it looks like the socket does exist on the docker container
[root@osestaging1 ~]# ls -lah /var/discourse/shared/standalone/ total 44K drwxr-xr-x. 11 root root 4.0K Dec 16 12:59 . drwxr-xr-x. 3 root root 4.0K Nov 7 11:27 .. drwxr-xr-x. 3 tgriffing 33 4.0K Nov 8 00:00 backups drwxr-xr-x. 4 root root 4.0K Nov 7 11:28 log srw-rw-rw-. 1 root root 0 Dec 16 12:59 nginx.http.sock drwxr-xr-x. 2 106 110 4.0K Nov 7 11:28 postgres_backup drwx------. 19 106 110 4.0K Dec 16 12:59 postgres_data drwxrwxr-x. 3 106 110 4.0K Dec 16 12:59 postgres_run drwxr-xr-x. 2 108 111 4.0K Dec 16 12:57 redis_data drwxr-xr-x. 4 root root 4.0K Nov 7 11:54 state drwxr-xr-x. 4 tgriffing 33 4.0K Dec 16 12:59 tmp drwxr-xr-x. 4 tgriffing 33 4.0K Nov 10 14:17 uploads [root@osestaging1 ~]#
- well, I was able to get *some* response from the nginx running on the docker container; it probably doesn't like my lack of User-Agent
[root@osestaging1 ~]# nc -U /var/discourse/shared/standalone/nginx.http.sock GET / <html> <head><title>403 Forbidden</title></head> <body> <center><h1>403 Forbidden</h1></center> <hr><center>nginx</center> </body> </html> Ncat: Broken pipe. [root@osestaging1 ~]#
- I checked the nginx logs for my above netcat request; yep, it's mod_sec's rule id = 920280 complaining that I didn't give a Host header.
- If I access it in the browser, I see the errors report differently with the 502
==> /var/log/nginx/error.log <== 2019/12/16 13:30:03 [error] 85#85: *16 connect() failed (111: Connection refused) while connecting to upstream, client: 10.241.189.10, server: _, request: "GET / HTTP/1.1", upstr eam: "http://127.0.0.1:3000/", host: "discourse.opensourceecology.org"
- woah, one thing that stands out to me is that 10.241.189.10 ip address. Not sure that the docker container should ever see that ip address. I checked, and this ip address is *supposed* to be reserved for our production server
[root@osedev1 openvpn]# pwd /etc/openvpn [root@osedev1 openvpn]# cat ccd/hetzner2 ifconfig-push 10.241.189.10 255.255.255.255 [root@osedev1 openvpn]#
- but, curiously, I discovered that my laptop has this IPon the VPN
user@ose:~/openvpn$ ip address show tun0 7: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100 link/none inet 10.241.189.10 peer 10.241.189.9/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::4d25:623f:12d7:b047/64 scope link flags 800 valid_lft forever preferred_lft forever user@ose:~/openvpn$
- I'm not sure if this is the isue or not, but it should be addressed. I found this openvpn doc on ip addressing that states that ccd files for reserving ip addresses don't actually make reservations, and if a vpn already allocates the ip to another client, it can create issues. The solution they say is to not use the --server directive at all. For reference, here's our server directive in the openvpn server.conf file https://community.openvpn.net/openvpn/wiki/Concepts-Addressing
[root@osedev1 openvpn]# grep -E '^server' server.conf server 10.241.189.0 255.255.255.0 [root@osedev1 openvpn]#
- so it looks like to make this not an issue, I'd have to strike 'server' and instead replace it with all its parts
- I really wish there were some way to get openvpn to print me the expanded form of my current config, but there doesn't appear to be an equivalent form of `postconf` in openvpn land.
- First question: what topology am I using? subnet, net30, or p2p? It seems like net30 is deprecated, p2p doesn't allow multiple clients, so I'm guessing we're using a "subnet" toplogyhttps://community.openvpn.net/openvpn/wiki/Concepts-Addressing
- the second "examples" section in the above doc has what we probably want https://community.openvpn.net/openvpn/wiki/Concepts-Addressing#Examples
mode server tls-server push "topology subnet" dev tun ifconfig 10.241.189.1 255.255.255.0 # we reserve the first 50 addresses for static IPs for servers; see 'ccd' dir ifconfig-pool 10.241.189.50 10.241.189.253 255.255.255.0 push "route-gateway 10.241.189.1"
- the above options (and many variations) didn't work. I kept getting a "-1" broadcast address. Note that I was forced to remove the broadcast address from the 'ifconfig-pool' line because openvpn said it was only valid for tap interfaces
... Mon Dec 16 22:00:46 2019 /sbin/ip addr add dev tun0 10.241.189.50/-1 broadcast 255.255.255.254 Error: inet prefix is expected rather than "10.241.189.50/-1". Mon Dec 16 22:00:46 2019 Linux ip addr add failed: external program exited with error status: 1 Mon Dec 16 22:00:46 2019 Exiting due to fatal error user@ose:~/openvpn$
- I changed it to p2p, and then I couldn't ping the gateway.
- fuck it, I reset it back to just 'server', and made a point to connect to the vpn from the staging & prod servers *before* my laptop.
- anyway, back to Discourse. Now I see this in my logs. Note that the IP switched to .50, which is my laptop--not the ose prod server (which was the conflict on .10 before)
==> /var/log/nginx/error.log <== 2019/12/16 17:11:47 [error] 85#85: *29 connect() failed (111: Connection refused) while connecting to upstream, client: 10.241.189.50, server: _, request: "GET / HTTP/1.1", upstream: "http://127.0.0.1:3000/", host: "discourse.opensourceecology.org"
- it's late and I have a meeting with Marcin tomorrow on VPN training. For now I'm just going to comment-out the itpables template line and rebuid the discourse app so it actually works so Marcin can play with it after he's setup on the VPN tomorrow.
- after waiting 10 fucking minutes, I as informed the rebuild failed
[root@osestaging1 discourse]# time ./launcher rebuild discourse_ose ... starting up existing container ... + /bin/docker start discourse_ose Error response from daemon: container is marked for removal and cannot be started Error: failed to start containers: discourse_ose ... real 10m40.584s user 0m2.099s sys 0m1.925s [root@osestaging1 discourse]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@osestaging1 discourse]#
- I ran it again. waited 10 fucking minutes. Fail again with same error.
- while it ran, I did some research that could have suggested my iptables one was giving a 502 gateway error when nginx was trying to connect to the rails app. So I should check the rails logs.
- I've read that one fix for the above error is to just restart the server or restart the docker service https://meta.discourse.org/t/upgrade-running-in-docker-fails/67370/4
- I tried to rebuild it using the iptables template uncommented, but it failed again with the same issue. That's 30 minutes to run 3 simple tests ffs
- I tried restarting docker and doing a `./launcher start discourse_ose`, but I got the same error
- fucking hell. I restarted the fucking server (what the hell are we going to do about this in prod?)
- that didn't fix it. Still getting "container is marked for removal and cannot be started"
- at the same time, I saw this pop into the logs
Dec 16 17:56:38 osestaging1 containerd[344]: time="2019-12-16T17:56:38.147034288Z" level=info msg="shim containerd-shim started" address="/containerd-shim/moby/87f90aca3545e7c1d5990adb2d9cced4ec1c0c5a846b91746d0ffea2669dcb88/shim.sock" debug=false pid=2844 Dec 16 17:56:38 osestaging1 dockerd[346]: time="2019-12-16T17:56:38Z" level=error msg="failed to disable IPv6 forwarding for container's interface all: open /proc/sys/net/ipv6/conf/all/disable_ipv6: read-only file system" Dec 16 17:56:38 osestaging1 dockerd[346]: time="2019-12-16T17:56:38.583054692Z" level=warning msg="Failed to disable IPv6 on all interfaces on network namespace \"/var/run/docker/netns/ac0b9c2c9581\": reexec to set IPv6 failed: exit status 4" Dec 16 17:56:38 osestaging1 containerd[344]: time="2019-12-16T17:56:38.859982529Z" level=info msg="shim reaped" id=87f90aca3545e7c1d5990adb2d9cced4ec1c0c5a846b91746d0ffea2669dcb88 Dec 16 17:56:38 osestaging1 dockerd[346]: time="2019-12-16T17:56:38.869984487Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete" Dec 16 17:56:38 osestaging1 dockerd[346]: time="2019-12-16T17:56:38.969781713Z" level=warning msg="87f90aca3545e7c1d5990adb2d9cced4ec1c0c5a846b91746d0ffea2669dcb88 cleanup: failed to unmount IPC: umount /var/lib/docker/containers/87f90aca3545e7c1d5990adb2d9cced4ec1c0c5a846b91746d0ffea2669dcb88/mounts/shm, flags: 0x2: no such file or directory"
- fucking hell. there will be no discourse tomorrow because docker is a piece of shit.
Fri Dec 13, 2019
- made 2x changes to OSSEC
- Added local rule to not send email alerts on "body.xml:1: parser error : Document labelled UTF-16 but has UTF-8 content", which were triggering email alerts at least every few days
<rule id="100056" level="2"> <if_sid>1002</if_sid> <match>Document labelled UTF-16 but has UTF-8 content</match> <description>ignore document unicode encoding issues as they're high in number</description> </rule>
- Added nodiff tags to sysconfig in main ossec.conf config file to prevent diffing of our https private keys in emails (note these emails are already encrypted, but still)
<nodiff>/etc/letsencrypt/live/</nodiff> <nodiff>/root/backups/*.key</nodiff>
Thr Dec 12, 2019
- Marcin set a meeting time for our VPN training on Tuesday of next week
- ...
- I tried to ssh into osestaging1, but it couldn't connect
- I checked the staging container on osedev1, and I discovered that it's not connected to the VPN. Oh, right, it wouldn't be able to connect without a 2FA token; I guess I have to create exceptions for some
- Unfortunately, it's non-trivial to make exceptions for some vpn clients that don't need 2FA. For one, the 2FA auth occurs before the cert auth. Second, the cert auth isn't really tied to a username.
- There is an option nullok to permit users who havent setup 2FA to login, but that would be in the user's domain. We don't, for example, want some devs with bad passwords to disable their 2FA on their own
- I found a guide for this ask, but it's specific to the non-free OpenVPN Access Server https://forums.openvpn.net/viewtopic.php?t=15366
- one option is that we don't disable 2FA but rather we just have to manually enter it. I found a couple important options for clients to prevent disconnect, since the totp password changes every 30 seconds the defaults don't make sense https://github.com/evgeny-gridasov/openvpn-otp
auth-nocache reneg-sec 0
- I discovered some other solutions to how to customize the 2FA token prompt using "authtok_prompt" and/or "static-challenge". It didn't work, and maybe because it would require compiling OpenVPN from source https://github.com/google/google-authenticator-libpam/issues/112
- I don't think I'm going to find a solution for requiring 2fa conditionally per user without RADIUS or paid OpenVPN, unfortunately. We should be able to store the private key on the server and generate the tokens as need.
- I know, I know. It's not a distinct factor. But the only reason we're doing this is because we can't disable 2FA for this user. Anyway.
- Let's find a tool that will output tokens and not just generate the secret keys & qr code for seeding a 2FA app
[root@osestaging1 ~]# yum search totp | grep oath gen-oath-safe.noarch : Script for generating HOTP/TOTP keys (and QR code) [root@osestaging1 ~]# yum search google | grep -i authenticat python2-certbot-dns-google.noarch : Google Cloud DNS Authenticator plugin for google-authenticator.x86_64 : One-time pass-code support using open standards [root@osestaging1 ~]#
- I installed gen-oath-safe
[root@osestaging1 ~]# yum install gen-oath-safe ... Installed: gen-oath-safe.noarch 0:0.10.1-3.el7 Dependency Installed: caca-utils.x86_64 0:0.99-0.17.beta17.el7 freeglut.x86_64 0:3.0.0-8.el7 imlib2.x86_64 0:1.4.5-9.el7 libcaca.x86_64 0:0.99-0.17.beta17.el7 mesa-libGLU.x86_64 0:9.0.0-4.el7 qrencode.x86_64 0:3.4.1-3.el7 slang.x86_64 0:2.2.4-11.el7 Complete! [root@osestaging1 ~]#
- ffs, gen-oath-safe expects the secret in hex format, but google-authenticator outputs it in base32 format.
[root@osestaging1 ~]# gen-oath-safe osestaging1 totp "OBFUSCATED0" ERROR: Invalid secret, must be hex encoded. [root@osestaging1 ~]#
- there's no base32 binary in the yum repos like there is for base64. But there's a relvant perl module at least
[root@osestaging1 ~]# yum search base32 Loaded plugins: fastestmirror, replace Loading mirror speeds from cached hostfile * base: mirror.alpix.eu * epel: mirror.23media.com * extras: mirror.softaculous.com * updates: mirror.fra10.de.leaseweb.net * webtatic: uk.repo.webtatic.com N/S matched: base32 =================== perl-Convert-Base32.noarch : Encoding and decoding of base32 strings perl-MIME-Base32.noarch : Base32 encoder / decoder rubygem-base32.noarch : Ruby extension for base32 encoding and decoding rubygem-base32-doc.noarch : Documentation for rubygem-base32 Name and summary matches only, use "search all" for everything. [root@osestaging1 ~]# yum install perl-MIME-Base32
- I made a dumb one-liner to decode base32
[root@osestaging1 ~]# perl -M"MIME::Base32 qw( RFC )" -e'print MIME::Base32::decode( "1" )'; 1[root@osestaging1 ~]#
- wait, actually, that's not right. "1" isn't actually defined in base32. And "A" in base32 should be "0" https://en.wikipedia.org/wiki/Base32
[root@osestaging1 ~]# perl -M"MIME::Base32 qw( RFC )" -e'print MIME::Base32::decode( "A" )'; [root@osestaging1 ~]#
- I think it's outputting binary data. Attempting to printf hex doesn't help.
- python gave me padding errors, no matter how name "=" I iteratively added to the end
- after wasting hours trying to convert the damn base32 to hex, I eventually used an online tool (not safe) at least as a sanity check. It gave me back a hex string that gen-oath-safe accepted, but gen-oath-safe didn't give me a token! It just spat out the secrets at me again in b32 & hex + a qr code!
- fucking hell, I'm back at square one. It's really not hard to implement a damn 2fa token generator on the desktop...
Thr Dec 05, 2019
- Looks like Marcin is delegating adding software to OSE Linux, but I'm afraid we hit a space limit last time Chris tried to do this. I always felt that it made more sense for OSE Linux to be a proper Debian/Ubuntu variant rather than a live ISO. That solves the live iso byte limits and the persistance issues.
I wasn't asked and I'm not an authority, but I'm going to add my $0.02: Chris hit a size issue with the last build where we reached a byte limit of a live iso when trying to add additional software to OSE Linux. And there's also issues with persistence on a live distro, in general. I've always wondered if it makes more sense to run an Ubuntu (or debian?) downstream distro/flavour/variant instead of a live distro * https://ubuntu.com/download/flavours * https://en.wikipedia.org/wiki/Category:Debian-based_distributions And/or our own OSE apt repository such that we pin the versions and update them on some standardized release cycle. * https://wiki.debian.org/DebianRepository#Set_up_and_maintain_a_repository Honestly, the live OSE distro always felt like a hack, and creating our own repo always seemed more--eh--apt. This would solve any iso disk space limitations, persistence issues, and probably be easier to maintain & iterate-on in the future. But, again, I'm no authority here. I've never maintained a publicly-accessible debian repository or built a live distro. My experience with OSE Linux is extremely limited. Also, January 25th is a tight deadline, so please don't let this email derail your effort. Only consider it if you hit the space limitation as Chris did. This is a longer-term discussion. Cheers, Michael Altfield Senior System Administrator PGP Fingerprint: 8A4B 0AF8 162F 3B6A 79B7 70D2 AA3E DF71 60E2 D97B Open Source Ecology www.opensourceecology.org
Wed Dec 04, 2019
- Backblaze got back to me stating that they currently don't support append-only, but they'd forward my request for this feature to their development team
Hello Michael, Unfortunately, we do not have an append-only feature implemented into our application keys but that is definitely something we can explore and potentially implement. I will forward that suggestion to our developers however you are correct. If a malicious actor were to gain access to both the keyID/application key they could overwrite file versions by listing the contents of a bucket and then writing junk files with the same file names found within the bucket. Unfortunately, at this time we do not a timeline on when a possible append-only permission will be available for application keys. Regards, --Brad The Backblaze Team
- Marcin confirmed that he hasn't yet made a copy of one of our backups stored off-site at FeF :(
- ...
- Marcin responded to my VPN Training email, including that his laptop won't boot so he's using Catarina's, but that he doesn't have sudo access on it. I sent him links on how to boot to recovery and/or single-user mode to reset the root password on the machine
- I also made it clear that, in addition to having sudo access to your workstation and generating a CSR before our meeting, he'd also need access to his personal keepass, his 2FA app on his phone, and his ssh private key.
Tue Dec 03, 2019
- Well, my "Append-only" article on wikipedia was rejected for lack of citing of sources (even though I cited man pages, business blog posts of their usage of append-only in their data structures, and a couple academic papers describing append-only solutions to randsomware). It was suggested that I add this to existing articles, such as the "File Attribute" article https://en.wikipedia.org/wiki/Draft:Append-only
- ...
- Backblaze got back to me and said that B2's file versioning system is essentially immature, as is the entire industries
Hello, Thanks for contacting Backblaze support. With the lifecycle rules set as you have them, "daysFromUploadingToHiding 364" and "daysFromHidingToDeleting 1" the expected behavior of when a file version is uploaded, is to hide the file after 364 days and to delete file versions after 24 hours. Unfortunately, there is no way to avoid creating new file versions other than renaming the file upon upload to b2 with an edited file name. No application key permission changes will address this if the key has write permission. If a file is uploaded to b2 with the filename example.file any further example.file('s) will be treated as a file version, the previous version will be hidden and lifecycle rules will be applied. If the file has a server name appended to it such as "exampleservername.file" then it will be treated differently than example.file and retained as a separate file. Additionally adding a date stamp to the file can also increase the amount of retained versions, i.e. "exampleservername12022019.file" or if multiple versions per day will be added then a time and date stamp can be used, "exampleservername14162212022019.file. A simpler method may be to increase the dayFromHidingToDeleting to an amount greater than 1 day to allow for a file review to ensure no version deletions are made without review. The issue at hand is that B2 does not have a more robust file version control method built into the storage. Even Amazon S3 does not have this and either a manual/scripted solution with appended file names must be used or a 3rd party Integration with more featured file versioning must be implemented. Regards, --Brad The Backblaze Team
- I responded, reiterating the need for "append-only" for many org's defenses to ransomware, and asking if anything was being done about it. I CC'd Marcin.
Hi Brad, thanks for your response! > adding a date stamp to the file can also increase the amount of > retained versions, i.e. "exampleservername12022019.file" or if multiple > versions per day will be added then a time and date stamp can be used, > "exampleservername14162212022019.file. To be clear, the use-case here is an "append-only" ACL such that an attacker that's compromised my machine where the keys' live (ie: ransomware) does not have permission to overwrite existing data in our bucket. Or, in the case specific to Backblaze B2, the key would also need to not have permission to make existing files hidden. Under this threat model, I cannot prevent the malicious actor who stole our application key from just being nice and appending some string to or existing files to prevent the old ones from being hidden! > The issue at hand is that B2 does not have a more robust file version > control method built into the storage. Are there any improvements to B2 in the works that would fix this issue or otherwise provide "append-only" ACL permissions to application keys in conjunction with lifecycle rules? If so, is there an ETA on when I can expect to be able to set "append-only" permissions to a given application key in conjunction with lifecycle rules? With the number of ransomware attacks that occurred in 2019 such that the attacker literally encrypted all the victims' servers *and* deleted the victims' backups, append-only permissions have become a critical component to many organization's backup solutions. If Backblaze fully supported append-only permissions to application keys with lifecycle rules, it would certainly attract many of these customers who were victim to ransomware. ...Or loose customers to a solution that *does* offer append-only permissions, such as BorgBase or Wasabi. * https://www.borgbase.com/ * https://wasabi.com/blog/use-immutable-storage/ Please let me know when I can expect to be able to use set "append-only" permissions to an application key in conjunction with lifecycle rules. Michael Altfield Senior System Administrator PGP Fingerprint: 8A4B 0AF8 162F 3B6A 79B7 70D2 AA3E DF71 60E2 D97B Open Source Ecology www.opensourceecology.org
- meanwhile, our monthly backup report email came into my inbox; the first of this month's backup was OK, but last night's was not. Did I break backups? The log shows the command about to be executed, but failed with an access control issue
[root@opensourceecology ~]# head /var/log/backups/backup.log-20191202 ================================================================================ INFO: Beginning Backup Run on 20191202_072001 INFO: Cleaning up old backup files INFO: Beginning to backup mysql databases real 1m27.779s user 1m42.149s sys 0m1.578s INFO: Beginning to backup server's files INFO: /etc [root@opensourceecology ~]# tail /var/log/backups/backup.log-20191202 user 8m25.247s sys 0m13.917s INFO: Deleting unencrypted backup archive INFO: moving encrypted backup file to b2user's sync dir INFO: Beginning upload to backblaze b2 ERROR: Application key is restricted to bucket: ose-server-backups real 0m0.191s user 0m0.164s sys 0m0.025s [root@opensourceecology ~]#
- But it looks like our most recent backup succeeded, so there's no issue here
[root@opensourceecology ~]# head /var/log/backups/backup.log ================================================================================ INFO: Beginning Backup Run on 20191203_072001 INFO: Cleaning up old backup files INFO: Beginning to backup mysql databases real 1m25.295s user 1m40.124s sys 0m1.501s INFO: Beginning to backup server's files INFO: /etc [root@opensourceecology ~]# tail /var/log/backups/backup.log "action": "upload", "fileId": "4_z5605817c251dadb96e4d0118_f205e5e6c6b206f16_d20191203_m074425_c001_v0001113_t0059", "fileName": "daily_hetzner2_20191203_072001.tar.gpg", "size": 18670808915, "uploadTimestamp": 1575359065000 } real 175m47.708s user 5m14.135s sys 0m56.405s [root@opensourceecology ~]#
- I sent an email to Marcin not to worry about the missing backup from the emailed backup report, and I also asked if Marcin ever successfully was able to download one copy of our backups to store safely on a disk offline at FeF.
- ...
- OK, back to OpenVPN 2FA. Unfortunately, it looks like I can't change the name of the prompt "Enter Auth Password: " to something like "Enter OTP Token: " https://openvpn.net/community-resources/reference-manual-for-openvpn-2-0/
- I began documenting this process. I figured I'd use `easy-rsa` for the OSE dev when they generate a certificate and certificate signing request, but I quickly found that easy-rsa isn't easy. At lesat not in a way that's easy & robust for me to document for how users to do it. I started with this:
sudo apt-get install openvpn openresolv easy-rsa cd $HOME mkdir openvpn cd /usr/share/easy-rsa source vars KEY_DIR=$HOME/openvpn KEY_CONFIG=/etc/ssl/openssl.cnf # inputs echo -n "Enter your two-digit country code: "; read KEY_COUNTRY echo -n "Enter your state/province: "; read KEY_PROVINCE echo -n "Enter your city: "; read KEY_CITY echo -n "Enter the name of your organization: "; read KEY_ORG echo -n "Enter your email address: "; read KEY_EMAIL # generate certificate request ./build-req `whoami`
- but then I got an error that the KEY_CONFIG isn't easy-rsa specific. Well, this is a problem as the openssl.cnf files provided by RSA don't include one for OpenSSL 1.1.0l, the current version of openssl installed by debian. And why can't easy-rsa make this easy by automatically figuring out which one to use, anyway?
user@ose:/usr/share/easy-rsa$ sudo find / | grep -i openssl | grep -i cnf /usr/lib/ssl/openssl.cnf /usr/share/easy-rsa/openssl-0.9.6.cnf /usr/share/easy-rsa/whichopensslcnf /usr/share/easy-rsa/openssl-0.9.8.cnf /usr/share/easy-rsa/openssl-1.0.0.cnf /usr/share/doc/openvpn/examples/sample-keys/openssl.cnf /etc/ssl/openssl.cnf user@ose:/usr/share/easy-rsa$ user@ose:/usr/share/easy-rsa$ openssl version OpenSSL 1.1.0l 10 Sep 2019 user@ose:/usr/share/easy-rsa$
- It might actually be easier to write this documentation for the user to use `openssl` instead of `easy-rsa`
- I finished the documentation for both
- The OSE Developer requesting VPN access and https://wiki.opensourceecology.org/wiki/VPN#Developers:_How_to_request_access_to_the_dev_VPN
- The OSE SysAdmin granting VPN access https://wiki.opensourceecology.org/wiki/VPN#Sysadmin:_How_to_grant_access_to_the_dev_VPN
- I sent an email to Marcin asking if he'd be free sometime this week for a meeting to setup & train him on connecting the VPN so he can access our staging sites (including our Discourse POC site)
- I sent an meial to Marcin as a status update on the Discourse POC
- I updated my TODO list on the OSE Server article https://wiki.opensourceecology.org/wiki/OSE_Server#TODO
Mon Dec 02, 2019
- I created a new key in Backblaze B2 with name 'prod-append-only'. This will be an append-only key such that our production server can put new data in our backblaze b2 bucket, but it cannot overwrite or delete existing backups. This is to prevent our box from having the capacity to delete our golden backups in the even that it gets hacked by, for example, randsomeware.
- unfortunately, while the wui lists a ton of key permissions, you can't actually granularly control them when creating them in the wui. I could only set "read-only" "write-only" and "read-write". If I set "write-only", I get this
deleteFiles, listBuckets, writeFiles
- of course, that's not what we want. "append-only" is distinct from "write-only" in that we want to be damn sure that it can add *new* files (or, from the filesystem meaning, appending to existing files is OK), but not be able to delete existing files or overwrite existing file's data.
- this backblaze b2 documentation give smore info on application keys and their permissions. Let's see if we can remove "deleteFiles" via the cli's `b2_create_key` and then test to make sure it can't overwrite existing files https://www.backblaze.com/b2/docs/application_keys.html
- ok, so the commands are a bit different for the `b2` python cli tool than the api documentation; here's a list of the existing keys
[b2user@opensourceecology ~]$ ~/virtualenv/bin/b2 --help This program provides command-line access to the B2 service. Usages: b2 authorize-account [<accountId>] [<applicationKey>] b2 cancel-all-unfinished-large-files <bucketName> b2 cancel-large-file <fileId> b2 clear-account b2 create-bucket [--bucketInfo <json>] [--corsRules <json>] [--lifecycleRules <json>] <bucketName> [allPublic | allPrivate] b2 create-key [--duration <validDurationSeconds>] [--bucket <bucketName>] [--namePrefix <namePrefix>] <keyName> <capabilities> b2 delete-bucket <bucketName> b2 delete-file-version [<fileName>] <fileId> b2 delete-key <applicationKeyId> b2 download-file-by-id [--noProgress] <fileId> <localFileName> b2 download-file-by-name [--noProgress] <bucketName> <fileName> <localFileName> b2 get-account-info b2 get-bucket [--showSize] <bucketName> b2 get-download-auth [--prefix <fileNamePrefix>] [--duration <durationInSeconds>] <bucketName> b2 get-download-url-with-auth [--duration <durationInSeconds>] <bucketName> <fileName> b2 get-file-info <fileId> b2 help [commandName] b2 hide-file <bucketName> <fileName> b2 list-buckets b2 list-file-names <bucketName> [<startFileName>] [<maxToShow>] b2 list-file-versions <bucketName> [<startFileName>] [<startFileId>] [<maxToShow>] b2 list-keys b2 list-parts <largeFileId> b2 list-unfinished-large-files <bucketName> b2 ls [--long] [--versions] [--recursive] <bucketName> [<folderName>] b2 make-url <fileId> b2 sync [--delete] [--keepDays N] [--skipNewer] [--replaceNewer] \ [--compareVersions <option>] [--compareThreshold N] \ [--threads N] [--noProgress] [--dryRun ] [--allowEmptySource ] \ [--excludeRegex <regex> [--includeRegex <regex>]] \ [--excludeDirRegex <regex>] \ <source> <destination> b2 update-bucket [--bucketInfo <json>] [--corsRules <json>] [--lifecycleRules <json>] <bucketName> [allPublic | allPrivate] b2 upload-file [--sha1 <sha1sum>] [--contentType <contentType>] \ [--info <key>=<value>]* [--minPartSize N] \ [--noProgress] [--threads N] <bucketName> <localFilePath> <b2FileName> b2 version The environment variable B2_ACCOUNT_INFO specifies the sqlite file to use for caching authentication information. The default file to use is: ~/.b2_account_info For more details on one command: b2 help <command> When authorizing with application keys, this tool requires that the key have the 'listBuckets' capability so that it can take the bucket names you provide on the command line and translate them into bucket IDs for the B2 Storage service. Each different command may required additional capabilities. You can find the details for each command in the help for that command. [b2user@opensourceecology ~]$ [b2user@opensourceecology ~]$ ~/virtualenv/bin/b2 list-keys OBFUCATED1 dev OBFUCATED2 prod-append-only [b2user@opensourceecology ~]$
- I was successfully able to delete the key I just made in the wui and create a new one with "writeFiles" only
[b2user@opensourceecology ~]$ ~/virtualenv/bin/b2 delete-key OBFUCATED2 OBFUCATED2 [b2user@opensourceecology ~]$ ~/virtualenv/bin/b2 create-key --bucket 'ose-server-backups' 'prod-append-only' 'writeFiles' OBFUSCATED3 OBFUSCATEDSECRETKEY3 [b2user@opensourceecology ~]$
- There doesn't appear to be a way to query a key and get its permisions on the CLI, but a quick refresh on the 'secure.backblaze.com/app_keys.htm' page reflects that the old key is gone and the new one only has the 'writeFiles' permission. Nice!
- now for the test: firstI made a backup of the exsting creds, then cleared it and re-added the creds for the new appliciation key
[b2user@opensourceecology ~]$ cp .b2_account_info .b2_account_info.master [b2user@opensourceecology ~]$ [b2user@opensourceecology ~]$ ~/virtualenv/bin/b2 clear-account --help b2 clear-account Erases everything in ~/.b2_account_info. Location of file can be overridden by setting B2_ACCOUNT_INFO. [b2user@opensourceecology ~]$ ~/virtualenv/bin/b2 clear-account [b2user@opensourceecology ~]$
- ugh, I got yelled at that listBuckets is required. I guess that's not *too* bad
[b2user@opensourceecology ~]$ ~/virtualenv/bin/b2 authorize-account 'OBFUSCATED3' 'OBFUSCATEDSECRETKEY3' Using https://api.backblazeb2.com ERROR: application key has no listBuckets capability, which is required for the b2 command-line tool [b2user@opensourceecology ~]$
- ok, I deleted the old one (not shown) added a new one with listBuckets and writeFiles
[b2user@opensourceecology ~]$ ~/virtualenv/bin/b2 authorize-account 'OBFUSCATED4' 'OBFUSCATEDSECRETKEY4' Using https://api.backblazeb2.com [b2user@opensourceecology ~]$ ~/virtualenv/bin/b2 get-account-info { "accountAuthToken": "OBFFUSCATED", "accountId": "OBFFUSCATED", "allowed": { "bucketId": "OBFFUSCATED", "bucketName": "ose-server-backups", "capabilities": [ "listBuckets", "writeFiles" ], "namePrefix": null }, "apiUrl": "https://api001.backblazeb2.com", "applicationKey": "OBFFUSCATED", "downloadUrl": "https://f001.backblazeb2.com" } [b2user@opensourceecology ~]$
- I tried an ls, but it stupidly gave me an error suggesting that I was trying to list a bucket I didn't have access to. The bucket is right, but I don't have list permissions on it. Anyway..
[b2user@opensourceecology ~]$ ~/virtualenv/bin/b2 ls 'ose-server-backups' ERROR: Application key is restricted to bucket: ose-server-backups [b2user@opensourceecology ~]$
- son of a bitch, I can't upload either. Note that the bucket *is* correct. what gives?
[b2user@opensourceecology tmp]$ ~/virtualenv/bin/b2 upload-file 'ose-server-backups' test.txt test.txt ERROR: Application key is restricted to bucket: ose-server-backups [b2user@opensourceecology tmp]$ # this could maybe be related to this bug, which says it was fixed in b2 version 1.3.4. I'm using 1.3.3. https://github.com/Backblaze/B2_Command_Line_Tool/issues/485 <pre> [b2user@opensourceecology tmp]$ ~/virtualenv/bin/b2 version b2 command line tool, version 1.3.3 [b2user@opensourceecology tmp]$
- when I first installed b2 I broke our site because it broke our `certbot` tool; hopefully it's better now that it's in a virtualenv for the b2 user and not OS-level. Anyway, I was able to update to 1.4.3 within the virtualenv https://wiki.opensourceecology.org/wiki/Backblaze#Install_CLI
[b2user@opensourceecology ~]$ source ~/virtualenv/bin/activate (virtualenv) [b2user@opensourceecology ~]$ cd ~/sandbox/B2_Command_Line_Tool/ (virtualenv) [b2user@opensourceecology B2_Command_Line_Tool]$ git pull ... (virtualenv) [b2user@opensourceecology B2_Command_Line_Tool]$ python setup.py install ... (virtualenv) [b2user@opensourceecology B2_Command_Line_Tool]$ b2 version b2 command line tool, version 1.4.3 (virtualenv) [b2user@opensourceecology B2_Command_Line_Tool]$
- I'm still getting the same stupid error, though. I'm literally typing the bucket name that it says I'm restricted to. wtf?
(virtualenv) [b2user@opensourceecology B2_Command_Line_Tool]$ b2 ls ose-server-backups ERROR: unauthorized for application key with capabilities 'listBuckets,writeFiles', restricted to bucket 'ose-server-backups' (unauthorized) (virtualenv) [b2user@opensourceecology B2_Command_Line_Tool]$
- oh, duh, this error is different from before in that it says the capabilities don't macth. I can't do an `ls` because I don't have 'listFiles'. what about an upload?
- sweet, that worked!.I confirmed the file's existance on the wui too.
[b2user@opensourceecology ~]$ mkdir tmp [b2user@opensourceecology ~]$ cd tmp [b2user@opensourceecology tmp]$ echo 'test0' > test.txt [b2user@opensourceecology tmp]$ ~/virtualenv/bin/b2 upload-file 'ose-server-backups' test.txt test.txt test.txt: 100%|| 6.00/6.00 [00:01<00:00, 4.43B/s] URL by file name: https://f001.backblazeb2.com/file/ose-server-backups/test.txt URL by fileId: https://f001.backblazeb2.com/b2api/v2/b2_download_file_by_id?fileId=4_z5605817c251dadb96e4d0118_f118d984fa5fe76bd_d20191202_m080016_c001_v0001131_t0058 { "action": "upload", "fileId": "4_z5605817c251dadb96e4d0118_f118d984fa5fe76bd_d20191202_m080016_c001_v0001131_t0058", "fileName": "test.txt", "size": 6, "uploadTimestamp": 1575273616000 } [b2user@opensourceecology tmp]$
- now let's download the file. cool, it fails because I can't read. that's fine.
[b2user@opensourceecology tmp]$ mkdir restore [b2user@opensourceecology tmp]$ cd restore/ [b2user@opensourceecology restore]$ ls [b2user@opensourceecology restore]$ ~/virtualenv/bin/b2 download-file-by-name 'ose-server-backups' test.txt test.txt ERROR: unauthorized for application key with capabilities 'listBuckets,writeFiles', restricted to bucket 'ose-server-backups' (unauthorized) [b2user@opensourceecology restore]$
- and just to be sure: this key can't delete, right? Nope, good. Note that there's no 'delete-file-by-name'; we have to use the 'delete-file-version' https://www.backblaze.com/b2/docs/b2_delete_file_version.html
[b2user@opensourceecology restore]$ ~/virtualenv/bin/b2 delete-file-version 'test.txt' '4_z5605817c251dadb96e4d0118_f118d984fa5fe76bd_d20191202_m080016_c001_v0001131_t0058' ERROR: unauthorized for application key with capabilities 'listBuckets,writeFiles', restricted to bucket 'ose-server-backups' (unauthorized) [b2user@opensourceecology restore]$
- the file's contents should be 'test0'. Let's see if our 'append-only' key has the ability to overwrite data by re-uploading the file with a different contents of 'test1'
[b2user@opensourceecology restore]$ ls [b2user@opensourceecology restore]$ cd .. [b2user@opensourceecology tmp]$ ls restore test.txt [b2user@opensourceecology tmp]$ cat test.txt test0 [b2user@opensourceecology tmp]$ echo "test1" > test.txt [b2user@opensourceecology tmp]$ cat test.txt test1 [b2user@opensourceecology tmp]$ ~/virtualenv/bin/b2 upload-file 'ose-server-backups' test.txt test.txt test.txt: 100%|| 6.00/6.00 [00:01<00:00, 4.87B/s] URL by file name: https://f001.backblazeb2.com/file/ose-server-backups/test.txt URL by fileId: https://f001.backblazeb2.com/b2api/v2/b2_download_file_by_id?fileId=4_z5605817c251dadb96e4d0118_f1157fdf57c59dad1_d20191202_m080938_c001_v0001039_t0057 { "action": "upload", "fileId": "4_z5605817c251dadb96e4d0118_f1157fdf57c59dad1_d20191202_m080938_c001_v0001039_t0057", "fileName": "test.txt", "size": 6, "uploadTimestamp": 1575274178000 } [b2user@opensourceecology tmp]$
- now to validate, I hop-back to another application key with more permissions. And, damn, it looks the file got overriden.
[b2user@opensourceecology tmp]$ cd restore/ [b2user@opensourceecology restore]$ ls [b2user@opensourceecology restore]$ ~/virtualenv/bin/b2 download-file-by-name 'ose-server-backups' test.txt test.txt test.txt: 100%|| 6.00/6.00 [00:00<00:00, 4.72kB/s] File name: test.txt File id: 4_z5605817c251dadb96e4d0118_f1157fdf57c59dad1_d20191202_m080938_c001_v0001039_t0057 File size: 6 Content type: text/plain Content sha1: dba7673010f19a94af4345453005933fd511bea9 INFO src_last_modified_millis: 1575274166571 checksum matches [b2user@opensourceecology restore]$ cat test.txt test1 [b2user@opensourceecology restore]$
- But what about this "version" stuff. Is the old file there too? There's no great way to get all the versions of a given file. Note that 'startFileId' will just output all files in the bucket strting from the given start point. And it looks like the api defines a 'prefix', but the b2 cli tool doesn't implement it (yet?) https://www.backblaze.com/b2/docs/b2_list_file_versions.html
[b2user@opensourceecology restore]$ ~/virtualenv/bin/b2 list-file-versions --help b2 list-file-versions <bucketName> [<startFileName>] [<startFileId>] [<maxToShow>] Lists the names of the files in a bucket, starting at the given point. This is a low-level operation that reports the raw JSON returned from the service. 'b2 ls' provides a higher- level view. Requires capability: listFiles [b2user@opensourceecology restore]$
- Anyway, we can force the `ls` command to list multiple versions of each file with '--versions', and we can get the file-id of each version with '--long', and we can just grep for our filename. Here we see both versions of the files. Nice!
[b2user@opensourceecology restore]$ ~/virtualenv/bin/b2 ls --versions --long 'ose-server-backups' | grep -i test.txt 4_z5605817c251dadb96e4d0118_f1157fdf57c59dad1_d20191202_m080938_c001_v0001039_t0057 upload 2019-12-02 08:09:38 6 test.txt 4_z5605817c251dadb96e4d0118_f118d984fa5fe76bd_d20191202_m080016_c001_v0001131_t0058 upload 2019-12-02 08:00:16 6 test.txt [b2user@opensourceecology restore]$
- It does appear that old versions are not automatically deleted by default https://www.backblaze.com/b2/docs/lifecycle_rules.html
Keep all versions of the file (default) removes all lifecycle rules from the bucket, and keeps all versions of all files until you explicitly delete them
- But this can be achieved by creating a lifecycle rule "daysFromHidingToDeleting". Oh, damn it, we *do* set this so that our deleted files get deleted as soon as possible. Apparently when a file is overwritten (as we did above), the old version becomes "hidden". So, effectively, our append-only key can currently overwrite all our backups and after 1 day all our data would be lost!
The most commonly used setting is daysFromHidingToDeleting, which says how long to keep file versions that are not the current version. A file version counts as hidden when explicitly hidden with b2_hide_file, or when a newer file with the same name is uploaded. When a rule with this setting applies, the file will be deleted the given number of days after it is hidden.
- So I originally setup these rules on Jul 28th without much of an understanding of the distincton between daysFromUploadingToHiding or daysFromHidingToDeleting. Or the versioning of files in B2. https://wiki.opensourceecology.org/wiki/Maltfield_Log/2018_Q3#Sat_Jul_28.2C_2018
- I think I can create a new lifecycle rules the same as before, but set "daysFromHidingToDeleting" to null, and that should achieve what I want.
- Ugh, no, I can't do that. There is no "daysFromUploadingToDeletign". It appears you have to go from Uploading -> Hiding -> Deleting. That sucks!
- So one hackish solution exists: I could just hide all our backups 1 day after uploading, and then delete them after some long interval. For example, our retention policy for monthly backups is 1 years. Currently we set 'daysFromUploadingToHiding' to '364' and daysFromHidingToDeleting to '1'. We could reverse those, so that the montly backups are hidden after 1 day but then retained for 364 days. This would achieve what we want, but it would give the illusion to anyone at OSE other than me that we only maintain 1 day's worth of backups, because they'd all be hidden except the most recent version. That's pretty hackish, but I guess it works if needed.
- Because I couldn't find any article on "append-only" keys in Backblaze's knowledgebase/faq, I opened a support ticket to Backblaze asking them how it's possible to create an append-only application key in conjunction with lifecycle rules that implement our backup retention policy without effectively giving the would-be "append-only" key the abiltiy to delete all our existing backups (after a 24 hour delay). The support request is #517135. http://help.backblaze.com/hc/requests/517135
How can I create a Backblaze B2 Application Key to have append-only permissions along with lifecycle rules that delete old backups? I've been pretty happy since we switched our offsite backups to Backblaze B2 over a year ago, but we have a new requirement that our servers are only granted append-only permissions to the endpoint where our backups reside. Append-only is a common access control that permits appending new data to the destination, but it does not permit deleting or overwriting existing content. Note that this is very distinct from 'read-write', and it's especially important in protecting backup data in the event that the server that's writing backups to B2 is hacked by, for example, ransomware. I was able to create an append-only application key by granting it only the 'writeFiles' and 'listBuckets' capabilities (the latter being required by the `b2` python cli tool), but I found that--while this new application key could upload new files without being able to delete existing files (good!)--this application key *can* overwrite existing files (bad!). I'm aware that old versions of files are, by default, not actually deleted on B2. But (!) it appears that using lifecycle rules on the B2-side to establish a retention policy (as is necessary if I want to make it so my client keys don't have permission to delete or hide existing files in our b2 buckets storing our backup data) necessarily requires setting the 'daysFromHidingToDeleting' rule to something non-null. (!!) ************ (!!) Setting the 'daysFromHidingToDeleting' rule to something non-null, in effect, gives the would-be "append-only" key the ability to delete all our backups (after a 24-hour delay). (!!) ************ (!!) Specifically, for example, we use the following LifeCycle rules for our monthly backups: fileNamePrefix: monthly_ daysFromUploadingToHiding: 364 daysFromHidingToDeleting: 1 The above ^ rules means that our monthly backups are deleted after 356 days (1 year retention for monthly backups), but it also necessarily means that if our server was ever infected with ransomware, then the attacker could overwrite all of our existing monthly backups (with, say, 1 byte of data), and 24 hours later it would all be deleted from our bucket! I imagine there's a number of solutions to this, but one that comes to mind is: how do I create lifecycle rules that delete files X days after they're uploaded *while keeping 'daysFromUploadingToHiding' = null? If we could set a lifecycle rule to go straight from upload to delete such that we could leave daysFromUploadingToHiding null, then we wouldn't effectively allow an would-be "append-only" key to be able to delete all our data. Or, alternatively, if there were a 'writeVersions' capability for keys, and if a key that lacked this permission attempted to (over)write a file in a bucket that already existed, it would trigger an error rather than permitting it to upload a new file (causing the existing file to be hidden and potentially deleted by the lifecycle rules after 24 hours). Or if there were some way to make it so that when a new file uploaded "over" an existing file of the same name didn't make that old version "hidden", then it would also solve this issue. Please let me know how I can create an "append-only" application key in conjunction with lifecycle rules that implement our data retention policy--without letting the "append-only" key effectively delete all existing data.
- It's also important to verify that our append-only key doesn't have permission to mark an old version as as hidden. Ugh, it does. IMHO, Backblaze should really create a distinct permission/capability for writing new files vs writing new versions of existing files (including this hidden-file command)
[b2user@opensourceecology restore]$ ~/virtualenv/bin/b2 hide-file 'ose-server-backups' test.txt { "action": "hide", "fileId": "4_z5605817c251dadb96e4d0118_f118addadb89fa4a0_d20191202_m102714_c001_v0001130_t0058", "fileName": "test.txt", "size": 0, "uploadTimestamp": 1575282434000 } [b2user@opensourceecology restore]$ ~/virtualenv/bin/b2 ls --versions --long 'ose-server-backups' | grep -i test.txt 4_z5605817c251dadb96e4d0118_f118addadb89fa4a0_d20191202_m102714_c001_v0001130_t0058 hide 2019-12-02 10:27:14 0 test.txt 4_z5605817c251dadb96e4d0118_f1157fdf57c59dad1_d20191202_m080938_c001_v0001039_t0057 upload 2019-12-02 08:09:38 6 test.txt 4_z5605817c251dadb96e4d0118_f118d984fa5fe76bd_d20191202_m080016_c001_v0001131_t0058 upload 2019-12-02 08:00:16 6 test.txt [b2user@opensourceecology restore]$
- And we also validate that the append-only key cannot change the lifecycle rules. Let's test on the dev backups bucket to be safe. FIrst we get the existing rules
[b2user@opensourceecology restore]$ ~/virtualenv/bin/b2 get-bucket 'ose-dev-server-backups' { "accountId": "OBFUSCATED", "bucketId": "OBFUSCATED", "bucketInfo": {}, "bucketName": "ose-dev-server-backups", "bucketType": "allPrivate", "corsRules": [], "lifecycleRules": [ { "daysFromHidingToDeleting": 1, "daysFromUploadingToHiding": 2, "fileNamePrefix": "daily_" }, { "daysFromHidingToDeleting": 1, "daysFromUploadingToHiding": 364, "fileNamePrefix": "monthly_" }, { "daysFromHidingToDeleting": 1, "daysFromUploadingToHiding": 30, "fileNamePrefix": "weekly_" } ], "options": [], "revision": 4 } [b2user@opensourceecology restore]$
- unfortunately my attempts to update the lifecycle rules with the b2 cli always failed. it just printed out the help page for the command with no further information. Not sure what's the issue here..
- there's not a whole lot of great references out there on append-only as a solution to ransomware, and it seems that very few cloud providers have actually implemented it. Especially in the open source space: it appears that it can't be natively setup by OpenStack's swift, Nextcloud, Owncloud, etc. Wikipedia has a great article comparing file hosting providers, but it doesn't have a column for append-only or lifecycle policies, so I spent some time adding these two columns and adding to the rows: Backblaze B2, Wasabi, and Borgbase https://en.wikipedia.org/wiki/Comparison_of_file_hosting_services
- Even though append-only has long been an attribute for filesystems permissions, datastructures, databases, and now APIs to cloud storage providers, there's not an article defining "append-only", so I created one as a draft for wikipedia https://en.wikipedia.org/wiki/Draft:Append-only
- ok, that's all I can do on no-append backups for now. I don't want to do my hack-ish swap on hide & delete lifecycle rules just yet (especially if someone isn't aware of it and swaps them back, then causing all the now-hidden backups to be deleted!) I'll wait to hear back from Backblaze..
- ...
- I also need to update our dev server's openvpn configuration to support 2FA before I train marcin on it
- this guide uses the 'openvpn-auth-pam.so' plugin https://www.mikejonesey.co.uk/security/2fa/openvpn-with-2fa
- which then delegates to the google authenticator pam module https://github.com/google/google-authenticator/
- I went to check the dev server's repos, and I found a pam_2fa module already in the yum repos
[maltfield@osedev1 ~]$ yum search 2fa Loaded plugins: fastestmirror Could not set cachedir: [Errno 28] No space left on device: '/var/tmp/yum-maltfield-cVl2V7' Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast N/S matched: 2fa ================ pam_2fa.x86_64 : Second factor authentication for PAM Name and summary matches only, use "search all" for everything. [maltfield@osedev1 ~]$ yum search mfa Loaded plugins: fastestmirror Could not set cachedir: [Errno 28] No space left on device: '/var/tmp/yum-maltfield-E_Ggwe' Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast Warning: No matches found for: mfa No matches found [maltfield@osedev1 ~]$
- unrelated, the above command also informed me that we're out of disk space on the dev node :(
[maltfield@osedev1 ~]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 19G 3.4G 15G 19% / devtmpfs 873M 0 873M 0% /dev tmpfs 896M 0 896M 0% /dev/shm tmpfs 896M 100M 796M 12% /run tmpfs 896M 0 896M 0% /sys/fs/cgroup /dev/mapper/ose_dev_volume_1 125G 120G 0 100% /mnt/ose_dev_volume_1 tmpfs 180M 0 180M 0% /run/user/1000 [maltfield@osedev1 ~]$
- so that's our "ebs" network volume that filled up; it hold the storage for our lxc stating node's container's rootfs in /mnt/ose_dev_volume_1/var/cache/lxc/centos/x86_64/7/rootfs/
[root@osedev1 ~]# du -sh /mnt/ose_dev_volume_1/var/cache/lxc/centos/x86_64/7/rootfs 433M /mnt/ose_dev_volume_1/var/cache/lxc/centos/x86_64/7/rootfs [root@osedev1 ~]#
- wtf? it shows only 433M usage? That sounds like some file got deleted but it's still in usage by some process, so it's stuck in purgetory
- no, it doesn't look like that on either dev or statging
[root@osedev1 ~]# lsof 2>&1 | grep '(deleted)$' | sort -rnk 7 | head -20 [root@osedev1 ~]#
[maltfield@osestaging1 ~]$ lsof 2>&1 | grep '(deleted)$' | sort -rnk 7 | head -20 [maltfield@osestaging1 ~]$
- also, our prod server is >100G and the staging server should be too, so that 433M just doesn't make sense at all..
- I can't blame docker entirely, but it definitely is partially to blame: it's overlay2 dir is taking up 13G of space
[root@osestaging1 ~]# du -sh /var/lib/docker 13G /var/lib/docker [root@osestaging1 ~]#
- But I'm mostly to blame; looks like I've been tailing verbose lxc logs to a file that grew to 46G
[root@osedev1 osestaging1]# pwd /var/lib/lxc/osestaging1 [root@osedev1 osestaging1]# du -sh * 4.0K config 8.0K dev 47G lxc-start.log 8.0K osestaging1 71G rootfs 0 rootfs.dev 4.0K ts [root@osedev1 osestaging1]# [root@osedev1 ~]# ps -ef | grep -i lxc root 3644 1798 1 Oct22 pts/1 12:22:17 /usr/bin/lua /bin/lxc-top root 20002 1760 0 Nov07 pts/2 03:48:55 lxc-start -n osestaging1 -f config -l trace -o lxc-start.log root 27165 2125 26 15:21 pts/8 00:00:04 du -sh config dev lxc-start.log osestaging1 rootfs rootfs.dev ts root 27203 27185 0 15:21 pts/17 00:00:00 grep --color=auto -i lxc [root@osedev1 ~]#
- ok, I truncated the log file and confirmed we now have 45G avaiable
[root@osedev1 osestaging1]# du -sh * 4.0K config 8.0K dev 44K lxc-start.log 8.0K osestaging1 df -h 71G rootfs 0 rootfs.dev 4.0K ts [root@osedev1 osestaging1]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 19G 3.4G 15G 19% / devtmpfs 873M 0 873M 0% /dev tmpfs 896M 0 896M 0% /dev/shm tmpfs 896M 100M 796M 12% /run tmpfs 896M 0 896M 0% /sys/fs/cgroup /dev/mapper/ose_dev_volume_1 125G 74G 45G 63% /mnt/ose_dev_volume_1 tmpfs 180M 0 180M 0% /run/user/1000 [root@osedev1 osestaging1]#
- I went ahead and stopped the lxc staging container and restarted it *without* dumping trace-level loggings to an ever-expanding file
[root@osedev1 osestaging1]# lxc-stop --name osestaging1 [root@osedev1 osestaging1]# lxc-start -n osestaging1 ... [ OK ] Started containerd container runtime. Starting Docker Application Container Engine... CentOS Linux 7 (Core) Kernel 3.10.0-957.21.3.el7.x86_64 on an x86_64 osestaging1 login:
- ok, now that's sane, continuing with openvpn 2fa..
- this site lists at least 3x pam 2fa modules https://cern-cert.github.io/pam_2fa/
- it looks like there's also a 'google-authenticator' package in the yum repos
[root@osedev1 etc]# yum search google-auth Loaded plugins: fastestmirror Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast Determining fastest mirrors * base: mirror.checkdomain.de * epel: ftp.plusline.net * extras: linux.darkpenguin.net * updates: mirror.ratiokontakt.de N/S matched: google-auth ======================== google-authenticator.x86_64 : One-time pass-code support using open standards python2-google-auth.noarch : Google Auth Python Library Name and summary matches only, use "search all" for everything. You have new mail in /var/spool/mail/root [root@osedev1 etc]#
- according to pkgs.org, the yum pam_2fa package is a package from epel that came from CERN above https://centos.pkgs.org/7/epel-x86_64/pam_2fa-1.0-1.el7.x86_64.rpm.html
- And the 'google-authenticator' package comes from this one https://centos.pkgs.org/7/epel-x86_64/google-authenticator-1.04-1.el7.x86_64.rpm.html
- let's compare
- https://github.com/CERN-CERT/pam_2fa
- - 1 contributor on github
- * Apache-2.0 license
- + 211 commits, most recently 1 month ago
- * first commit 2019-04
- + this repo is owned & maintained by Google, Inc (jesus, TIL there's a .google TLD) https://github.com/google
- https://github.com/google/google-authenticator-libpam/
- + 29 contributors on github
- * gnu license (?)
- * 114 commits, most recently 8 months ago
- + first commit on 2014-01
- + this repo is owned & maintained by the CERN CERT https://github.com/CERN-CERT
- all that considered, sorry CERN, I'm going to have to go with google's open source implementation.
- I went ahead and installed it
[root@osedev1 etc]# yum install google-authenticator ... Installed: google-authenticator.x86_64 0:1.04-1.el7 Complete! [root@osedev1 etc]# rpm -ql google-authenticator /usr/bin/google-authenticator /usr/lib64/security/pam_google_authenticator.la /usr/lib64/security/pam_google_authenticator.so /usr/share/doc/google-authenticator-1.04 /usr/share/doc/google-authenticator-1.04/CONTRIBUTING.md /usr/share/doc/google-authenticator/FILEFORMAT /usr/share/doc/google-authenticator/README.md /usr/share/doc/google-authenticator/totp.html /usr/share/licenses/google-authenticator-1.04 /usr/share/licenses/google-authenticator-1.04/LICENSE /usr/share/man/man1/google-authenticator.1.gz /usr/share/man/man8/pam_google_authenticator.8.gz [root@osedev1 etc]#
- Per the documentation on the github page, I just ran the /usr/bin/google-authenticator binary to generate a new 2fa secret key for myself.
- a few notes: I set a window size of 8, which will make 8 codes in both the past & future acceptable; that'll permit up to a 4 minute time drift.
- I've set the rate limit to 2 every 30 seconds. If someone tries to login with an OTP code 3 or more times in a given 30 second window, it will be denied
- I actually wanted to set the emergency codes to 0, but it wouldn't let me :(
- as for the issuer & label, on my phone andOTP displays this as "vpn.opensourceecology.org - maltfield@osedev1", which I think is the best way to explain to the user what it is: first: this is a code for the VPN on OSE's network. Second, it was specifically generated on the osedev1 server for the user 'maltfield'.
google-authenticator --time-based --disallow-reuse --issuer "vpn.opensourceecology.org" --label "`whoami`@osedev1" --emergency-codes=1 --window-size=8 --rate-limit=2 --rate-time=30
- I created a new pam.d config file for openvpn
[root@osedev1 etc]# cat /etc/pam.d/openvpn # google auth auth required /usr/lib64/security/pam_google_authenticator.so account required pam_nologin.so account include system-auth password include system-auth session include system-auth [root@osedev1 etc]#
- And I updated the openvpn server config file to use the above file. And I restarted the openvpn server.
[root@osedev1 server]# tail /etc/openvpn/server/server.conf # Notify the client that when the server restarts so it # can automatically reconnect. explicit-exit-notify 1 # additional hardening --maltfield tls-version-min 1.2 tls-cipher TLS-DHE-RSA-WITH-AES-256-GCM-SHA384 # google-authenticator 2fa plugin /usr/lib64/openvpn/plugins/openvpn-plugin-auth-pam.so openvpn [root@osedev1 server]# [root@osedev1 server]# systemctl restart openvpn@server [root@osedev1 server]#
- I disconnected fro the VPN on my laptop and attempted to reconnect, but I never got in. This is what I got on the client
Mon Dec 2 21:53:13 2019 ++ Certificate has key usage 00a0, expects 00a0 Mon Dec 2 21:53:13 2019 VERIFY KU OK Mon Dec 2 21:53:13 2019 Validating certificate extended key usage Mon Dec 2 21:53:13 2019 ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication Mon Dec 2 21:53:13 2019 VERIFY EKU OK Mon Dec 2 21:53:13 2019 VERIFY OK: depth=0, CN=server
- And this was the openvpn server's logs
Dec 02 17:08:13 osedev1 openvpn[24978]: Mon Dec 2 17:08:13 2019 182.74.197.50:58218 TLS: Initial packet from [AF_INET]182.74.197.50:58218, sid=8f76e1f9 1d391d64 Dec 02 17:08:13 osedev1 openvpn[24978]: Mon Dec 2 17:08:13 2019 182.74.197.50:58218 VERIFY OK: depth=1, CN=osedev1 Dec 02 17:08:13 osedev1 openvpn[24978]: Mon Dec 2 17:08:13 2019 182.74.197.50:58218 VERIFY OK: depth=0, CN=maltfield Dec 02 17:08:13 osedev1 openvpn[24978]: Mon Dec 2 17:08:13 2019 182.74.197.50:58218 peer info: IV_VER=2.4.0 Dec 02 17:08:13 osedev1 openvpn[24978]: Mon Dec 2 17:08:13 2019 182.74.197.50:58218 peer info: IV_PLAT=linux Dec 02 17:08:13 osedev1 openvpn[24978]: Mon Dec 2 17:08:13 2019 182.74.197.50:58218 peer info: IV_PROTO=2 Dec 02 17:08:13 osedev1 openvpn[24978]: Mon Dec 2 17:08:13 2019 182.74.197.50:58218 peer info: IV_NCP=2 Dec 02 17:08:13 osedev1 openvpn[24978]: Mon Dec 2 17:08:13 2019 182.74.197.50:58218 peer info: IV_LZ4=1 Dec 02 17:08:13 osedev1 openvpn[24978]: Mon Dec 2 17:08:13 2019 182.74.197.50:58218 peer info: IV_LZ4v2=1 Dec 02 17:08:13 osedev1 openvpn[24978]: Mon Dec 2 17:08:13 2019 182.74.197.50:58218 peer info: IV_LZO=1 Dec 02 17:08:13 osedev1 openvpn[24978]: Mon Dec 2 17:08:13 2019 182.74.197.50:58218 peer info: IV_COMP_STUB=1 Dec 02 17:08:13 osedev1 openvpn[24978]: Mon Dec 2 17:08:13 2019 182.74.197.50:58218 peer info: IV_COMP_STUBv2=1 Dec 02 17:08:13 osedev1 openvpn[24978]: Mon Dec 2 17:08:13 2019 182.74.197.50:58218 peer info: IV_TCPNL=1 Dec 02 17:08:13 osedev1 openvpn[24978]: Mon Dec 2 17:08:13 2019 182.74.197.50:58218 TLS Error: Auth Username/Password was not provided by peer Dec 02 17:08:13 osedev1 openvpn[24978]: Mon Dec 2 17:08:13 2019 182.74.197.50:58218 TLS Error: TLS handshake failed Dec 02 17:08:13 osedev1 openvpn[24978]: Mon Dec 2 17:08:13 2019 182.74.197.50:58218 SIGUSR1[soft,tls-error] received, client-instance restarting Dec 02 17:08:37 osedev1 kernel: docker0: port 1(vethdeb72fd) entered blocking state
- Turns out I have to add the 'auth-user-pass' field to my client config https://securityskittles.wordpress.com/2012/03/14/two-factor-authentication-for-openvpn-on-centos-using-google-authenticator/
user@ose:~$ tail openvpn/client.conf # hardening tls-cipher TLS-DHE-RSA-WITH-AES-256-GCM-SHA384 # dns for staging script-security 2 up /etc/openvpn/update-resolv-conf down /etc/openvpn/update-resolv-conf # 2fa auth-user-pass user@ose:~$
- And now it works!
user@ose:~$ sudo openvpn openvpn/client.conf Mon Dec 2 21:55:37 2019 OpenVPN 2.4.0 x86_64-pc-linux-gnu [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Oct 14 2018 Mon Dec 2 21:55:37 2019 library versions: OpenSSL 1.0.2t 10 Sep 2019, LZO 2.08 Enter Auth Username: maltfield Enter Auth Password: ****** Mon Dec 2 21:55:57 2019 NOTE: the current --script-security setting may allow this configuration to call user-defined scripts Enter Private Key Password: * Mon Dec 2 21:56:00 2019 WARNING: this configuration may cache passwords in memory -- use the auth-nocache option to prevent this Mon Dec 2 21:56:00 2019 Outgoing Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Mon Dec 2 21:56:00 2019 Incoming Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Mon Dec 2 21:56:00 2019 TCP/UDP: Preserving recently used remote address: [AF_INET]195.201.233.113:1194 Mon Dec 2 21:56:00 2019 Socket Buffers: R=[212992->212992] S=[212992->212992] Mon Dec 2 21:56:00 2019 UDP link local: (not bound) Mon Dec 2 21:56:00 2019 UDP link remote: [AF_INET]195.201.233.113:1194 Mon Dec 2 21:56:00 2019 TLS: Initial packet from [AF_INET]195.201.233.113:1194, sid=959cd30b 98bdba8b Mon Dec 2 21:56:01 2019 VERIFY OK: depth=1, CN=osedev1 Mon Dec 2 21:56:01 2019 Validating certificate key usage Mon Dec 2 21:56:01 2019 ++ Certificate has key usage 00a0, expects 00a0 Mon Dec 2 21:56:01 2019 VERIFY KU OK Mon Dec 2 21:56:01 2019 Validating certificate extended key usage Mon Dec 2 21:56:01 2019 ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication Mon Dec 2 21:56:01 2019 VERIFY EKU OK Mon Dec 2 21:56:01 2019 VERIFY OK: depth=0, CN=server Mon Dec 2 21:56:01 2019 Control Channel: TLSv1.2, cipher TLSv1/SSLv3 DHE-RSA-AES256-GCM-SHA384, 4096 bit RSA Mon Dec 2 21:56:01 2019 [server] Peer Connection Initiated with [AF_INET]195.201.233.113:1194 Mon Dec 2 21:56:02 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Mon Dec 2 21:56:02 2019 PUSH: Received control message: 'PUSH_REPLY,dhcp-option DNS 10.241.189.1,route 10.241.189.0 255.255.255.0,topology net30,ping 10,ping-restart 120,ifconfig 10.241.189.10 10.241.189.9,peer-id 1,cipher AES-256-GCM' Mon Dec 2 21:56:02 2019 OPTIONS IMPORT: timers and/or timeouts modified Mon Dec 2 21:56:02 2019 OPTIONS IMPORT: --ifconfig/up options modified Mon Dec 2 21:56:02 2019 OPTIONS IMPORT: route options modified Mon Dec 2 21:56:02 2019 OPTIONS IMPORT: --ip-win32 and/or --dhcp-option options modified Mon Dec 2 21:56:02 2019 OPTIONS IMPORT: peer-id set Mon Dec 2 21:56:02 2019 OPTIONS IMPORT: adjusting link_mtu to 1624 Mon Dec 2 21:56:02 2019 OPTIONS IMPORT: data channel crypto options modified Mon Dec 2 21:56:02 2019 Data Channel Encrypt: Cipher 'AES-256-GCM' initialized with 256 bit key Mon Dec 2 21:56:02 2019 Data Channel Decrypt: Cipher 'AES-256-GCM' initialized with 256 bit key Mon Dec 2 21:56:02 2019 ROUTE_GATEWAY 10.137.0.6 Mon Dec 2 21:56:02 2019 TUN/TAP device tun0 opened Mon Dec 2 21:56:02 2019 TUN/TAP TX queue length set to 100 Mon Dec 2 21:56:02 2019 do_ifconfig, tt->did_ifconfig_ipv6_setup=0 Mon Dec 2 21:56:02 2019 /sbin/ip link set dev tun0 up mtu 1500 Mon Dec 2 21:56:02 2019 /sbin/ip addr add dev tun0 local 10.241.189.10 peer 10.241.189.9 Mon Dec 2 21:56:02 2019 /etc/openvpn/update-resolv-conf tun0 1500 1552 10.241.189.10 10.241.189.9 init dhcp-option DNS 10.241.189.1 Mon Dec 2 21:56:02 2019 /sbin/ip route add 10.241.189.0/24 via 10.241.189.9 Mon Dec 2 21:56:02 2019 Initialization Sequence Completed
- I found that I could also eliminate the need to type my username every time if I give an argument to 'auth-user-pass' a path to a filename that contains only 1 line with the username on it (omitting the second line, which is ordinarailly the password, will make the openvpn client prompt the user for the password) https://openvpn.net/community-resources/reference-manual-for-openvpn-2-4/
user@ose:~$ tail openvpn/client.conf # hardening tls-cipher TLS-DHE-RSA-WITH-AES-256-GCM-SHA384 # dns for staging script-security 2 up /etc/openvpn/update-resolv-conf down /etc/openvpn/update-resolv-conf # 2fa auth-user-pass /home/user/openvpn/username.txt user@ose:~$ cat openvpn/username.txt maltfield user@ose:~$ sudo openvpn openvpn/client.conf Mon Dec 2 22:04:53 2019 WARNING: file '/home/user/openvpn/username.txt' is group or others accessible Mon Dec 2 22:04:53 2019 OpenVPN 2.4.0 x86_64-pc-linux-gnu [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Oct 14 2018 Mon Dec 2 22:04:53 2019 library versions: OpenSSL 1.0.2t 10 Sep 2019, LZO 2.08 Enter Auth Password:
- But that's not a great UX and it would certainly confuse Marcin. How do I replace "Enter Auth Password" with "Enter 2FA Token"?
Tue Nov 26, 2019
- continuing from yesterday, why are my iptables rules failing inside the docker container? I literally whitelisted every user on the system, but it's still failing to resolve DNS
root@osestaging1-discourse-ose:~/backups/iptables/20191125# cat /etc/passwd root:x:0:0:root:/root:/bin/bash daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin bin:x:2:2:bin:/bin:/usr/sbin/nologin sys:x:3:3:sys:/dev:/usr/sbin/nologin sync:x:4:65534:sync:/bin:/bin/sync games:x:5:60:games:/usr/games:/usr/sbin/nologin man:x:6:12:man:/var/cache/man:/usr/sbin/nologin lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin mail:x:8:8:mail:/var/mail:/usr/sbin/nologin news:x:9:9:news:/var/spool/news:/usr/sbin/nologin uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin proxy:x:13:13:proxy:/bin:/usr/sbin/nologin www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin backup:x:34:34:backup:/var/backups:/usr/sbin/nologin list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin _apt:x:100:65534::/nonexistent:/usr/sbin/nologin systemd-timesync:x:101:102:systemd Time Synchronization,,,:/run/systemd:/usr/sbin/nologin systemd-network:x:102:103:systemd Network Management,,,:/run/systemd:/usr/sbin/nologin systemd-resolve:x:103:104:systemd Resolver,,,:/run/systemd:/usr/sbin/nologin messagebus:x:104:105::/nonexistent:/usr/sbin/nologin Debian-exim:x:105:108::/var/spool/exim4:/usr/sbin/nologin postgres:x:106:110:PostgreSQL administrator,,,:/var/lib/postgresql:/bin/bash sshd:x:107:65534::/run/sshd:/usr/sbin/nologin runit-log:x:999:999::/nonexistent:/usr/sbin/nologin redis:x:108:111::/var/lib/redis:/usr/sbin/nologin discourse:x:1000:1000::/home/discourse:/bin/bash root@osestaging1-discourse-ose:~/backups/iptables/20191125# root@osestaging1-discourse-ose:~/backups/iptables/20191125# root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -F root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 0 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 1 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 2 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 3 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 4 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 5 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 6 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 7 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 8 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 9 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 10 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 13 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 33 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 34 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 38 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 39 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 41 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 100 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 101 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 102 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 103 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -p tcp -m owner --uid-owner 65534 -j ACCEPT root@osestaging1-discourse-ose:~/backups/iptables/20191125# iptables -A OUTPUT -j DROP root@osestaging1-discourse-ose:~/backups/iptables/20191125# root@osestaging1-discourse-ose:/# root@osestaging1-discourse-ose:/# apt-get install iptables-persistent Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: netfilter-persistent The following NEW packages will be installed: iptables-persistent netfilter-persistent 0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded. Need to get 21.8 kB of archives. After this operation, 80.9 kB of additional disk space will be used. Do you want to continue? [Y/n] y Err:1 http://deb.debian.org/debian buster/main amd64 netfilter-persistent all 1.0.11 Temporary failure resolving 'deb.debian.org' Err:2 http://deb.debian.org/debian buster/main amd64 iptables-persistent all 1.0.11 Temporary failure resolving 'deb.debian.org' E: Failed to fetch http://deb.debian.org/debian/pool/main/i/iptables-persistent/netfilter-persistent_1.0.11_all.deb Temporary failure resolving 'deb.debian.org' E: Failed to fetch http://deb.debian.org/debian/pool/main/i/iptables-persistent/iptables-persistent_1.0.11_all.deb Temporary failure resolving 'deb.debian.org' E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing? root@osestaging1-discourse-ose:/#
- oh, duh, those are all explicitly tcp rules x_x
- yeah, removing to tcp arg (to permit udp) worked
root@osestaging1-discourse-ose:/# iptables -F root@osestaging1-discourse-ose:/# iptables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT root@osestaging1-discourse-ose:/# iptables -A OUTPUT -m owner --uid-owner 100 -j ACCEPT root@osestaging1-discourse-ose:/# iptables -A OUTPUT -j DROP root@osestaging1-discourse-ose:/# root@osestaging1-discourse-ose:/# iptables-save # Generated by xtables-save v1.8.2 on Tue Nov 26 09:28:51 2019 *filter :OUTPUT ACCEPT [66:3802] :FORWARD ACCEPT [0:0] :INPUT ACCEPT [82:547247] -A OUTPUT -m owner --uid-owner 0 -j ACCEPT -A OUTPUT -m owner --uid-owner 100 -j ACCEPT -A OUTPUT -j DROP COMMIT # Completed on Tue Nov 26 09:28:51 2019 # Warning: iptables-legacy tables present, use iptables-legacy-save to see them root@osestaging1-discourse-ose:/# apt-get install iptables-persistent Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: netfilter-persistent The following NEW packages will be installed: iptables-persistent netfilter-persistent 0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded. Need to get 21.8 kB of archives. After this operation, 80.9 kB of additional disk space will be used. Do you want to continue? [Y/n] y Get:1 http://deb.debian.org/debian buster/main amd64 netfilter-persistent all 1.0.11 [10.1 kB] Get:2 http://deb.debian.org/debian buster/main amd64 iptables-persistent all 1.0.11 [11.7 kB] Fetched 21.8 kB in 0s (681 kB/s)
- now that iptables-persistent is installed, there are files at /etc/iptables/rules.v*
root@osestaging1-discourse-ose:/# cat /etc/iptables/rules.v* # Generated by xtables-save v1.8.2 on Tue Nov 26 09:30:15 2019 *filter :OUTPUT ACCEPT [66:3802] :FORWARD ACCEPT [0:0] :INPUT ACCEPT [101:571562] -A OUTPUT -m owner --uid-owner 0 -j ACCEPT -A OUTPUT -m owner --uid-owner 100 -j ACCEPT -A OUTPUT -j DROP COMMIT # Completed on Tue Nov 26 09:30:15 2019 # Generated by xtables-save v1.8.2 on Tue Nov 26 09:30:15 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] COMMIT # Completed on Tue Nov 26 09:30:15 2019 root@osestaging1-discourse-ose:/#
- let's also add ipv6 rules
root@osestaging1-discourse-ose:/# ip6tables-save # Generated by xtables-save v1.8.2 on Tue Nov 26 09:31:56 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] COMMIT # Completed on Tue Nov 26 09:31:56 2019 # Warning: ip6tables-legacy tables present, use ip6tables-legacy-save to see them root@osestaging1-discourse-ose:/# ip6tables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT root@osestaging1-discourse-ose:/# ip6tables -A OUTPUT -m owner --uid-owner 100 -j ACCEPT root@osestaging1-discourse-ose:/# ip6tables -A OUTPUT -j DROP root@osestaging1-discourse-ose:/# root@osestaging1-discourse-ose:/# ip6tables-save # Generated by xtables-save v1.8.2 on Tue Nov 26 09:32:03 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] -A OUTPUT -m owner --uid-owner 0 -j ACCEPT -A OUTPUT -m owner --uid-owner 100 -j ACCEPT -A OUTPUT -j DROP COMMIT # Completed on Tue Nov 26 09:32:03 2019 # Warning: ip6tables-legacy tables present, use ip6tables-legacy-save to see them root@osestaging1-discourse-ose:/#
- unfortunately there's now init.d service or sv runit scripts for saving or restarting iptables!
- I tried to stop & start iptables, but it lost the config; even though the config was stored to /etc/itptables/rules.v4 :\
[root@osestaging1 discourse]# ./launcher enter discourse_ose root@osestaging1-discourse-ose:/# iptables-save # Warning: iptables-legacy tables present, use iptables-legacy-save to see them root@osestaging1-discourse-ose:/# iptables -nvL Chain INPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination # Warning: iptables-legacy tables present, use iptables-legacy to see them root@osestaging1-discourse-ose:/# cat /etc/iptables/rules.v* # Generated by xtables-save v1.8.2 on Tue Nov 26 09:30:15 2019 *filter :OUTPUT ACCEPT [66:3802] :FORWARD ACCEPT [0:0] :INPUT ACCEPT [101:571562] -A OUTPUT -m owner --uid-owner 0 -j ACCEPT -A OUTPUT -m owner --uid-owner 100 -j ACCEPT -A OUTPUT -j DROP COMMIT # Completed on Tue Nov 26 09:30:15 2019 # Generated by xtables-save v1.8.2 on Tue Nov 26 09:30:15 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] COMMIT # Completed on Tue Nov 26 09:30:15 2019 root@osestaging1-discourse-ose:/#
- I've been debating baking this into the image a build time, but if it can't persist then it'll *have* to live in a template. I really don't like having the box come up with networking and an open firewall, even for a microsecond, but I guess this is probably the best option
- I added a 'templates/iptables.template.yml' that just runs the iptables & ip6tables commands and attempted a bootstrap. it failed saying that there was an issue with my iptables syntax?
[root@osestaging1 discourse]# cat templates/iptables.template.yml run: - exec: cmd: # run these every time since the container can't persist iptables rules - sudo apt-get install -y iptables - iptables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT - iptables -A OUTPUT -m owner --uid-owner 100 -j ACCEPT - iptables -A OUTPUT -j DROP - ip6tables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT - ip6tables -A OUTPUT -m owner --uid-owner 100 -j ACCEPT - ip6tables -A OUTPUT -j DROP [root@osestaging1 discourse]# [root@osestaging1 discourse]# docker stop discourse_ose discourse_ose [root@osestaging1 discourse]# ./launcher bootstrap discourse_ose ... I, [2019-11-26T09:51:45.398930 #1] INFO -- : > iptables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT iptables v1.8.2 (nf_tables): Couldn't load match `owner':No such file or directory Try `iptables -h' or 'iptables --help' for more information. I, [2019-11-26T09:51:45.409106 #1] INFO -- : FAILED -------------------- Pups::ExecError: iptables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT failed with return #<Process::Status: pid 190 exit 2> Location of failure: /pups/lib/pups/exec_command.rb:112:in `spawn' exec failed with the params {"cmd"=>["sudo apt-get install -y iptables", "iptables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT", "iptables -A OUTPUT -m owner --uid-owner 100 -j ACCEPT", "iptables -A OUTPUT -j DROP", "ip6tables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT", "ip6tables -A OUTPUT -m owner --uid-owner 100 -j ACCEPT", "ip6tables -A OUTPUT -j DROP"]} 57dee8aa8bba2e5e61f0978c14def84b5981de1be525dc36c3ad5fad191ddca3 FAILED TO BOOTSTRAP please scroll up and look for earlier error messages, there may be more than one. ./discourse-doctor may help diagnose the problem. [root@osestaging1 discourse]#
- I tried changing it from 'iptables' to 'iptables-legacy'. Now I'm getting Permision issues again
[root@osestaging1 discourse]# cat templates/iptables.template.yml run: - exec: cmd: # run these every time since the container can't persist iptables rules - sudo apt-get install -y iptables - iptables-legacy -A OUTPUT -m owner --uid-owner 0 -j ACCEPT - iptables-legacy -A OUTPUT -m owner --uid-owner 100 -j ACCEPT - iptables-legacy -A OUTPUT -j DROP - ip6tables-legacy -A OUTPUT -m owner --uid-owner 0 -j ACCEPT - ip6tables-legacy -A OUTPUT -m owner --uid-owner 100 -j ACCEPT - ip6tables-legacy -A OUTPUT -j DROP [root@osestaging1 discourse]# [root@osestaging1 discourse]# ./launcher bootstrap discourse_ose ... I, [2019-11-26T09:54:59.014399 #1] INFO -- : > iptables-legacy -A OUTPUT -m owner --uid-owner 0 -j ACCEPT getsockopt failed strangely: Operation not permitted I, [2019-11-26T09:54:59.023038 #1] INFO -- : FAILED -------------------- Pups::ExecError: iptables-legacy -A OUTPUT -m owner --uid-owner 0 -j ACCEPT failed with return #<Process::Status: pid 190 exit 1> Location of failure: /pups/lib/pups/exec_command.rb:112:in `spawn' exec failed with the params {"cmd"=>["sudo apt-get install -y iptables", "iptables-legacy -A OUTPUT -m owner --uid-owner 0 -j ACCEPT", "iptables-legacy -A OUTPUT -m owner --uid-owner 100 -j ACCEPT", "iptables-legacy -A OUTPUT -j DROP", "ip6tables-legacy -A OUTPUT -m owner --uid-owner 0 -j ACCEPT", "ip6tables-legacy -A OUTPUT -m owner --uid-owner 100 -j ACCEPT", "ip6tables-legacy -A OUTPUT -j DROP"]} beed5935a7fdbbc447610d57e295f93a133c0bf9dff6f82bc7afaea531d91526 FAILED TO BOOTSTRAP please scroll up and look for earlier error messages, there may be more than one. ./discourse-doctor may help diagnose the problem. [root@osestaging1 discourse]# # I confirmed that the NET_ADMIN capacity was still present <pre> [root@osestaging1 discourse]# docker inspect discourse_ose | grep -iC3 CapAdd "AutoRemove": false, "VolumeDriver": "", "VolumesFrom": null, "CapAdd": [ "NET_ADMIN" ], "CapDrop": null, [root@osestaging1 discourse]#
- adding sudo didn't help either. someone else had this issue, but there's no solution https://stackoverflow.com/questions/50419819/adding-a-new-user-to-docker-and-limiting-its-permissions
- I moved this to a runinit instead
[root@osestaging1 discourse]# cat templates/iptables.template.yml run: - exec: cmd: - sudo apt-get install -y iptables - file: path: /etc/runit/1.d/000-iptables contents: | #!/bin/bash sudo apt-get install -y iptables sudo iptables-legacy -A OUTPUT -m owner --uid-owner 0 -j ACCEPT sudo iptables-legacy -A OUTPUT -m owner --uid-owner 100 -j ACCEPT sudo iptables-legacy -A OUTPUT -j DROP sudo ip6tables-legacy -A OUTPUT -m owner --uid-owner 0 -j ACCEPT sudo ip6tables-legacy -A OUTPUT -m owner --uid-owner 100 -j ACCEPT sudo ip6tables-legacy -A OUTPUT -j DROP [root@osestaging1 discourse]#
- I also had issues with launcher killing my changes to the container's hostconfig json file. the problem was solved by doing a docker restart, which is necessary to make the config stick
id=`docker inspect --format=".Id" discourse_ose` grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json sed -i 's/"CapAdd":null/"CapAdd":"NET_ADMIN"/' /var/lib/docker/containers/$id/hostconfig.json grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json systemctl restart docker
- now the box that comes up from `launcher start discourse_ose` has iptables permission! But there's on config :\
[root@osestaging1 discourse]# ./launcher enter discourse_ose root@osestaging1-discourse-ose:/var/www/discourse# iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination # Warning: iptables-legacy tables present, use iptables-legacy to see them root@osestaging1-discourse-ose:/var/www/discourse#
- I'm having issues with the `launcher bootstrap` process killing my CapAdd settings; this is the process that works
/var/discourse/launcher stop discourse_ose /var/discourse/launcher bootstrap discourse_ose id=`docker inspect --format=".Id" discourse_ose` grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json sed -i 's/"CapAdd":null/"CapAdd":"NET_ADMIN"/' /var/lib/docker/containers/$id/hostconfig.json grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json systemctl restart docker /var/discourse/launcher start discourse_ose
- I had an issue where the docker container got stuck in a loop boot. `docker ps` just kept saying its state was "restarting". To figure out what the problem was, I had to tail the docker logs https://stackoverflow.com/questions/37471929/docker-container-keeps-on-restarting-again-on-again
2019-11-26T13:20:16.076166067Z run-parts: executing /etc/runit/1.d/00-ensure-links 2019-11-26T13:20:16.115041657Z run-parts: executing /etc/runit/1.d/00-fix-var-logs 2019-11-26T13:20:16.190732272Z run-parts: executing /etc/runit/1.d/000-iptables 2019-11-26T13:20:16.190786173Z run-parts: failed to exec /etc/runit/1.d/000-iptables: Exec format error 2019-11-26T13:20:16.190800674Z run-parts: /etc/runit/1.d/000-iptables exited with return code 1 [root@osestaging1 discourse]#
- this is really fucking frustrating, because the iteration time of trying to get this damn script correct is like 10-20 minutes on the discourse 'bootstrap' command!
- this is where I'm at now. Could it be that apt-get can't run yet in the early stage of runlevel 1?
[root@osestaging1 discourse]# cat templates/iptables.template.yml run: - exec: cmd: - sudo apt-get install -y iptables - file: path: /etc/runit/1.d/000-iptables chmod: "+x" contents: | ################################################################################ # File: /etc/runit/1.d/000-iptables # Version: 0.1 # Purpose: installs & locks-down iptables # Author: Michael Altfield <michael@opensourceecology.org> # Created: 2019-11-26 # Updated: 2019-11-26 ################################################################################ #!/bin/bash sudo apt-get install -y iptables sudo iptableslegacy -A OUTPUT -m owner --uid-owner 0 -j ACCEPT sudo iptables -A OUTPUT -m owner --uid-owner 100 -j ACCEPT sudo iptables -A OUTPUT -j DROP sudo ip6tables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT sudo ip6tables -A OUTPUT -m owner --uid-owner 100 -j ACCEPT sudo ip6tables -A OUTPUT -j DROP [root@osestaging1 discourse]#
- ugh, there's a typo on the second command. Let's bootstrap again and wait another 10 minutes to see if that worked..
- I tried moving this to runlevel 2 instead of 1, but then I discovered that it *still* got stuck on the old 1.d runlevel file. Looks like it doesn't get deleted between bootstrap runs unless I do a destroy; so now these are my iteration commands
/var/discourse/launcher destroy discourse_ose time nice /var/discourse/launcher bootstrap discourse_ose id=`docker inspect --format=".Id" discourse_ose` grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json sed -i 's/"CapAdd":null/"CapAdd":"NET_ADMIN"/' /var/lib/docker/containers/$id/hostconfig.json grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json systemctl restart docker /var/discourse/launcher start discourse_ose
- actually, the bootstrap starts the docker container too early after it's gotten rid of our changes to the container's capacity (NET_ADMIN), so it comes up without the ability for root to do an `iptables` a stop immediatly after the bootstrap followed by the change followed by a start works
/var/discourse/launcher destroy discourse_ose time nice /var/discourse/launcher bootstrap discourse_ose /var/discourse/launcher stop discourse_ose id=`docker inspect --format=".Id" discourse_ose` grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json sed -i 's/"CapAdd":null/"CapAdd":"NET_ADMIN"/' /var/lib/docker/containers/$id/hostconfig.json grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json systemctl restart docker /var/discourse/launcher start discourse_ose
- actually, that *still* doesn't work. The hostconfig.json file does not exist after the `launcher destroy`, and it isn't created until after a `launcher start`. But then the first start is necessarily can't have the NET_ADMIN capacity
- I did verify that the script runs without isues if I manually trigger it from runlevel = 2
root@osestaging1-discourse-ose:/# /etc/runit/2.d/000-iptables Reading package lists... Done Building dependency tree Reading state information... Done iptables is already the newest version (1.8.2-4). 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. root@osestaging1-discourse-ose:/# iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination ACCEPT all -- anywhere anywhere owner UID match root ACCEPT all -- anywhere anywhere owner UID match _apt DROP all -- anywhere anywhere # Warning: iptables-legacy tables present, use iptables-legacy to see them root@osestaging1-discourse-ose:/# cat /etc/runit/2.d/000-iptables ################################################################################ # File: /etc/runit/1.d/000-iptables # Version: 0.1 # Purpose: installs & locks-down iptables # Author: Michael Altfield <michael@opensourceecology.org> # Created: 2019-11-26 # Updated: 2019-11-26 ################################################################################ #!/bin/bash sudo apt-get install -y iptables sudo iptables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT sudo iptables -A OUTPUT -m owner --uid-owner 100 -j ACCEPT sudo iptables -A OUTPUT -j DROP sudo ip6tables -A OUTPUT -m owner --uid-owner 0 -j ACCEPT sudo ip6tables -A OUTPUT -m owner --uid-owner 100 -j ACCEPT sudo ip6tables -A OUTPUT -j DROProot@osestaging1-discourse-ose:/#
- ah, I think I understand at least why my script isn't being called when I put it in runlevel 2
root@osestaging1-discourse-ose:/etc/runit# cat 1 #!/bin/bash /bin/run-parts --verbose --exit-on-error /etc/runit/1.d || exit 100 root@osestaging1-discourse-ose:/etc/runit# cat 2 #!/bin/bash exec /usr/bin/runsvdir -P /etc/service root@osestaging1-discourse-ose:/etc/runit# cat 3 #!/bin/bash /bin/run-parts --verbose /etc/runit/3.d root@osestaging1-discourse-ose:/etc/runit#
- the actual command in `launcher` that finally creates e hostconfig.json file is this `docker run` one
+ id=sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63 + grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63/hostconfig.json grep: /var/lib/docker/containers/sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63/hostconfig.json: No such file or directory + /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e UNICORN_WORKERS=2 -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=osediscourse_2019@michaelaltfield.net -e DISCOURSE_SMTP_ADDRESS=172.17.0.1 -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-discourse-ose -e DOCKER_HOST_IP=172.17.0.1 --name discourse_ose -t -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:fc:97:b8:b4:0d local_discourse/discourse_ose /sbin/boot acb146563bc67ad946d0ae3ec40ebb01c08d51f2b457676aba9a52a67bcb4896 ++ docker inspect '--format=.Id' discourse_ose + id=acb146563bc67ad946d0ae3ec40ebb01c08d51f2b457676aba9a52a67bcb4896 + grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/acb146563bc67ad946d0ae3ec40ebb01c08d51f2b457676aba9a52a67bcb4896/hostconfig.json {"Binds":["/var/discourse/shared/standalone:/shared","/var/discourse/shared/standalone/log/var-log:/var/log"],"ContainerIDFile":"","LogConfig":{"Type":"json-file","Config":{}},"NetworkMode":"default","PortBindings":{},"RestartPolicy":{"Name":"always","MaximumRetryCount":0},"AutoRemove":false,"VolumeDriver":"","VolumesFrom":null,"CapAdd":null,"CapDrop":null,"Capabilities":null,"Dns":[],"DnsOptions":[],"DnsSearch":[],"ExtraHosts":null,"GroupAdd":null,"IpcMode":"private","Cgroup":"","Links":null,"OomScoreAdj":0,"PidMode":"","Privileged":false,"PublishAllPorts":false,"ReadonlyRootfs":false,"SecurityOpt":null,"UTSMode":"","UsernsMode":"","ShmSize":536870912,"Runtime":"runc","ConsoleSize":[0,0],"Isolation":"","CpuShares":0,"Memory":0,"NanoCpus":0,"CgroupParent":"","BlkioWeight":0,"BlkioWeightDevice":[],"BlkioDeviceReadBps":null,"BlkioDeviceWriteBps":null,"BlkioDeviceReadIOps":null,"BlkioDeviceWriteIOps":null,"CpuPeriod":0,"CpuQuota":0,"CpuRealtimePeriod":0,"CpuRealtimeRuntime":0,"CpusetCpus":"","CpusetMems":"","Devices":[],"DeviceCgroupRules":null,"DeviceRequests":null,"KernelMemory":0,"KernelMemoryTCP":0,"MemoryReservation":0,"MemorySwap":0,"MemorySwappiness":null,"OomKillDisable":false,"PidsLimit":null,"Ulimits":null,"CpuCount":0,"CpuPercent":0,"IOMaximumIOps":0,"IOMaximumBandwidth":0,"MaskedPaths":["/proc/asound","/proc/acpi","/proc/kcore","/proc/keys","/proc/latency_stats","/proc/timer_list","/proc/timer_stats","/proc/sched_debug","/proc/scsi","/sys/firmware"],"ReadonlyPaths":["/proc/bus","/proc/fs","/proc/irq","/proc/sys","/proc/sysrq-trigger"]}
- I was finally able to trigger the creation of the hostconfig.json file with this
/bin/docker run --rm -i -a stdin -a stdout --name discourse_ose local_discourse/discourse_ose /sbin/boot
- for example I run this in one terminal
[root@osestaging1 discourse]# /var/discourse/launcher destroy discourse_ose + /bin/docker stop -t 10 discourse_ose Error response from daemon: No such container: discourse_ose discourse_ose was not found [root@osestaging1 discourse]# time nice /var/discourse/launcher bootstrap discourse_ose INFO: checking hostconfig capacities before 5 grep: /var/lib/docker/containers/sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63/hostconfig.json: No such file or directory grep: /var/lib/docker/containers/sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63/hostconfig.json: No such file or directory INFO: checking hostconfig capacities before 6 grep: /var/lib/docker/containers/sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63/hostconfig.json: No such file or directory grep: /var/lib/docker/containers/sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63/hostconfig.json: No such file or directory INFO: checking hostconfig capacities before 7 grep: /var/lib/docker/containers/sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63/hostconfig.json: No such file or directory grep: /var/lib/docker/containers/sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63/hostconfig.json: No such file or directory cd /pups && git pull && /pups/bin/pups --stdin Already up to date. I, [2019-11-26T15:00:50.276279 #1] INFO -- : Loading --stdin I, [2019-11-26T15:00:50.278427 #1] INFO -- : Skipped missing after_code hook I, [2019-11-26T15:00:50.308117 #1] INFO -- : File > /etc/runit/2.d/000-iptables chmod: +x chown: I, [2019-11-26T15:00:50.308310 #1] INFO -- : > echo "Beginning of custom commands" I, [2019-11-26T15:00:50.313145 #1] INFO -- : Beginning of custom commands I, [2019-11-26T15:00:50.313367 #1] INFO -- : > echo "End of custom commands" I, [2019-11-26T15:00:50.318130 #1] INFO -- : End of custom commands sha256:ed80d37d26774cf0d3a51cde9e08808814281f4ebb07cc7a8e44a7c1b333e3da 5e1526cce436c9c343cfc6c58a4b0421548836ac49d5b7f9a525d28a82dc1ed3 Successfully bootstrapped, to startup use ./launcher start discourse_ose real 0m44.255s user 0m1.832s sys 0m1.420s [root@osestaging1 discourse]# id=`docker inspect --format=".Id" discourse_ose` [root@osestaging1 discourse]# grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json grep: /var/lib/docker/containers/sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63/hostconfig.json: No such file or directory [root@osestaging1 discourse]# id=`docker inspect --format=".Id" discourse_ose` [root@osestaging1 discourse]# grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json grep: /var/lib/docker/containers/sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63/hostconfig.json: No such file or directory [root@osestaging1 discourse]# /bin/docker run --rm -i -a stdin -a stdout --name discourse_ose local_discourse/discourse_ose /sbin/boot Cleaning stale PID files Started runsvdir, PID is 25 chgrp: invalid group: ‘syslog’ rsyslogd: imklog: cannot open kernel log (/proc/kmsg): Operation not permitted. rsyslogd: activation of module imklog failed [v8.1901.0 try https://www.rsyslog.com/e/2145 ]
- which gets stuck, then in another terminal: boom!
[root@osestaging1 discourse]# id=`docker inspect --format=".Id" discourse_ose` [root@osestaging1 discourse]# grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json {"Binds":null,"ContainerIDFile":"","LogConfig":{"Type":"json-file","Config":{}},"NetworkMode":"default","PortBindings":{},"RestartPolicy":{"Name":"no","MaximumRetryCount":0},"AutoRemove":true,"VolumeDriver":"","VolumesFrom":null,"CapAdd":null,"CapDrop":null,"Capabilities":null,"Dns":[],"DnsOptions":[],"DnsSearch":[],"ExtraHosts":null,"GroupAdd":null,"IpcMode":"private","Cgroup":"","Links":null,"OomScoreAdj":0,"PidMode":"","Privileged":false,"PublishAllPorts":false,"ReadonlyRootfs":false,"SecurityOpt":null,"UTSMode":"","UsernsMode":"","ShmSize":67108864,"Runtime":"runc","ConsoleSize":[0,0],"Isolation":"","CpuShares":0,"Memory":0,"NanoCpus":0,"CgroupParent":"","BlkioWeight":0,"BlkioWeightDevice":[],"BlkioDeviceReadBps":null,"BlkioDeviceWriteBps":null,"BlkioDeviceReadIOps":null,"BlkioDeviceWriteIOps":null,"CpuPeriod":0,"CpuQuota":0,"CpuRealtimePeriod":0,"CpuRealtimeRuntime":0,"CpusetCpus":"","CpusetMems":"","Devices":[],"DeviceCgroupRules":null,"DeviceRequests":null,"KernelMemory":0,"KernelMemoryTCP":0,"MemoryReservation":0,"MemorySwap":0,"MemorySwappiness":null,"OomKillDisable":false,"PidsLimit":null,"Ulimits":null,"CpuCount":0,"CpuPercent":0,"IOMaximumIOps":0,"IOMaximumBandwidth":0,"MaskedPaths":["/proc/asound","/proc/acpi","/proc/kcore","/proc/keys","/proc/latency_stats","/proc/timer_list","/proc/timer_stats","/proc/sched_debug","/proc/scsi","/sys/firmware"],"ReadonlyPaths":["/proc/bus","/proc/fs","/proc/irq","/proc/sys","/proc/sysrq-trigger"]} [root@osestaging1 discourse]#
- this simpler command works too!
docker run --name discourse_ose local_discourse/discourse_ose whoami
- for example
[root@osestaging1 discourse]# /var/discourse/launcher destroy discourse_ose + /bin/docker stop -t 10 discourse_ose Error response from daemon: No such container: discourse_ose discourse_ose was not found [root@osestaging1 discourse]# time nice /var/discourse/launcher bootstrap discourse_ose ... sha256:2a321b9e0983a5134e9ca9459d4fc31b83eedb209ecf2a95697de6ed78e87542 8f0c04f55129ddc5bf7e9a85a5d724ce5b696f8c31dc525ea5625f1554887315 Successfully bootstrapped, to startup use ./launcher start discourse_ose real 0m26.934s user 0m1.899s sys 0m1.511s [root@osestaging1 discourse]# id=`docker inspect --format=".Id" discourse_ose` [root@osestaging1 discourse]# grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json grep: /var/lib/docker/containers/sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63/hostconfig.json: No such file or directory [root@osestaging1 discourse]# docker run --name discourse_ose local_discourse/discourse_ose whoami root [root@osestaging1 discourse]# id=`docker inspect --format=".Id" discourse_ose` [root@osestaging1 discourse]# grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json {"Binds":null,"ContainerIDFile":"","LogConfig":{"Type":"json-file","Config":{}},"NetworkMode":"default","PortBindings":{},"RestartPolicy":{"Name":"no","MaximumRetryCount":0},"AutoRemove":false,"VolumeDriver":"","VolumesFrom":null,"CapAdd":null,"CapDrop":null,"Capabilities":null,"Dns":[],"DnsOptions":[],"DnsSearch":[],"ExtraHosts":null,"GroupAdd":null,"IpcMode":"private","Cgroup":"","Links":null,"OomScoreAdj":0,"PidMode":"","Privileged":false,"PublishAllPorts":false,"ReadonlyRootfs":false,"SecurityOpt":null,"UTSMode":"","UsernsMode":"","ShmSize":67108864,"Runtime":"runc","ConsoleSize":[0,0],"Isolation":"","CpuShares":0,"Memory":0,"NanoCpus":0,"CgroupParent":"","BlkioWeight":0,"BlkioWeightDevice":[],"BlkioDeviceReadBps":null,"BlkioDeviceWriteBps":null,"BlkioDeviceReadIOps":null,"BlkioDeviceWriteIOps":null,"CpuPeriod":0,"CpuQuota":0,"CpuRealtimePeriod":0,"CpuRealtimeRuntime":0,"CpusetCpus":"","CpusetMems":"","Devices":[],"DeviceCgroupRules":null,"DeviceRequests":null,"KernelMemory":0,"KernelMemoryTCP":0,"MemoryReservation":0,"MemorySwap":0,"MemorySwappiness":null,"OomKillDisable":false,"PidsLimit":null,"Ulimits":null,"CpuCount":0,"CpuPercent":0,"IOMaximumIOps":0,"IOMaximumBandwidth":0,"MaskedPaths":["/proc/asound","/proc/acpi","/proc/kcore","/proc/keys","/proc/latency_stats","/proc/timer_list","/proc/timer_stats","/proc/sched_debug","/proc/scsi","/sys/firmware"],"ReadonlyPaths":["/proc/bus","/proc/fs","/proc/irq","/proc/sys","/proc/sysrq-trigger"]} [root@osestaging1 discourse]#
- so here's probably a better bootstrap method that hopefully gets the first boot with NET_ADMIN capabilities..
/var/discourse/launcher destroy discourse_ose time nice /var/discourse/launcher bootstrap discourse_ose # create hostconfig.json and grant the container NET_ADMIN permissions (for iptables) # this hack is necessary because `docker start` doesn't take --cap-add docker run --name discourse_ose local_discourse/discourse_ose whoami id=`docker inspect --format=".Id" discourse_ose` grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json sed -i 's/"CapAdd":null/"CapAdd":"NET_ADMIN"/' /var/lib/docker/containers/$id/hostconfig.json grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json systemctl restart docker /var/discourse/launcher start discourse_ose
- well fuck, now the last line `launcher start discourse_ose` doesn't actually start the docker instance!
- even manually running it doesn't work. da fuck?
[root@osestaging1 discourse]# docker start discourse_ose discourse_ose [root@osestaging1 discourse]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@osestaging1 discourse]#
- ok, so I think what's happeneing is that I created a fork of the container "discourse_ose" called "local_discourse/discourse_ose" but such that when it's started, it runs only `whoami` and exits. This is a damn catch-22, because I need to make it run something that's *not* a boot so I can give it the NET_ADMIN capacity before booting!!
- I tried to "overwrite" the "whoami" with "/sbin/boot" after setting the CapAdd, but it said I had to delete the old one first. Of course, when I delete the old one, then I loose the config
root@osestaging1 discourse]# id=`docker inspect --format=".Id" discourse_ose` [root@osestaging1 discourse]# grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json {"Binds":null,"ContainerIDFile":"","LogConfig":{"Type":"json-file","Config":{}},"NetworkMode":"default","PortBindings":{},"RestartPolicy":{"Name":"no","MaximumRetryCount":0},"AutoRemove":false,"VolumeDriver":"","VolumesFrom":null,"CapAdd":["NET_ADMIN"],"CapDrop":null,"Capabilities":null,"Dns":[],"DnsOptions":[],"DnsSearch":[],"ExtraHosts":null,"GroupAdd":null,"IpcMode":"private","Cgroup":"","Links":null,"OomScoreAdj":0,"PidMode":"","Privileged":false,"PublishAllPorts":false,"ReadonlyRootfs":false,"SecurityOpt":null,"UTSMode":"","UsernsMode":"","ShmSize":67108864,"Runtime":"runc","ConsoleSize":[0,0],"Isolation":"","CpuShares":0,"Memory":0,"NanoCpus":0,"CgroupParent":"","BlkioWeight":0,"BlkioWeightDevice":[],"BlkioDeviceReadBps":null,"BlkioDeviceWriteBps":null,"BlkioDeviceReadIOps":null,"BlkioDeviceWriteIOps":null,"CpuPeriod":0,"CpuQuota":0,"CpuRealtimePeriod":0,"CpuRealtimeRuntime":0,"CpusetCpus":"","CpusetMems":"","Devices":[],"DeviceCgroupRules":null,"DeviceRequests":null,"KernelMemory":0,"KernelMemoryTCP":0,"MemoryReservation":0,"MemorySwap":0,"MemorySwappiness":null,"OomKillDisable":false,"PidsLimit":null,"Ulimits":null,"CpuCount":0,"CpuPercent":0,"IOMaximumIOps":0,"IOMaximumBandwidth":0,"MaskedPaths":["/proc/asound","/proc/acpi","/proc/kcore","/proc/keys","/proc/latency_stats","/proc/timer_list","/proc/timer_stats","/proc/sched_debug","/proc/scsi","/sys/firmware"],"ReadonlyPaths":["/proc/bus","/proc/fs","/proc/irq","/proc/sys","/proc/sysrq-trigger"]} [root@osestaging1 discourse]# [root@osestaging1 discourse]# docker ps -a | grep -i local_discourse 419cfe8ed8bc local_discourse/discourse_ose "whoami" 11 minutes ago Exited (0) 6 minutes ago discourse_ose [root@osestaging1 discourse]# docker rm 419cfe8ed8bc 419cfe8ed8bc [root@osestaging1 discourse]# id=`docker inspect --format=".Id" discourse_ose` [root@osestaging1 discourse]# grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json grep: /var/lib/docker/containers/sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63/hostconfig.json: No such file or directory [root@osestaging1 discourse]#
- ok, what if I create the first docker instance just for a second, then terminate it, then update the configs, then start it again? Something like this?
/var/discourse/launcher destroy discourse_ose time nice /var/discourse/launcher bootstrap discourse_ose # create hostconfig.json and grant the container NET_ADMIN permissions (for iptables) # this hack is necessary because `docker start` doesn't take --cap-add docker run -d --name discourse_ose local_discourse/discourse_ose /sbin/boot id=`docker inspect --format=".Id" discourse_ose` grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json sed -i 's/"CapAdd":null/"CapAdd":"NET_ADMIN"/' /var/lib/docker/containers/$id/hostconfig.json grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json systemctl restart docker /var/discourse/launcher start discourse_ose
- that didn't quite work. after the docker restart, the NET_ADMIN capacity disappeared; maybe we have to stop the container before updating its config and restarting docker?
[root@osestaging1 discourse]# id=`docker inspect --format=".Id" discourse_ose` [root@osestaging1 discourse]# grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json grep: /var/lib/docker/containers/sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63/hostconfig.json: No such file or directory [root@osestaging1 discourse]# docker run -d --name discourse_ose local_discourse/discourse_ose /sbin/boot 3337ce1d1833172a4543a105396179a102fec4969cf2b3d9208c441009dd0443 [root@osestaging1 discourse]# id=`docker inspect --format=".Id" discourse_ose` [root@osestaging1 discourse]# grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json {"Binds":null,"ContainerIDFile":"","LogConfig":{"Type":"json-file","Config":{}},"NetworkMode":"default","PortBindings":{},"RestartPolicy":{"Name":"no","MaximumRetryCount":0},"AutoRemove":false,"VolumeDriver":"","VolumesFrom":null,"CapAdd":null,"CapDrop":null,"Capabilities":null,"Dns":[],"DnsOptions":[],"DnsSearch":[],"ExtraHosts":null,"GroupAdd":null,"IpcMode":"private","Cgroup":"","Links":null,"OomScoreAdj":0,"PidMode":"","Privileged":false,"PublishAllPorts":false,"ReadonlyRootfs":false,"SecurityOpt":null,"UTSMode":"","UsernsMode":"","ShmSize":67108864,"Runtime":"runc","ConsoleSize":[0,0],"Isolation":"","CpuShares":0,"Memory":0,"NanoCpus":0,"CgroupParent":"","BlkioWeight":0,"BlkioWeightDevice":[],"BlkioDeviceReadBps":null,"BlkioDeviceWriteBps":null,"BlkioDeviceReadIOps":null,"BlkioDeviceWriteIOps":null,"CpuPeriod":0,"CpuQuota":0,"CpuRealtimePeriod":0,"CpuRealtimeRuntime":0,"CpusetCpus":"","CpusetMems":"","Devices":[],"DeviceCgroupRules":null,"DeviceRequests":null,"KernelMemory":0,"KernelMemoryTCP":0,"MemoryReservation":0,"MemorySwap":0,"MemorySwappiness":null,"OomKillDisable":false,"PidsLimit":null,"Ulimits":null,"CpuCount":0,"CpuPercent":0,"IOMaximumIOps":0,"IOMaximumBandwidth":0,"MaskedPaths":["/proc/asound","/proc/acpi","/proc/kcore","/proc/keys","/proc/latency_stats","/proc/timer_list","/proc/timer_stats","/proc/sched_debug","/proc/scsi","/sys/firmware"],"ReadonlyPaths":["/proc/bus","/proc/fs","/proc/irq","/proc/sys","/proc/sysrq-trigger"]} [root@osestaging1 discourse]# sed -i 's/"CapAdd":null/"CapAdd":"NET_ADMIN"/' /var/lib/docker/containers/$id/hostconfig.json [root@osestaging1 discourse]# grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json {"Binds":null,"ContainerIDFile":"","LogConfig":{"Type":"json-file","Config":{}},"NetworkMode":"default","PortBindings":{},"RestartPolicy":{"Name":"no","MaximumRetryCount":0},"AutoRemove":false,"VolumeDriver":"","VolumesFrom":null,"CapAdd":"NET_ADMIN","CapDrop":null,"Capabilities":null,"Dns":[],"DnsOptions":[],"DnsSearch":[],"ExtraHosts":null,"GroupAdd":null,"IpcMode":"private","Cgroup":"","Links":null,"OomScoreAdj":0,"PidMode":"","Privileged":false,"PublishAllPorts":false,"ReadonlyRootfs":false,"SecurityOpt":null,"UTSMode":"","UsernsMode":"","ShmSize":67108864,"Runtime":"runc","ConsoleSize":[0,0],"Isolation":"","CpuShares":0,"Memory":0,"NanoCpus":0,"CgroupParent":"","BlkioWeight":0,"BlkioWeightDevice":[],"BlkioDeviceReadBps":null,"BlkioDeviceWriteBps":null,"BlkioDeviceReadIOps":null,"BlkioDeviceWriteIOps":null,"CpuPeriod":0,"CpuQuota":0,"CpuRealtimePeriod":0,"CpuRealtimeRuntime":0,"CpusetCpus":"","CpusetMems":"","Devices":[],"DeviceCgroupRules":null,"DeviceRequests":null,"KernelMemory":0,"KernelMemoryTCP":0,"MemoryReservation":0,"MemorySwap":0,"MemorySwappiness":null,"OomKillDisable":false,"PidsLimit":null,"Ulimits":null,"CpuCount":0,"CpuPercent":0,"IOMaximumIOps":0,"IOMaximumBandwidth":0,"MaskedPaths":["/proc/asound","/proc/acpi","/proc/kcore","/proc/keys","/proc/latency_stats","/proc/timer_list","/proc/timer_stats","/proc/sched_debug","/proc/scsi","/sys/firmware"],"ReadonlyPaths":["/proc/bus","/proc/fs","/proc/irq","/proc/sys","/proc/sysrq-trigger"]} [root@osestaging1 discourse]# systemctl restart docker [root@osestaging1 discourse]# id=`docker inspect --format=".Id" discourse_ose` [root@osestaging1 discourse]# grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json {"Binds":null,"ContainerIDFile":"","LogConfig":{"Type":"json-file","Config":{}},"NetworkMode":"default","PortBindings":{},"RestartPolicy":{"Name":"no","MaximumRetryCount":0},"AutoRemove":false,"VolumeDriver":"","VolumesFrom":null,"CapAdd":null,"CapDrop":null,"Capabilities":null,"Dns":[],"DnsOptions":[],"DnsSearch":[],"ExtraHosts":null,"GroupAdd":null,"IpcMode":"private","Cgroup":"","Links":null,"OomScoreAdj":0,"PidMode":"","Privileged":false,"PublishAllPorts":false,"ReadonlyRootfs":false,"SecurityOpt":null,"UTSMode":"","UsernsMode":"","ShmSize":67108864,"Runtime":"runc","ConsoleSize":[0,0],"Isolation":"","CpuShares":0,"Memory":0,"NanoCpus":0,"CgroupParent":"","BlkioWeight":0,"BlkioWeightDevice":[],"BlkioDeviceReadBps":null,"BlkioDeviceWriteBps":null,"BlkioDeviceReadIOps":null,"BlkioDeviceWriteIOps":null,"CpuPeriod":0,"CpuQuota":0,"CpuRealtimePeriod":0,"CpuRealtimeRuntime":0,"CpusetCpus":"","CpusetMems":"","Devices":[],"DeviceCgroupRules":null,"DeviceRequests":null,"KernelMemory":0,"KernelMemoryTCP":0,"MemoryReservation":0,"MemorySwap":0,"MemorySwappiness":null,"OomKillDisable":false,"PidsLimit":null,"Ulimits":null,"CpuCount":0,"CpuPercent":0,"IOMaximumIOps":0,"IOMaximumBandwidth":0,"MaskedPaths":["/proc/asound","/proc/acpi","/proc/kcore","/proc/keys","/proc/latency_stats","/proc/timer_list","/proc/timer_stats","/proc/sched_debug","/proc/scsi","/sys/firmware"],"ReadonlyPaths":["/proc/bus","/proc/fs","/proc/irq","/proc/sys","/proc/sysrq-trigger"]} [root@osestaging1 discourse]#
- let's try this
/var/discourse/launcher destroy discourse_ose time nice /var/discourse/launcher bootstrap discourse_ose # create hostconfig.json and grant the container NET_ADMIN permissions (for iptables) # this hack is necessary because `docker start` doesn't take --cap-add docker run -d --name discourse_ose local_discourse/discourse_ose /sbin/boot /var/discourse/launcher stop discourse_ose id=`docker inspect --format=".Id" discourse_ose` grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json sed -i 's/"CapAdd":null/"CapAdd":"NET_ADMIN"/' /var/lib/docker/containers/$id/hostconfig.json grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json systemctl restart docker /var/discourse/launcher start discourse_ose
- it stuck after the restart this time!
root@osestaging1 discourse]# id=`docker inspect --format=".Id" discourse_ose` [root@osestaging1 discourse]# grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json grep: /var/lib/docker/containers/sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63/hostconfig.json: No such file or directory [root@osestaging1 discourse]# [root@osestaging1 discourse]# [root@osestaging1 discourse]# [root@osestaging1 discourse]# docker run -d --name discourse_ose local_discourse/discourse_ose /sbin/boot 91549063df643166ed7ac82c8fb79a4db42562dc2a568cee2462e488a2ef5e4f [root@osestaging1 discourse]# /var/discourse/launcher stop discourse_ose ... [root@osestaging1 discourse]# id=`docker inspect --format=".Id" discourse_ose` [root@osestaging1 discourse]# grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json {"Binds":null,"ContainerIDFile":"","LogConfig":{"Type":"json-file","Config":{}},"NetworkMode":"default","PortBindings":{},"RestartPolicy":{"Name":"no","MaximumRetryCount":0},"AutoRemove":false,"VolumeDriver":"","VolumesFrom":null,"CapAdd":null,"CapDrop":null,"Capabilities":null,"Dns":[],"DnsOptions":[],"DnsSearch":[],"ExtraHosts":null,"GroupAdd":null,"IpcMode":"private","Cgroup":"","Links":null,"OomScoreAdj":0,"PidMode":"","Privileged":false,"PublishAllPorts":false,"ReadonlyRootfs":false,"SecurityOpt":null,"UTSMode":"","UsernsMode":"","ShmSize":67108864,"Runtime":"runc","ConsoleSize":[0,0],"Isolation":"","CpuShares":0,"Memory":0,"NanoCpus":0,"CgroupParent":"","BlkioWeight":0,"BlkioWeightDevice":[],"BlkioDeviceReadBps":null,"BlkioDeviceWriteBps":null,"BlkioDeviceReadIOps":null,"BlkioDeviceWriteIOps":null,"CpuPeriod":0,"CpuQuota":0,"CpuRealtimePeriod":0,"CpuRealtimeRuntime":0,"CpusetCpus":"","CpusetMems":"","Devices":[],"DeviceCgroupRules":null,"DeviceRequests":null,"KernelMemory":0,"KernelMemoryTCP":0,"MemoryReservation":0,"MemorySwap":0,"MemorySwappiness":null,"OomKillDisable":false,"PidsLimit":null,"Ulimits":null,"CpuCount":0,"CpuPercent":0,"IOMaximumIOps":0,"IOMaximumBandwidth":0,"MaskedPaths":["/proc/asound","/proc/acpi","/proc/kcore","/proc/keys","/proc/latency_stats","/proc/timer_list","/proc/timer_stats","/proc/sched_debug","/proc/scsi","/sys/firmware"],"ReadonlyPaths":["/proc/bus","/proc/fs","/proc/irq","/proc/sys","/proc/sysrq-trigger"]} [root@osestaging1 discourse]# sed -i 's/"CapAdd":null/"CapAdd":"NET_ADMIN"/' /var/lib/docker/containers/$id/hostconfig.json [root@osestaging1 discourse]# grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json {"Binds":null,"ContainerIDFile":"","LogConfig":{"Type":"json-file","Config":{}},"NetworkMode":"default","PortBindings":{},"RestartPolicy":{"Name":"no","MaximumRetryCount":0},"AutoRemove":false,"VolumeDriver":"","VolumesFrom":null,"CapAdd":"NET_ADMIN","CapDrop":null,"Capabilities":null,"Dns":[],"DnsOptions":[],"DnsSearch":[],"ExtraHosts":null,"GroupAdd":null,"IpcMode":"private","Cgroup":"","Links":null,"OomScoreAdj":0,"PidMode":"","Privileged":false,"PublishAllPorts":false,"ReadonlyRootfs":false,"SecurityOpt":null,"UTSMode":"","UsernsMode":"","ShmSize":67108864,"Runtime":"runc","ConsoleSize":[0,0],"Isolation":"","CpuShares":0,"Memory":0,"NanoCpus":0,"CgroupParent":"","BlkioWeight":0,"BlkioWeightDevice":[],"BlkioDeviceReadBps":null,"BlkioDeviceWriteBps":null,"BlkioDeviceReadIOps":null,"BlkioDeviceWriteIOps":null,"CpuPeriod":0,"CpuQuota":0,"CpuRealtimePeriod":0,"CpuRealtimeRuntime":0,"CpusetCpus":"","CpusetMems":"","Devices":[],"DeviceCgroupRules":null,"DeviceRequests":null,"KernelMemory":0,"KernelMemoryTCP":0,"MemoryReservation":0,"MemorySwap":0,"MemorySwappiness":null,"OomKillDisable":false,"PidsLimit":null,"Ulimits":null,"CpuCount":0,"CpuPercent":0,"IOMaximumIOps":0,"IOMaximumBandwidth":0,"MaskedPaths":["/proc/asound","/proc/acpi","/proc/kcore","/proc/keys","/proc/latency_stats","/proc/timer_list","/proc/timer_stats","/proc/sched_debug","/proc/scsi","/sys/firmware"],"ReadonlyPaths":["/proc/bus","/proc/fs","/proc/irq","/proc/sys","/proc/sysrq-trigger"]} [root@osestaging1 discourse]# systemctl restart docker [root@osestaging1 discourse]# id=`docker inspect --format=".Id" discourse_ose` [root@osestaging1 discourse]# grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json {"Binds":null,"ContainerIDFile":"","LogConfig":{"Type":"json-file","Config":{}},"NetworkMode":"default","PortBindings":{},"RestartPolicy":{"Name":"no","MaximumRetryCount":0},"AutoRemove":false,"VolumeDriver":"","VolumesFrom":null,"CapAdd":["NET_ADMIN"],"CapDrop":null,"Capabilities":null,"Dns":[],"DnsOptions":[],"DnsSearch":[],"ExtraHosts":null,"GroupAdd":null,"IpcMode":"private","Cgroup":"","Links":null,"OomScoreAdj":0,"PidMode":"","Privileged":false,"PublishAllPorts":false,"ReadonlyRootfs":false,"SecurityOpt":null,"UTSMode":"","UsernsMode":"","ShmSize":67108864,"Runtime":"runc","ConsoleSize":[0,0],"Isolation":"","CpuShares":0,"Memory":0,"NanoCpus":0,"CgroupParent":"","BlkioWeight":0,"BlkioWeightDevice":[],"BlkioDeviceReadBps":null,"BlkioDeviceWriteBps":null,"BlkioDeviceReadIOps":null,"BlkioDeviceWriteIOps":null,"CpuPeriod":0,"CpuQuota":0,"CpuRealtimePeriod":0,"CpuRealtimeRuntime":0,"CpusetCpus":"","CpusetMems":"","Devices":[],"DeviceCgroupRules":null,"DeviceRequests":null,"KernelMemory":0,"KernelMemoryTCP":0,"MemoryReservation":0,"MemorySwap":0,"MemorySwappiness":null,"OomKillDisable":false,"PidsLimit":null,"Ulimits":null,"CpuCount":0,"CpuPercent":0,"IOMaximumIOps":0,"IOMaximumBandwidth":0,"MaskedPaths":["/proc/asound","/proc/acpi","/proc/kcore","/proc/keys","/proc/latency_stats","/proc/timer_list","/proc/timer_stats","/proc/sched_debug","/proc/scsi","/sys/firmware"],"ReadonlyPaths":["/proc/bus","/proc/fs","/proc/irq","/proc/sys","/proc/sysrq-trigger"]} [root@osestaging1 discourse]#
- and now the first boot lets root use iptables, yay!
root@91549063df64:/# iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination # Warning: iptables-legacy tables present, use iptables-legacy to see them root@91549063df64:/#
- problem is, I think we lost a lot of important stuff (like persistent volumes) from the original run
+ /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e UNICORN_WORKERS=2 -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=osediscourse_2019@michaelaltfield.net -e DISCOURSE_SMTP_ADDRESS=172.17.0.1 -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-discourse-ose -e DOCKER_HOST_IP=172.17.0.1 --name discourse_ose -t -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:fc:97:b8:b4:0d local_discourse/discourse_ose /sbin/boot
- let's just use launcher commands then; this works
/var/discourse/launcher destroy discourse_ose time nice /var/discourse/launcher bootstrap discourse_ose # create hostconfig.json and grant the container NET_ADMIN permissions (for iptables) # this hack is necessary because `docker start` doesn't take --cap-add /var/discourse/launcher start discourse_ose sleep 1 /var/discourse/launcher stop discourse_ose id=`docker inspect --format=".Id" discourse_ose` grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json sed -i 's/"CapAdd":null/"CapAdd":"NET_ADMIN"/' /var/lib/docker/containers/$id/hostconfig.json grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/$id/hostconfig.json systemctl restart docker /var/discourse/launcher start discourse_ose
- but when I put my iptables scrpt in runlevel 1, I still get stuck in a restarting bootloop. Because of this, I can't actually use the launcher to enter the container and test it out
[root@osestaging1 discourse]# ./launcher enter discourse_ose Error response from daemon: Container bac54f66d4e39052bcede900d734380ef82725fe835f5e3b18da6c0a704e7e5b is restarting, wait until the container is running [root@osestaging1 discourse]#
- but I was able to enter the 'discourse_local/discourse_ose' container (whatever that is) and confirm that it doesn't have NET_ADMIN capabilities--whatever *that* is
[root@osestaging1 discourse]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES bac54f66d4e3 local_discourse/discourse_ose "/sbin/boot" 7 minutes ago Restarting (100) Less than a second ago discourse_ose [root@osestaging1 discourse]# docker run -it bac54f66d4e3 /bin/bash Unable to find image 'bac54f66d4e3:latest' locally docker: Error response from daemon: pull access denied for bac54f66d4e3, repository does not exist or may require 'docker login': denied: requested access to the resource is denied. See 'docker run --help'. [root@osestaging1 discourse]# docker run -it local_discourse/discourse_ose /bin/bash root@24a1f9f4c038:/# iptables -L bash: iptables: command not found root@24a1f9f4c038:/# apt-get install iptables ... root@24a1f9f4c038:/# iptables -L # Warning: iptables-legacy tables present, use iptables-legacy to see them iptables: Permission denied (you must be root). root@24a1f9f4c038:/#
- ok, so it looks like `docker run` does take a '--cap-add' command, and since `docker start` appears to just reference the container that's "created" (?) when that `launcher` script runs that long `docker run` command, perhaps we add it there. this critical line occurs at the very end of the runs_start() function in `launcher` https://docs.docker.com/engine/reference/commandline/run/
+ /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e UNICORN_WORKERS=2 -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=osediscourse_2019@michaelaltfield.net -e DISCOURSE_SMTP_ADDRESS=172.17.0.1 -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-discourse-ose -e DOCKER_HOST_IP=172.17.0.1 --name discourse_ose -t -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:fc:97:b8:b4:0d local_discourse/discourse_ose /sbin/boot
- that appears to have worked, but my iptables runit script is still causing the container to restart loop
- let me remove that iptables template and connect to the machine on first boot to see if it really does have the NET_ADMIN capacity; it does!
[root@osestaging1 discourse]# time nice /var/discourse/launcher bootstrap discourse_ose ... Successfully bootstrapped, to startup use ./launcher start discourse_ose real 0m24.815s user 0m1.690s sys 0m1.386s [root@osestaging1 discourse]# /var/discourse/launcher start discourse_ose ... grep: /var/lib/docker/containers/sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63/hostconfig.json: No such file or directory + echo 'INFO: checking hostconfig capacities before 16' INFO: checking hostconfig capacities before 16 ++ docker inspect '--format=.Id' discourse_ose + id=sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63 + grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63/hostconfig.json grep: /var/lib/docker/containers/sha256:940c0024cbd795d6f66c64ebbb1ab96b52a3051393d7d8dd85451f94b5f89c63/hostconfig.json: No such file or directory + echo 'run_image: local_discourse/discourse_ose' run_image: local_discourse/discourse_ose + /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e UNICORN_WORKERS=2 -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=osediscourse_2019@michaelaltfield.net -e DISCOURSE_SMTP_ADDRESS=172.17.0.1 -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-discourse-ose --cap-add NET_ADMIN -e DOCKER_HOST_IP=172.17.0.1 --name discourse_ose -t -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:fc:97:b8:b4:0d local_discourse/discourse_ose /sbin/boot ed44b77cea78b62b9a2d7114d30eb3d4c9b4b2123296a0b019ecff132f743bd9 ++ docker inspect '--format=.Id' discourse_ose + id=ed44b77cea78b62b9a2d7114d30eb3d4c9b4b2123296a0b019ecff132f743bd9 + grep -E 'CapAdd|CapDrop|Capabilities' /var/lib/docker/containers/ed44b77cea78b62b9a2d7114d30eb3d4c9b4b2123296a0b019ecff132f743bd9/hostconfig.json {"Binds":["/var/discourse/shared/standalone:/shared","/var/discourse/shared/standalone/log/var-log:/var/log"],"ContainerIDFile":"","LogConfig":{"Type":"json-file","Config":{}},"NetworkMode":"default","PortBindings":{},"RestartPolicy":{"Name":"always","MaximumRetryCount":0},"AutoRemove":false,"VolumeDriver":"","VolumesFrom":null,"CapAdd":["NET_ADMIN"],"CapDrop":null,"Capabilities":null,"Dns":[],"DnsOptions":[],"DnsSearch":[],"ExtraHosts":null,"GroupAdd":null,"IpcMode":"private","Cgroup":"","Links":null,"OomScoreAdj":0,"PidMode":"","Privileged":false,"PublishAllPorts":false,"ReadonlyRootfs":false,"SecurityOpt":null,"UTSMode":"","UsernsMode":"","ShmSize":536870912,"Runtime":"runc","ConsoleSize":[0,0],"Isolation":"","CpuShares":0,"Memory":0,"NanoCpus":0,"CgroupParent":"","BlkioWeight":0,"BlkioWeightDevice":[],"BlkioDeviceReadBps":null,"BlkioDeviceWriteBps":null,"BlkioDeviceReadIOps":null,"BlkioDeviceWriteIOps":null,"CpuPeriod":0,"CpuQuota":0,"CpuRealtimePeriod":0,"CpuRealtimeRuntime":0,"CpusetCpus":"","CpusetMems":"","Devices":[],"DeviceCgroupRules":null,"DeviceRequests":null,"KernelMemory":0,"KernelMemoryTCP":0,"MemoryReservation":0,"MemorySwap":0,"MemorySwappiness":null,"OomKillDisable":false,"PidsLimit":null,"Ulimits":null,"CpuCount":0,"CpuPercent":0,"IOMaximumIOps":0,"IOMaximumBandwidth":0,"MaskedPaths":["/proc/asound","/proc/acpi","/proc/kcore","/proc/keys","/proc/latency_stats","/proc/timer_list","/proc/timer_stats","/proc/sched_debug","/proc/scsi","/sys/firmware"],"ReadonlyPaths":["/proc/bus","/proc/fs","/proc/irq","/proc/sys","/proc/sysrq-trigger"]} [root@osestaging1 discourse]# [root@osestaging1 discourse]# ./launcher enter discourse_ose root@osestaging1-discourse-ose:/# sudo apt-get install iptables root@osestaging1-discourse-ose:/# iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination # Warning: iptables-legacy tables present, use iptables-legacy to see them root@osestaging1-discourse-ose:/#
- I manually added the runit script, and it executed fine!
... root@osestaging1-discourse-ose:/# chmod +x /etc/runit/1.d/000-iptables root@osestaging1-discourse-ose:/# /etc/runit/1.d/000-iptables Reading package lists... Done Building dependency tree Reading state information... Done iptables is already the newest version (1.8.2-4). 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. root@osestaging1-discourse-ose:/# echo $? 0 root@osestaging1-discourse-ose:/# iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination ACCEPT all -- anywhere anywhere owner UID match root ACCEPT all -- anywhere anywhere owner UID match _apt DROP all -- anywhere anywhere # Warning: iptables-legacy tables present, use iptables-legacy to see them root@osestaging1-discourse-ose:/#
- well, now that I've fixed all possible issues with not having NET_ADMIN permissions on the first boot by adding the '--cap-add=NET_ADMIN' argument to the final `docker run` line of the `run_start()` function of the `launcher` script (which applies the NET_ADMIN capacity to all subsequent `docker start`calls wrapped by `launcher start discourse_ose`), I should be able to isolate it to a single line on the runit script.
- tomorrow I'll try to just make my script a single line = `exit 0`. That should work. Then I can add the `apt-get` line. Then a single `iptables` line
Mon Nov 25, 2019
- Our post on the offline wiki archiving and video on how to view the OSE wiki offline on the phone using kiwix is live https://www.opensourceecology.org/wp-admin/post.php?post=11468&action=edit
- ...
- Because it was unclear to me when the Dockerfile was actually run (I thought it was run when doing a `docker pull`) and I didn't find an easy answer to this question on the Internet, I posted a question & answer to serverfault to help future docker n00bs on this https://serverfault.com/questions/993177/what-is-responsible-for-calling-and-running-a-dockerfile/993178#993178
- ...
- I was about to post to the discourse forms describing my solution to use the docker host via SMTP and no auth, but I found this https://meta.discourse.org/t/how-to-set-smtp-config-to-use-localhost/131464
- I'm still not sure how durable it is to explicitly set the docker host IP as 172.17.0.1, but this other user had the same IP, so I guess it's pretty common at least
- I ended up documenting this on the main topic for troubleshooting email on a new Discourse install here https://meta.discourse.org/t/troubleshooting-email-on-a-new-discourse-install/16326/375
- ...
- there's a new update available for both discourse and the 'docker_manager' plugin. I've wanted to test the update procedure when the docker container has no internet access (only the docker host does), so let's figure that out
- Jesus, I wanted to simply note "we're on version X" and "the latest version is Y", but appraently Discourse is too fucking complex than that.
- In fact, we *are* running the latest version = v2.4.0.beta7, which is visible from the '/admin' page on the Discourse page https://github.com/discourse/discourse/releases
- But apparently the updates that we're behind-on are 88 new commits to the 'discourse' repo https://github.com/discourse/discourse/compare/84107c61a7...22eb1828f6
- does that mean that if I update Discourse it would get all of these commits, rather than them being put into a properly tested release? Jesus, this is sketchy. Note that the Discourse team doesn't actually roll fixes that break the previous stable release, so it's recommended that you stick with the beta version. God this sucks https://meta.discourse.org/t/please-dont-pressure-self-installers-to-be-on-beta-branch/32237/4
- And 1 commit to the 'docker_manager' plugin. https://github.com/discourse/docker_manager/compare/e4c82d3...bc4318f
- this one makes sense, but wouldn't the 'discourse' repo just be part of the main release version?
- first up: let me reverse my changes that gave the docker containers internet access (which I did for my poc setting up ModSecurity support in nginx from within the container so I could, for example, download ModSecurity from within the container).
[root@osestaging1 discourse]# grep -ir ExecStart /usr/lib/systemd/system/docker.service #ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --iptables=false [root@osestaging1 discourse]# systemctl restart docker.service Warning: docker.service changed on disk. Run 'systemctl daemon-reload' to reload units. ^C [root@osestaging1 discourse]# ^C [root@osestaging1 discourse]# systemctl daemon-reload [root@osestaging1 discourse]# systemctl restart docker.service [root@osestaging1 discourse]# iptables-save # Generated by iptables-save v1.4.21 on Mon Nov 25 11:38:32 2019 *mangle :PREROUTING ACCEPT [1469:151806] :INPUT ACCEPT [1469:151806] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [1126:194886] :POSTROUTING ACCEPT [1120:194526] COMMIT # Completed on Mon Nov 25 11:38:32 2019 # Generated by iptables-save v1.4.21 on Mon Nov 25 11:38:32 2019 *nat :PREROUTING ACCEPT [0:0] :INPUT ACCEPT [0:0] :OUTPUT ACCEPT [4:304] :POSTROUTING ACCEPT [4:304] :DOCKER - [0:0] -A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER -A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER -A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE -A DOCKER -i docker0 -j RETURN COMMIT # Completed on Mon Nov 25 11:38:32 2019 # Generated by iptables-save v1.4.21 on Mon Nov 25 11:38:32 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [4:304] :DOCKER - [0:0] :DOCKER-ISOLATION-STAGE-1 - [0:0] :DOCKER-ISOLATION-STAGE-2 - [0:0] :DOCKER-USER - [0:0] -A INPUT -s 5.9.144.234/32 -j DROP -A INPUT -s 173.234.159.250/32 -j DROP -A INPUT -i lo -j ACCEPT -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -p icmp -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 80 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 443 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 4443 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 4444 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 32415 -j ACCEPT -A INPUT -m limit --limit 5/min -j LOG --log-prefix "iptables IN denied: " --log-level 7 -A INPUT -j DROP -A FORWARD -j DOCKER-USER -A FORWARD -j DOCKER-ISOLATION-STAGE-1 -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -A FORWARD -o docker0 -j DOCKER -A FORWARD -i docker0 ! -o docker0 -j ACCEPT -A FORWARD -i docker0 -o docker0 -j ACCEPT -A FORWARD -s 5.9.144.234/32 -j DROP -A FORWARD -s 173.234.159.250/32 -j DROP -A OUTPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A OUTPUT -s 127.0.0.1/32 -d 127.0.0.1/32 -j ACCEPT -A OUTPUT -d 213.133.98.98/32 -p udp -m udp --dport 53 -j ACCEPT -A OUTPUT -d 213.133.99.99/32 -p udp -m udp --dport 53 -j ACCEPT -A OUTPUT -d 213.133.100.100/32 -p udp -m udp --dport 53 -j ACCEPT -A OUTPUT -m limit --limit 5/min -j LOG --log-prefix "iptables OUT denied: " --log-level 7 -A OUTPUT -p tcp -m owner --uid-owner 48 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 27 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 995 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 994 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 993 -j DROP -A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2 -A DOCKER-ISOLATION-STAGE-1 -j RETURN -A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP -A DOCKER-ISOLATION-STAGE-2 -j RETURN -A DOCKER-USER -j RETURN COMMIT # Completed on Mon Nov 25 11:38:32 2019 [root@osestaging1 discourse]#
- hmm, that didn't work?
[root@osestaging1 discourse]# ./launcher enter discourse_ose root@osestaging1-discourse-ose:/var/www/discourse# ping 1.1.1.1 bash: ping: command not found root@osestaging1-discourse-ose:/var/www/discourse# curl 1.1.1.1 <html> <head><title>301 Moved Permanently</title></head> <body bgcolor="white"> <center><h1>301 Moved Permanently</h1></center> <hr><center>cloudflare-lb</center> </body> </html> root@osestaging1-discourse-ose:/var/www/discourse#
- stopping & starting the app didn't work either :\
[root@osestaging1 discourse]# ./launcher stop discourse_ose + /bin/docker stop -t 10 discourse_ose discourse_ose [root@osestaging1 discourse]# ./launcher start discourse_ose starting up existing container + /bin/docker start discourse_ose discourse_ose [root@osestaging1 discourse]# ./launcher enter discourse_ose root@osestaging1-discourse-ose:/var/www/discourse# curl 1.1.1.1 <html> <head><title>301 Moved Permanently</title></head> <body bgcolor="white"> <center><h1>301 Moved Permanently</h1></center> <hr><center>cloudflare-lb</center> </body> </html> root@osestaging1-discourse-ose:/var/www/discourse#
- destroy didn't do it either
[root@osestaging1 discourse]# ./launcher destroy discourse_ose + /bin/docker stop -t 10 discourse_ose discourse_ose + /bin/docker rm discourse_ose discourse_ose [root@osestaging1 discourse]# ./launcher start discourse_ose + /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=2 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=osediscourse_2019@michaelaltfield.net -e DISCOURSE_SMTP_ADDRESS=172.17.0.1 -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-discourse-ose -e DOCKER_HOST_IP=172.17.0.1 --name discourse_ose -t -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:fc:97:b8:b4:0d local_discourse/discourse_ose /sbin/boot d201a1a4c67111e84d095e2f712bc56fdaaab35375cfa09cf0865156c3b9b8f0 [root@osestaging1 discourse]# ./launcher enter discourse_ose root@osestaging1-discourse-ose:/var/www/discourse# curl 1.1.1.1 <html> <head><title>301 Moved Permanently</title></head> <body bgcolor="white"> <center><h1>301 Moved Permanently</h1></center> <hr><center>cloudflare-lb</center> </body> </html> root@osestaging1-discourse-ose:/var/www/discourse#
- I got it working; I think the key is to stop the docker containers first, then fix iptables, then start docker
[root@osestaging1 discourse]# systemctl start docker [root@osestaging1 discourse]# iptables-save # Generated by iptables-save v1.4.21 on Mon Nov 25 11:47:14 2019 *mangle :PREROUTING ACCEPT [835:70141] :INPUT ACCEPT [831:69901] :FORWARD ACCEPT [4:240] :OUTPUT ACCEPT [487:67065] :POSTROUTING ACCEPT [490:67245] COMMIT # Completed on Mon Nov 25 11:47:14 2019 # Generated by iptables-save v1.4.21 on Mon Nov 25 11:47:14 2019 *nat :PREROUTING ACCEPT [1:60] :INPUT ACCEPT [0:0] :OUTPUT ACCEPT [10:728] :POSTROUTING ACCEPT [10:728] COMMIT # Completed on Mon Nov 25 11:47:14 2019 # Generated by iptables-save v1.4.21 on Mon Nov 25 11:47:14 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [2:152] :DOCKER-USER - [0:0] -A INPUT -s 5.9.144.234/32 -j DROP -A INPUT -s 173.234.159.250/32 -j DROP -A INPUT -i lo -j ACCEPT -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -p icmp -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 80 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 443 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 4443 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 4444 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 32415 -j ACCEPT -A INPUT -m limit --limit 5/min -j LOG --log-prefix "iptables IN denied: " --log-level 7 -A INPUT -j DROP -A FORWARD -j DOCKER-USER -A FORWARD -s 5.9.144.234/32 -j DROP -A FORWARD -s 173.234.159.250/32 -j DROP -A OUTPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A OUTPUT -s 127.0.0.1/32 -d 127.0.0.1/32 -j ACCEPT -A OUTPUT -d 213.133.98.98/32 -p udp -m udp --dport 53 -j ACCEPT -A OUTPUT -d 213.133.99.99/32 -p udp -m udp --dport 53 -j ACCEPT -A OUTPUT -d 213.133.100.100/32 -p udp -m udp --dport 53 -j ACCEPT -A OUTPUT -m limit --limit 5/min -j LOG --log-prefix "iptables OUT denied: " --log-level 7 -A OUTPUT -p tcp -m owner --uid-owner 48 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 27 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 995 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 994 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 993 -j DROP -A DOCKER-USER -j RETURN COMMIT # Completed on Mon Nov 25 11:47:14 2019 [root@osestaging1 discourse]# ./launcher enter discourse_ose root@osestaging1-discourse-ose:/var/www/discourse# curl 1.1.1.1
- damn, now browsing to the same '/upgrade' page tells me that we *are* up-to-date. Why does it lie rather than just saying it can't access the updates page? https://discourse.opensourceecology.org/admin/upgrade
- for example, the plugin 'docker_manager' says we're on commit ' e4c82d3', but clearly there's a new commit 'bc4318f' after that https://github.com/discourse/docker_manager/commits/master
- so the update process seems pretty straight-forward. First do a `git pull` in /var/discourse, then I added my step to rebuild the docker image locally, then rebuild the app with the launcher script. None of that should require the docker container itself to have internet access, but apparently there is something that did: pups?
[root@osestaging1 discourse]# ${vhostDir}/launcher rebuild discourse_ose Ensuring launcher is up to date Fetching origin Launcher is up-to-date Stopping old container + /bin/docker stop -t 10 discourse_ose discourse_ose cd /pups && git pull && /pups/bin/pups --stdin fatal: unable to access 'https://github.com/discourse/pups.git/': Could not resolve host: github.com 12fca265a98f2b6eae828341d2d25f104a82282e1e8150c367751ff2240270a2 FAILED TO BOOTSTRAP please scroll up and look for earlier error messages, there may be more than one. ./discourse-doctor may help diagnose the problem. [root@osestaging1 discourse]#
- oddly, the only place where I saw this being done is in the Dockerfile, but I know from the past couple weeks fighting with Discourse that this does *not* get run unles a `docker build` is called, which I've done on the *host* above
[root@osestaging1 ~]# cd /var/discourse/ [root@osestaging1 discourse]# grep -ir 'pups.git' * image/base/Dockerfile: cd / && git clone https://github.com/discourse/pups.git [root@osestaging1 discourse]# grep -irC4 'pups.git' * image/base/Dockerfile-RUN gem update --system image/base/Dockerfile- image/base/Dockerfile-RUN gem install bundler --force &&\ image/base/Dockerfile- rm -rf /usr/local/share/ri/2.6.5/system &&\ image/base/Dockerfile: cd / && git clone https://github.com/discourse/pups.git image/base/Dockerfile- image/base/Dockerfile-ADD install-redis /tmp/install-redis image/base/Dockerfile-RUN /tmp/install-redis image/base/Dockerfile- [root@osestaging1 discourse]#
- honestly, this shouldn't even be necessary since I just rebuit the image above. But, a catch (!), docker will diff the previous image's Dockerfile and only run some things if its lines changed. But that's an issue if one of the steps is to do a git pull, since the `git pull` command won't change, but the actual source code at the endpoint on github.com did change!
- it looks like there's an option `--no-cache` for this https://stackoverflow.com/questions/35594987/how-to-force-docker-for-a-clean-build-of-an-image
- Unfortunately
[root@osestaging1 base]# docker build --tag 'discourse_ose' /var/discourse/image/base/ ... Step 6/61 : RUN apt update && apt install -y gnupg sudo curl ---> Running in 56360e585353 WARNING: apt does not have a stable CLI interface. Use with caution in scripts. Err:1 http://security.debian.org/debian-security buster/updates InRelease Temporary failure resolving 'security.debian.org' Err:2 http://deb.debian.org/debian buster InRelease Temporary failure resolving 'deb.debian.org' Err:3 http://deb.debian.org/debian buster-updates InRelease Temporary failure resolving 'deb.debian.org' Reading package lists... Building dependency tree... Reading state information... All packages are up to date. W: Failed to fetch http://deb.debian.org/debian/dists/buster/InRelease Temporary failure resolving 'deb.debian.org' W: Failed to fetch http://security.debian.org/debian-security/dists/buster/updates/InRelease Temporary failure resolving 'security.debian.org' W: Failed to fetch http://deb.debian.org/debian/dists/buster-updates/InRelease Temporary failure resolving 'deb.debian.org' W: Some index files failed to download. They have been ignored, or old ones used instead. WARNING: apt does not have a stable CLI interface. Use with caution in scripts. Reading package lists... Building dependency tree... Reading state information... Package gnupg is not available, but is referred to by another package. This may mean that the package is missing, has been obsoleted, or is only available from another source E: Package 'gnupg' has no installation candidate E: Unable to locate package sudo E: Unable to locate package curl The command '/bin/sh -c apt update && apt install -y gnupg sudo curl' returned a non-zero code: 100 [root@osestaging1 base]#
- ugh, so we want the container to have internet access at build time, but not at run time.
- probably what we'll want to do long-term is do this process on staging only, then the procedure for updating prod will be to copy the docker image from staging to production and then just run the final `launcher rebuild discourse_ose` command on production; this will need to be tested!
- I checked the config, and found that we have 3 networks that docker created https://docs.docker.com/network/
[root@osestaging1 discourse]# docker network ls NETWORK ID NAME DRIVER SCOPE 70656b6b4a55 bridge bridge local 330da33d9bfc host host local ae36b68b1cf6 none null local [root@osestaging1 discourse]#
- I got it to work by setting the docker network for the container at build time to "host"
[root@osestaging1 discourse]# docker build --no-cache --network=host --tag 'discourse_ose' /var/discourse/image/base/ ... Post-install message from discourse_image_optim: Rails image assets optimization is extracted into image_optim_rails gem You can safely remove `config.assets.image_optim = false` if you are not going to use that gem Downloading MaxMindDb's GeoLite2-City... Downloading MaxMindDb's GeoLite2-ASN... Removing intermediate container 4ac7f6e70f9b ---> 940c0024cbd7 Successfully built 940c0024cbd7 Successfully tagged discourse_ose:latest [root@osestaging1 base]#
- unfortunately the next rebuild step still failed!
[root@osestaging1 base]# popd /var/discourse [root@osestaging1 discourse]# ${vhostDir}/launcher rebuild discourse_ose Ensuring launcher is up to date Fetching origin Launcher is up-to-date Stopping old container + /bin/docker stop -t 10 discourse_ose discourse_ose cd /pups && git pull && /pups/bin/pups --stdin fatal: unable to access 'https://github.com/discourse/pups.git/': Could not resolve host: github.com 44d1a10e6e41e9ea314e911638ce8282c1ef43cf5d18079a4012cbce9e6445c9 FAILED TO BOOTSTRAP please scroll up and look for earlier error messages, there may be more than one. ./discourse-doctor may help diagnose the problem. [root@osestaging1 discourse]#
- tbqh, this pups update step is entirely unnecessary since we just baked in those changes into the damn image. how do we strip it out? I couldn't find anything on the container itself (like a cron job or init script) that is triggering this attempt to update pups
[root@osestaging1 discourse]# docker start discourse_ose discourse_ose [root@osestaging1 discourse]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES d201a1a4c671 bf23c0a7cb70 "/sbin/boot" 2 hours ago Up About a minute discourse_ose [root@osestaging1 discourse]# docker exec -it d201a1a4c671 /bin/bash root@osestaging1-discourse-ose:/#
- ah, it looks like it's being done by the launcher script itself
[root@osestaging1 discourse]# grep pups launcher update_pups=`cat $config_file | $docker_path run $user_args --rm -i -a stdin -a stdout $image ruby -e \ "require 'yaml'; puts YAML.load(STDIN.readlines.join)['update_pups']"` run_command="cd /pups &&" if ! "false" = $update_pups ; then run_command="$run_command /pups/bin/pups --stdin" [root@osestaging1 discourse]#
- commenting-out the git pull line worked
[root@osestaging1 discourse]# grep -C1 'git pull' launcher if ! "false" = $update_pups ; then #run_command="$run_command git pull &&" run_command="echo 'skipping pups git pull'" fi -- echo "Updating Launcher" git pull || (echo 'failed to update' && exit 1) [root@osestaging1 discourse]#
- that worked!
[root@osestaging1 discourse]# time ./launcher rebuild discourse_ose Ensuring launcher is up to date Fetching origin Launcher is up-to-date Stopping old container + /bin/docker stop -t 10 discourse_ose discourse_ose echo 'skipping pups git pull' /pups/bin/pups --stdin skipping pups git pull /pups/bin/pups --stdin sha256:3f97048def9ae8c80c689f8278db1f302cc6dedf611dbaa6a42d2ef600cf0407 23dd166f10cf6ed8f727ca6e8737bc7136bf33d0369c7acbd96d2ffff8fc555e Removing old container + /bin/docker rm discourse_ose discourse_ose + /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=2 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=osediscourse_2019@michaelaltfield.net -e DISCOURSE_SMTP_ADDRESS=172.17.0.1 -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-discourse-ose -e DOCKER_HOST_IP=172.17.0.1 --name discourse_ose -t -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:fc:97:b8:b4:0d local_discourse/discourse_ose /sbin/boot b7ac0e6856f621b632059dd8caec15582bf1a4a4092924aee2baa5c569d069cc real 1m22.172s user 0m2.603s sys 0m2.188s [root@osestaging1 discourse]#
- but, damn, the site didn't come up. Further digging shows that nginx isn't started. Not only that, but there's now no runit service for nginx!
root@osestaging1-discourse-ose:/# ps -ef | grep -i nginx root 73 35 0 14:02 pts/1 00:00:00 grep -i nginx root@osestaging1-discourse-ose:/# nginx -t nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: configuration file /etc/nginx/nginx.conf test is successful root@osestaging1-discourse-ose:/# sv start nginx fail: nginx: unable to change to service directory: file does not exist root@osestaging1-discourse-ose:/# ls -lah /etc/runit/* -rwxr-xr-x. 1 root root 81 Oct 28 12:07 /etc/runit/1 -rwxr-xr-x. 1 root root 51 Oct 28 12:07 /etc/runit/2 -rwxr-xr-x. 1 root root 53 Oct 28 12:07 /etc/runit/3 /etc/runit/1.d: total 24K drwxr-xr-x. 1 root root 4.0K Nov 25 13:22 . drwxr-xr-x. 1 root root 4.0K Nov 25 13:22 .. -rwxr-xr-x. 1 root root 321 Oct 28 12:07 00-fix-var-logs -rwxr-xr-x. 1 root root 33 Oct 28 12:07 anacron -rwxr-xr-x. 1 root root 75 Oct 28 12:07 cleanup-pids /etc/runit/3.d: total 16K drwxr-xr-x. 2 root root 4.0K Nov 25 13:22 . drwxr-xr-x. 1 root root 4.0K Nov 25 13:22 .. /etc/runit/runsvdir: total 24K drwxr-xr-x. 1 root root 4.0K Nov 25 12:49 . drwxr-xr-x. 1 root root 4.0K Nov 25 13:22 .. drwxr-xr-x. 1 root root 4.0K Nov 25 13:22 default root@osestaging1-discourse-ose:/# ls -lah /etc/runit/runsvdir/default/ total 32K drwxr-xr-x. 1 root root 4.0K Nov 25 13:22 . drwxr-xr-x. 1 root root 4.0K Nov 25 12:49 .. drwxr-xr-x. 1 root root 4.0K Nov 25 13:56 cron drwxr-xr-x. 1 root root 4.0K Nov 25 13:56 rsyslog root@osestaging1-discourse-ose:/#
- so my guess is that, for whatever reason, the templates defined in the discourse_ose.yaml file didn't get called
- I guess my solution for changing the run_command to an echo broke something later on; I found that it actually continued with the bootstrap if I instead just set update_pups to false before the if statement
run_command="cd /pups &&" update_pups="false" if ! "false" = $update_pups ; then run_command="$run_command git pull &&" fi run_command="$run_command /pups/bin/pups --stdin"
- but now there's a step in the templates/web.templates.yml file that's failing; it's trying to do a `git pull`. again, we just did all that when we built the image, and we don't want the container to have internet access.
- I'm beginning to rethink our solution to preventing the container from having internet access. While it's not a VM, it does have its own set of packages. Therefore, if the debian base on which this Discourse docker container has an outdated package, then it wouldn't be auto-updated (the Discourse docs make a point to enable debian's unattended-upgrades package so that the container [or maybe they meant the host?] gets security updates). Disabling the entire container's internet access was just a hack to prevent docker from fucking up our iptables rules. Usually what I do is block the web server's user at the firewall level. I should probably just do this in the container as well *sigh*
- ...
- ok, now let's see if I can get docker behind varnish
- I skipped this while waiting for feedback from the Discourse community on the topic I posted about putting Docker behind a cache like varnish, but I since discovered that the Discourese developer community is pretty toxic and just wants to tell me essentiall "don't do that" and "pay me." So now we proceed on our own https://meta.discourse.org/t/discourse-purge-cache-method-on-content-changes/132917
- The only person who appears to have done this is Lee at Ars Technica. They finally responded saying that they only cached static assets https://meta.discourse.org/t/discourse-purge-cache-method-on-content-changes/132917/20
- I think I'll try to push it further: let's cache everything (for non-logged-in users, of course) for 1 minute (which is the default for the built-in Discourse ANON_CACHE_DURATION) and see what happens..
- well damn, first off, it appears that varnish doesn't support connecting to unix domain sockets until vcl 4.1, which was released in Varnish 6.0. We're only on Varnish 4.0.5-1, which is the latest in Cent repos https://varnish-cache.org/docs/6.0/whats-new/changes-6.0.html
[root@osestaging1 sites-enabled]# rpm -qa | grep -i varnish varnish-libs-4.0.5-1.el7.x86_64 varnish-libs-devel-4.0.5-1.el7.x86_64 varnish-4.0.5-1.el7.x86_64 [root@osestaging1 sites-enabled]# yum install varnish Loaded plugins: fastestmirror, replace Loading mirror speeds from cached hostfile * base: mirror.checkdomain.de * epel: mirror.23media.com * extras: mirror.checkdomain.de * updates: mirror.fra10.de.leaseweb.net * webtatic: uk.repo.webtatic.com Package varnish-4.0.5-1.el7.x86_64 already installed and latest version Nothing to do [root@osestaging1 sites-enabled]#
- so it looks like we'll have to have nginx listen on 127.0.0.1:8020. note that currently our prod server has apache listening on port 8000 for name-based-vhosts (defaulting to www.opensourceecology.org), and 127.0.0.1:8010 for certbot to validate on our private vhosts (munin, awstats, etc) which is exposed over port 443 while admins actually access it on port 4443. We'll just follow that standard and have discourse listen on 127.0.0.1:8020 and expose that to the docker host, hopefully only accessible to 127.0.0.1 on the docker host
- first I re-enabled docker to set iptables rules, reloaded the systemd config, and restarted the docker service
[root@osestaging1 templates]# systemctl stop docker [root@osestaging1 templates]# grep ExecStart /usr/lib/systemd/system/docker.service ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock #ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --iptables=false [root@osestaging1 templates]# [root@osestaging1 templates]# systemctl daemon-reload [root@osestaging1 templates]# systemctl start docker [root@osestaging1 templates]#
- I did this manually first on the container; I'll roll this into a template file soon. note that the base debian image for the docker container is so basic that it didn't come with ufw; no need to remove it before installing iptables
root@osestaging1-discourse-ose:/# apt-get install iptables
- oddly, I couldn't list the iptables rules as root. It said I wasn't root. hmm
root@osestaging1-discourse-ose:/# iptables -L # Warning: iptables-legacy tables present, use iptables-legacy to see them iptables: Permission denied (you must be root). root@osestaging1-discourse-ose:/# sudo iptables -L # Warning: iptables-legacy tables present, use iptables-legacy to see them iptables: Permission denied (you must be root). root@osestaging1-discourse-ose:/#
- ok, so it looks like I need to add the NET_ADMIN capacity to the docker container https://stackoverflow.com/questions/41706983/installing-iptables-in-docker-container-based-on-alpinelinux
- that link doesn't make clear exactly how the hell to set the container to use that capacity (what command do you add --cap-add=NET_ADMIN *to*?)
- this solution says you can edit the json file directly; fuck that https://stackoverflow.com/questions/38758627/how-can-we-add-capabilities-to-a-running-docker-container
- I found that I can edit the image directly with `docker run` docker run --cap-add=NET_ADMIN discourse_ose
[root@osestaging1 discourse]# docker run --cap-add=NET_ADMIN discourse_ose
- honestly, I don't think that edits containers launched from the image; I think it's only relevant to one-time commands
[root@osestaging1 discourse]# ./launcher stop discourse_ose + /bin/docker stop -t 10 discourse_ose discourse_ose [root@osestaging1 discourse]# docker run --cap-add=NET_ADMIN discourse_ose [root@osestaging1 discourse]# ./launcher start discourse_ose starting up existing container + /bin/docker start discourse_ose discourse_ose [root@osestaging1 discourse]# ./l launcher launcher.20191118_122249 launcher.20191118.orig launcher.new libbrotli/ [root@osestaging1 discourse]# ./launcher enter discourse_ose root@osestaging1-discourse-ose:/# iptables -L # Warning: iptables-legacy tables present, use iptables-legacy to see them iptables: Permission denied (you must be root). root@osestaging1-discourse-ose:/#
- indeed, the CapAdd in the json is still null
[root@osestaging1 discourse]# docker inspect discourse_ose | grep CapAdd "CapAdd": null, [root@osestaging1 discourse]#
- yeah, this did work
[root@osestaging1 discourse]# docker run --cap-add=NET_ADMIN discourse_ose /usr/bin/apt-get install -y iptables && iptables -L Reading package lists... Building dependency tree... Reading state information... The following additional packages will be installed: libip6tc0 libiptc0 libmnl0 libnetfilter-conntrack3 libnfnetlink0 libnftables0 libnftnl11 libxtables12 nftables Suggested packages: kmod The following NEW packages will be installed: iptables libip6tc0 libiptc0 libmnl0 libnetfilter-conntrack3 libnfnetlink0 libnftables0 libnftnl11 libxtables12 nftables 0 upgraded, 10 newly installed, 0 to remove and 0 not upgraded. Chain INPUT (policy ACCEPT) target prot opt source destination DROP all -- static.234.144.9.5.clients.your-server.de anywhere DROP all -- 173.234.159.250 anywhere ACCEPT all -- anywhere anywhere ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED ACCEPT icmp -- anywhere anywhere ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:http ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:https ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:pharos ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:krb524 ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:32415 LOG all -- anywhere anywhere limit: avg 5/min burst 5 LOG level debug prefix "iptables IN denied: " DROP all -- anywhere anywhere Chain FORWARD (policy ACCEPT) target prot opt source destination DOCKER-USER all -- anywhere anywhere DOCKER-ISOLATION-STAGE-1 all -- anywhere anywhere ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED DOCKER all -- anywhere anywhere ACCEPT all -- anywhere anywhere ACCEPT all -- anywhere anywhere DROP all -- static.234.144.9.5.clients.your-server.de anywhere DROP all -- 173.234.159.250 anywhere Chain OUTPUT (policy ACCEPT) target prot opt source destination ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED ACCEPT all -- localhost.localdomain localhost.localdomain ACCEPT udp -- anywhere ns1-coloc.hetzner.de udp dpt:domain ACCEPT udp -- anywhere ns2-coloc.hetzner.net udp dpt:domain ACCEPT udp -- anywhere ns3-coloc.hetzner.com udp dpt:domain LOG all -- anywhere anywhere limit: avg 5/min burst 5 LOG level debug prefix "iptables OUT denied: " DROP tcp -- anywhere anywhere owner UID match apache DROP tcp -- anywhere anywhere owner UID match mysql DROP tcp -- anywhere anywhere owner UID match varnish DROP tcp -- anywhere anywhere owner UID match hitch DROP tcp -- anywhere anywhere owner UID match nginx Chain DOCKER (1 references) target prot opt source destination Chain DOCKER-ISOLATION-STAGE-1 (1 references) target prot opt source destination DOCKER-ISOLATION-STAGE-2 all -- anywhere anywhere RETURN all -- anywhere anywhere Chain DOCKER-ISOLATION-STAGE-2 (1 references) target prot opt source destination DROP all -- anywhere anywhere RETURN all -- anywhere anywhere Chain DOCKER-USER (1 references) target prot opt source destination RETURN all -- anywhere anywhere [root@osestaging1 discourse]#
- actually, no, that's running the iptables command on my docker host; it's not taking the "&&" inside the `docker run` command
- I had issues stringing commands together (putting quotes around didn't work) with the docker run command, and found it much easier to give myself a shell. This worked!
[root@osestaging1 discourse]# docker run --cap-add=NET_ADMIN -it discourse_ose /bin/bash root@ef4f90be07e6:/# apt-get install -y iptables Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: libip6tc0 libiptc0 libmnl0 libnetfilter-conntrack3 libnfnetlink0 libnftables0 libnftnl11 libxtables12 nftables Suggested packages: kmod The following NEW packages will be installed: iptables libip6tc0 libiptc0 libmnl0 libnetfilter-conntrack3 libnfnetlink0 libnftables0 libnftnl11 libxtables12 nftables 0 upgraded, 10 newly installed, 0 to remove and 0 not upgraded. Need to get 982 kB of archives. ... root@ef4f90be07e6:/# iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination # Warning: iptables-legacy tables present, use iptables-legacy to see them root@ef4f90be07e6:/#
- I tried again without the capacity added, and it failed as expected (note that iptables isn't installed on the next command; must be a new container)
[root@osestaging1 discourse]# docker run -it discourse_ose /bin/bash root@4a3b6e123460:/# iptables -L bash: iptables: command not found root@4a3b6e123460:/# apt-get install -y iptables ... root@4a3b6e123460:/# iptables -L # Warning: iptables-legacy tables present, use iptables-legacy to see them iptables: Permission denied (you must be root). root@4a3b6e123460:/#
- ok, but I can connect to the *existing* running container with `docker exec` (as opposed to `docker run`, which just starts a new container)
[root@osestaging1 discourse]# docker exec -it discourse_ose /bin/bash root@osestaging1-discourse-ose:/# iptables -L # Warning: iptables-legacy tables present, use iptables-legacy to see them iptables: Permission denied (you must be root). root@osestaging1-discourse-ose:/#
- unfortunately there's no '--cap-add' command for a `docker exec` run
[root@osestaging1 discourse]# docker exec -it --cap-add=NET_ADMIN discourse_ose /bin/bash unknown flag: --cap-add See 'docker exec --help'. [root@osestaging1 discourse]#
- and it's the same for the `docker start` command that ran it initially
[root@osestaging1 discourse]# docker start --cap-add=NET_ADMIN discourse_ose unknown flag: --cap-add See 'docker start --help'. [root@osestaging1 discourse]#
- fucking hell, I really think I just have to edit this damn json file directly per https://stackoverflow.com/questions/38758627/how-can-we-add-capabilities-to-a-running-docker-container
- first we get the id of the container
[root@osestaging1 discourse]# id=`docker inspect --format=".Id" discourse_ose` [root@osestaging1 discourse]# echo $id d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e [root@osestaging1 discourse]#
- and now we sed to change it
[root@osestaging1 discourse]# cd /var/lib/docker/containers/d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e[root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# cp hostconfig.json hostconfig.20191125.bak.orig [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# #sed -i 's/"CapAdd":null/"CapAdd":"NET_ADMIN"/' /var/lib/docker/containers/$id/hostconfig.json [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# cat hostconfig.json {"Binds":["/var/discourse/shared/standalone:/shared","/var/discourse/shared/standalone/log/var-log:/var/log"],"ContainerIDFile":"","LogConfig":{"Type":"json-file","Config":{}},"NetworkMode":"default","PortBindings":{},"RestartPolicy":{"Name":"always","MaximumRetryCount":0},"AutoRemove":false,"VolumeDriver":"","VolumesFrom":null,"CapAdd":null,"CapDrop":null,"Capabilities":null,"Dns":[],"DnsOptions":[],"DnsSearch":[],"ExtraHosts":null,"GroupAdd":null,"IpcMode":"private","Cgroup":"","Links":null,"OomScoreAdj":0,"PidMode":"","Privileged":false,"PublishAllPorts":false,"ReadonlyRootfs":false,"SecurityOpt":null,"UTSMode":"","UsernsMode":"","ShmSize":536870912,"Runtime":"runc","ConsoleSize":[0,0],"Isolation":"","CpuShares":0,"Memory":0,"NanoCpus":0,"CgroupParent":"","BlkioWeight":0,"BlkioWeightDevice":[],"BlkioDeviceReadBps":null,"BlkioDeviceWriteBps":null,"BlkioDeviceReadIOps":null,"BlkioDeviceWriteIOps":null,"CpuPeriod":0,"CpuQuota":0,"CpuRealtimePeriod":0,"CpuRealtimeRuntime":0,"CpusetCpus":"","CpusetMems":"","Devices":[],"DeviceCgroupRules":null,"DeviceRequests":null,"KernelMemory":0,"KernelMemoryTCP":0,"MemoryReservation":0,"MemorySwap":0,"MemorySwappiness":null,"OomKillDisable":false,"PidsLimit":null,"Ulimits":null,"CpuCount":0,"CpuPercent":0,"IOMaximumIOps":0,"IOMaximumBandwidth":0,"MaskedPaths":["/proc/asound","/proc/acpi","/proc/kcore","/proc/keys","/proc/latency_stats","/proc/timer_list","/proc/timer_stats","/proc/sched_debug","/proc/scsi","/sys/firmware"],"ReadonlyPaths":["/proc/bus","/proc/fs","/proc/irq","/proc/sys","/proc/sysrq-trigger"]} [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# sed -i 's/"CapAdd":null/"CapAdd":"NET_ADMIN"/' /var/lib/docker/containers/$id/hostconfig.json [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# cat hostconfig.json {"Binds":["/var/discourse/shared/standalone:/shared","/var/discourse/shared/standalone/log/var-log:/var/log"],"ContainerIDFile":"","LogConfig":{"Type":"json-file","Config":{}},"NetworkMode":"default","PortBindings":{},"RestartPolicy":{"Name":"always","MaximumRetryCount":0},"AutoRemove":false,"VolumeDriver":"","VolumesFrom":null,"CapAdd":"NET_ADMIN","CapDrop":null,"Capabilities":null,"Dns":[],"DnsOptions":[],"DnsSearch":[],"ExtraHosts":null,"GroupAdd":null,"IpcMode":"private","Cgroup":"","Links":null,"OomScoreAdj":0,"PidMode":"","Privileged":false,"PublishAllPorts":false,"ReadonlyRootfs":false,"SecurityOpt":null,"UTSMode":"","UsernsMode":"","ShmSize":536870912,"Runtime":"runc","ConsoleSize":[0,0],"Isolation":"","CpuShares":0,"Memory":0,"NanoCpus":0,"CgroupParent":"","BlkioWeight":0,"BlkioWeightDevice":[],"BlkioDeviceReadBps":null,"BlkioDeviceWriteBps":null,"BlkioDeviceReadIOps":null,"BlkioDeviceWriteIOps":null,"CpuPeriod":0,"CpuQuota":0,"CpuRealtimePeriod":0,"CpuRealtimeRuntime":0,"CpusetCpus":"","CpusetMems":"","Devices":[],"DeviceCgroupRules":null,"DeviceRequests":null,"KernelMemory":0,"KernelMemoryTCP":0,"MemoryReservation":0,"MemorySwap":0,"MemorySwappiness":null,"OomKillDisable":false,"PidsLimit":null,"Ulimits":null,"CpuCount":0,"CpuPercent":0,"IOMaximumIOps":0,"IOMaximumBandwidth":0,"MaskedPaths":["/proc/asound","/proc/acpi","/proc/kcore","/proc/keys","/proc/latency_stats","/proc/timer_list","/proc/timer_stats","/proc/sched_debug","/proc/scsi","/sys/firmware"],"ReadonlyPaths":["/proc/bus","/proc/fs","/proc/irq","/proc/sys","/proc/sysrq-trigger"]} [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]#
- now we connect to it, but there's still no NET_ADMIN capacity :(
[root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# docker start discourse_ose discourse_ose [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# docker exec -it discourse_ose /bin/bash root@osestaging1-discourse-ose:/# iptables -L
- Warning: iptables-legacy tables present, use iptables-legacy to see them
iptables: Permission denied (you must be root). root@osestaging1-discourse-ose:/#
- and, ffs, it got rid of my changes to hostconfig.json
[root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# docker exec -it discourse_ose /bin/bash root@osestaging1-discourse-ose:/# iptables -L # Warning: iptables-legacy tables present, use iptables-legacy to see them iptables: Permission denied (you must be root). root@osestaging1-discourse-ose:/#
- I did not put the capacity in an array; I tried that, but it still failed
[root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# docker stop discourse_ose discourse_ose [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# sed -i 's/"CapAdd":null/"CapAdd":["NET_ADMIN"]/' /var/lib/docker/containers/$id/hostconfig.json [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# cat hostconfig.json {"Binds":["/var/discourse/shared/standalone:/shared","/var/discourse/shared/standalone/log/var-log:/var/log"],"ContainerIDFile":"","LogConfig":{"Type":"json-file","Config":{}},"NetworkMode":"default","PortBindings":{},"RestartPolicy":{"Name":"always","MaximumRetryCount":0},"AutoRemove":false,"VolumeDriver":"","VolumesFrom":null,"CapAdd":["NET_ADMIN"],"CapDrop":null,"Capabilities":null,"Dns":[],"DnsOptions":[],"DnsSearch":[],"ExtraHosts":null,"GroupAdd":null,"IpcMode":"private","Cgroup":"","Links":null,"OomScoreAdj":0,"PidMode":"","Privileged":false,"PublishAllPorts":false,"ReadonlyRootfs":false,"SecurityOpt":null,"UTSMode":"","UsernsMode":"","ShmSize":536870912,"Runtime":"runc","ConsoleSize":[0,0],"Isolation":"","CpuShares":0,"Memory":0,"NanoCpus":0,"CgroupParent":"","BlkioWeight":0,"BlkioWeightDevice":[],"BlkioDeviceReadBps":null,"BlkioDeviceWriteBps":null,"BlkioDeviceReadIOps":null,"BlkioDeviceWriteIOps":null,"CpuPeriod":0,"CpuQuota":0,"CpuRealtimePeriod":0,"CpuRealtimeRuntime":0,"CpusetCpus":"","CpusetMems":"","Devices":[],"DeviceCgroupRules":null,"DeviceRequests":null,"KernelMemory":0,"KernelMemoryTCP":0,"MemoryReservation":0,"MemorySwap":0,"MemorySwappiness":null,"OomKillDisable":false,"PidsLimit":null,"Ulimits":null,"CpuCount":0,"CpuPercent":0,"IOMaximumIOps":0,"IOMaximumBandwidth":0,"MaskedPaths":["/proc/asound","/proc/acpi","/proc/kcore","/proc/keys","/proc/latency_stats","/proc/timer_list","/proc/timer_stats","/proc/sched_debug","/proc/scsi","/sys/firmware"],"ReadonlyPaths":["/proc/bus","/proc/fs","/proc/irq","/proc/sys","/proc/sysrq-trigger"]} [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# docker start discourse_ose discourse_ose [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# docker exec -it discourse_ose /bin/bash root@osestaging1-discourse-ose:/# iptables -L # Warning: iptables-legacy tables present, use iptables-legacy to see them iptables: Permission denied (you must be root). root@osestaging1-discourse-ose:/#
- ah, got it! I had to restart the docker service
[root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# docker stop discourse_ose discourse_ose [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# sed -i 's/"CapAdd":null/"CapAdd":["NET_ADMIN"]/' /var/lib/docker/containers/$id/hostconfig.json [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# service docker restart Redirecting to /bin/systemctl restart docker.service [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# docker start discourse_ose discourse_ose [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# docker exec -it discourse_ose /bin/bash root@osestaging1-discourse-ose:/# iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination # Warning: iptables-legacy tables present, use iptables-legacy to see them root@osestaging1-discourse-ose:/#
- note that a hard restart *is* necessary; I confirmed that a reload won't work
[root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# docker stop discourse_ose discourse_ose [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# sed -i 's/"CapAdd":null/"CapAdd":["NET_ADMIN"]/' /var/lib/docker/containers/$id/hostconfig.json [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# service docker reload Redirecting to /bin/systemctl reload docker.service [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# docker start discourse_ose discourse_ose [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]# docker exec -it discourse_ose /bin/bash root@osestaging1-discourse-ose:/# iptables -L # Warning: iptables-legacy tables present, use iptables-legacy to see them iptables: Permission denied (you must be root). root@osestaging1-discourse-ose:/# exit [root@osestaging1 d5b6a3af666e868e3d574532fc81792c41ee7f13fbd547a82f367845009c490e]#
- cool, now I can actually iterate! Let's try a rule that permits root and the apth user and blocks all other users from being able to access the internet
root@osestaging1-discourse-ose:/# curl 1.1.1.1 <html> <head><title>301 Moved Permanently</title></head> <body bgcolor="white"> <center><h1>301 Moved Permanently</h1></center> <hr><center>cloudflare-lb</center> </body> </html> root@osestaging1-discourse-ose:/# iptables -A OUTPUT -p tcp -m owner --uid-owner 0 -j ACCEPT root@osestaging1-discourse-ose:/# iptables -A OUTPUT -j DROP root@osestaging1-discourse-ose:/# curl 1.1.1.1 <html> <head><title>301 Moved Permanently</title></head> <body bgcolor="white"> <center><h1>301 Moved Permanently</h1></center> <hr><center>cloudflare-lb</center> </body> </html> root@osestaging1-discourse-ose:/#
- it works!
root@osestaging1-discourse-ose:/# su - discourse discourse@osestaging1-discourse-ose:~$ curl 1.1.1.1 <html> <head><title>301 Moved Permanently</title></head> <body bgcolor="white"> <center><h1>301 Moved Permanently</h1></center> <hr><center>cloudflare-lb</center> </body> </html> discourse@osestaging1-discourse-ose:~$ logout root@osestaging1-discourse-ose:/# grep -ir apt /etc/passwd _apt:x:100:65534::/nonexistent:/usr/sbin/nologin root@osestaging1-discourse-ose:/# iptables -A OUTPUT -p tcp -m owner --uid-owner 0 -j ACCEPT root@osestaging1-discourse-ose:/# iptables -A OUTPUT -p tcp -m owner --uid-owner 100 -j ACCEPT root@osestaging1-discourse-ose:/# iptables -A OUTPUT -j DROP root@osestaging1-discourse-ose:/# su - discourse discourse@osestaging1-discourse-ose:~$ curl 1.1.1.1
- I then tried to persist this by installing iptables-persistent, but it failed. I whitelisted apt, why did it fail?
root@osestaging1-discourse-ose:/# apt-get install -y iptables-persistent Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: netfilter-persistent The following NEW packages will be installed: iptables-persistent netfilter-persistent 0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded. Need to get 21.8 kB of archives. After this operation, 80.9 kB of additional disk space will be used. Err:1 http://deb.debian.org/debian buster/main amd64 netfilter-persistent all 1.0.11 Temporary failure resolving 'deb.debian.org' Err:2 http://deb.debian.org/debian buster/main amd64 iptables-persistent all 1.0.11 Temporary failure resolving 'deb.debian.org' E: Failed to fetch http://deb.debian.org/debian/pool/main/i/iptables-persistent/netfilter-persistent_1.0.11_all.deb Temporary failure resolving 'deb.debian.org' E: Failed to fetch http://deb.debian.org/debian/pool/main/i/iptables-persistent/iptables-persistent_1.0.11_all.deb Temporary failure resolving 'deb.debian.org' E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing? root@osestaging1-discourse-ose:/#
Mon Nov 18, 2019
- I'm still trying to trace the ./launcher process to the point to where it downloads the discourse_docker repo, so that I can modify the image/base/install_nginx script during a `./launcher rebuild discourse_ose` https://github.com/discourse/discourse_docker
- ./launcher has a function run_bootstrap() that runs `$docker_path pull $image` https://github.com/discourse/discourse_docker/blob/87fd7172af8f2848d5118fdebada646c5996821b/launcher#L660
+ /bin/docker pull discourse/base:2.0.20191013-2320 2.0.20191013-2320: Pulling from discourse/base Digest: sha256:77e010342aa5111c8c3b81d80de7d4bdb229793d595bbe373992cdb8f86ef41f Status: Image is up to date for discourse/base:2.0.20191013-2320 docker.io/discourse/base:2.0.20191013-2320
- so this appears to be the part responsible for pulling the discourse/base image from docker hub https://hub.docker.com/r/discourse/base/
- another important command appears to be auto_build.rb, which actually executes the `docker build` command to create docker images. Unfortunately, I don't think this script is actually called by any other scripts, and it's actually just a helper script for human developers when doing some testing https://github.com/discourse/discourse_docker/blob/master/image/auto_build.rb
- I'm beginning to think that `docker pull` doesn't actually use the Dockerfile to build the image at download. Or, perhaps more importantly, the Discourse scripts download a fresh copy of the discourse_docker repo from github when doing a rebuild sch that I can't make modifications to the relvant Dockerfile (ie: updating install-nginx) to affect the image anyway.
- however, there *is* an argument passed to `launcher` named `--run-image`. So, perhaps, I can write a wrapper for the Discourse `launcher` script that first builds my own docker image, and then just tell `./launcher` to use my local image rather than the image from docker hub https://github.com/discourse/discourse_docker/blob/87fd7172af8f2848d5118fdebada646c5996821b/launcher#L22
- I did a test trying to build my own local docker image. To make *sure* that it used my local files, I updated the Dockerfile's FROM to use alpine linux. A good sign: it showed alpine and it failed when trying to apt
[root@osestaging1 base]# docker build --tag 'alpine_discourse' . Sending build context to Docker daemon 38.4kB Step 1/61 : FROM alpine latest: Pulling from library/alpine 89d9c30c1d48: Pull complete Digest: sha256:c19173c5ada610a5989151111163d28a67368362762534d8a8121ce95cf2bd5a Status: Downloaded newer image for alpine:latest ---> 965ea09ff2eb Step 2/61 : ENV PG_MAJOR 10 ---> Running in e53144a7c7ee Removing intermediate container e53144a7c7ee ---> 895d2ca9cfe9 Step 3/61 : ENV RUBY_ALLOCATOR /usr/lib/libjemalloc.so.1 ---> Running in a3bcc3695df3 Removing intermediate container a3bcc3695df3 ---> 06e4d7476da0 Step 4/61 : ENV RAILS_ENV production ---> Running in 0ac24d9fb112 Removing intermediate container 0ac24d9fb112 ---> 4159869cf970 Step 5/61 : RUN echo 2.0.`date +%Y%m%d` > /VERSION ---> Running in 2e2c3136622a Removing intermediate container 2e2c3136622a ---> 9b33df0cef8e Step 6/61 : RUN apt update && apt install -y gnupg sudo curl ---> Running in 53bbee438a5e /bin/sh: apt: not found The command '/bin/sh -c apt update && apt install -y gnupg sudo curl' returned a non-zero code: 127 [root@osestaging1 base]#
- now let's check the list of docker images. it looks like it pulled-in the alpine image, but my new build above isn't present--probably because it failed to build
[root@osestaging1 base]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE <none> <none> 9b33df0cef8e About a minute ago 5.55MB local_discourse/discourse_ose latest de5dc3e2af42 15 minutes ago 2.76GB alpine latest 965ea09ff2eb 3 weeks ago 5.55MB discourse/base 2.0.20191013-2320 09725007dc9e 5 weeks ago 2.3GB hello-world latest fce289e99eb9 10 months ago 1.84kB [root@osestaging1 base]#
- I changed the FROM line back to debian, and I tried the build again. it worked!
[root@osestaging1 base]# docker build --tag 'discourse_maltfield' . ... Please check your Rails app for 'config.i18n.fallbacks = true'. If you're using I18n (>= 1.1.0) and Rails (< 5.2.2), this should be 'config.i18n.fallbacks = [I18n.default_locale]'. If not, fallbacks will be broken in your app by I18n 1.1.x. For more info see: https://github.com/svenfuchs/i18n/releases/tag/v1.1.0 Post-install message from discourse_image_optim: Rails image assets optimization is extracted into image_optim_rails gem You can safely remove `config.assets.image_optim = false` if you are not going to use that gem Downloading MaxMindDb's GeoLite2-City... Downloading MaxMindDb's GeoLite2-ASN... Removing intermediate container 4ea4f5d6aa85 ---> 1a24fc6acacd Successfully built 1a24fc6acacd Successfully tagged discourse_maltfield:latest [root@osestaging1 base]#
- on a related note wrt understanding how docker works (and trying to differentiate the components of Discourse that are specific to Discourse vs docker itself), most of the documentation for docker is helpful to understand how to use the commands, but it's not especially userful for understanding the bigger picture. For example, when does the Dockerfile get executed? Is it during the pull? Or the push? Or the build? https://docs.docker.com/get-started/
- this documentation was very helpful in understanding the bigger picture of how docker images are built, pushed, pulled, and stored on the machine http://blog.thoward37.me/articles/where-are-docker-images-stored/
- this also provides a ton of info on the Dockerfile https://linuxhint.com/understand_dockerfile[[1]]
- after the build above, there's now a new image
[root@osestaging1 base]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE discourse_maltfield latest 1a24fc6acacd About an hour ago 2.35GB <none> <none> 9b33df0cef8e 2 hours ago 5.55MB local_discourse/discourse_ose latest de5dc3e2af42 2 hours ago 2.76GB alpine latest 965ea09ff2eb 3 weeks ago 5.55MB debian buster-slim 105ec214185d 4 weeks ago 69.2MB discourse/base 2.0.20191013-2320 09725007dc9e 5 weeks ago 2.3GB hello-world latest fce289e99eb9 10 months ago 1.84kB [root@osestaging1 base]#
- I ran the launcher script, passing it the new image that I built as the '--run-image'. It finished without error, and the new discourse_ose container is shown using the new 'discourse_maltfield' image
[root@osestaging1 discourse]# time ./launcher rebuild discourse_ose --run-image discourse_maltfield &> output.log real 10m49.785s user 0m2.751s sys 0m2.394s [root@osestaging1 discourse]# docker container ls CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 0438d3a342f4 discourse_maltfield "/sbin/boot" 20 seconds ago Up 19 seconds discourse_ose [root@osestaging1 discourse]#
- And I confirmed that the new container has nginx with the ModSecurity module working
[root@osestaging1 discourse]# ./launcher enter discourse_ose root@osestaging1-discourse-ose:/# ls -lah /etc/modsecurity/ total 84K drwxr-xr-x. 3 root root 4.0K Nov 18 08:28 . drwxr-xr-x. 1 root root 4.0K Nov 18 10:44 .. drwxr-xr-x. 2 root root 4.0K Nov 18 08:28 crs -rw-r--r--. 1 root root 8.3K Dec 10 2018 modsecurity.conf-recommended -rw-r--r--. 1 root root 52K Dec 10 2018 unicode.mapping root@osestaging1-discourse-ose:/# nginx -V nginx version: nginx/1.17.4 built by gcc 8.3.0 (Debian 8.3.0-6) built with OpenSSL 1.1.1d 10 Sep 2019 TLS SNI support enabled configure arguments: --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wno-deprecated-declarations' --with-ld-opt='-Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-ipv6 --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_addition_module --with-http_dav_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_v2_module --with-http_sub_module --with-stream --with-stream_ssl_module --with-mail --with-mail_ssl_module --with-threads --add-module=/tmp/ngx_brotli --add-module=/tmp/ModSecurity-nginx root@osestaging1-discourse-ose:/# ls -lah /tmp total 16K drwxrwxrwt. 1 root root 4.0K Nov 18 10:45 . drwxr-xr-x. 1 root root 4.0K Nov 18 10:44 .. root@osestaging1-discourse-ose:/#
- unfortunately the new container doesn't appear to have the nginx configs in-place, which should have been added by the templates
root@osestaging1-discourse-ose:/var/www/discourse# cd /etc/nginx root@osestaging1-discourse-ose:/etc/nginx# ls -lah total 108K drwxr-xr-x. 1 root root 4.0K Nov 18 10:50 . drwxr-xr-x. 1 root root 4.0K Nov 18 10:44 .. drwxr-xr-x. 2 root root 4.0K Aug 13 18:10 conf.d -rw-r--r--. 1 root root 1.1K Aug 13 18:10 fastcgi.conf -rw-r--r--. 1 root root 1.1K Nov 18 08:30 fastcgi.conf.default -rw-r--r--. 1 root root 1007 Aug 13 18:10 fastcgi_params -rw-r--r--. 1 root root 1007 Nov 18 08:30 fastcgi_params.default -rw-r--r--. 1 root root 2.8K Nov 18 08:30 koi-utf -rw-r--r--. 1 root root 2.2K Nov 18 08:30 koi-win -rw-r--r--. 1 root root 3.9K Aug 13 18:10 mime.types -rw-r--r--. 1 root root 5.2K Nov 18 08:30 mime.types.default drwxr-xr-x. 2 root root 4.0K Aug 13 18:10 modules-available drwxr-xr-x. 2 root root 4.0K Nov 18 08:30 modules-enabled -rw-r--r--. 1 root root 1.5K Aug 13 18:10 nginx.conf -rw-r--r--. 1 root root 2.6K Nov 18 08:30 nginx.conf.default -rw-r--r--. 1 root root 180 Aug 13 18:10 proxy_params -rw-r--r--. 1 root root 636 Aug 13 18:10 scgi_params -rw-r--r--. 1 root root 636 Nov 18 08:30 scgi_params.default drwxr-xr-x. 2 root root 4.0K Nov 18 08:28 sites-available drwxr-xr-x. 2 root root 4.0K Nov 18 08:28 sites-enabled drwxr-xr-x. 2 root root 4.0K Nov 18 08:28 snippets -rw-r--r--. 1 root root 664 Aug 13 18:10 uwsgi_params -rw-r--r--. 1 root root 664 Nov 18 08:30 uwsgi_params.default -rw-r--r--. 1 root root 3.6K Nov 18 08:30 win-utf root@osestaging1-discourse-ose:/etc/nginx# ls -lah conf.d total 12K drwxr-xr-x. 2 root root 4.0K Aug 13 18:10 . drwxr-xr-x. 1 root root 4.0K Nov 18 10:50 .. root@osestaging1-discourse-ose:/etc/nginx# ls -lah modules-available/ total 12K drwxr-xr-x. 2 root root 4.0K Aug 13 18:10 . drwxr-xr-x. 1 root root 4.0K Nov 18 10:50 .. root@osestaging1-discourse-ose:/etc/nginx# ls -lah modules-enabled/ total 12K drwxr-xr-x. 2 root root 4.0K Nov 18 08:30 . drwxr-xr-x. 1 root root 4.0K Nov 18 10:50 . root@osestaging1-discourse-ose:/etc/nginx# find . | grep -i discourse root@osestaging1-discourse-ose:/etc/nginx# grep -irl 'discourse' * root@osestaging1-discourse-ose:/etc/nginx#
- the templates are still in-place in my container yaml
[root@osestaging1 discourse]# head containers/discourse_ose.yml -n 20 ## this is the all-in-one, standalone Discourse Docker container template ## ## After making changes to this file, you MUST rebuild ## /var/discourse/launcher rebuild app ## ## BE *VERY* CAREFUL WHEN EDITING! ## YAML FILES ARE SUPER SUPER SENSITIVE TO MISTAKES IN WHITESPACE OR ALIGNMENT! ## visit http://www.yamllint.com/ to validate this file as needed templates: - "templates/postgres.template.yml" - "templates/redis.template.yml" - "templates/web.template.yml" - "templates/web.ratelimited.template.yml" - "templates/web.socketed.template.yml" # - "templates/web.modsecurity.template.yml" ## Uncomment these two lines if you wish to add Lets Encrypt (https) #- "templates/web.ssl.template.yml" #- "templates/web.letsencrypt.ssl.template.yml" [root@osestaging1 discourse]#
- for example, web.template should create the /etc/nginx/conf.d/discourse.conf file
[root@osestaging1 discourse]# grep -ir '/etc/nginx/conf.d/discourse.conf' templates/web.template.yml - "cp $home/config/nginx.sample.conf /etc/nginx/conf.d/discourse.conf" filename: "/etc/nginx/conf.d/discourse.conf" filename: "/etc/nginx/conf.d/discourse.conf" filename: "/etc/nginx/conf.d/discourse.conf" [root@osestaging1 discourse]#
- I checked the verbose output of the `./launcher rebuild...` execution above, and I *still* saw some `docker run` commands being executed against the 'discourse/base...' image not the image I explicitly gave it = 'discourse_maltfield'
[root@osestaging1 discourse]# grep -i 'docker run' output.log ++ /bin/docker run -i --rm -a stdout -a stderr discourse/base:2.0.20191013-2320 echo working ++ /bin/docker run --rm -i -a stdout -a stdin discourse/base:2.0.20191013-2320 ruby -e 'require '\yaml'\ +++ /bin/docker run --rm -i -a stdin -a stdout discourse/base:2.0.20191013-2320 ruby -e 'require '\yaml'\; puts YAML.load(STDIN.readlines.join)['\templates'\]' ++++ /bin/docker run --rm -i -a stdin -a stdout discourse/base:2.0.20191013-2320 ruby -e 'require '\yaml'\; puts YAML.load(STDIN.readlines.join)['\templates'\]' ++ /bin/docker run --rm -i -a stdin -a stdout discourse/base:2.0.20191013-2320 ruby -e 'require '\yaml'\ ++ /bin/docker run --rm -i -a stdin -a stdout discourse/base:2.0.20191013-2320 ruby -e 'require '\yaml'\ ++ /bin/docker run --rm -i -a stdin -a stdout discourse/base:2.0.20191013-2320 ruby -e 'require '\yaml'\ ++ /bin/docker run --rm -i -a stdout -a stdin discourse/base:2.0.20191013-2320 ruby -e 'require '\yaml'\; puts YAML.load(STDIN.readlines.join)['\docker_args'\]' +++ /bin/docker run --rm -i -a stdin -a stdout discourse/base:2.0.20191013-2320 ruby -e 'require '\yaml'\; puts YAML.load(STDIN.readlines.join)['\templates'\]' ++++ /bin/docker run --rm -i -a stdin -a stdout discourse/base:2.0.20191013-2320 ruby -e 'require '\yaml'\; puts YAML.load(STDIN.readlines.join)['\templates'\]' ++ /bin/docker run --rm -i -a stdin -a stdout discourse/base:2.0.20191013-2320 ruby -e 'require '\yaml'\ ++ /bin/docker run --rm -i -a stdin -a stdout discourse/base:2.0.20191013-2320 ruby -e 'require '\yaml'\ ++ /bin/docker run --rm -i -a stdin -a stdout discourse/base:2.0.20191013-2320 ruby -e 'require '\yaml'\ ++ /bin/docker run --rm -i -a stdout -a stdin discourse/base:2.0.20191013-2320 ruby -e 'require '\yaml'\; puts YAML.load(STDIN.readlines.join)['\docker_args'\]' + /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=2 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=osediscourse_2019@michaelaltfield.net -e DISCOURSE_SMTP_ADDRESS=172.17.0.1 -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-discourse-ose -e DOCKER_HOST_IP=172.17.0.1 --name discourse_ose -t -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:fc:97:b8:b4:0d discourse_maltfield /sbin/boot [root@osestaging1 discourse]#
- so it looks like all the yaml `docker run` executions use the discoures base image, and only the final run uses my discourse_maltfield image
- it looks like the `launcher` script only takes '--run-image' and stores it to run_image and user_run_image
user_run_image="" ... --run-image) user_run_image="$2" shift ;; esac ... set_run_image ... $docker_path run --shm-size=512m $links $attach_on_run $restart_policy "${env[@]}" "${labels[@]}" -h "$hostname" \ -e DOCKER_HOST_IP="$docker_ip" --name $config -t "${ports[@]}" $volumes $mac_address $user_args \ $run_image $boot_command ... set_run_image unset ERR (exec $docker_path run --rm --shm-size=512m $user_args $links "${env[@]}" -e DOCKER_HOST_IP="$docker_ip" -i -a stdin -a stdout -a stderr $volumes $run_image \ /bin/bash -c "$run_command") || ERR=$?
- also note that it looks like we can define the run_image in the container's yaml config file, rather than having to set it as a command line argument passed to `launcher`
set_run_image() { run_image=`cat $config_file | $docker_path run $user_args --rm -i -a stdin -a stdout $image ruby -e \ "require 'yaml'; puts YAML.load(STDIN.readlines.join)['run_image']"` if [ -n "$user_run_image" ]; then run_image=$user_run_image elif [ -z "$run_image" ]; then run_image="$local_discourse/$config" fi }
- but the `launcher` script uses a distinct variable `image`, which appears to be hard-coded
image="discourse/base:2.0.20191013-2320" ... set_volumes() { volumes=`cat $config_file | $docker_path run $user_args --rm -i -a stdout -a stdin $image ruby -e \ "require 'yaml'; puts YAML.load(STDIN.readlines.join)['volumes'].map{|v| '-v ' << v['volume']['host'] << ':' << v['volume']['guest'] << ' '}.join"` } set_links() { links=`cat $config_file | $docker_path run $user_args --rm -i -a stdout -a stdin $image ruby -e \ "require 'yaml'; puts YAML.load(STDIN.readlines.join)['links'].map{|l| '--link ' << l['link']['name'] << ':' << l['link']['alias'] << ' '}.join"` }
- I replaced the hard-coded image with my own and re-ran the `launcher rebuild...` command
[root@osestaging1 discourse]# grep "image=" launcher user_run_image="" user_run_image="$2" #image="discourse/base:2.0.20191013-2320" image="discourse_maltfield" run_image=`cat $config_file | $docker_path run $user_args --rm -i -a stdin -a stdout $image ruby -e \ run_image=$user_run_image run_image="$local_discourse/$config" base_image=`cat $config_file | $docker_path run $user_args --rm -i -a stdin -a stdout $image ruby -e \ image=$base_image [root@osestaging1 discourse]# time ./launcher rebuild discourse_ose --run-image discourse_maltfield &> output.log real 8m26.645s user 0m2.800s sys 0m2.559s [root@osestaging1 discourse]#
- shit, that didn't work; there's still no nginx configs on the build. I'm thinking that specifying the build image is breaking it and overriding the changes made by the template. what if I just override 'image' and don't specify the 'run-image'?
[root@osestaging1 discourse]# time ./launcher rebuild discourse_ose &> output.log real 8m10.725s user 0m2.691s sys 0m2.406s [root@osestaging1 discourse]#
- that worked!
[root@osestaging1 discourse]# ./launcher enter discourse_ose root@osestaging1-discourse-ose:/var/www/discourse# ls -lah /etc/nginx/conf.d total 24K drwxr-xr-x. 1 root root 4.0K Nov 18 11:41 . drwxr-xr-x. 1 root root 4.0K Nov 18 08:30 .. -rw-r--r--. 1 root root 8.4K Nov 18 11:41 discourse.conf -rw-r--r--. 1 root root 661 Nov 18 11:41 modsecurity.include root@osestaging1-discourse-ose:/var/www/discourse#
- I updated our documentation to use this solution https://wiki.opensourceecology.org/wiki/Discourse#Nginx_mod_security
- And I updated my topic on meta.discourse.org about using a WAF with Discourse for the beneift of the community https://meta.discourse.org/t/discourse-web-application-firewall-waf-mod-security/133612/7
Sun Nov 17, 2019
- my docker for-linux github issue requesting they update their gpg key to a keyserver to assist their clients in validating the key's authenticity out-of-band from docker.com was closed as a duplicate of #602 https://github.com/docker/for-linux/issues/849
- but #602 is *not* a duplicate; it's a distinct request to add the fignerprint of the key to the get.docker.com install script--which I support, but it doesn't solve the out-of-band authenticity validation issue as both the installs script *and* the key are downloaded from the docker.com domain. I requested that the above ticket be re-opened (it should only take a few minutes for them to action, anyway) https://github.com/docker/for-linux/issues/849
- ...
- I created a new topic specifically asking what WAF is recommended for Discourse https://meta.discourse.org/t/discourse-web-application-firewall-waf-mod-security/133612
- I pointed out that, even if the code is perfect (which is impossible anyway), a web application can be vulnerable to an exploit if one of its dependencies are vulnerable. I pointed to the example of CVE-2019-11043, which made servers with vulnerable php-fpm & nginx subject critical remote code executions. But this is entirely mitigated by mod_security's CRS, which blocks all malicious requests when it blocks queries with \n or \r in them. https://www.nginx.com/blog/php-fpm-cve-2019-11043-vulnerability-nginx/
- I did a quick test of the site with mod_security enabled. first I do a regular request with nothing malicious
user@ose:/tmp$ curl -sik "https://discourse.opensourceecology.org/" | head -n1HTTP/1.1 200 OK
- then I do an example of malicious query including a newline in the request
user@ose:/tmp$ curl -sik -X POST --data-binary $'line break test\n' "https://discourse.opensourceecology.org/" | head -n1 HTTP/1.1 403 Forbidden
- then I disabled modsecurity in nginx on the discourse docker container, and re-execute the same query. note that this query was not stopped by modsecurity, and made it to the backend app. in this example, if we were running php-fpm and nginx we could be vulnerable. But with modsecurity enabled, we are already protected prior to and after the 0day
user@ose:/tmp$ curl -sik -X POST --data-binary $'line break test\n' "https://discourse.opensourceecology.org/" | head -n1 HTTP/1.1 404 Not Found
Tue Nov 12, 2019
- I created an issue in the docker-ce github asking them to please upload their gpg key to a keyserver so that it can actually be validated in a more sane way
https://github.com/docker/for-linux/issues/849
- I also asked them to integrate it into the web of trust. currently it has no non-self signatures *facepalm*
- I also recommended that they create a keybase account and use their official twitter account to identify themselves on keybase, which indicates their gpg key fingerprint
- ...
- I asked for more info about the Discourse built-in "anonymous" cache. For example, how do I see how many queries are cache hits vs misses? And is there a system in-place for cache invalidation? What if I set the cache to 24 hours (looks like it defaults to 60 seconds) would that mean it will serve stale content for a max of 24 hours? None of this is documented. https://meta.discourse.org/t/discourse-purge-cache-method-on-content-changes/132917/18
- ...
- regarding mod_security, there *are* risks in breaking future installs by me modifying the nginx install & config used by Discourse inside the Docker container. But the alternative of adding apache as a proxy before Discourse (nginx -> varnish -> apache -> nginx) not only is a ridiculious architecture that will make troubleshooting difficult for the future sysadmin, but it also looks like an apache proxy performs awful for long polling, which is used by the Disourse message bus https://meta.discourse.org/t/how-to-run-discourse-in-apache-vhost-not-nginx/133112/13
- so I think the best option is still to update the nginx config inside the docker container to use mod_security
- this is probably the best guide for compiling nginx with mod_security; it's from nginx.com https://www.nginx.com/blog/compiling-and-installing-modsecurity-for-open-source-nginx/
- first, let's get the existing nginx config so we can vaildate that after our changes that it includes mod_security. note that the configure line here shows 'ngx_brotli' which was added by the Discourse install-nginx script https://github.com/discourse/discourse_docker/blob/416467f6ead98f82342e8a926dc6e06f36dfbd56/image/base/install-nginx
root@osestaging1-discourse-ose:/var/www/discourse# nginx -V nginx version: nginx/1.17.4 built by gcc 8.3.0 (Debian 8.3.0-6) built with OpenSSL 1.1.1d 10 Sep 2019 TLS SNI support enabled configure arguments: --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wno-deprecated-declarations' --with-ld-opt='-Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-ipv6 --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_addition_module --with-http_dav_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_v2_module --with-http_sub_module --with-stream --with-stream_ssl_module --with-mail --with-mail_ssl_module --with-threads --add-module=/tmp/ngx_brotli root@osestaging1-discourse-ose:/var/www/discourse#
- so the install guide above first downloads, compiles, & installs the SpiderLabs Modecurity package *then* it downloads the 'ModSecurity-nginx.git' nginx module, which is added to the './configure' line as a dynamic module
- but the first one we may be able to get from the yum repos
root@osestaging1-discourse-ose:/var/www/discourse# apt-cache search security ... libmodsecurity-dev - ModSecurity v3 library component (development files) libmodsecurity3 - ModSecurity v3 library component libapache2-mod-security2 - Tighten web applications security for Apache modsecurity-crs - OWASP ModSecurity Core Rule Set
- before i update 'install-nginx', let's see if I can get it working manually. First I installed the 'modsecurity-crs' package above
root@osestaging1-discourse-ose:/var/www/discourse# apt-get install modsecurity-crs Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: apache2-bin libapache2-mod-security2 libapr1 libaprutil1 libaprutil1-dbd-sqlite3 libaprutil1-ldap libbrotli1 libjansson4 liblua5.1-0 liblua5.2-0 libyajl2 Suggested packages: apache2-doc apache2-suexec-pristine | apache2-suexec-custom www-browser lua geoip-database-contrib ruby The following NEW packages will be installed: apache2-bin libapache2-mod-security2 libapr1 libaprutil1 libaprutil1-dbd-sqlite3 libaprutil1-ldap libbrotli1 libjansson4 liblua5.1-0 liblua5.2-0 libyajl2 modsecurity-crs 0 upgraded, 12 newly installed, 0 to remove and 3 not upgraded. Need to get 2,544 kB of archives. After this operation, 11.1 MB of additional disk space will be used. Do you want to continue? [Y/n] y Err:1 http://deb.debian.org/debian buster/main amd64 libapr1 amd64 1.6.5-1+b1 Temporary failure resolving 'deb.debian.org'
- ugh, that failed since I blocked the docker container from having internet access. for the purposes of testing, I'll undo that for now
[root@osestaging1 base]# vim /usr/lib/systemd/system/docker.service ... [root@osestaging1 base]# systemctl daemon-reload [root@osestaging1 base]# systemctl restart docker ... [root@osestaging1 discourse]# ./launcher enter discourse_ose root@osestaging1-discourse-ose:/var/www/discourse# apt-get install modsecurity-crs Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: apache2-bin libapache2-mod-security2 libapr1 libaprutil1 libaprutil1-dbd-sqlite3 libaprutil1-ldap libbrotli1 libjansson4 liblua5.1-0 liblua5.2-0 libyajl2 Suggested packages: apache2-doc apache2-suexec-pristine | apache2-suexec-custom www-browser lua geoip-database-contrib ruby The following NEW packages will be installed: apache2-bin libapache2-mod-security2 libapr1 libaprutil1 libaprutil1-dbd-sqlite3 libaprutil1-ldap libbrotli1 libjansson4 liblua5.1-0 liblua5.2-0 libyajl2 modsecurity-crs 0 upgraded, 12 newly installed, 0 to remove and 3 not upgraded. Need to get 2,544 kB of archives. After this operation, 11.1 MB of additional disk space will be used. Do you want to continue? [Y/n] y Get:1 http://security.debian.org/debian-security buster/updates/main amd64 apache2-bin amd64 2.4.38-3+deb10u3 [1,307 kB] ... Unpacking modsecurity-crs (3.1.0-1) ... Setting up libbrotli1:amd64 (1.0.7-2) ... Setting up libyajl2:amd64 (2.1.0-3) ... Setting up libapr1:amd64 (1.6.5-1+b1) ... Setting up modsecurity-crs (3.1.0-1) ... Setting up libjansson4:amd64 (2.12-1) ... Setting up liblua5.2-0:amd64 (5.2.4-1.1+b2) ... Setting up liblua5.1-0:amd64 (5.1.5-8.1+b2) ... Setting up libaprutil1:amd64 (1.6.1-4) ... Setting up libaprutil1-ldap:amd64 (1.6.1-4) ... Setting up libaprutil1-dbd-sqlite3:amd64 (1.6.1-4) ... Setting up apache2-bin (2.4.38-3+deb10u3) ... Setting up libapache2-mod-security2 (2.9.3-1) ... Processing triggers for libc-bin (2.28-10) ... root@osestaging1-discourse-ose:/var/www/discourse#
- unfortunately that installed apache, but fortunately it looks like apache doesn't get started so it's no biggie
- but it looks like that CRS package + the nginx mod_security wasn't sufficient to compile nginx with the mod_security module. It still wants the ModSecurity library
root@osestaging1-discourse-ose:/tmp/nginx-1.17.4# ./configure --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wno-deprecated-declarations' --with-ld-opt='-Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-ipv6 --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_addition_module --with-http_dav_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_v2_module --with-http_sub_module --with-stream --with-stream_ssl_module --with-mail --with-mail_ssl_module --with-threads --add-module=/tmp/ngx_brotli --add-dynamic-module=/tmp/ModSecurity-nginx
- it looks like adding the 'libmodsecurity3' package was also not sufficient
root@osestaging1-discourse-ose:/tmp/nginx-1.17.4# apt-get install libmodsecurity3 Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: libfuzzy2 liblua5.3-0 libmaxminddb0 Suggested packages: mmdb-bin The following NEW packages will be installed: libfuzzy2 liblua5.3-0 libmaxminddb0 libmodsecurity3 0 upgraded, 4 newly installed, 0 to remove and 3 not upgraded. Need to get 682 kB of archives. After this operation, 3,010 kB of additional disk space will be used. Do you want to continue? [Y/n] y Get:1 http://deb.debian.org/debian buster/main amd64 libfuzzy2 amd64 2.14.1+git20180629.57fcfff-1 [19.3 kB] Get:2 http://deb.debian.org/debian buster/main amd64 liblua5.3-0 amd64 5.3.3-1.1 [120 kB] Get:3 http://deb.debian.org/debian buster/main amd64 libmaxminddb0 amd64 1.3.2-1 [30.7 kB] Get:4 http://deb.debian.org/debian buster/main amd64 libmodsecurity3 amd64 3.0.3-1 [513 kB] Fetched 682 kB in 0s (1,596 kB/s) debconf: delaying package configuration, since apt-utils is not installed Selecting previously unselected package libfuzzy2:amd64. (Reading database ... 45112 files and directories currently installed.) Preparing to unpack .../libfuzzy2_2.14.1+git20180629.57fcfff-1_amd64.deb ... Unpacking libfuzzy2:amd64 (2.14.1+git20180629.57fcfff-1) ... Selecting previously unselected package liblua5.3-0:amd64. Preparing to unpack .../liblua5.3-0_5.3.3-1.1_amd64.deb ... Unpacking liblua5.3-0:amd64 (5.3.3-1.1) ... Selecting previously unselected package libmaxminddb0:amd64. Preparing to unpack .../libmaxminddb0_1.3.2-1_amd64.deb ... Unpacking libmaxminddb0:amd64 (1.3.2-1) ... Selecting previously unselected package libmodsecurity3:amd64. Preparing to unpack .../libmodsecurity3_3.0.3-1_amd64.deb ... Unpacking libmodsecurity3:amd64 (3.0.3-1) ... Setting up libfuzzy2:amd64 (2.14.1+git20180629.57fcfff-1) ... Setting up libmaxminddb0:amd64 (1.3.2-1) ... Setting up liblua5.3-0:amd64 (5.3.3-1.1) ... Setting up libmodsecurity3:amd64 (3.0.3-1) ... Processing triggers for libc-bin (2.28-10) ... root@osestaging1-discourse-ose:/tmp/nginx-1.17.4# ./configure --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wno-deprecated-declarations' --with-ld-opt='-Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-ipv6 --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_addition_module --with-http_dav_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_v2_module --with-http_sub_module --with-stream --with-stream_ssl_module --with-mail --with-mail_ssl_module --with-threads --add-module=/tmp/ngx_brotli --add-dynamic-module=/tmp/ModSecurity-nginx ... configuring additional dynamic modules adding module in /tmp/ModSecurity-nginx checking for ModSecurity library ... not found checking for ModSecurity library in /usr/local/modsecurity ... not found ./configure: error: ngx_http_modsecurity_module requires the ModSecurity library. root@osestaging1-discourse-ose:/tmp/nginx-1.17.4#
- but the 'libmodsecurity-dev' package worked!
root@osestaging1-discourse-ose:/tmp/nginx-1.17.4# apt-get install libmodsecurity-dev Reading package lists... Done Building dependency tree Reading state information... Done The following NEW packages will be installed: libmodsecurity-dev 0 upgraded, 1 newly installed, 0 to remove and 3 not upgraded. Need to get 614 kB of archives. After this operation, 5,840 kB of additional disk space will be used. Get:1 http://deb.debian.org/debian buster/main amd64 libmodsecurity-dev amd64 3.0.3-1 [614 kB] Fetched 614 kB in 0s (1,242 kB/s) ydebconf: delaying package configuration, since apt-utils is not installed Selecting previously unselected package libmodsecurity-dev:amd64. (Reading database ... 45143 files and directories currently installed.) Preparing to unpack .../libmodsecurity-dev_3.0.3-1_amd64.deb ... Unpacking libmodsecurity-dev:amd64 (3.0.3-1) ... Setting up libmodsecurity-dev:amd64 (3.0.3-1) ... root@osestaging1-discourse-ose:/tmp/nginx-1.17.4# ./configure --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wno-deprecated-declarations' --with-ld-opt='-Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-ipv6 --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_addition_module --with-http_dav_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_v2_module --with-http_sub_module --with-stream --with-stream_ssl_module --with-mail --with-mail_ssl_module --with-threads --add-module=/tmp/ngx_brotli --add-dynamic-module=/tmp/ModSecurity-nginx ... + ngx_brotli was configured configuring additional dynamic modules adding module in /tmp/ModSecurity-nginx checking for ModSecurity library ... found + ngx_http_modsecurity_module was configured checking for PCRE library ... found checking for PCRE JIT support ... found checking for OpenSSL library ... found checking for zlib library ... found creating objs/Makefile Configuration summary + using threads + using system PCRE library + using system OpenSSL library + using system zlib library nginx path prefix: "/usr/share/nginx" nginx binary file: "/usr/share/nginx/sbin/nginx" nginx modules path: "/usr/share/nginx/modules" nginx configuration prefix: "/etc/nginx" nginx configuration file: "/etc/nginx/nginx.conf" nginx pid file: "/run/nginx.pid" nginx error log file: "/var/log/nginx/error.log" nginx http access log file: "/var/log/nginx/access.log" nginx http client request body temporary files: "/var/lib/nginx/body" nginx http proxy temporary files: "/var/lib/nginx/proxy" nginx http fastcgi temporary files: "/var/lib/nginx/fastcgi" nginx http uwsgi temporary files: "/var/lib/nginx/uwsgi" nginx http scgi temporary files: "/var/lib/nginx/scgi" ./configure: warning: the "--with-ipv6" option is deprecated root@osestaging1-discourse-ose:/tmp/nginx-1.17.4#
- but then the make install failed :(
root@osestaging1-discourse-ose:/tmp/nginx-1.17.4# make install ... objs/addon/src/ngx_http_modsecurity_rewrite.o \ objs/ngx_http_modsecurity_module_modules.o \ -Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now -lmodsecurity \ -shared /usr/bin/ld: objs/addon/src/ngx_http_modsecurity_module.o: relocation R_X86_64_PC32 against symbol `stderr@@GLIBC_2.2.5' can not be used when making a shared object; recompile with -fPIC /usr/bin/ld: objs/addon/src/ngx_http_modsecurity_pre_access.o: relocation R_X86_64_PC32 against symbol `ngx_http_modsecurity_module' can not be used when making a shared object; recompile with -fPIC /usr/bin/ld: objs/addon/src/ngx_http_modsecurity_header_filter.o: relocation R_X86_64_PC32 against undefined symbol `ngx_http_core_module' can not be used when making a shared object; recompile with -fPIC /usr/bin/ld: objs/addon/src/ngx_http_modsecurity_body_filter.o: relocation R_X86_64_PC32 against symbol `ngx_http_modsecurity_module' can not be used when making a shared object; recompile with -fPIC /usr/bin/ld: objs/addon/src/ngx_http_modsecurity_log.o: relocation R_X86_64_PC32 against symbol `ngx_http_modsecurity_module' can not be used when making a shared object; recompile with -fPIC /usr/bin/ld: objs/addon/src/ngx_http_modsecurity_rewrite.o: relocation R_X86_64_PC32 against symbol `ngx_http_modsecurity_module' can not be used when making a shared object; recompile with -fPIC /usr/bin/ld: final link failed: nonrepresentable section on output collect2: error: ld returned 1 exit status make[1]: *** [objs/Makefile:1952: objs/ngx_http_modsecurity_module.so] Error 1 make[1]: Leaving directory '/tmp/nginx-1.17.4' make: *** [Makefile:11: install] Error 2 root@osestaging1-discourse-ose:/tmp/nginx-1.17.4#
- using '--add-module' instead of '--add-dynamic-module' fixes this issue *shrug* And now `nginx -V` shows the 'ModSecurity-nginx' module. sweet!
root@osestaging1-discourse-ose:/tmp/nginx-1.17.4# make ... objs/ngx_modules.o \ -Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now -ldl -lpthread -lpthread -lcrypt -lm -lmodsecurity -lpcre -lssl -lcrypto -ldl -lpthread -lz \ -Wl,-E sed -e "s|%%PREFIX%%|/usr/share/nginx|" \ -e "s|%%PID_PATH%%|/run/nginx.pid|" \ -e "s|%%CONF_PATH%%|/etc/nginx/nginx.conf|" \ -e "s|%%ERROR_LOG_PATH%%|/var/log/nginx/error.log|" \ < man/nginx.8 > objs/nginx.8 make[1]: Leaving directory '/tmp/nginx-1.17.4' root@osestaging1-discourse-ose:/tmp/nginx-1.17.4# make install cp conf/nginx.conf '/etc/nginx/nginx.conf.default' test -d '/run' \ || mkdir -p '/run' test -d '/var/log/nginx' \ || mkdir -p '/var/log/nginx' test -d '/usr/share/nginx/html' \ || cp -R html '/usr/share/nginx' test -d '/var/log/nginx' \ || mkdir -p '/var/log/nginx' make[1]: Leaving directory '/tmp/nginx-1.17.4' root@osestaging1-discourse-ose:/tmp/nginx-1.17.4# nginx -V nginx version: nginx/1.17.4 built by gcc 8.3.0 (Debian 8.3.0-6) built with OpenSSL 1.1.1d 10 Sep 2019 TLS SNI support enabled configure arguments: --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wno-deprecated-declarations' --with-ld-opt='-Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-ipv6 --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_addition_module --with-http_dav_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_v2_module --with-http_sub_module --with-stream --with-stream_ssl_module --with-mail --with-mail_ssl_module --with-threads --add-module=/tmp/ngx_brotli root@osestaging1-discourse-ose:/tmp/nginx-1.17.4# mv /usr/share/nginx/sbin/nginx /usr/sbin root@osestaging1-discourse-ose:/tmp/nginx-1.17.4# nginx -V nginx version: nginx/1.17.4 built by gcc 8.3.0 (Debian 8.3.0-6) built with OpenSSL 1.1.1d 10 Sep 2019 TLS SNI support enabled configure arguments: --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wno-deprecated-declarations' --with-ld-opt='-Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-ipv6 --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_addition_module --with-http_dav_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_v2_module --with-http_sub_module --with-stream --with-stream_ssl_module --with-mail --with-mail_ssl_module --with-threads --add-module=/tmp/ngx_brotli --add-module=/tmp/ModSecurity-nginx root@osestaging1-discourse-ose:/tmp/nginx-1.17.4#
- meanwhile, it looks like the crs package I installed before dropped all my rules into /usr/share/modsecurity-crs
root@osestaging1-discourse-ose:/tmp/nginx-1.17.4# ls -lah /etc/modsecurity/ total 84K drwxr-xr-x. 3 root root 4.0K Nov 12 08:22 . drwxr-xr-x. 1 root root 4.0K Nov 12 08:38 .. drwxr-xr-x. 2 root root 4.0K Nov 12 08:22 crs -rw-r--r--. 1 root root 8.3K Dec 10 2018 modsecurity.conf-recommended -rw-r--r--. 1 root root 52K Dec 10 2018 unicode.mapping root@osestaging1-discourse-ose:/tmp/nginx-1.17.4# ls -lah /usr/share/modsecurity-crs/ total 28K drwxr-xr-x. 4 root root 4.0K Nov 12 08:22 . drwxr-xr-x. 1 root root 4.0K Nov 12 08:22 .. -rw-r--r--. 1 root root 373 Nov 27 2018 owasp-crs.load drwxr-xr-x. 2 root root 4.0K Nov 12 08:22 rules drwxr-xr-x. 13 root root 4.0K Nov 12 08:22 util root@osestaging1-discourse-ose:/tmp/nginx-1.17.4# ls -lah /usr/share/modsecurity-crs/rules/ total 648K drwxr-xr-x. 2 root root 4.0K Nov 12 08:22 . drwxr-xr-x. 4 root root 4.0K Nov 12 08:22 .. -rw-r--r--. 1 root root 659 Nov 27 2018 crawlers-user-agents.data -rw-r--r--. 1 root root 551 Nov 27 2018 iis-errors.data -rw-r--r--. 1 root root 833 Nov 27 2018 java-classes.data -rw-r--r--. 1 root root 264 Nov 27 2018 java-code-leakages.data -rw-r--r--. 1 root root 240 Nov 27 2018 java-errors.data -rw-r--r--. 1 root root 31K Nov 27 2018 lfi-os-files.data -rw-r--r--. 1 root root 5.3K Nov 27 2018 php-config-directives.data -rw-r--r--. 1 root root 9.1K Nov 27 2018 php-errors.data -rw-r--r--. 1 root root 589 Nov 27 2018 php-function-names-933150.data -rw-r--r--. 1 root root 21K Nov 27 2018 php-function-names-933151.data -rw-r--r--. 1 root root 224 Nov 27 2018 php-variables.data -rw-r--r--. 1 root root 13K Nov 27 2018 REQUEST-901-INITIALIZATION.conf -rw-r--r--. 1 root root 12K Nov 27 2018 REQUEST-903.9001-DRUPAL-EXCLUSION-RULES.conf -rw-r--r--. 1 root root 22K Nov 27 2018 REQUEST-903.9002-WORDPRESS-EXCLUSION-RULES.conf -rw-r--r--. 1 root root 8.9K Nov 27 2018 REQUEST-903.9003-NEXTCLOUD-EXCLUSION-RULES.conf -rw-r--r--. 1 root root 7.3K Nov 27 2018 REQUEST-903.9004-DOKUWIKI-EXCLUSION-RULES.conf -rw-r--r--. 1 root root 1.8K Nov 27 2018 REQUEST-903.9005-CPANEL-EXCLUSION-RULES.conf -rw-r--r--. 1 root root 1.5K Nov 27 2018 REQUEST-905-COMMON-EXCEPTIONS.conf -rw-r--r--. 1 root root 11K Nov 27 2018 REQUEST-910-IP-REPUTATION.conf -rw-r--r--. 1 root root 2.8K Nov 27 2018 REQUEST-911-METHOD-ENFORCEMENT.conf -rw-r--r--. 1 root root 9.7K Nov 27 2018 REQUEST-912-DOS-PROTECTION.conf -rw-r--r--. 1 root root 7.7K Nov 27 2018 REQUEST-913-SCANNER-DETECTION.conf -rw-r--r--. 1 root root 52K Nov 27 2018 REQUEST-920-PROTOCOL-ENFORCEMENT.conf -rw-r--r--. 1 root root 11K Nov 27 2018 REQUEST-921-PROTOCOL-ATTACK.conf -rw-r--r--. 1 root root 6.3K Nov 27 2018 REQUEST-930-APPLICATION-ATTACK-LFI.conf -rw-r--r--. 1 root root 5.9K Nov 27 2018 REQUEST-931-APPLICATION-ATTACK-RFI.conf -rw-r--r--. 1 root root 54K Nov 27 2018 REQUEST-932-APPLICATION-ATTACK-RCE.conf -rw-r--r--. 1 root root 31K Nov 27 2018 REQUEST-933-APPLICATION-ATTACK-PHP.conf -rw-r--r--. 1 root root 41K Nov 27 2018 REQUEST-941-APPLICATION-ATTACK-XSS.conf -rw-r--r--. 1 root root 70K Nov 27 2018 REQUEST-942-APPLICATION-ATTACK-SQLI.conf -rw-r--r--. 1 root root 5.6K Nov 27 2018 REQUEST-943-APPLICATION-ATTACK-SESSION-FIXATION.conf -rw-r--r--. 1 root root 15K Nov 27 2018 REQUEST-944-APPLICATION-ATTACK-JAVA.conf -rw-r--r--. 1 root root 4.1K Nov 27 2018 REQUEST-949-BLOCKING-EVALUATION.conf -rw-r--r--. 1 root root 3.9K Nov 27 2018 RESPONSE-950-DATA-LEAKAGES.conf -rw-r--r--. 1 root root 19K Nov 27 2018 RESPONSE-951-DATA-LEAKAGES-SQL.conf -rw-r--r--. 1 root root 3.7K Nov 27 2018 RESPONSE-952-DATA-LEAKAGES-JAVA.conf -rw-r--r--. 1 root root 5.2K Nov 27 2018 RESPONSE-953-DATA-LEAKAGES-PHP.conf -rw-r--r--. 1 root root 6.0K Nov 27 2018 RESPONSE-954-DATA-LEAKAGES-IIS.conf -rw-r--r--. 1 root root 3.8K Nov 27 2018 RESPONSE-959-BLOCKING-EVALUATION.conf -rw-r--r--. 1 root root 6.6K Nov 27 2018 RESPONSE-980-CORRELATION.conf -rw-r--r--. 1 root root 1.7K Nov 27 2018 restricted-files.data -rw-r--r--. 1 root root 390 Nov 27 2018 restricted-upload.data -rw-r--r--. 1 root root 216 Nov 27 2018 scanners-headers.data -rw-r--r--. 1 root root 418 Nov 27 2018 scanners-urls.data -rw-r--r--. 1 root root 4.2K Nov 27 2018 scanners-user-agents.data -rw-r--r--. 1 root root 717 Nov 27 2018 scripting-user-agents.data -rw-r--r--. 1 root root 1.9K Nov 27 2018 sql-errors.data -rw-r--r--. 1 root root 2.0K Nov 27 2018 sql-function-names.data -rw-r--r--. 1 root root 973 Nov 27 2018 unix-shell.data -rw-r--r--. 1 root root 3.9K Nov 27 2018 windows-powershell-commands.data root@osestaging1-discourse-ose:/tmp/nginx-1.17.4#
- I updated the 'nginx-install' script with the changes tested above; here's a diff (TODO: test this and write a sed line to add to make these changes to add to our install guide on the wiki)
[root@osestaging1 base]# diff install-nginx.20191112.orig install-nginx 7a8,12 > # mod_security --maltfield > apt-get install -y libmodsecurity-dev modsecurity-crs > cd /tmp > git clone --depth 1 https://github.com/SpiderLabs/ModSecurity-nginx.git > 34c39 < ./configure --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wno-deprecated-declarations' --with-ld-opt='-Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-ipv6 --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_addition_module --with-http_dav_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_v2_module --with-http_sub_module --with-stream --with-stream_ssl_module --with-mail --with-mail_ssl_module --with-threads --add-module=/tmp/ngx_brotli --- > ./configure --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wno-deprecated-declarations' --with-ld-opt='-Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-ipv6 --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_addition_module --with-http_dav_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_v2_module --with-http_sub_module --with-stream --with-stream_ssl_module --with-mail --with-mail_ssl_module --with-threads --add-module=/tmp/ngx_brotli --add-module=/tmp/ModSecurity-nginx [root@osestaging1 base]#
- the next step of the nginx mod_security guide linked above (skipping over the bits for loading the dynamic module which I no longer compiled-in as dynamic) is to put modsecurity.conf-recommended in /etc/nginx/modsec/modsecurity.conf and change "SecRuleEngine DetectionOnly" to "SecRuleEngine On". Well, it looks like we don't have to get that from github; our crs package already created modsecurity.conf-recommended in /etc/modsecurity/
root@osestaging1-discourse-ose:/tmp/nginx-1.17.4# cp /etc/modsecurity/modsecurity.conf-recommended /etc/modsecurity/modsecurity.conf root@osestaging1-discourse-ose:/tmp/nginx-1.17.4# sed -i 's/SecRuleEngine DetectionOnly/SecRuleEngine On/' /etc/modsecurity/modsecurity.conf root@osestaging1-discourse-ose:/tmp/nginx-1.17.4# diff /etc/modsecurity/modsecurity.conf-recommended /etc/modsecurity/modsecurity.conf 7c7 < SecRuleEngine DetectionOnly --- > SecRuleEngine On root@osestaging1-discourse-ose:/tmp/nginx-1.17.4#
- now we create our nginx modsecurity config file, which can be then included into our nginx discourse vhost config file
cat << EOF > /etc/nginx/conf.d/modsecurity.include ################################################################################ # File: modsecurity.include # Version: 0.1 # Purpose: Defines mod_security rules for the discourse vhost # This should be included in the server{} blocks nginx vhosts. # Author: Michael Altfield <michael@opensourceecology.org> # Created: 2019-11-12 # Updated: 2019-11-12 ################################################################################ Include "/etc/modsecurity/modsecurity.conf" # Basic test rule SecRule ARGS:testparam "@contains test" "id:1234,deny,status:403" EOF
- testing the nginx config failed
root@osestaging1-discourse-ose:/etc/nginx/conf.d# nginx -t nginx: [emerg] "modsecurity_rules_file" directive Rules error. File: /etc/modsecurity/modsecurity.conf. Line: 39. Column: 33. As of ModSecurity version 3.0, SecRequestBodyInMemoryLimit is no longer supported. Instead, you can use your web server configurations to control those values. ModSecurity will follow the web server decision. in /etc/nginx/conf.d/discourse.conf:39 nginx: configuration file /etc/nginx/nginx.conf test failed root@osestaging1-discourse-ose:/etc/nginx/conf.d#
- commenting-out the offending line for SecRequestBodyInMemoryLimit fixed it, but now there's a distinct issue with the log path
root@osestaging1-discourse-ose:/etc/nginx/conf.d# sed --in-place=.`date "+%Y%m%d_%H%M%S"` 's^\(\s*\)[^#]*SecRequestBodyInMemoryLimit\(.*\)^\1#SecRequestBodyInMemoryLimit\2^' /etc/modsecurity/modsecurity.conf root@osestaging1-discourse-ose:/etc/nginx/conf.d# nginx -t nginx: [emerg] "modsecurity_rules_file" directive Failed to open file: /var/log/apache2/modsec_audit.log in /etc/nginx/conf.d/discourse.conf:39 nginx: configuration file /etc/nginx/nginx.conf test failed root@osestaging1-discourse-ose:/etc/nginx/conf.d#
- I changed the log path from apache to nginx; that fixed the nginx config issues
root@osestaging1-discourse-ose:/etc/modsecurity# sed --in-place=.`date "+%Y%m%d_%H%M%S"` '/nginx/! s%^\(\s*\)[^#]*SecAuditLog \(.*\)%#\1SecAuditLog \2\n\1SecAuditLog /var/log/nginx/modsec_audit.log%' /etc/modsecurity/modsecurity.conf root@osestaging1-discourse-ose:/etc/modsecurity# nginx -t nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: configuration file /etc/nginx/nginx.conf test is successful root@osestaging1-discourse-ose:/etc/modsecurity#
- as a recap, here's the changes I made to modsecurity.conf-recommended to the new modsecurity.conf file
root@osestaging1-discourse-ose:/etc/modsecurity# diff /etc/modsecurity/modsecurity.conf-recommended /etc/modsecurity/modsecurity.conf 7c7 < SecRuleEngine DetectionOnly --- > SecRuleEngine On 45c45 < SecRequestBodyInMemoryLimit 131072 --- > #SecRequestBodyInMemoryLimit 131072 193c193,194 < SecAuditLog /var/log/apache2/modsec_audit.log --- > #SecAuditLog /var/log/apache2/modsec_audit.log > SecAuditLog /var/log/nginx/modsec_audit.log root@osestaging1-discourse-ose:/etc/modsecurity#
- I tried to restart nginx, but it failed with a message that it couldn't bind to an address already in use. I think it's because nginx is executed by 'runsv' instead of as a typical service inside the docker container
root@osestaging1-discourse-ose:/etc/modsecurity# ps -ef | grep -i nginx root 41 34 0 08:21 ? 00:00:00 runsv nginx root 23605 41 4 10:16 ? 00:00:00 /usr/sbin/nginx root 23608 4977 0 10:16 pts/1 00:00:00 grep -i nginx root@osestaging1-discourse-ose:/etc/modsecurity#
- I destoreyed & restarted the 'disocourse_ose' docker container because I don't know the proper way to restart nginx inside the docker container, but when it came back up all the configs I createad above were lost. Even the version of nginx is stale
root@osestaging1-discourse-ose:/etc/nginx/conf.d# nginx -V nginx version: nginx/1.17.4 built by gcc 8.3.0 (Debian 8.3.0-6) built with OpenSSL 1.1.1d 10 Sep 2019 TLS SNI support enabled configure arguments: --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wno-deprecated-declarations' --with-ld-opt='-Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-ipv6 --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_addition_module --with-http_dav_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_v2_module --with-http_sub_module --with-stream --with-stream_ssl_module --with-mail --with-mail_ssl_module --with-threads --add-module=/tmp/ngx_brotli root@osestaging1-discourse-ose:/etc/nginx/conf.d# ls -lah /etc/modsecurity ls: cannot access '/etc/modsecurity': No such file or directory root@osestaging1-discourse-ose:/etc/nginx/conf.d# ls -lah /etc/nginx/conf.d total 28K drwxr-xr-x. 1 root root 4.0K Nov 12 10:23 . drwxr-xr-x. 1 root root 4.0K Oct 13 22:41 .. -rw-r--r--. 1 root root 8.3K Nov 11 10:49 discourse.conf root@osestaging1-discourse-ose:/etc/nginx/conf.d# dpkg -l | grep -i modsecurity root@osestaging1-discourse-ose:/etc/nginx/conf.d#
- I should probably figure out how to restart nginx inside this container. Here's what I get from systemctl
root@osestaging1-discourse-ose:/etc/nginx/conf.d# systemctl restart nginx System has not been booted with systemd as init system (PID 1). Can't operate. Failed to connect to bus: Host is down root@osestaging1-discourse-ose:/etc/nginx/conf.d#
- ok, so there's an init.d script. but, yeah, when I attempt to start nginx it hit the same issue with binding to an existing nginx
</pre> root@osestaging1-discourse-ose:/etc/nginx/conf.d# /etc/init.d/nginx restart [FAIL] Restarting nginx: nginx failed! root@osestaging1-discourse-ose:/etc/nginx/conf.d# tail /var/log/nginx/error.log 2019/11/12 10:32:45 [emerg] 1123#1123: bind() to unix:/shared/nginx.http.sock failed (98: Address already in use) 2019/11/12 10:32:45 [emerg] 1123#1123: bind() to unix:/shared/nginx.http.sock failed (98: Address already in use) 2019/11/12 10:32:45 [emerg] 1123#1123: bind() to unix:/shared/nginx.http.sock failed (98: Address already in use) 2019/11/12 10:32:45 [emerg] 1123#1123: bind() to unix:/shared/nginx.http.sock failed (98: Address already in use) 2019/11/12 10:32:45 [emerg] 1123#1123: bind() to unix:/shared/nginx.http.sock failed (98: Address already in use) 2019/11/12 10:32:45 [emerg] 1123#1123: still could not bind() 2019/11/12 10:32:47 [emerg] 1127#1127: bind() to unix:/shared/nginx.http.sock failed (98: Address already in use) 2019/11/12 10:32:47 [emerg] 1127#1127: bind() to unix:/shared/nginx.http.sock failed (98: Address already in use) 2019/11/12 10:32:47 [emerg] 1127#1127: bind() to unix:/shared/nginx.http.sock failed (98: Address already in use) 2019/11/12 10:32:47 [emerg] 1127#1127: bind() to unix:/shared/nginx.http.sock failed (98: Address already in use) root@osestaging1-discourse-ose:/etc/nginx/conf.d# ps -ef | grep -i nginx root 46 40 0 10:19 ? 00:00:00 runsv nginx root 1174 46 1 10:33 ? 00:00:00 /usr/sbin/nginx root 1177 108 0 10:33 pts/1 00:00:00 grep -i nginx root@osestaging1-discourse-ose:/etc/nginx/conf.d# </pre>
- so it looks like 'runsv' is part of runit https://en.wikipedia.org/wiki/Runit
- indeed, this is something that's been implemented into Ruby on Rails
- I find that if I kill the 'runsv nginx' process, then it automatically re-runs itself after a few seconds. Perhaps this is how I restart it?
root@osestaging1-discourse-ose:/# ps -ef | grep -i nginx root 2798 40 0 10:51 ? 00:00:00 runsv nginx root 2852 2798 2 10:52 ? 00:00:00 /usr/sbin/nginx root 2855 108 0 10:52 pts/1 00:00:00 grep -i nginx root@osestaging1-discourse-ose:/# kill 2798 root@osestaging1-discourse-ose:/# ps -ef | grep -i nginx root 2871 108 0 10:52 pts/1 00:00:00 grep -i nginx root@osestaging1-discourse-ose:/# ps -ef | grep -i nginx root 2873 40 0 10:52 ? 00:00:00 runsv nginx root 2874 2873 2 10:52 ? 00:00:00 /usr/sbin/nginx root 2877 108 0 10:52 pts/1 00:00:00 grep -i nginx root@osestaging1-discourse-ose:/#
- but, actually, my discourse site is inaccessible now; I get a 502 Bad Gateway now after that kill
- ah, I found the answer in the templates/web.template.yml file https://github.com/discourse/discourse_docker/blob/master/templates/web.template.yml
- it's the `sv` command
root@osestaging1-discourse-ose:/var/www/discourse# sv stop nginx ok: down: nginx: 1s, normally up root@osestaging1-discourse-ose:/var/www/discourse# sv start nginx ok: run: nginx: (pid 269) 0s root@osestaging1-discourse-ose:/var/www/discourse# sv --help usage: sv [-v] [-w sec] command service ... root@osestaging1-discourse-ose:/var/www/discourse#
- there's no man page on my docker container; here's the sv man page http://smarden.org/runit/sv.8.html
- since it's not documented anywhere, I made a quick topic on the discourse meta forums with this question & solution https://meta.discourse.org/t/how-to-restart-discorse-nginx-in-docker-container/133187
- awkwardly, it appears that I can't mark my own reply to a topic as a solution? That's not very stack-exchange-like..
- also, I added a line to the 'install-nginx' script to delete the ModSecurity-nginx repo from /tmp/ along with all the other cleanup lines
[root@osestaging1 base]# diff install-nginx.20191112.orig install-nginx 7a8,12 > # mod_security --maltfield > apt-get install -y libmodsecurity-dev modsecurity-crs > cd /tmp > git clone --depth 1 https://github.com/SpiderLabs/ModSecurity-nginx.git > 34c39 < ./configure --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wno-deprecated-declarations' --with-ld-opt='-Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-ipv6 --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_addition_module --with-http_dav_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_v2_module --with-http_sub_module --with-stream --with-stream_ssl_module --with-mail --with-mail_ssl_module --with-threads --add-module=/tmp/ngx_brotli --- > ./configure --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wno-deprecated-declarations' --with-ld-opt='-Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-ipv6 --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_addition_module --with-http_dav_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_v2_module --with-http_sub_module --with-stream --with-stream_ssl_module --with-mail --with-mail_ssl_module --with-threads --add-module=/tmp/ngx_brotli --add-module=/tmp/ModSecurity-nginx 44a50 > rm -fr /tmp/ModSecurity-nginx [root@osestaging1 base]#
- I also discovered that the existing script has a minor logic error: it doesn't actually cleanup the nginx source code. I commented about this on the discourse meta forums https://meta.discourse.org/t/how-to-run-discourse-in-apache-vhost-not-nginx/133112/14
- I re-did all the install-nginx commands in the docker container again
- finally, I added the following lines inside the server{} block of /etc/nginx/conf.d/discourse.conf
modsecurity on; modsecurity_rules_file /etc/nginx/conf.d/modsecurity.include;
- and I stopped & started nginx
root@osestaging1-discourse-ose:/var/www/discourse# sv stop nginx ok: down: nginx: 0s, normally up root@osestaging1-discourse-ose:/var/www/discourse# sv start nginx ok: run: nginx: (pid 11237) 0s root@osestaging1-discourse-ose:/var/www/discourse#
- I confirmed that I can query the page and get a 200 with curl
user@ose:~$ curl -Iks https://discourse.opensourceecology.org/ HTTP/1.1 200 OK Server: nginx Date: Tue, 12 Nov 2019 11:53:34 GMT Content-Type: text/html; charset=utf-8 Connection: keep-alive Vary: Accept-Encoding X-Frame-Options: SAMEORIGIN X-XSS-Protection: 1; mode=block X-Content-Type-Options: nosniff X-Download-Options: noopen X-Permitted-Cross-Domain-Policies: none Referrer-Policy: strict-origin-when-cross-origin X-Discourse-Route: list/latest Cache-Control: no-cache, no-store Content-Security-Policy: base-uri 'none'; object-src 'none'; script-src 'unsafe-eval' 'report-sample' http://discourse.opensourceecology.org/logs/ http://discourse.opensourceecology.org/sidekiq/ http://discourse.opensourceecology.org/mini-profiler-resources/ http://discourse.opensourceecology.org/assets/ http://discourse.opensourceecology.org/brotli_asset/ http://discourse.opensourceecology.org/extra-locales/ http://discourse.opensourceecology.org/highlight-js/ http://discourse.opensourceecology.org/javascripts/ http://discourse.opensourceecology.org/plugins/ http://discourse.opensourceecology.org/theme-javascripts/ http://discourse.opensourceecology.org/svg-sprite/; worker-src 'self' blob: X-Request-Id: 26257fd3-a4f7-4926-95e7-01cc7373a5fa X-Runtime: 0.146294 Strict-Transport-Security: max-age=15552001 Public-Key-Pins: pin-sha256="UbSbHFsFhuCrSv9GNsqnGv4CbaVh5UV5/zzgjLgHh9c="; pin-sha256="YLh1dUR9y6Kja30RrAn7JKnbQG/uEtLMkBgFF2Fuihg="; pin-sha256="C5+lpZ7tcVwmwQIMcRtPbsQtWLABXhQzejna0wHFr8M="; pin-sha256="Vjs8r4z+80wjNcr1YKepWQboSIRi63WsWXhIMN+eWys="; pin-sha256="lCppFqbkrlJ3EcVFAkeip0+44VaoJUymbnOaEUk7tEU="; pin-sha256="K87oWBWM9UZfyddvDfoxL+8lpNyoUB2ptGtn0fv6G2Q="; pin-sha256="Y9mvm0exBk1JoQ57f9Vm28jKo5lFm/woKcVxrYxu80o="; pin-sha256="EGn6R6CqT4z3ERscrqNl7q7RCzJmDe9uBhS/rnCHU="; pin-sha256="NIdnza073SiyuN1TUa7DDGjOxc1p0nbfOCfbxPWAZGQ="; pin-sha256="fNZ8JI9p2D/C+bsB3LH3rWejY9BGBDeW0JhMOiMfa7A="; pin-sha256="oyD01TTXvpfBro3QSZc1vIlcMjrdLTiL/M9mLCPX+Zo="; pin-sha256="0cRTd+vc1hjNFlHcLgLCHXUeWqn80bNDH/bs9qMTSPo="; pin-sha256="MDhNnV1cmaPdDDONbiVionUHH2QIf2aHJwq/lshMWfA="; pin-sha256="OIZP7FgTBf7hUpWHIA7OaPVO2WrsGzTl9vdOHLPZmJU="; max-age=3600; includeSubDomains; report-uri="http:opensourceecology.org/hpkp-report" user@ose:~$
- And, finally, I successfully confirmed that modsecurity returns a 403 when the testparm from 'modsecurity.include' is set
user@ose:~$ curl -Iks https://discourse.opensourceecology.org/?testparam=test HTTP/1.1 403 Forbidden Server: nginx Date: Tue, 12 Nov 2019 11:53:48 GMT Content-Type: text/html Content-Length: 146 Connection: keep-alive user@ose:~$
- And the modsecurity log reflects this too!
root@osestaging1-discourse-ose:/var/www/discourse# tail -f /var/log/nginx/modsec_audit.log ... ---ZPTABsP3---A-- [12/Nov/2019:11:56:29 +0000] 157355978960.000481 10.241.189.10 0 10.241.189.10 12147 ---ZPTABsP3---B-- HEAD /?testparam=test HTTP/1.1 Host: discourse.opensourceecology.org X-Forwarded-For: 10.241.189.10 X-Forwarded-Proto: https Connection: close X-Real-IP: 10.241.189.10 User-Agent: curl/7.52.1 Accept: */* ---ZPTABsP3---D-- ---ZPTABsP3---F-- HTTP/1.1 403 Server: nginx Date: Tue, 12 Nov 2019 11:56:29 GMT Content-Length: 146 Content-Type: text/html Connection: close ---ZPTABsP3---H-- ---ZPTABsP3---I-- ---ZPTABsP3---J-- ---ZPTABsP3---Z-- ^C root@osestaging1-discourse-ose:/var/www/discourse#
- To actually enable the CRS, I updated the nginx modsecurity.include file
root@osestaging1-discourse-ose:/var/www/discourse# cat /etc/nginx/conf.d/modsecurity.include ################################################################################ # File: modsecurity.include # Version: 0.1 # Purpose: Defines mod_security rules for the discourse vhost # This should be included in the server{} blocks nginx vhosts. # Author: Michael Altfield <michael@opensourceecology.org> # Created: 2019-11-12 # Updated: 2019-11-12 ################################################################################ Include "/etc/modsecurity/modsecurity.conf" # OWASP Core Rule Set, installed from the 'modsecurity-crs' package in debian Include /etc/modsecurity/crs/crs-setup.conf Include /usr/share/modsecurity-crs/rules/*.conf root@osestaging1-discourse-ose:/var/www/discourse#
- and now it's blocking this sqli attack
user@ose:~$ curl -Iks 'https://discourse.opensourceecology.org/?sqli_attack="; DROP TABLE please;' HTTP/1.1 403 Forbidden Server: nginx Date: Tue, 12 Nov 2019 12:10:32 GMT Content-Type: text/html Content-Length: 146 Connection: keep-alive user@ose:~$
- compare this to meta.discourse.org, which--intesting--gives me a 400 error?
user@ose:~$ curl -Iks 'https://meta.discourse.org/?sqli_attack="; DROP TABLE please;' HTTP/2 400 date: Tue, 12 Nov 2019 12:11:13 GMT server: nginx user@ose:~$
- so it looks like meta.discourse.org does have some sort of WAF in-place. It works until I use the word "DROP"
user@ose:~$ curl -Iks 'https://meta.discourse.org/?sqli_attack=";' HTTP/2 200 date: Tue, 12 Nov 2019 12:13:10 GMT content-type: text/html; charset=utf-8 server: nginx vary: Accept-Encoding x-frame-options: SAMEORIGIN x-xss-protection: 1; mode=block x-content-type-options: nosniff x-download-options: noopen x-permitted-cross-domain-policies: none referrer-policy: strict-origin-when-cross-origin x-discourse-route: list/latest cache-control: no-cache, no-store content-security-policy: base-uri 'none'; object-src 'none'; script-src 'unsafe-eval' 'report-sample' https://meta.discourse.org/logs/ https://meta.discourse.org/sidekiq/ https://meta.discourse.org/mini-profiler-resources/ https://d11a6trkgmumsb.cloudfront.net/assets/ https://d11a6trkgmumsb.cloudfront.net/brotli_asset/ https://meta.discourse.org/extra-locales/ https://d3bpeqsaub0i6y.cloudfront.net/highlight-js/ https://d3bpeqsaub0i6y.cloudfront.net/javascripts/ https://d3bpeqsaub0i6y.cloudfront.net/plugins/ https://d3bpeqsaub0i6y.cloudfront.net/theme-javascripts/ https://d3bpeqsaub0i6y.cloudfront.net/svg-sprite/ https://www.google-analytics.com/analytics.js; worker-src 'self' blob: x-request-id: ad7e08c2-85b9-4be5-ae91-68187d8814ad x-runtime: 0.063429 strict-transport-security: max-age=31536000 user@ose:~$ curl -Iks 'https://meta.discourse.org/?sqli_attack="; DROP' HTTP/2 400 date: Tue, 12 Nov 2019 12:13:14 GMT server: nginx user@ose:~$
- Interestingly, because of the way Discourse is designed, I can still browse my discourse site when querying something like "https://discourse.opensourceecology.org/t/test-topic-that-is-15-characters-or-more/14?sqli_attempt=%22DROP" in my web browser. It's just that a bunch of the JS queries return 403. Maybe it's because of the browser's cache?
- even when I clear the cache, I can still view the topic; hmm..
- Discourse of course makes a ton of reqeusts for a single page load. All of them succeed except one POST it makes to the message-bus, not because of the URI (it strips away my sql_attempt GET var), but namely because there's still a "Referrer" header in the request that includes my malicious variable. So the server returns 403 on all those requests...but the page itself appears to be unaffected in in the ux
- I updated our install guide with some commands to update the `install-nginx` script https://wiki.opensourceecology.org/wiki/Discourse#Nginx_mod_security
- I tried to add a new template for adding & updating the relevant nginx & modsecurity configuration files on the container during the bootstrap, but I kept getting an error on the (super long execution) `./launcher rebuild discourse_ose` setep
[root@osestaging1 discourse]# ./launcher rebuild discourse_ose ... 2019-11-12 13:51:09.437 UTC [49] LOG: database system is shut down FAILED -------------------- Pups::ExecError: cp /etc/modsecurity/modsecurity.conf-recommended /etc/modsecurity/modsecurity.conf failed with return #<Process::Status: pid 11074 exit 1> Location of failure: /pups/lib/pups/exec_command.rb:112:in `spawn' exec failed with the params {"cmd"=>["cp /etc/modsecurity/modsecurity.conf-recommended /etc/modsecurity/modsecurity.conf"]} e744f7701026c015b115924bf758cec781552e2c65693ae198db8742416eb069 FAILED TO BOOTSTRAP please scroll up and look for earlier error messages, there may be more than one. ./discourse-doctor may help diagnose the problem. [root@osestaging1 discourse]#
- so my best guess as to the only reason that would fail is if the '/etc/modsecurity/modsecurity.conf-recommended' file doesn't exist yet. that file is put in-place by installing the 'modsecurity-crs' package, which is supposed to happen in the 'install-nginx' script. So I guess that script doesn't get executed before the template is called? That's a problem.
- fuck it; I just added a line to install that package to the template's exec lines as well. Note, interestingly, that it picked-up that I was running an apt command, and it delayed my template to later *shrug*
[root@osestaging1 discourse]# time ./launcher rebuild discourse_ose ... I, [2019-11-12T14:29:26.856387 #1] INFO -- : > sudo apt-get install -y modsecurity-crs debconf: delaying package configuration, since apt-utils is not installed I, [2019-11-12T14:29:45.766725 #1] INFO -- : Reading package lists... Building dependency tree... Reading state information... The following additional packages will be installed: apache2-bin libapache2-mod-security2 libapr1 libaprutil1 libaprutil1-dbd-sqlite3 libaprutil1-ldap libbrotli1 libjansson4 liblua5.1-0 liblua5.2-0 libyajl2 Suggested packages: apache2-doc apache2-suexec-pristine | apache2-suexec-custom www-browser lua geoip-database-contrib ruby The following NEW packages will be installed: apache2-bin libapache2-mod-security2 libapr1 libaprutil1 libaprutil1-dbd-sqlite3 libaprutil1-ldap libbrotli1 libjansson4 liblua5.1-0 liblua5.2-0 libyajl2 modsecurity-crs 0 upgraded, 12 newly installed, 0 to remove and 0 not upgraded. Need to get 2,544 kB of archives. After this operation, 11.1 MB of additional disk space will be used. Get:1 http://deb.debian.org/debian buster/main amd64 libapr1 amd64 1.6.5-1+b1 [102 kB] Get:2 http://deb.debian.org/debian buster/main amd64 libaprutil1 amd64 1.6.1-4 [91.8 kB] Get:3 http://deb.debian.org/debian buster/main amd64 libaprutil1-dbd-sqlite3 amd64 1.6.1-4 [18.7 kB] Get:4 http://deb.debian.org/debian buster/main amd64 libaprutil1-ldap amd64 1.6.1-4 [16.8 kB] Get:5 http://deb.debian.org/debian buster/main amd64 libbrotli1 amd64 1.0.7-2 [270 kB] Get:6 http://deb.debian.org/debian buster/main amd64 libjansson4 amd64 2.12-1 [38.0 kB] Get:7 http://deb.debian.org/debian buster/main amd64 liblua5.2-0 amd64 5.2.4-1.1+b2 [110 kB] Get:8 http://deb.debian.org/debian buster/main amd64 apache2-bin amd64 2.4.38-3+deb10u1 [1,307 kB] Get:9 http://deb.debian.org/debian buster/main amd64 liblua5.1-0 amd64 5.1.5-8.1+b2 [111 kB] Get:10 http://deb.debian.org/debian buster/main amd64 libyajl2 amd64 2.1.0-3 [23.8 kB] Get:11 http://deb.debian.org/debian buster/main amd64 libapache2-mod-security2 amd64 2.9.3-1 [257 kB] Get:12 http://deb.debian.org/debian buster/main amd64 modsecurity-crs all 3.1.0-1 [198 kB] Fetched 2,544 kB in 0s (20.6 MB/s) Selecting previously unselected package libapr1:amd64. (Reading database ... 44564 files and directories currently installed.) Preparing to unpack .../00-libapr1_1.6.5-1+b1_amd64.deb ... Unpacking libapr1:amd64 (1.6.5-1+b1) ... Selecting previously unselected package libaprutil1:amd64. Preparing to unpack .../01-libaprutil1_1.6.1-4_amd64.deb ... Unpacking libaprutil1:amd64 (1.6.1-4) ... Selecting previously unselected package libaprutil1-dbd-sqlite3:amd64. Preparing to unpack .../02-libaprutil1-dbd-sqlite3_1.6.1-4_amd64.deb ... Unpacking libaprutil1-dbd-sqlite3:amd64 (1.6.1-4) ... Selecting previously unselected package libaprutil1-ldap:amd64. Preparing to unpack .../03-libaprutil1-ldap_1.6.1-4_amd64.deb ... Unpacking libaprutil1-ldap:amd64 (1.6.1-4) ... Selecting previously unselected package libbrotli1:amd64. Preparing to unpack .../04-libbrotli1_1.0.7-2_amd64.deb ... Unpacking libbrotli1:amd64 (1.0.7-2) ... Selecting previously unselected package libjansson4:amd64. Preparing to unpack .../05-libjansson4_2.12-1_amd64.deb ... Unpacking libjansson4:amd64 (2.12-1) ... Selecting previously unselected package liblua5.2-0:amd64. Preparing to unpack .../06-liblua5.2-0_5.2.4-1.1+b2_amd64.deb ... Unpacking liblua5.2-0:amd64 (5.2.4-1.1+b2) ... Selecting previously unselected package apache2-bin. Preparing to unpack .../07-apache2-bin_2.4.38-3+deb10u1_amd64.deb ... Unpacking apache2-bin (2.4.38-3+deb10u1) ... Selecting previously unselected package liblua5.1-0:amd64. Preparing to unpack .../08-liblua5.1-0_5.1.5-8.1+b2_amd64.deb ... Unpacking liblua5.1-0:amd64 (5.1.5-8.1+b2) ... Selecting previously unselected package libyajl2:amd64. Preparing to unpack .../09-libyajl2_2.1.0-3_amd64.deb ... Unpacking libyajl2:amd64 (2.1.0-3) ... Selecting previously unselected package libapache2-mod-security2. Preparing to unpack .../10-libapache2-mod-security2_2.9.3-1_amd64.deb ... Unpacking libapache2-mod-security2 (2.9.3-1) ... Selecting previously unselected package modsecurity-crs. Preparing to unpack .../11-modsecurity-crs_3.1.0-1_all.deb ... Unpacking modsecurity-crs (3.1.0-1) ... Setting up libbrotli1:amd64 (1.0.7-2) ... Setting up libyajl2:amd64 (2.1.0-3) ... Setting up libapr1:amd64 (1.6.5-1+b1) ... Setting up modsecurity-crs (3.1.0-1) ... Setting up libjansson4:amd64 (2.12-1) ... Setting up liblua5.2-0:amd64 (5.2.4-1.1+b2) ... Setting up liblua5.1-0:amd64 (5.1.5-8.1+b2) ... Setting up libaprutil1:amd64 (1.6.1-4) ... Setting up libaprutil1-ldap:amd64 (1.6.1-4) ... Setting up libaprutil1-dbd-sqlite3:amd64 (1.6.1-4) ... Setting up apache2-bin (2.4.38-3+deb10u1) ... Setting up libapache2-mod-security2 (2.9.3-1) ... Processing triggers for libc-bin (2.28-10) ... I, [2019-11-12T14:29:45.767132 #1] INFO -- : > cp /etc/modsecurity/modsecurity.conf-recommended /etc/modsecurity/modsecurity.conf I, [2019-11-12T14:29:45.779094 #1] INFO -- : I, [2019-11-12T14:29:45.787697 #1] INFO -- : File > /etc/nginx/conf.d/modsecurity.include chmod: chown: I, [2019-11-12T14:29:45.788674 #1] INFO -- : Replacing (?-mix:server.+{) with server { modsecurity on; modsecurity_rules_file /etc/nginx/conf.d/modsecurity.include; in /etc/nginx/conf.d/discourse.conf I, [2019-11-12T14:29:45.791643 #1] INFO -- : > echo "Beginning of custom commands" I, [2019-11-12T14:29:45.799587 #1] INFO -- : Beginning of custom commands I, [2019-11-12T14:29:45.800032 #1] INFO -- : > echo "End of custom commands" I, [2019-11-12T14:29:45.806371 #1] INFO -- : End of custom commands I, [2019-11-12T14:29:45.806650 #1] INFO -- : Terminating async processes I, [2019-11-12T14:29:45.806739 #1] INFO -- : Sending INT to HOME=/var/lib/postgresql USER=postgres exec chpst -u postgres:postgres:ssl-cert -U postgres:postgres:ssl-cert /usr/lib/postgresql/10/bin/postmaster -D /etc/postgresql/10/main pid: 49 I, [2019-11-12T14:29:45.806965 #1] INFO -- : Sending TERM to exec chpst -u redis -U redis /usr/bin/redis-server /etc/redis/redis.conf pid: 166 2019-11-12 14:29:45.811 UTC [49] LOG: received fast shutdown request 166:signal-handler (1573568985) Received SIGTERM scheduling shutdown... 166:M 12 Nov 2019 14:29:45.829 # User requested shutdown... 166:M 12 Nov 2019 14:29:45.829 * Saving the final RDB snapshot before exiting. 2019-11-12 14:29:46.082 UTC [49] LOG: aborting any active transactions 166:M 12 Nov 2019 14:29:46.100 * DB saved on disk 166:M 12 Nov 2019 14:29:46.100 # Redis is now ready to exit, bye bye... 2019-11-12 14:29:46.107 UTC [49] LOG: worker process: logical replication launcher (PID 58) exited with exit code 1 2019-11-12 14:29:46.109 UTC [53] LOG: shutting down 2019-11-12 14:29:46.234 UTC [49] LOG: database system is shut down sha256:76988657b060bb18646decb9b91710d4e265141cfaeebea727752ededa162796 c00e0459426ce7ba3b2539c1d411909702488c2ec99cd56aac4c30db4b436d91 + /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=2 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=osediscourse_2019@michaelaltfield.net -e DISCOURSE_SMTP_ADDRESS=172.17.0.1 -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-discourse-ose -e DOCKER_HOST_IP=172.17.0.1 --name discourse_ose -t -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:fc:97:b8:b4:0d local_discourse/discourse_ose /sbin/boot 14df01c72c4682248df91a31a5dc4d462167293e9809362076267975cc759465 real 10m12.144s user 0m2.632s sys 0m2.333s [root@osestaging1 discourse]#
[root@osestaging1 discourse]# time ./launcher rebuild discourse_ose
- I added all the commands (before it was just a shorter list of commands for testing to find the error). The bootstrap finishes, but nginx now fails to come up.
I, [2019-11-12T14:43:44.756483 #1] INFO -- : > sed -i '/nginx/! s%^\(\s*\)[^#]*SecAuditLog \(.*\)%#\1SecAuditLog \2\n\1SecAuditLog /var/log/nginx/modsec_audit.log%' /etc/modsecurity/modsecurity.conf I, [2019-11-12T14:43:44.765192 #1] INFO -- : I, [2019-11-12T14:43:44.773608 #1] INFO -- : File > /etc/nginx/conf.d/modsecurity.include chmod: chown: I, [2019-11-12T14:43:44.774845 #1] INFO -- : Replacing (?-mix:server.+{) with server { modsecurity on; modsecurity_rules_file /etc/nginx/conf.d/modsecurity.include; in /etc/nginx/conf.d/discourse.conf I, [2019-11-12T14:43:44.776540 #1] INFO -- : > echo "Beginning of custom commands" I, [2019-11-12T14:43:44.780945 #1] INFO -- : Beginning of custom commands I, [2019-11-12T14:43:44.781118 #1] INFO -- : > echo "End of custom commands" I, [2019-11-12T14:43:44.784763 #1] INFO -- : End of custom commands I, [2019-11-12T14:43:44.785048 #1] INFO -- : Terminating async processes I, [2019-11-12T14:43:44.785132 #1] INFO -- : Sending INT to HOME=/var/lib/postgresql USER=postgres exec chpst -u postgres:postgres:ssl-cert -U postgres:postgres:ssl-cert /usr/lib/postgresql/10/bin/postmaster -D /etc/postgresql/10/main pid: 49 I, [2019-11-12T14:43:44.785268 #1] INFO -- : Sending TERM to exec chpst -u redis -U redis /usr/bin/redis-server /etc/redis/redis.conf pid: 166 2019-11-12 14:43:44.790 UTC [49] LOG: received fast shutdown request 166:signal-handler (1573569824) Received SIGTERM scheduling shutdown... 2019-11-12 14:43:44.803 UTC [49] LOG: aborting any active transactions 2019-11-12 14:43:44.826 UTC [49] LOG: worker process: logical replication launcher (PID 58) exited with exit code 1 2019-11-12 14:43:44.826 UTC [53] LOG: shutting down 166:M 12 Nov 2019 14:43:44.845 # User requested shutdown... 166:M 12 Nov 2019 14:43:44.845 * Saving the final RDB snapshot before exiting. 166:M 12 Nov 2019 14:43:44.932 * DB saved on disk 166:M 12 Nov 2019 14:43:44.932 # Redis is now ready to exit, bye bye... 2019-11-12 14:43:44.982 UTC [49] LOG: database system is shut down sha256:f43a4c400a22a43bc380dd6da2b1dfd6c4b19abc441357830aac0b972a12420f b28b80fd026569e7ec05fad9fb6094b84bde6fc5e24ac7aa387f40b26acc14ee Removing old container + /bin/docker rm discourse_ose discourse_ose + /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=2 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=osediscourse_2019@michaelaltfield.net -e DISCOURSE_SMTP_ADDRESS=172.17.0.1 -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-discourse-ose -e DOCKER_HOST_IP=172.17.0.1 --name discourse_ose -t -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:fc:97:b8:b4:0d local_discourse/discourse_ose /sbin/boot 09ffc71da6507d7100a5a7329d27893ca6128b31b2f8c9c650f49b9a201a975d real 10m40.690s user 0m2.805s sys 0m2.466s [root@osestaging1 discourse]# ./launcher enter discourse_ose root@osestaging1-discourse-ose:/var/www/discourse# sv status nginx down: nginx: 1s, normally up, want up root@osestaging1-discourse-ose:/var/www/discourse# nginx -t nginx: [emerg] unknown directive "modsecurity" in /etc/nginx/conf.d/discourse.conf:37 nginx: configuration file /etc/nginx/nginx.conf test failed root@osestaging1-discourse-ose:/var/www/discourse#
- damn, nginx doesn't have ModSecurity. Did it even run 'nginx-install' at all?
root@osestaging1-discourse-ose:/var/www/discourse# nginx -V nginx version: nginx/1.17.4 built by gcc 8.3.0 (Debian 8.3.0-6) built with OpenSSL 1.1.1d 10 Sep 2019 TLS SNI support enabled configure arguments: --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wno-deprecated-declarations' --with-ld-opt='-Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-ipv6 --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_addition_module --with-http_dav_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_v2_module --with-http_sub_module --with-stream --with-stream_ssl_module --with-mail --with-mail_ssl_module --with-threads --add-module=/tmp/ngx_brotli root@osestaging1-discourse-ose:/var/www/discourse#
- it looks like 'install-nginx' is run by the Dockerfile, but it's still not clear why it didn't take my changes
[root@osestaging1 discourse]# grep -irl 'install-nginx' * image/base/Dockerfile [root@osestaging1 discourse]# grep 'install-nginx' image/base/Dockerfile ADD install-nginx /tmp/install-nginx RUN /tmp/install-nginx [root@osestaging1 discourse]#
- I asked the community why the `launcher rebuild app` command might not be using my updated `install-nginx` script https://meta.discourse.org/t/how-to-run-discourse-in-apache-vhost-not-nginx/133112/15
Mon Nov 11, 2019
- the discourse devs responded again https://meta.discourse.org/t/discourse-purge-cache-method-on-content-changes/132917/9
- one linked me to the source code for the "anonymous" caching in Discourse (I suppose they think that users that are not logged-in are anonymous). So I guess there is no documentation. The source code itself has almost 0 comments.
- the dev also admitted that I could cut down the load time from a few ms to fractions of a ms. It's not substantiated, but that's a worthwhile gain for a thundering herd making queries to a single server. In any case, I don't think it's necessary to backup the logic that a RAM cache from something as tried & true before an app like rails is obviously going to have immense gains.
- I was instructed to checkout the howto articles on the discourse meta forums for writing plugins, but I wasn't given a link
- I was recommended to checkout their github's plugins section, specifically the github plugin, which has logic in-place to respond to a new post to a topic.
- The devs re-iterated that they don't care about this since they already host huge customers. Of course they do. Of course you can host at almost an infinite scale when running elastically in the cloud. But you could run more efficiently on less nodes with a proper pre-backend cache..
- ...
- anyway, returning to my list of TODO items for this Discourse POC: backups are more important than varnish
- I was successfully able to trigger backups from within the staging node (host machine). It takes 20-30 seconds to run the backup now.
[root@osestaging1 discourse]# time docker exec app discourse backup ... [SUCCESS] Backup done. Output file is in: /var/www/discourse/public/backups/default/discourse-2019-11-11-101834-v20191108000414.tar.gz real 0m20.699s user 0m0.074s sys 0m0.084s [root@osestaging1 discourse]#
- and that creates a compressed tarball in /var/discourse/standalone/backups/default/
[root@osestaging1 discourse]# ls -lah shared/standalone/backups/default/ total 46M drwxr-xr-x. 2 tgriffing 33 4.0K Nov 11 10:18 . drwxr-xr-x. 3 tgriffing 33 4.0K Nov 8 00:00 .. -rw-r--r--. 1 tgriffing 33 6.8M Nov 8 03:31 discourse-2019-11-08-033129-v20191101113230.tar.gz -rw-r--r--. 1 tgriffing tgriffing 9.6M Nov 8 12:22 discourse-2019-11-08-122241-v20191108000414.tar.gz -rw-r--r--. 1 tgriffing tgriffing 9.8M Nov 11 10:15 discourse-2019-11-11-101518-v20191108000414.tar.gz -rw-r--r--. 1 tgriffing tgriffing 9.8M Nov 11 10:16 discourse-2019-11-11-101616-v20191108000414.tar.gz -rw-r--r--. 1 tgriffing tgriffing 9.8M Nov 11 10:18 discourse-2019-11-11-101834-v20191108000414.tar.gz [root@osestaging1 discourse]#
- I validated the contents of the discourse backup; it's just an sql file and an image uploads dir
[root@osestaging1 backup-test.20191111]# ls discourse-2019-11-11-101834-v20191108000414.tar.gz [root@osestaging1 backup-test.20191111]# tar -xzvf discourse-2019-11-11-101834-v20191108000414.tar.gz dump.sql.gz uploads/default/ uploads/default/original/ uploads/default/original/1X/ uploads/default/original/1X/e952cfd4c1bc58e77024e4c2b518531356319780.png [root@osestaging1 backup-test.20191111]# cd ^C [root@osestaging1 backup-test.20191111]# ls -lah total 20M drwxr-xr-x. 3 root root 4.0K Nov 11 10:23 . drwxrwxrwt. 42 root root 4.0K Nov 11 10:23 .. -rw-r--r--. 1 root root 9.8M Nov 11 10:23 discourse-2019-11-11-101834-v20191108000414.tar.gz -rw-r--r--. 1 tgriffing tgriffing 9.8M Nov 11 10:18 dump.sql.gz drwxr-xr-x. 3 root root 4.0K Nov 11 10:23 uploads [root@osestaging1 backup-test.20191111]# gunzip dump.sql.gz [root@osestaging1 backup-test.20191111]# ls -lah total 55M drwxr-xr-x. 3 root root 4.0K Nov 11 10:24 . drwxrwxrwt. 42 root root 4.0K Nov 11 10:23 .. -rw-r--r--. 1 root root 9.8M Nov 11 10:23 discourse-2019-11-11-101834-v20191108000414.tar.gz -rw-r--r--. 1 tgriffing tgriffing 45M Nov 11 10:18 dump.sql drwxr-xr-x. 3 root root 4.0K Nov 11 10:23 uploads [root@osestaging1 backup-test.20191111]# ls -lah uploads/ total 12K drwxr-xr-x. 3 root root 4.0K Nov 11 10:23 . drwxr-xr-x. 3 root root 4.0K Nov 11 10:24 .. drwxr-xr-x. 3 tgriffing 33 4.0K Nov 7 11:56 default [root@osestaging1 backup-test.20191111]# ls -lah uploads/default/ total 12K drwxr-xr-x. 3 tgriffing 33 4.0K Nov 7 11:56 . drwxr-xr-x. 3 root root 4.0K Nov 11 10:23 .. drwxr-xr-x. 3 tgriffing 33 4.0K Nov 7 11:56 original [root@osestaging1 backup-test.20191111]# ls -lah uploads/default/original/ total 12K drwxr-xr-x. 3 tgriffing 33 4.0K Nov 7 11:56 . drwxr-xr-x. 3 tgriffing 33 4.0K Nov 7 11:56 .. drwxr-xr-x. 2 tgriffing 33 4.0K Nov 10 14:17 1X [root@osestaging1 backup-test.20191111]# ls -lah uploads/default/original/1X/ total 20K drwxr-xr-x. 2 tgriffing 33 4.0K Nov 10 14:17 . drwxr-xr-x. 3 tgriffing 33 4.0K Nov 7 11:56 .. -rw-r--r--. 1 tgriffing 33 11K Nov 7 11:56 e952cfd4c1bc58e77024e4c2b518531356319780.png [root@osestaging1 backup-test.20191111]#
- so this is script-able, but currently it's set to 'app'. Let's rename 'app' to 'discourse-ose' first. This meta discourse topic says to do a destroy & mv the yml file & rebuild https://meta.discourse.org/t/how-to-rename-app-yml-to-mycompanyname-yml-after-deploy/42625/2
[root@osestaging1 discourse]# pwd /var/discourse [root@osestaging1 discourse]# ls containers/ app.yml app.yml.2019-10-28-122901.bak app.yml.2019-10-28-124114.bak app.yml.2019-11-07-112654.bak app.yml.2019-10-28-121933.bak app.yml.2019-10-28-123029.bak app.yml.2019-10-28-130503.bak app.yml.2019-11-07-114148.bak app.yml.2019-10-28-122342.bak app.yml.2019-10-28-123104.bak app.yml.2019-10-28-131830.bak app.yml.2019-11-07-124719.bak app.yml.2019-10-28-122534.bak app.yml.2019-10-28-123312.bak app.yml.2019-11-07-094441.bak app.yml.2019-11-07-124745.bak app.yml.2019-10-28-122816.bak app.yml.2019-10-28-123827.bak app.yml.2019-11-07-104530.bak app.yml.2019-11-07-130735.bak [root@osestaging1 discourse]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 8d66b8c8ccce local_discourse/app "/sbin/boot" 11 seconds ago Up 9 seconds app [root@osestaging1 discourse]# ./launcher destroy app + /bin/docker stop -t 10 app app + /bin/docker rm app app [root@osestaging1 discourse]# mv containers/app.yml containers/discourse_ose.yml [root@osestaging1 discourse]# time ./launcher rebuild discourse_ose ... + /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=2 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=osediscourse_2019@michaelaltfield.net -e DISCOURSE_SMTP_ADDRESS=172.17.0.1 -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-discourse-ose -e DOCKER_HOST_IP=172.17.0.1 --name discourse_ose -t -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:fc:97:b8:b4:0d local_discourse/discourse_ose /sbin/boot f01f52d2dcbaae157b75f2a43732b7b1e2d4125cd71103b841d1e36d768658b8 real 10m14.414s user 0m2.779s sys 0m2.386s [root@osestaging1 discourse]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES f01f52d2dcba local_discourse/discourse_ose "/sbin/boot" 51 seconds ago Up 50 seconds discourse_ose [root@osestaging1 discourse]#
- backups are still present; good
[root@osestaging1 discourse]# ls -lah shared/standalone/backups/default/ total 46M drwxr-xr-x. 2 tgriffing 33 4.0K Nov 11 10:18 . drwxr-xr-x. 3 tgriffing 33 4.0K Nov 8 00:00 .. -rw-r--r--. 1 tgriffing 33 6.8M Nov 8 03:31 discourse-2019-11-08-033129-v20191101113230.tar.gz -rw-r--r--. 1 tgriffing 33 9.6M Nov 8 12:22 discourse-2019-11-08-122241-v20191108000414.tar.gz -rw-r--r--. 1 tgriffing 33 9.8M Nov 11 10:15 discourse-2019-11-11-101518-v20191108000414.tar.gz -rw-r--r--. 1 tgriffing 33 9.8M Nov 11 10:16 discourse-2019-11-11-101616-v20191108000414.tar.gz -rw-r--r--. 1 tgriffing 33 9.8M Nov 11 10:18 discourse-2019-11-11-101834-v20191108000414.tar.gz [root@osestaging1 discourse]#
- and I can still kick-off backups using the new name 'discourse_ose' instead of 'app'
[root@osestaging1 discourse]# time docker exec discourse_ose discourse backup ... [SUCCESS] Backup done. Output file is in: /var/www/discourse/public/backups/default/discourse-2019-11-11-110452-v20191108000414.tar.gz real 0m37.206s user 0m0.065s sys 0m0.051s [root@osestaging1 discourse]# ls -lah shared/standalone/backups/default/ total 50M drwxr-xr-x. 2 tgriffing 33 4.0K Nov 11 11:04 . drwxr-xr-x. 3 tgriffing 33 4.0K Nov 8 00:00 .. -rw-r--r--. 1 tgriffing 33 9.6M Nov 8 12:22 discourse-2019-11-08-122241-v20191108000414.tar.gz -rw-r--r--. 1 tgriffing 33 9.8M Nov 11 10:15 discourse-2019-11-11-101518-v20191108000414.tar.gz -rw-r--r--. 1 tgriffing 33 9.8M Nov 11 10:16 discourse-2019-11-11-101616-v20191108000414.tar.gz -rw-r--r--. 1 tgriffing 33 9.8M Nov 11 10:18 discourse-2019-11-11-101834-v20191108000414.tar.gz -rw-r--r--. 1 tgriffing tgriffing 12M Nov 11 11:04 discourse-2019-11-11-110452-v20191108000414.tar.gz [root@osestaging1 discourse]#
- unfortunately the staging node doesn't actually have /root/backups copied from prod (this was intentional; all of /root on prod wasn't copied). Anyway, here's basically what I *should* add to /root/backups/backup.sh (untested)
############# # DISCOURSE # ############# # cleanup old backups $NICE $RM -rf /var/discourse/shared/standalone/backups/default/*.tar.gz time $NICE $DOCKER exec discourse_ose discourse backup $MKDIR "${backupDirPath}/${archiveDirName}/discourse_ose" $NICE $MV /var/discourse/shared/standalone/backups/default/*.tar.gz ${backupDirPath}/${archiveDirName}/discourse_ose/"
- ...
- ok, so now I want to harden Discourse. The first logical step is to prevent Discourse from being able to initiate connections to the Internet. As a web server, it should only be able to *respond* to queries--not initiate them like some piece of malware. We do this with our other apps using iptables to permit RELATED,ESTABLISHED connections through the OUTPUT table, but then blocking all other traffic from specific uids--such as the nginx & apache users.
- iptables on the host machine seems like the best solution for docker as well, but I'm not sure if docker containers are run as a specific uid. It looks like the daemon is run as root. Some services are running as 'docker' or 'dockerd' user, but they don't actually exist in /etc/passwd. They're probably some systemd temp user shit.
[root@osestaging1 backup-test.20191111]# ps -ef | grep -i docker root 2739 1 0 12:10 ? 00:00:00 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --iptables=false root 2758 1689 0 12:10 ? 00:00:00 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/f01f52d2dcbaae157b75f2a43732b7b1e2d4125cd71103b841d1e36d768658b8 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc root 3414 24689 0 12:13 pts/5 00:00:00 /bin/docker exec -it discourse_ose /bin/bash --login root 4678 23960 0 12:16 pts/8 00:00:00 grep --color=auto -i docker [root@osestaging1 backup-test.20191111]# ss -plan | grep -i docker u_str LISTEN 0 128 /var/run/docker/libnetwork/8d9f7d64fe798ffae545b7cf474d68140ccb5d56dda86273c3dd44cb10d6b2ff.sock 469029893 * 0 users:(("dockerd",pid=2739,fd=14)) u_str LISTEN 0 128 /var/run/docker.sock 469029709 * 0 users:(("dockerd",pid=2739,fd=6),("systemd",pid=1,fd=24)) u_str LISTEN 0 128 /var/run/docker/metrics.sock 469029865 * 0 users:(("dockerd",pid=2739,fd=3)) u_str ESTAB 0 0 * 469029851 * 469029852 users:(("dockerd",pid=2739,fd=2),("dockerd",pid=2739,fd=1)) u_str ESTAB 0 0 * 469115654 * 469115655 users:(("docker",pid=3414,fd=3)) u_str ESTAB 0 0 * 469115657 * 469115658 users:(("docker",pid=3414,fd=5)) u_dgr UNCONN 0 0 * 469029861 * 370237218 users:(("dockerd",pid=2739,fd=4)) u_str ESTAB 0 0 /var/run/docker.sock 469115655 * 469115654 users:(("dockerd",pid=2739,fd=25)) u_str ESTAB 0 0 /var/run/docker.sock 469115658 * 469115657 users:(("dockerd",pid=2739,fd=26)) u_str ESTAB 0 0 * 469029868 * 469029875 users:(("dockerd",pid=2739,fd=7)) u_str ESTAB 0 0 * 469029877 * 469029878 users:(("dockerd",pid=2739,fd=8)) [root@osestaging1 backup-test.20191111]# grep -ir 'docker' /etc/passwd dockerroot:x:989:985:Docker User:/var/lib/docker:/sbin/nologin [root@osestaging1 backup-test.20191111]#
- it's also worth pointing out that docker already fucked-up all our iptables rules by injecting some tables and rules of its own: not ideal. There also appears to be no way to tell docker to clean that shit up. I would *expect* stopping docker to do that, but it doesn't. Google says that people just had to manually delete the rules to get them to go away. So I did that. https://stackoverflow.com/questions/50084582/cant-delete-docker-containers-default-iptables-rule
[root@osestaging1 lib]# iptables-save # Generated by iptables-save v1.4.21 on Mon Nov 11 12:09:41 2019 *mangle :PREROUTING ACCEPT [10080:1184525] :INPUT ACCEPT [10080:1184525] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [7512:1518444] :POSTROUTING ACCEPT [7506:1518084] COMMIT # Completed on Mon Nov 11 12:09:41 2019 # Generated by iptables-save v1.4.21 on Mon Nov 11 12:09:41 2019 *nat :PREROUTING ACCEPT [1:52] :INPUT ACCEPT [1:52] :OUTPUT ACCEPT [140:9693] :POSTROUTING ACCEPT [134:9333] :DOCKER - [0:0] -A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER -A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER -A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE -A DOCKER -i docker0 -j RETURN COMMIT # Completed on Mon Nov 11 12:09:41 2019 # Generated by iptables-save v1.4.21 on Mon Nov 11 12:09:41 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] :DOCKER - [0:0] :DOCKER-ISOLATION-STAGE-1 - [0:0] :DOCKER-ISOLATION-STAGE-2 - [0:0] :DOCKER-USER - [0:0] -A INPUT -p tcp -m state --state NEW -m tcp --dport 25 -j ACCEPT -A INPUT -s 5.9.144.234/32 -j DROP -A INPUT -s 173.234.159.250/32 -j DROP -A INPUT -i lo -j ACCEPT -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -p icmp -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 80 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 443 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 4443 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 4444 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 32415 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 8020 -j ACCEPT -A INPUT -m limit --limit 5/min -j LOG --log-prefix "iptables IN denied: " --log-level 7 -A INPUT -j DROP -A FORWARD -j DOCKER-USER -A FORWARD -j DOCKER-ISOLATION-STAGE-1 -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -A FORWARD -o docker0 -j DOCKER -A FORWARD -i docker0 ! -o docker0 -j ACCEPT -A FORWARD -i docker0 -o docker0 -j ACCEPT -A OUTPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A OUTPUT -s 127.0.0.1/32 -d 127.0.0.1/32 -j ACCEPT -A OUTPUT -d 213.133.98.98/32 -p udp -m udp --dport 53 -j ACCEPT -A OUTPUT -d 213.133.99.99/32 -p udp -m udp --dport 53 -j ACCEPT -A OUTPUT -d 213.133.100.100/32 -p udp -m udp --dport 53 -j ACCEPT -A OUTPUT -m limit --limit 5/min -j LOG --log-prefix "iptables OUT denied: " --log-level 7 -A OUTPUT -p tcp -m owner --uid-owner 48 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 27 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 995 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 994 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 993 -j DROP -A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2 -A DOCKER-ISOLATION-STAGE-1 -j RETURN -A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP -A DOCKER-ISOLATION-STAGE-2 -j RETURN -A DOCKER-USER -j RETURN COMMIT # Completed on Mon Nov 11 12:09:41 2019 [root@osestaging1 lib]# cd /root/backups/iptables/20191111 [root@osestaging1 20191111]# iptables-save > iptablesa [root@osestaging1 20191111]# cp iptablesa iptablesb [root@osestaging1 20191111]# vim iptablesb [root@osestaging1 20191111]# iptables-restore < iptablesb [root@osestaging1 20191111]# iptables-save # Generated by iptables-save v1.4.21 on Mon Nov 11 12:10:17 2019 *mangle :PREROUTING ACCEPT [129:14009] :INPUT ACCEPT [129:14009] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [107:12837] :POSTROUTING ACCEPT [107:12837] COMMIT # Completed on Mon Nov 11 12:10:17 2019 # Generated by iptables-save v1.4.21 on Mon Nov 11 12:10:17 2019 *nat :PREROUTING ACCEPT [0:0] :INPUT ACCEPT [0:0] :OUTPUT ACCEPT [0:0] :POSTROUTING ACCEPT [0:0] COMMIT # Completed on Mon Nov 11 12:10:17 2019 # Generated by iptables-save v1.4.21 on Mon Nov 11 12:10:17 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] -A INPUT -p tcp -m state --state NEW -m tcp --dport 25 -j ACCEPT -A INPUT -s 5.9.144.234/32 -j DROP -A INPUT -s 173.234.159.250/32 -j DROP -A INPUT -i lo -j ACCEPT -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -p icmp -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 80 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 443 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 4443 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 4444 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 32415 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 8020 -j ACCEPT -A INPUT -m limit --limit 5/min -j LOG --log-prefix "iptables IN denied: " --log-level 7 -A INPUT -j DROP -A OUTPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A OUTPUT -s 127.0.0.1/32 -d 127.0.0.1/32 -j ACCEPT -A OUTPUT -d 213.133.98.98/32 -p udp -m udp --dport 53 -j ACCEPT -A OUTPUT -d 213.133.99.99/32 -p udp -m udp --dport 53 -j ACCEPT -A OUTPUT -d 213.133.100.100/32 -p udp -m udp --dport 53 -j ACCEPT -A OUTPUT -m limit --limit 5/min -j LOG --log-prefix "iptables OUT denied: " --log-level 7 -A OUTPUT -p tcp -m owner --uid-owner 48 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 27 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 995 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 994 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 993 -j DROP COMMIT # Completed on Mon Nov 11 12:10:17 2019 [root@osestaging1 20191111]#
- I read that I can prevent docker from injecting rules into iptables by using the 'iptables=false' argument, but it's unclear where to put this value https://docs.docker.com/network/iptables/
- the above doc says to put it in /etc/docker/daemon.json, but that file does not exist
- other places say to define DOCKER_OPTS. I could find no place to put this in /var/discourse
- other places said /etc/default/docker; that doesn't exist either
- other places said /etc/sysconfig/docker; that doesn't exist either
- finally, I found /usr/lib/systemd/system/docker.service that defines 'ExecStart'. I added the '--iptables=false' arg to that line
[root@osestaging1 system]# cat docker.service [Unit] Description=Docker Application Container Engine Documentation=https://docs.docker.com BindsTo=containerd.service After=network-online.target firewalld.service containerd.service Wants=network-online.target Requires=docker.socket [Service] Type=notify # the default is not to use systemd for cgroups because the delegate issues still # exists and systemd currently does not support the cgroup feature set required # for containers run by docker #ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --iptables=false ExecReload=/bin/kill -s HUP $MAINPID TimeoutSec=0 RestartSec=2 Restart=always # Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229. # Both the old, and new location are accepted by systemd 229 and up, so using the old location # to make them work for either version of systemd. StartLimitBurst=3 # Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230. # Both the old, and new name are accepted by systemd 230 and up, so using the old name to make # this option work for either version of systemd. StartLimitInterval=60s # Having non-zero Limit*s causes performance problems due to accounting overhead # in the kernel. We recommend using cgroups to do container-local accounting. LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity # Comment TasksMax if your systemd version does not support it. # Only systemd 226 and above support this option. TasksMax=infinity # set delegate yes so that systemd does not reset the cgroups of docker containers Delegate=yes # kill only the docker process, not all processes in the cgroup KillMode=process [Install] WantedBy=multi-user.target [root@osestaging1 system]#
- note that I had to reload the systemd daemon to apply this change. when I restarted docker, it mangled our iptables *less*, but it still added 3x lines
[root@osestaging1 system]# systemctl daemon-reload [root@osestaging1 system]# systemctl start docker [root@osestaging1 system]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES f01f52d2dcba local_discourse/discourse_ose "/sbin/boot" 2 hours ago Up 4 seconds discourse_ose [root@osestaging1 system]# iptables-save # Generated by iptables-save v1.4.21 on Mon Nov 11 12:28:47 2019 *mangle :PREROUTING ACCEPT [7821:1020843] :INPUT ACCEPT [7810:1020199] :FORWARD ACCEPT [11:644] :OUTPUT ACCEPT [6531:1681597] :POSTROUTING ACCEPT [6530:1681521] COMMIT # Completed on Mon Nov 11 12:28:47 2019 # Generated by iptables-save v1.4.21 on Mon Nov 11 12:28:47 2019 *nat :PREROUTING ACCEPT [11:584] :INPUT ACCEPT [9:468] :OUTPUT ACCEPT [153:10197] :POSTROUTING ACCEPT [143:9593] COMMIT # Completed on Mon Nov 11 12:28:47 2019 # Generated by iptables-save v1.4.21 on Mon Nov 11 12:28:47 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [1:76] :DOCKER-USER - [0:0] -A INPUT -p tcp -m state --state NEW -m tcp --dport 25 -j ACCEPT -A INPUT -s 5.9.144.234/32 -j DROP -A INPUT -s 173.234.159.250/32 -j DROP -A INPUT -i lo -j ACCEPT -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -p icmp -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 80 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 443 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 4443 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 4444 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 32415 -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 8020 -j ACCEPT -A INPUT -m limit --limit 5/min -j LOG --log-prefix "iptables IN denied: " --log-level 7 -A INPUT -j DROP -A FORWARD -j DOCKER-USER -A OUTPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A OUTPUT -s 127.0.0.1/32 -d 127.0.0.1/32 -j ACCEPT -A OUTPUT -d 213.133.98.98/32 -p udp -m udp --dport 53 -j ACCEPT -A OUTPUT -d 213.133.99.99/32 -p udp -m udp --dport 53 -j ACCEPT -A OUTPUT -d 213.133.100.100/32 -p udp -m udp --dport 53 -j ACCEPT -A OUTPUT -m limit --limit 5/min -j LOG --log-prefix "iptables OUT denied: " --log-level 7 -A OUTPUT -p tcp -m owner --uid-owner 48 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 27 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 995 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 994 -j DROP -A OUTPUT -p tcp -m owner --uid-owner 993 -j DROP -A DOCKER-USER -j RETURN COMMIT # Completed on Mon Nov 11 12:28:47 2019 [root@osestaging1 system]#
- now when I enter the docker container, it cannot curl google.com (or the ip address for google.com). success!
[root@osestaging1 discourse]# ./launcher enter discourse_ose root@osestaging1-discourse-ose:/var/www/discourse# curl google.com curl: (6) Could not resolve host: google.com root@osestaging1-discourse-ose:/var/www/discourse# curl 216.58.207.78 curl: (7) Failed to connect to 216.58.207.78 port 80: Connection timed out root@osestaging1-discourse-ose:/var/www/discourse#
- meanwhile, I confirmed that I can still browse around on the discourse wui without issues; this is probaly because the host server has an nginx config mapped to a socket file--preventing the need for me to poke a hole in the firewall for RELATED, ESTABLISHED connections. svveet.
- the next step for hardening is getting a WAF. We use mod_security and the OWASP CRS for all our other sites, but all our other sites' backends are running apache. Unfortunately, getting mod_security setup in Nginx (which is what Discourse runs as in docker) requires compiling Nginx from source D:
- A search for 'apache' in the meta.discourse.org forums shows a lot of info on how to run discourse on a sever with apache already running. I already followed these guides to get Discourse to listen on a socket instead of a port to avoid port binding conflicts. Other topics on the forums are guides to run apache on the host that proxy back to Nginx. I couldn't find anyone who actually got Discourse to run as Apache.
- I created a new topic asking if anyone has actually gotten the docker backend running Discourse to run in an Apache vhost, hopefully someone has already translated the Nginx config into Apache so I can just copy their config https://meta.discourse.org/t/how-to-run-discourse-in-apache-vhost-not-nginx/133112
- good lord, it looks like Discourse may already be compiling nginx from source?? https://github.com/discourse/discourse_docker/blob/416467f6ead98f82342e8a926dc6e06f36dfbd56/image/base/install-nginx
- if this is confirmed, then I should ask the community which is least likely to break my Discourse config in the future: updating the above install-nginx script to include mod_security or updating the container's config to run Discourse behind Apache instead of Nginx
- ok, so it looks like $home is /var/www/discourse, which is itself a git clone of the main 'discourse/discourse' repo https://github.com/discourse/discourse/
root@osestaging1-discourse-ose:/var/www/discourse# cat .git/config [core] repositoryformatversion = 0 filemode = true bare = false logallrefupdates = true [remote "origin"] url = https://github.com/discourse/discourse.git fetch = +refs/heads/*:refs/remotes/origin/* fetch = +refs/heads/tests-passed:refs/remotes/origin/tests-passed fetch = +refs/heads/master:refs/remotes/origin/master [branch "master"] remote = origin merge = refs/heads/master [branch "tests-passed"] remote = origin merge = refs/heads/tests-passed root@osestaging1-discourse-ose:/var/www/discourse#
- that repo appears to be created (git cloned) at the end of this Dockerfile execution from the 'discouse_docker' repo that lives on the host at /var/discourse https://github.com/discourse/discourse_docker/blob/ceffc4433e1bd6fcbd101f2427e17232fc99ab14/image/base/Dockerfile#L132
[root@osestaging1 discourse]# cd /var/discourse/ [root@osestaging1 discourse]# cat .git/config [core] repositoryformatversion = 0 filemode = true bare = false logallrefupdates = true [remote "origin"] url = https://github.com/discourse/discourse_docker.git fetch = +refs/heads/*:refs/remotes/origin/* [branch "master"] remote = origin merge = refs/heads/master [root@osestaging1 discourse]# ls -lah image/base/Dockerfile -rw-r--r--. 1 root root 5.6K Oct 28 12:07 image/base/Dockerfile [root@osestaging1 discourse]#
- so then $home (in the template yamls) appears to be equal to the '/var/www/discourse/' dir in the container, and then "$home/config/nginx.sample.conf" is '/var/www/discourse/config/nginx.sample.conf', which is then the file in the 'discourse' repo here https://github.com/discourse/discourse/blob/1d1dd2a4d436944a7b088f2d4a471c62b8fa4de2/config/nginx.sample.conf
- well, I wanted to avoid compiling nginx from source, but if Discourse is *already* doing that, then it's probably easier to just update that relatively simple 'install-nginx' script to add-in the mod_security module to the container's build of nginx rather than translate all the nginx vhost configs to apache
- so then the next steps for getting a WAF in-front of Discourse would be to
- update the install-nginx script so that it compiles nginx with mod_security (and probably downloads the OWASP CRS as well) https://github.com/discourse/discourse_docker/blob/416467f6ead98f82342e8a926dc6e06f36dfbd56/image/base/install-nginx
- add a new templates/web.modsecurity.yml file that updates the /etc/nginx/conf.d/discourse.conf file to enable mod_security (and add some blacklisted rules as-needed), similar to the existing web.socketed.template.yml file https://github.com/discourse/discourse_docker/blob/416467f6ead98f82342e8a926dc6e06f36dfbd56/templates/web.socketed.template.yml
Sun Nov 10, 2019
- I'm still getting a lot of friction from the Discourse team; they don't seem to understand front-end caching at all. But at least they are actively responding https://meta.discourse.org/t/discourse-purge-cache-method-on-content-changes/132917/9
- I was informed that there is a caching layer built-into Discourse for users that are not logged-in. I asked for a link to the documentation, but I'm afraid that it may not exist.
- I was chastised by on of their staff testers for not understanding how Discourse works...but I still can't find any documentation that describes how it works *shrug*
- in any case, they said I was wrong when I stated that Discourse produces HTML, CSS, & JS and sends that to a browser. What? That's literally what Discourse does, and all of it can be cached by varnish.
- this person seems to think I want to cache logged-in users. I'm not sure why they would assume that...I clarified that people usually don't cache logged-in users. I don't think the Discourse team has ever worked with caching web apps before...
- in any case, they said I was wrong when I stated that Discourse produces HTML, CSS, & JS and sends that to a browser. What? That's literally what Discourse does, and all of it can be cached by varnish.
- so I'm hoping I get 2x responses:
- links to documentation on what caching Discourse does built-in
- links to documentation on how I can write a Discourse plugin to call a function that I write when a new post is added to a topic
Sat Nov 09, 2019
- oh boy, a founder of the Discourse project responded to my question about how to have the DIscourse app send PURGE requests to our varnish caching layer on content changes. He errornously suggested that caching doesn't make sense for today's JS apps--as if Discourse's function isn't to produce HTML, CSS, and JS (all of which can be cached!). This is *not* good. https://meta.discourse.org/t/discourse-purge-cache-method-on-content-changes/132917
- I did some research into scaling Discourse. I got a ton of info in this thread. Important to note is that they recommend running redis & postgresql *outside* of docker; then just elastically scaling the ruby docker containers as needed based on some calculations https://meta.discourse.org/t/performance-scaling-and-ha-requirements/60098/8
- it makes a point to note that these calculations are based on read operations, which again suggests that we could significantly reduce our hardware requirements by putting a fucking varnish cache before the app. That's not surprising...
- Unfortunatley, it looks like I'd have to write a damn plugin for Discourse to get it to invalidate the cache. And, worse, I'd have no support from their development team that doesn't see the point in adding a cache before their app. Is it worth it?
Fri Nov 08, 2019
- I got a response from the discourse community on my query regarding connecting discourse to the same server running discourse over smtp unauth https://meta.discourse.org/t/troubleshooting-email-on-a-new-discourse-install/16326
- I was errornously told that my config is rare (it's literally the default postfix config on rhel/centos) and that it isn't supported using the 'discourse-setup' install script *facepalm*
- I really, really want to abort this Discourse POC. Their install tools are shit. Their community and documentation is shit & non-existant. Their developers are necessarily ruby shit. It's unnecessarily complex to the point that the most basic smtp config breaks its default install (localhost:25 without auth. could it get any simpler?!?). This isn't a problem for most open source web projects; they just connect to localhost:25 and send email without any configuration. They just work! But not Discourse...Alas, everyone is using Discourse now. And it still, from a user perspective, seems to be the best tool. It'll just be a huge pain to install & manage *sign*
- anyway, continuing from yesterday, the only port that's visible on the docker host from within the discourse docker container is 1000
root@osestaging1-app:/# nmap 172.17.0.1 Starting Nmap 7.70 ( https://nmap.org ) at 2019-11-07 15:00 UTC Nmap scan report for 172.17.0.1 Host is up (0.000019s latency). Not shown: 999 closed ports PORT STATE SERVICE 10000/tcp open snet-sensor-mgmt MAC Address: 02:42:80:35:65:A1 (Unknown) Nmap done: 1 IP address (1 host up) scanned in 1.85 seconds root@osestaging1-app:/#
- I checked the host, and there is something listening on pot 10000
[root@osestaging1 20191107]# ss -plan | grep -i 1000 udp UNCONN 0 0 *:10000 *:* users:(("miniserv.pl",pid=620,fd=6)) tcp LISTEN 0 128 *:10000 *:* users:(("miniserv.pl",pid=620,fd=5)) [root@osestaging1 20191107]#
- smtp, however, is only listening on 127.0.0.1. This is ideal; all servers on prod were setup to bind to a specific IP; not all interfaces. I guess we'll have to add the docker interface for postfix, though
- I update the staging server's postfix config to use inet_interfaces "127.0.0.1, 172.17.0.1" instead of just "localhost"
#inet_interfaces = localhost inet_interfaces = 127.0.0.1, 172.17.0.1
- I restarted the postfix service & confirmed the changes
[root@osestaging1 postfix]# service postfix restart Redirecting to /bin/systemctl restart postfix.service [root@osestaging1 postfix]# [root@osestaging1 postfix]# ss -plan | grep -i ':25' | grep -i LISTEN tcp LISTEN 0 100 172.17.0.1:25 *:* users:(("master",pid=27738,fd=14)) tcp LISTEN 0 100 127.0.0.1:25 *:* users:(("master",pid=27738,fd=13)) [root@osestaging1 postfix]#
- cool, that worked. now my Discourse docker instance can see an open port 25 on the staging server
root@osestaging1-app:/# nmap 172.17.0.1 Starting Nmap 7.70 ( https://nmap.org ) at 2019-11-08 07:57 UTC Nmap scan report for 172.17.0.1 Host is up (0.000035s latency). Not shown: 998 closed ports PORT STATE SERVICE 25/tcp open smtp 10000/tcp open snet-sensor-mgmt MAC Address: 02:42:80:35:65:A1 (Unknown) Nmap done: 1 IP address (1 host up) scanned in 1.56 seconds root@osestaging1-app:/#
- I tested this access via telnet, but it was rejected by the mail server
root@osestaging1-app:/# telnet 172.17.0.1 25 Trying 172.17.0.1... Connected to 172.17.0.1. Escape character is '^]'. 220 mailer.opensourceecology.org ESMTP Postfix HELO from osestaging1-app.opensourceecology.org 250 mailer.opensourceecology.org mail from: discourse@opensourceecology.org 250 2.1.0 Ok rcpt to: michael@opensourceecology.org 454 4.7.1 <michael@opensourceecology.org>: Relay access denied
- the postfix log at /var/log/maillog on the staging server shows that it's rejected. That's probably because the docker IP is not in the 'mynetworks'
[root@osestaging1 nginx]# tail -f /var/log/maillog ... Nov 8 08:02:15 osestaging1 postfix/smtpd[28964]: connect from unknown[172.17.0.2] Nov 8 08:02:52 osestaging1 postfix/smtpd[28964]: NOQUEUE: reject: RCPT from unknown[172.17.0.2]: 454 4.7.1 <michael@opensourceecology.org>: Relay access denied; from=<discourse@opensourceecology.org> to=<michael@opensourceecology.org> proto=SMTP helo=<from?osestaging1-app.opensourceecology.org>
- I updated the mynetworks_style && mynetworks variables in postfix's /etc/postfix/main.cf config to include the docker0 subnet
[root@osestaging1 postfix]# ip address show docker0 3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP link/ether 02:42:80:35:65:a1 brd ff:ff:ff:ff:ff:ff inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0 valid_lft forever preferred_lft forever inet6 fe80::42:80ff:fe35:65a1/64 scope link valid_lft forever preferred_lft forever [root@osestaging1 postfix]# cat /etc/postfix/main.cf ... #mynetworks_style = host ... mynetworks = 127.0.0.0/8, 172.17.0.0/16 ...
- this appears to work now
root@osestaging1-app:/# telnet 172.17.0.1 25 Trying 172.17.0.1... Connected to 172.17.0.1. Escape character is '^]'. 220 mailer.opensourceecology.org ESMTP Postfix HELO from osestaging1-app.opensourceecology.org 250 mailer.opensourceecology.org mail from: discourse@opensourceecology.org 250 2.1.0 Ok rcpt to: vt6t5up@mail.ru 250 2.1.5 Ok DATA 354 End data with <CR><LF>.<CR><LF> subject: test Hi, this is a test can you see it? . 250 2.0.0 Ok: queued as 434F25E2279 QUIT 221 2.0.0 Bye Connection closed by foreign host. root@osestaging1-app:/#
- postfix accepted it, but the logs show mail.ru rejected my mail as spam
Nov 8 08:14:55 osestaging1 postfix/cleanup[31776]: 434F25E2279: message-id=<> Nov 8 08:14:55 osestaging1 postfix/qmgr[31738]: 434F25E2279: from=<discourse@opensourceecology.org>, size=268, nrcpt=1 (queue active) Nov 8 08:14:56 osestaging1 postfix/smtp[31842]: 434F25E2279: to=<vt6t5up@mail.ru>, relay=mxs.mail.ru[94.100.180.31]:25, delay=67, delays=66/0.03/0.16/1, dsn=5.0.0, status=bounced (host mxs.mail.ru[94.100.180.31] said: 550 spam message rejected. Please visit http://help.mail.ru/notspam-support/id?c=clN2o_Yz744aT6kkUmr_G_iDdemyJDJvS9gQkItq7hI-s9yFa787o8FmNc74dWmdKwAAAJhXAADjzkkU or report details to abuse@corp.mail.ru. Error code: A37653728EEF33F624A94F1A1BFF6A52E97583F86F3224B29010D84B12EE6A8B85DCB33EA33BBF6BCE3566C19D6975F8. ID: 0000002B000057981449CEE3. (in reply to end of DATA command)) Nov 8 08:14:56 osestaging1 postfix/cleanup[31776]: D2DAC5E227D: message-id=<20191108081456.D2DAC5E227D@mailer.opensourceecology.org> Nov 8 08:14:56 osestaging1 postfix/qmgr[31738]: D2DAC5E227D: from=<>, size=2976, nrcpt=1 (queue active) Nov 8 08:14:56 osestaging1 postfix/bounce[31844]: 434F25E2279: sender non-delivery notification: D2DAC5E227D Nov 8 08:14:56 osestaging1 postfix/qmgr[31738]: 434F25E2279: removed Nov 8 08:14:57 osestaging1 postfix/smtp[31842]: D2DAC5E227D: to=<discourse@opensourceecology.org>, relay=aspmx.l.google.com[74.125.140.27]:25, delay=0.63, delays=0.01/0/0.13/0.5, dsn=2.0.0, status=sent (250 2.0.0 OK 1573200896 l8si4236714wmg.78 - gsmtp) Nov 8 08:14:57 osestaging1 postfix/qmgr[31738]: D2DAC5E227D: removed
- I tried again, sending to my personal domain. And it worked!
Nov 8 08:21:22 osestaging1 postfix/smtpd[2027]: connect from unknown[172.17.0.2] Nov 8 08:22:10 osestaging1 postfix/smtpd[2027]: 8168E5E2279: client=unknown[172.17.0.2] Nov 8 08:22:25 osestaging1 postfix/cleanup[31776]: 8168E5E2279: message-id=<> Nov 8 08:22:25 osestaging1 postfix/qmgr[31738]: 8168E5E2279: from=<discourse@opensourceecology.org>, size=304, nrcpt=1 (queue active) Nov 8 08:22:26 osestaging1 postfix/smtp[2095]: 8168E5E2279: to=<osediscourse_2019@michaelaltfield.net>, relay=mail.michaelaltfield.net[176.56.237.113]:25, delay=23, delays=22/0.02/0.48/0.43, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 605471016) Nov 8 08:22:26 osestaging1 postfix/qmgr[31738]: 8168E5E2279: removed
- now, I changed the SMTP address from 'localhost' to '172.17.0.1' in the discourse app.yml config file & rebuild the app
root@osestaging1 discourse]# grep SMTP_ADDRESS containers/app.yml DISCOURSE_SMTP_ADDRESS: 172.17.0.1 [root@osestaging1 discourse]# ./launcher destroy app && ./launcher rebuild app
- finally disourse came back up. Now hitting the "Resend Activation Email" button produces postfix logs in the staging server
Nov 8 08:43:35 osestaging1 postfix/smtpd[21052]: connect from unknown[172.17.0.2] Nov 8 08:43:35 osestaging1 postfix/smtpd[21052]: D330D5E227D: client=unknown[172.17.0.2] Nov 8 08:43:35 osestaging1 postfix/cleanup[21056]: D330D5E227D: message-id=<968ccef2-0b84-4807-940d-bf075e21f260@discourse.opensourceecology.org> Nov 8 08:43:35 osestaging1 postfix/qmgr[31738]: D330D5E227D: from=<noreply@discourse.opensourceecology.org>, size=2790, nrcpt=1 (queue active) Nov 8 08:43:35 osestaging1 postfix/smtpd[21052]: disconnect from unknown[172.17.0.2] Nov 8 08:43:36 osestaging1 postfix/smtp[21057]: D330D5E227D: host mail.michaelaltfield.net[176.56.237.113] said: 450 4.1.8 <noreply@discourse.opensourceecology.org>: Sender address rejected: Domain not found (in reply to RCPT TO command) Nov 8 08:43:38 osestaging1 postfix/smtp[21057]: connect to mail.michaelaltfield.net[2a00:d880:5:82b::329a]:25: Network is unreachable Nov 8 08:43:38 osestaging1 postfix/smtp[21057]: D330D5E227D: to=<osediscorse_2019@michaelaltfield.net>, relay=none, delay=2.4, delays=0.06/0/2.3/0, dsn=4.4.1, status=deferred (connect to mail.michaelaltfield.net[2a00:d880:5:82b::329a]:25: Network is unreachable)
- I checked the logs on my personal mail server. yeah, I rejected them since 'discourse.opensourceecology.org' is not defined
Nov 8 08:45:03 mail postfix/smtpd[14653]: connect from static.113.233.201.195.clients.your-server.de[195.201.233.113] Nov 8 08:45:03 mail postfix/smtpd[14653]: NOQUEUE: reject: RCPT from static.113.233.201.195.clients.your-server.de[195.201.233.113]: 450 4.1.8 <noreply@discourse.opensourceecology.org>: Sender address rejected: Domain not found; from=<noreply@discourse.opensourceecology.org> to=<osediscorse_2019@michaelaltfield.net> proto=ESMTP helo=<mailer.opensourceecology.org> Nov 8 08:45:07 mail postfix/smtpd[14653]: disconnect from static.113.233.201.195.clients.your-server.de[195.201.233.113]
- in my testing, I apparently triggered an active response from ossec, temp banning the ossec staging server from accessing my personal michaelaltfield.net server; here's the command to fix it
[root@mail etc]# /var/ossec/active-response/bin/firewall-drop.sh delete - 195.201.233.113/32
- ugh, I had a typo of <osediscorse_2019@michaelaltfield.net> != <osediscourse_2019@michaelaltfield.net>
- I fixed it & rebuilt the app. god this takes so long. I timed it: a simple change to a single variable followed by a restart of Discourse takes 10 minutes and 40 seconds *facepalm*
[root@osestaging1 discourse]# time ( ./launcher destroy app && ./launcher rebuild app ) 2019-11-08 09:57:15.837 UTC [49] LOG: database system is shut down sha256:e54ffb1ec9b28fda8a13807a4147fcc2bc06f1558e5212c324a8952071602967 2f988a76dde4cc8f151979a0cb321116abe94fcf5159d9630450178b3fef72b3 + /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=2 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=osediscourse_2019@michaelaltfield.net -e DISCOURSE_SMTP_ADDRESS=172.17.0.1 -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-app -e DOCKER_HOST_IP=172.17.0.1 --name app -t -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:53:2a:01:9b:c2 local_discourse/app /sbin/boot dc9a16388e615187594c0fa5919a8d691f0e0a4bb7a55d63d51c5404bca15de7 real 10m40.754s user 0m3.380s sys 0m3.095s [root@osestaging1 discourse]#
- that fixed it! I got an email with an activation link, and I was able to begin the install in the discourse wui!
- I set the Community Name = Open Source Ecology
- I set the "Describe your community in one short sentence" = "We’re developing open source industrial machines that can be made for a fraction of commercial costs, and sharing our designs online for free. The goal of Open Source Ecology is to create an open source economy an efficient economy which increases innovation by open collaboration."
- I set the "Describe your community in few words" = "Open Source Blueprints for Civilization. Build Yourself."
- I set the "Welcome Topic" paragraph to
We’re developing open source industrial machines that can be made for a fraction of commercial costs, and sharing our designs online for free. The goal of Open Source Ecology is to create an open source economy an efficient economy which increases innovation by open collaboration. For more information, see our Founder's TED talk https://www.ted.com/talks/marcin_jakubowski Our website is at https://www.opensourceecology.org And our wiki is at https://wiki.opensourceecology.org
- I set the site to "Public"
- I set the new user signup to "Users can sign up on their own, but must be approved by staff."
- I set the (contact) "Web Page" to https://www.opensourceecology.org/contact/
- I set the city & state regarding laws to "Maysville, MO"
- I selected the top-left-most (first?) theme
- the next step asked for two logos
- one that's 120 pixels high and 3x (or more) wider than that
- one that's square, larger than 512x512
- I used this one for both *shrug*
- It then asked for another two icons: a favicon and, I guess, the apple touch icon? Again, I used the Yellowlogo.png from above for both
- I left the default to show Latest Topics on the main page
- Emoji preferences? I left it at the default of Twitter. Which one is the most open-source? *shrug*
- I added Marcin, but he won't be able to access the site until he's setup with the VPN anyway..
- ok, now I can click around the Discourse wui
- first I noticed that our Discourse version is 2.4.0.beta7. Da fuk? How did we end up on some beta version?
- this articles discusses it. they say that the stable branch is actually neglected and not maintained, so the Discourse model is to use beta for default on production install *facepalm* https://meta.discourse.org/t/please-dont-pressure-self-installers-to-be-on-beta-branch/32237/4
- well, fuck, everything I setup in the wizard (as logged above) is totally not set! why?
- I set all the Admin -> Required fields manually; they saved.
- But then the "Branding" section refused to take my icons. It uploaded them, but then when I clicked the green check-box, it just dissolved my logo and replaced it with Discourse's with no error in the wui or in the logs (they *do* show 200 success logs). the fuck?
- I had to go to the "Login" tab to re-enable the "Staff must approve all new user accounts before they are allowed to access the site." checkbox
- the fuck? there's a default list of domains banned by Discourse, including mailinator.com. Why are we blocking mailinator? I removed it.
- there's also a ban of all cryptographic signatures? This is just perplexing; I created a topic about it, tagging the dev that made this the default back in 2016-08-03 https://meta.discourse.org/t/why-does-discourse-block-cryptographic-signatures-by-default/132912
- I removed the S/MIME & PGP signature ban
- I changed the max image size from 4096 to 1024 KB (this matches our MediaWiki settings)
- I also created a topic on meta.discourse.org asking how to setup Discourse to send PURGE requests to our defined cache server when its content changes, so that we can put Discourse behind varnish https://meta.discourse.org/t/discourse-purge-cache-method-on-content-changes/132917
- there's a secruity tab in the admin section. The default cookie policy is "lax" I changed it to "strict"
- I found a docs subdomain on discourse.org, but it appears to be just for documenting the API? Also, it freezes my damn browser it's so slow.. https://docs.discourse.org/
- I'm not the only one that is puzzled by the lack of documentation https://meta.discourse.org/t/basic-product-documentation/35719/6
- someone responded suggesting threads in meta.discourse.org tagged with #howto https://meta.discourse.org/c/10-howto
- and the faq https://meta.discourse.org/c/howto/faq/4
- this new user guide seems like an obvious place to start https://meta.discourse.org/t/discourse-new-user-guide/96331
- ok, since there's no fucking documentation I'm left googling the Discourse forms trying to figure out how the fuck I should cobble together a backup solution. First, it looks like discourse-triggered backups are stored to /var/discourse/shared/standalone/backups/default https://meta.discourse.org/t/where-do-the-local-backups-go-when-s3-backups-arent-enabled/26591
[root@osestaging1 discourse]# ls -lah /var/discourse/shared/standalone/backups/default/ total 6.9M drwxr-xr-x. 2 tgriffing 33 4.0K Nov 8 03:31 . drwxr-xr-x. 3 tgriffing 33 4.0K Nov 8 00:00 .. -rw-r--r--. 1 tgriffing 33 6.8M Nov 8 03:31 discourse-2019-11-08-033129-v20191101113230.tar.gz [root@osestaging1 discourse]#
- so that's 6.8M. And it's owned by tgriffing? The permissions are wrong, ugh. The permissions start to get fucked under the 'standalone' dir it seems. Maybe that's just some internal docker UIDs or sth..
[root@osestaging1 discourse]# ls -lah /var/discourse/shared/standalone/ total 44K drwxr-xr-x. 11 root root 4.0K Nov 8 09:59 . drwxr-xr-x. 3 root root 4.0K Nov 7 11:27 .. drwxr-xr-x. 3 tgriffing 33 4.0K Nov 8 00:00 backups drwxr-xr-x. 4 root root 4.0K Nov 7 11:28 log srw-rw-rw-. 1 root root 0 Nov 8 09:59 nginx.http.sock drwxr-xr-x. 2 106 109 4.0K Nov 7 11:28 postgres_backup drwx------. 19 106 109 4.0K Nov 8 09:59 postgres_data drwxrwxr-x. 3 106 109 4.0K Nov 8 09:59 postgres_run drwxr-xr-x. 2 108 111 4.0K Nov 8 12:14 redis_data drwxr-xr-x. 4 root root 4.0K Nov 7 11:54 state drwxr-xr-x. 4 tgriffing 33 4.0K Nov 8 09:59 tmp drwxr-xr-x. 3 tgriffing 33 4.0K Nov 7 11:30 uploads [root@osestaging1 discourse]#
- fwiw that standalone dir (which is where all our state is stored) is 102M. Most of it is postgres data
[root@osestaging1 discourse]# du -sh /var/discourse/shared/standalone/ 102M /var/discourse/shared/standalone/ [root@osestaging1 discourse]# du -sh /var/discourse/shared/standalone/* 6.9M /var/discourse/shared/standalone/backups 1.1M /var/discourse/shared/standalone/log 0 /var/discourse/shared/standalone/nginx.http.sock 4.0K /var/discourse/shared/standalone/postgres_backup 93M /var/discourse/shared/standalone/postgres_data 216K /var/discourse/shared/standalone/postgres_run 404K /var/discourse/shared/standalone/redis_data 28K /var/discourse/shared/standalone/state 12K /var/discourse/shared/standalone/tmp 216K /var/discourse/shared/standalone/uploads [root@osestaging1 discourse]#
- this post suggests a method to kick-off a backup, but it requires shutting down the discourse server to do so; not an option. https://meta.discourse.org/t/backup-discourse-from-the-command-line/64364/7
- another user suggested to run a two-liner to trigger a backup via the cli which (I think) doesn't require taking down the app
[root@osestaging1 discourse]# ./launcher enter app root@osestaging1-app:/var/www/discourse# discourse backup [SUCCESS] Backup done. Output file is in: /var/www/discourse/public/backups/default/discourse-2019-11-08-122241-v20191108000414.tar.gz root@osestaging1-app:/var/www/discourse#
- but can it be done from a script?
- related: Discourse is a Javascript-heavy beast that's basically crashing my browser. Just typing into a textarea when updating the wiki in one tab experiences significant delays when I have only 8 tabs open on meta.discourse.org. I need a way to better throttle backgrounded tabs, but I could never make sense of how to do this in firefox. So I posted a question about this to SuperUser. I want backgrounded tabs to have exactly 0.00% CPU usage for at least 30 minutes once they're no longer in the foreground. If I'm not looking at it, it shouldn't be slowing down my computer. https://superuser.com/questions/1500289/how-to-aggressively-throttle-background-tabs-in-firefox-using-dom-min-background
Thr Nov 07, 2019
- Chris made a video tutorial for how to download our Aug 2019 wiki .zim wiki archive from archive.org, put it on an sd card, and view it from Kiwix on android.
- I wrote a draft of an article and asked Chris to publish his video on youtube so we can embed it and publish the article on www.opensourceecology.org
- ...
- continuing from oct 28, let's do our best attempt to validate the damn unsigned docker install script
- I downloaded it again from a vpn connection; the non-cryptographic integrity hash matches from my last download
root@disp3084:~# curl --tlsv1.2 --proto =https --location https://get.docker.com/ > get-docker.sh % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 13216 100 13216 0 0 6487 0 0:00:02 0:00:02 --:--:-- 6487 root@disp3084:~# sha384sum get-docker.sh 68041f4b75f5485834c53c549d1682f1d36af864ac2fde5eba1d7bf401fd44db3a6c79ba32d7f85c6778aea5897182c4 get-docker.sh root@disp3084:~#
- I downloaded it again from the staging server through the clearnet; it matches again
[maltfield@osestaging1 tmp]$ curl --tlsv1.2 --proto =https --location https://get.docker.com/ > get-docker.sh % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 13216 100 13216 0 0 50878 0 --:--:-- --:--:-- --:--:-- 50830 [maltfield@osestaging1 tmp]$ sha384sum get-docker.sh 68041f4b75f5485834c53c549d1682f1d36af864ac2fde5eba1d7bf401fd44db3a6c79ba32d7f85c6778aea5897182c4 get-docker.sh [maltfield@osestaging1 tmp]$
- and, finally, I did another download from the tor network; it maches too
user@host:~$ curl --tlsv1.2 --proto =https --location https://get.docker.com/ > get-docker.sh % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 13216 100 13216 0 0 4673 0 0:00:02 0:00:02 --:--:-- 4673 user@host:~$ sha384sum get-docker.sh 68041f4b75f5485834c53c549d1682f1d36af864ac2fde5eba1d7bf401fd44db3a6c79ba32d7f85c6778aea5897182c4 get-docker.sh user@host:~$
- ok, I'm satisfied that I got the correct file that's being served by get.docker.com, though I cannot have any greater than 0% confidence that it's actually produced by the docker team, since it has no cryptographic signature. Next step is to read the file's contents and see what it's doing.
- ugh, the install script escalates its privilege as root. not great, but reasonable for an install script.
- ok, so the installer sideloads a gpg key into the apt/yum keyring then attempts to install its packages. For centos, the gpg key comes from here
$DOWNLOAD_URL/linux/$lsb_dist/$REPO_FILE
- which should translate at runtime to
https://download.docker.com/linux/centos/docker-ce.repo
- I did the same thing with this file as above; here's from the clearnet on the staging box directly
[root@osestaging1 discourse]# curl --tlsv1.2 --proto =https --location --remote-name https://download.docker.com/linux/centos/docker-ce.repo % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 2424 100 2424 0 0 11862 0 --:--:-- --:--:-- --:--:-- 11940 [root@osestaging1 discourse]# sha384sum docker-ce.repo 483187126d28ca55ff4c6554ab8847c8dcdf3f06b211ea6f800cc7b216088c785373830897ce0d7b202ad1f33edc1dc1 docker-ce.repo [root@osestaging1 discourse]#
- here's from the VPN; it matches
user@disp8990:~$ curl --tlsv1.2 --proto =https --location --remote-name https://download.docker.com/linux/centos/docker-ce.repo % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 2424 100 2424 0 0 602 0 0:00:04 0:00:04 --:--:-- 602 user@disp8990:~$ sha384sum docker-ce.repo 483187126d28ca55ff4c6554ab8847c8dcdf3f06b211ea6f800cc7b216088c785373830897ce0d7b202ad1f33edc1dc1 docker-ce.repo user@disp8990:~$
- and from TOR; it matches too
user@host:~$ curl --tlsv1.2 --proto =https --location --remote-name https://download.docker.com/linux/centos/docker-ce.repo % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 2424 100 2424 0 0 536 0 0:00:04 0:00:04 --:--:-- 536 user@host:~$ sha384sum docker-ce.repo 483187126d28ca55ff4c6554ab8847c8dcdf3f06b211ea6f800cc7b216088c785373830897ce0d7b202ad1f33edc1dc1 docker-ce.repo user@host:~$
- the file itself defines a ton of repos, but only the first one is enabled
[root@osestaging1 discourse]# grep enabled docker-ce.repo enabled=1 enabled=0 enabled=0 enabled=0 enabled=0 enabled=0 enabled=0 enabled=0 enabled=0 enabled=0 enabled=0 enabled=0 [root@osestaging1 discourse]# head docker-ce.repo [docker-ce-stable] name=Docker CE Stable - $basearch baseurl=https://download.docker.com/linux/centos/7/$basearch/stable enabled=1 gpgcheck=1 gpgkey=https://download.docker.com/linux/centos/gpg [docker-ce-stable-debuginfo] name=Docker CE Stable - Debuginfo $basearch baseurl=https://download.docker.com/linux/centos/7/debug-$basearch/stable [root@osestaging1 discourse]#
- and again a test for the gpg key
[root@osestaging1 discourse]# curl --tlsv1.2 --proto =https --location https://download.docker.com/linux/centos/gpg > docker.gpg % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1627 100 1627 0 0 8413 0 --:--:-- --:--:-- --:--:-- 8430 [root@osestaging1 discourse]# sha384sum docker.gpg e3837773edabb1aef62d8b89bbfe3a3c80e008fa312c1d7791606cd303e35d9c17208598fad4eb47fa0374ce027e4c17 docker.gpg [root@osestaging1 discourse]# gpg --keyid-format 0xlong docker.gpg pub 4096R/0xC52FEB6B621E9F35 2017-02-22 Docker Release (CE rpm) <docker@docker.com> [root@osestaging1 discourse]# cat docker.gpg | gpg --keyid-format 0xlong --list-packets :public key packet: version 4, algo 1, created 1487791233, expires 0 pkey[0]: [4096 bits] pkey[1]: [17 bits] keyid: C52FEB6B621E9F35 :user ID packet: "Docker Release (CE rpm) <docker@docker.com>" :signature packet: algo 1, keyid C52FEB6B621E9F35 version 4, created 1487792760, md5len 0, sigclass 0x13 digest algo 10, begin of digest e8 2d hashed subpkt 2 len 4 (sig created 2017-02-22) hashed subpkt 27 len 1 (key flags: 2F) hashed subpkt 11 len 4 (pref-sym-algos: 9 8 7 3) hashed subpkt 21 len 4 (pref-hash-algos: 10 9 8 11) hashed subpkt 22 len 4 (pref-zip-algos: 2 3 1 0) hashed subpkt 30 len 1 (features: 01) hashed subpkt 23 len 1 (key server preferences: 80) subpkt 16 len 8 (issuer key ID C52FEB6B621E9F35) data: [4094 bits] [root@osestaging1 discourse]#
- and from the vpn; it matches
user@disp8990:~$ curl --tlsv1.2 --proto =https --location https://download.docker.com/linux/centos/gpg > docker.gpg % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1627 100 1627 0 0 545 0 0:00:02 0:00:02 --:--:-- 545 user@disp8990:~$ sha384sum docker.gpg e3837773edabb1aef62d8b89bbfe3a3c80e008fa312c1d7791606cd303e35d9c17208598fad4eb47fa0374ce027e4c17 docker.gpg user@disp8990:~$ gpg --keyid-format 0xlong docker.gpg gpg: keybox '/home/user/.gnupg/pubring.kbx' created gpg: WARNING: no command supplied. Trying to guess what you mean ... pub rsa4096/0xC52FEB6B621E9F35 2017-02-22 [SCEA] 060A61C51B558A7F742B77AAC52FEB6B621E9F35 uid Docker Release (CE rpm) <docker@docker.com> user@disp8990:~$ cat docker.gpg | gpg --keyid-format 0xlong --list-packets # off=0 ctb=99 tag=6 hlen=3 plen=525 :public key packet: version 4, algo 1, created 1487791233, expires 0 pkey[0]: [4096 bits] pkey[1]: [17 bits] keyid: C52FEB6B621E9F35 # off=528 ctb=b4 tag=13 hlen=2 plen=43 :user ID packet: "Docker Release (CE rpm) <docker@docker.com>" # off=573 ctb=89 tag=2 hlen=3 plen=567 :signature packet: algo 1, keyid C52FEB6B621E9F35 version 4, created 1487792760, md5len 0, sigclass 0x13 digest algo 10, begin of digest e8 2d hashed subpkt 2 len 4 (sig created 2017-02-22) hashed subpkt 27 len 1 (key flags: 2F) hashed subpkt 11 len 4 (pref-sym-algos: 9 8 7 3) hashed subpkt 21 len 4 (pref-hash-algos: 10 9 8 11) hashed subpkt 22 len 4 (pref-zip-algos: 2 3 1 0) hashed subpkt 30 len 1 (features: 01) hashed subpkt 23 len 1 (keyserver preferences: 80) subpkt 16 len 8 (issuer key ID C52FEB6B621E9F35) data: [4094 bits] user@disp8990:~$
- and from tor; it matches too
user@host:~$ curl --tlsv1.2 --proto =https --location https://download.docker.com/linux/centos/gpg > docker.gpg % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1627 100 1627 0 0 819 0 0:00:01 0:00:01 --:--:-- 819 user@host:~$ sha384sum docker.gpg e3837773edabb1aef62d8b89bbfe3a3c80e008fa312c1d7791606cd303e35d9c17208598fad4eb47fa0374ce027e4c17 docker.gpg user@host:~$ gpg --keyid-format 0xlong docker.gpg gpg: keybox '/home/user/.gnupg/pubring.kbx' created gpg: WARNING: no command supplied. Trying to guess what you mean ... pub rsa4096/0xC52FEB6B621E9F35 2017-02-22 [SCEA] 060A61C51B558A7F742B77AAC52FEB6B621E9F35 uid Docker Release (CE rpm) <docker@docker.com> user@host:~$ cat docker.gpg | gpg --keyid-format 0xlong --list-packets # off=0 ctb=99 tag=6 hlen=3 plen=525 :public key packet: version 4, algo 1, created 1487791233, expires 0 pkey[0]: [4096 bits] pkey[1]: [17 bits] keyid: C52FEB6B621E9F35 # off=528 ctb=b4 tag=13 hlen=2 plen=43 :user ID packet: "Docker Release (CE rpm) <docker@docker.com>" # off=573 ctb=89 tag=2 hlen=3 plen=567 :signature packet: algo 1, keyid C52FEB6B621E9F35 version 4, created 1487792760, md5len 0, sigclass 0x13 digest algo 10, begin of digest e8 2d hashed subpkt 2 len 4 (sig created 2017-02-22) hashed subpkt 27 len 1 (key flags: 2F) hashed subpkt 11 len 4 (pref-sym-algos: 9 8 7 3) hashed subpkt 21 len 4 (pref-hash-algos: 10 9 8 11) hashed subpkt 22 len 4 (pref-zip-algos: 2 3 1 0) hashed subpkt 30 len 1 (features: 01) hashed subpkt 23 len 1 (keyserver preferences: 80) subpkt 16 len 8 (issuer key ID C52FEB6B621E9F35) data: [4094 bits] user@host:~$
- I imported this key into my personal keyring for future safe-keeping. Here's the full fingerprint
user@personal:~$ gpg --list-keys docker pub rsa4096/0xC52FEB6B621E9F35 2017-02-22 [SCEA] Key fingerprint = 060A 61C5 1B55 8A7F 742B 77AA C52F EB6B 621E 9F35 uid [ unknown] Docker Release (CE rpm) <docker@docker.com> user@personal:~$
- there were no entries for the uid 'docker@docker.com' on the new keyserver https://keys.openpgp.org/search?q=docker%40docker.com
- while the original sks key server's entry for the uid is fucking huge https://sks-keyservers.net/pks/lookup?op=get&search=docker@docker.com
- I'm not a huge fan of specifying the location of a gpg key as a URL; our other repo files specify gpg key files that are located at /etc/pki/rpm-gpg/ on disk
[root@osestaging1 yum.repos.d]# grep 'gpgkey=' * | sort -u CentOS-Base.repo:gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7 CentOS-CR.repo:gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7 CentOS-Debuginfo.repo:gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-Debug-7 CentOS-fasttrack.repo:gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7 CentOS-Media.repo:gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7 CentOS-Sources.repo:gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7 CentOS-Vault.repo:gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7 epel.repo:gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7 epel-testing.repo:gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7 webmin.repo:gpgkey=http://www.webmin.com/jcameron-key.asc webtatic-archive.repo:gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-webtatic-el7 webtatic.repo:gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-webtatic-el7 webtatic-testing.repo:gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-webtatic-el7 [root@osestaging1 yum.repos.d]# ls -lah /etc/pki/rpm-gpg total 28K drwxr-xr-x. 2 root root 4.0K Oct 27 2017 . drwxr-xr-x. 11 root root 4.0K Sep 22 2017 .. -rw-r--r--. 1 root root 1.7K Aug 30 2017 RPM-GPG-KEY-CentOS-7 -rw-r--r--. 1 root root 1004 Aug 30 2017 RPM-GPG-KEY-CentOS-Debug-7 -rw-r--r--. 1 root root 1.7K Aug 30 2017 RPM-GPG-KEY-CentOS-Testing-7 -rw-r--r--. 1 root root 1.7K Oct 2 2017 RPM-GPG-KEY-EPEL-7 -rw-r--r--. 1 root root 1.6K Oct 8 2014 RPM-GPG-KEY-webtatic-el7 [root@osestaging1 yum.repos.d]#
- I found an issue about this recommending to add the full fingerprint to the install script; the issue was closed, but my install script has no fingerprint var in it.. https://github.com/moby/moby/issues/17436
- the sks query finally finished downloading; it's 192M!
user@disp8990:~$ curl --tlsv1.2 --proto =https --location "https://sks-keyservers.net/pks/lookup?op=get&search=docker@docker.com" > docker-sks.gpg % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 191M 100 191M 0 0 308k 0 0:10:34 0:10:34 --:--:-- 319k user@disp8990:~$ user@disp8990:~$ du -sh docker-sks.gpg 192M docker-sks.gpg user@disp8990:~$
- it doesn't look like a spammed packet, and it doesn't even include the key from above; I got 4x keys with a ton of signatures it seems
ser@disp8990:~$ gpg --import docker-sks.gpg gpg: /home/user/.gnupg/trustdb.gpg: trustdb created gpg: key D7B2C1254AE90ACE: public key "vicky <vicky.kwan@docker.com>" imported gpg: key 48B1173B7CDE3ACB: public key "Elias Uriegas <eliasuriegas@gmail.com>" imported gpg: key 8E66A2E3A1C1CAD6: public key "ekino-gradle-docker-plugin <opensource@ekino.com>" imported gpg: key C755EC7A05D64F1E: public key "Oliver Faßbender <olli@intrepid.de>" imported gpg: packet(13) too large gpg: read_block: read error: Invalid packet gpg: no valid OpenPGP data found. gpg: import from 'docker-sks.gpg' failed: Invalid keyring gpg: Total number processed: 4 gpg: imported: 4 user@disp8990:~$ gpg --list-keys /home/user/.gnupg/pubring.kbx ----------------------------- pub rsa4096 2019-08-12 [SC] [expires: 2035-08-08] 47545A72C300CAB7A05A4E92D7B2C1254AE90ACE uid [ unknown] vicky <vicky.kwan@docker.com> sub rsa4096 2019-08-12 [E] [expires: 2035-08-08] pub rsa4096 2019-07-11 [SC] DEC839EFD2644EC6CE93CB8948B1173B7CDE3ACB uid [ unknown] Elias Uriegas <eliasuriegas@gmail.com> uid [ unknown] Elias Uriegas <eli.uriegas@docker.com> sub rsa2048 2019-07-11 [SA] [expires: 2027-07-09] sub rsa2048 2019-07-11 [E] [expires: 2027-07-09] pub rsa2048 2019-06-13 [SC] [expires: 2021-06-12] FF3F45A5AF2FC3FE927BF3338E66A2E3A1C1CAD6 uid [ unknown] ekino-gradle-docker-plugin <opensource@ekino.com> sub rsa2048 2019-06-13 [E] [expires: 2021-06-12] pub rsa4096 2019-04-12 [SC] [expires: 2024-04-12] 594DD4D9F6E9CA5738617BF6C755EC7A05D64F1E uid [ unknown] Oliver Faßbender <olli@intrepid.de> uid [ unknown] Oliver Faßbender <docker@intrepid.de> uid [ unknown] Oliver Faßbender <github@intrepid.de> uid [ unknown] Oliver Faßbender <foxromeo75@gmail.com> sub rsa4096 2019-04-12 [E] user@disp8990:~$
- ugh, TOFU sucks. There is no path to validation here; I'll just copy the one docker.org gave me and put it in /etc/pki/rpm-gpg :(
[root@osestaging1 discourse]# sha384sum docker.gpg e3837773edabb1aef62d8b89bbfe3a3c80e008fa312c1d7791606cd303e35d9c17208598fad4eb47fa0374ce027e4c17 docker.gpg [root@osestaging1 discourse]# cat docker.gpg -----BEGIN PGP PUBLIC KEY BLOCK----- mQINBFit5IEBEADDt86QpYKz5flnCsOyZ/fk3WwBKxfDjwHf/GIflo+4GWAXS7wJ 1PSzPsvSDATV10J44i5WQzh99q+lZvFCVRFiNhRmlmcXG+rk1QmDh3fsCCj9Q/yP w8jn3Hx0zDtz8PIB/18ReftYJzUo34COLiHn8WiY20uGCF2pjdPgfxE+K454c4G7 gKFqVUFYgPug2CS0quaBB5b0rpFUdzTeI5RCStd27nHCpuSDCvRYAfdv+4Y1yiVh KKdoe3Smj+RnXeVMgDxtH9FJibZ3DK7WnMN2yeob6VqXox+FvKYJCCLkbQgQmE50 uVK0uN71A1mQDcTRKQ2q3fFGlMTqJbbzr3LwnCBE6hV0a36t+DABtZTmz5O69xdJ WGdBeePCnWVqtDb/BdEYz7hPKskcZBarygCCe2Xi7sZieoFZuq6ltPoCsdfEdfbO +VBVKJnExqNZCcFUTEnbH4CldWROOzMS8BGUlkGpa59Sl1t0QcmWlw1EbkeMQNrN spdR8lobcdNS9bpAJQqSHRZh3cAM9mA3Yq/bssUS/P2quRXLjJ9mIv3dky9C3udM +q2unvnbNpPtIUly76FJ3s8g8sHeOnmYcKqNGqHq2Q3kMdA2eIbI0MqfOIo2+Xk0 rNt3ctq3g+cQiorcN3rdHPsTRSAcp+NCz1QF9TwXYtH1XV24A6QMO0+CZwARAQAB tCtEb2NrZXIgUmVsZWFzZSAoQ0UgcnBtKSA8ZG9ja2VyQGRvY2tlci5jb20+iQI3 BBMBCgAhBQJYrep4AhsvBQsJCAcDBRUKCQgLBRYCAwEAAh4BAheAAAoJEMUv62ti Hp816C0P/iP+1uhSa6Qq3TIc5sIFE5JHxOO6y0R97cUdAmCbEqBiJHUPNQDQaaRG VYBm0K013Q1gcJeUJvS32gthmIvhkstw7KTodwOM8Kl11CCqZ07NPFef1b2SaJ7l TYpyUsT9+e343ph+O4C1oUQw6flaAJe+8ATCmI/4KxfhIjD2a/Q1voR5tUIxfexC /LZTx05gyf2mAgEWlRm/cGTStNfqDN1uoKMlV+WFuB1j2oTUuO1/dr8mL+FgZAM3 ntWFo9gQCllNV9ahYOON2gkoZoNuPUnHsf4Bj6BQJnIXbAhMk9H2sZzwUi9bgObZ XO8+OrP4D4B9kCAKqqaQqA+O46LzO2vhN74lm/Fy6PumHuviqDBdN+HgtRPMUuao xnuVJSvBu9sPdgT/pR1N9u/KnfAnnLtR6g+fx4mWz+ts/riB/KRHzXd+44jGKZra IhTMfniguMJNsyEOO0AN8Tqcl0eRBxcOArcri7xu8HFvvl+e+ILymu4buusbYEVL GBkYP5YMmScfKn+jnDVN4mWoN1Bq2yMhMGx6PA3hOvzPNsUoYy2BwDxNZyflzuAi g59mgJm2NXtzNbSRJbMamKpQ69mzLWGdFNsRd4aH7PT7uPAURaf7B5BVp3UyjERW 5alSGnBqsZmvlRnVH5BDUhYsWZMPRQS9rRr4iGW0l+TH+O2VJ8aQ =0Zqq -----END PGP PUBLIC KEY BLOCK----- [root@osestaging1 discourse]# cp docker.gpg /etc/pki/rpm-gpg/ [root@osestaging1 discourse]# chown root:root /etc/pki/rpm-gpg/docker.gpg [root@osestaging1 discourse]# chmod 0644 /etc/pki/rpm-gpg/docker.gpg [root@osestaging1 discourse]#
- and I replaced the repo file to use this gpg key
[root@osestaging1 discourse]# sed 's^gpgkey=\(.*\)^gpgkey=file:///etc/pki/rpm-gpg/docker.gpg^' docker-ce.repo > /etc/yum.repos.d/docker-ce.repo [root@osestaging1 discourse]# head /etc/yum.repos.d/docker-ce.repo [docker-ce-stable] name=Docker CE Stable - $basearch baseurl=https://download.docker.com/linux/centos/7/$basearch/stable enabled=1 gpgcheck=1 gpgkey=file:///etc/pki/rpm-gpg/docker.gpg [docker-ce-stable-debuginfo] name=Docker CE Stable - Debuginfo $basearch baseurl=https://download.docker.com/linux/centos/7/debug-$basearch/stable [root@osestaging1 discourse]#
- following the script, I installed the 'yum-utils' package; it also looks like I setup the docker-ce repo files correctly
[root@osestaging1 yum.repos.d]# yum install yum-utils Loaded plugins: fastestmirror, replace docker-ce-stable | 3.5 kB 00:00:00 (1/2): docker-ce-stable/x86_64/primary_db | 37 kB 00:00:00 (2/2): docker-ce-stable/x86_64/updateinfo | 55 B 00:00:00 Loading mirror speeds from cached hostfile * base: linux.darkpenguin.net * epel: mirror.23media.com * extras: mirror.softaculous.com * updates: mirror.alpix.eu * webtatic: uk.repo.webtatic.com Resolving Dependencies --> Running transaction check ---> Package yum-utils.noarch 0:1.1.31-52.el7 will be installed --> Processing Dependency: python-kitchen for package: yum-utils-1.1.31-52.el7.noarch --> Processing Dependency: libxml2-python for package: yum-utils-1.1.31-52.el7.noarch --> Running transaction check ---> Package libxml2-python.x86_64 0:2.9.1-6.el7_2.3 will be installed ---> Package python-kitchen.noarch 0:1.1.1-5.el7 will be installed --> Finished Dependency Resolution Dependencies Resolved ============================================================================================================================= Package Arch Version Repository Size ============================================================================================================================= Installing: yum-utils noarch 1.1.31-52.el7 base 121 k Installing for dependencies: libxml2-python x86_64 2.9.1-6.el7_2.3 base 247 k python-kitchen noarch 1.1.1-5.el7 base 267 k Transaction Summary ============================================================================================================================= Install 1 Package (+2 Dependent packages) Total download size: 635 k Installed size: 3.2 M Is this ok [y/d/N]: y ... Installed: yum-utils.noarch 0:1.1.31-52.el7 Dependency Installed: libxml2-python.x86_64 0:2.9.1-6.el7_2.3 python-kitchen.noarch 0:1.1.1-5.el7 Complete! [root@osestaging1 yum.repos.d]#
- and I installed the docker-ce package
[root@osestaging1 yum.repos.d]# yum install docker-ce ... Install 1 Package (+2 Dependent packages) Total download size: 87 M Installed size: 362 M Is this ok [y/d/N]: y Downloading packages: warning: /var/cache/yum/x86_64/7/docker-ce-stable/packages/docker-ce-19.03.4-3.el7.x86_64.rpm: Header V4 RSA/SHA512 Signature, key ID 621e9f35: NOKEY Public key for docker-ce-19.03.4-3.el7.x86_64.rpm is not installed (1/3): docker-ce-19.03.4-3.el7.x86_64.rpm | 24 MB 00:00:01 (2/3): containerd.io-1.2.10-3.2.el7.x86_64.rpm | 23 MB 00:00:01 (3/3): docker-ce-cli-19.03.4-3.el7.x86_64.rpm | 39 MB 00:00:00 ----------------------------------------------------------------------------------------------------------------------------- Total 40 MB/s | 87 MB 00:00:02 Retrieving key from file:///etc/pki/rpm-gpg/docker.gpg Importing GPG key 0x621E9F35: Userid : "Docker Release (CE rpm) <docker@docker.com>" Fingerprint: 060a 61c5 1b55 8a7f 742b 77aa c52f eb6b 621e 9f35 From : /etc/pki/rpm-gpg/docker.gpg Is this ok [y/N]: y Running transaction check Running transaction test Transaction test succeeded Running transaction Installing : containerd.io-1.2.10-3.2.el7.x86_64 1/3 Installing : 1:docker-ce-cli-19.03.4-3.el7.x86_64 2/3 Installing : 3:docker-ce-19.03.4-3.el7.x86_64 3/3 Verifying : 1:docker-ce-cli-19.03.4-3.el7.x86_64 1/3 Verifying : 3:docker-ce-19.03.4-3.el7.x86_64 2/3 Verifying : containerd.io-1.2.10-3.2.el7.x86_64 3/3 Installed: docker-ce.x86_64 3:19.03.4-3.el7 Dependency Installed: containerd.io.x86_64 0:1.2.10-3.2.el7 docker-ce-cli.x86_64 1:19.03.4-3.el7 Complete! [root@osestaging1 yum.repos.d]#
- this time I have docker v19.03.4,
[root@osestaging1 yum.repos.d]# docker -v Docker version 19.03.4, build 9013bf583a [root@osestaging1 yum.repos.d]#
- and I started the docker daemon & attempted to run the discourse setup again; it failed
[root@osestaging1 discourse]# service docker start Redirecting to /bin/systemctl start docker.service [root@osestaging1 discourse]# [root@osestaging1 discourse]# ./discourse-setup ... Hostname for your Discourse? [discourse.opensourceecology.org]: Email address for admin account(s)? [michael@opensourceecology.org]: SMTP server address? [localhost]: SMTP port? [25]: SMTP user name? [discourse@opensouceecology.org]: SMTP password? [none]: Optional email address for setting up Let's Encrypt? (ENTER to skip) [me@example.com]: Does this look right? Hostname : discourse.opensourceecology.org Email : michael@opensourceecology.org SMTP address : localhost SMTP port : 25 SMTP username : discourse@opensouceecology.org SMTP password : none ENTER to continue, 'n' to try again, Ctrl+C to exit: Configuration file at updated successfully! Updates successful. Rebuilding in 5 seconds. Building app /bin/docker: Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container process caused "apply caps: operation not permitted": unknown. Your Docker installation is not working correctly See: https://meta.discourse.org/t/docker-error-on-bootstrap/13657/18?u=sam [root@osestaging1 discourse]#
- google suggests that this is because we're running a container inside a container. Namely docker in an lxc container. god damn it.
- the link from the output suggets a test of hello world in docker; that fails too
[root@osestaging1 discourse]# docker run -it --rm hello-world Unable to find image 'hello-world:latest' locally latest: Pulling from library/hello-world Digest: sha256:c3b4ada4687bbaa170745b3e4dd8ac3f194ca95b2d0518b417fb47e5879d9b5f Status: Downloaded newer image for hello-world:latest docker: Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container process caused "apply caps: operation not permitted": unknown. [root@osestaging1 discourse]#
- I added a line to the osestaging1 lxc config file (/var/lib/lxc/osestaging1/config) to keep the mknod capability https://serverfault.com/questions/946854/docker-inside-lxc-starting-container-process-caused-apply-caps-operation-not-p
lxc.cap.keep = mknod
- well, that failed; it wants either keep or drop. I tried again with an empty 'drop' setting to clear all drops https://linuxcontainers.org/fr/lxc/manpages/man5/lxc.container.conf.5.html
- I started the osestaging1 lxc container again, and that appears to have worked
[root@osestaging1 ~]# service docker start Redirecting to /bin/systemctl start docker.service [root@osestaging1 ~]# docker run -it --rm hello-world Hello from Docker! This message shows that your installation appears to be working correctly. To generate this message, Docker took the following steps: 1. The Docker client contacted the Docker daemon. 2. The Docker daemon pulled the "hello-world" image from the Docker Hub. (amd64) 3. The Docker daemon created a new container from that image which runs the executable that produces the output you are currently reading. 4. The Docker daemon streamed that output to the Docker client, which sent it to your terminal. To try something more ambitious, you can run an Ubuntu container with: $ docker run -it ubuntu bash Share images, automate workflows, and more with a free Docker ID: https://hub.docker.com/ For more examples and ideas, visit: https://docs.docker.com/get-started/ [root@osestaging1 ~]#
- I ran the discourse setup script again; it ran for a *while* but ultimately died because it tried to bind on port 443, where nginx is listening
[root@osestaging1 discourse]# ./discourse-setup ... sha256:45984f2db03ab095892062799571bef5ec7b89a66e05fe9389677e135884cd32 Error response from daemon: container a23752a126c179518a4ad5bdeeb431082167f2d4102875d07651e12fabf046da: driver "overlay2" failed to remove root filesystem: unlinkat /var/lib/docker/overlay2/af51149685f75bb3e62c402c45a6683a49d5e254d620a919fa3497843d9b6aec/merged: device or resource busy + /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=2 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=michael@opensourceecology.org -e DISCOURSE_SMTP_ADDRESS=localhost -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_USER_NAME=discourse@opensouceecology.org -e DISCOURSE_SMTP_PASSWORD=none -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-app -e DOCKER_HOST_IP=172.17.0.1 --name app -t -p 80:80 -p 443:443 -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:53:2a:01:9b:c2 local_discourse/app /sbin/boot 1ba1c1a440db8884093bd001d60955df85cd8b3e655e00b4a0c8c8659f56b9e0 /bin/docker: Error response from daemon: driver failed programming external connectivity on endpoint app (28487c717e14e86e50b3f3caa3e2d015d5a56248249587f6e96d00302f5becb9): Error starting userland proxy: listen tcp 0.0.0.0:443: bind: address already in use. [root@osestaging1 discourse]#
- I manually edited the containers/app.yml file and replaced the "expose" ports of 80 & 443 to be 8020, similar to all our other apache backends (which are 8000 & 8010 so far)
- I re-ran the discourse-setup script. this time no error
[root@osestaging1 discourse]# ./discourse-setup ... 166:M 07 Nov 2019 11:51:25.124 # Redis is now ready to exit, bye bye... 2019-11-07 11:51:25.255 UTC [49] LOG: database system is shut down sha256:201efc3c86c4597373ac995d85d9470e0765ef3a1efcf65720724adeda96e6ce d9e23c6be259dcda6aa66756c207c12cb45c4cdcb886619ce7d6a8ccd114ebb5 Removing old container + /bin/docker rm app app + /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=2 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=michael@opensourceecology.org -e DISCOURSE_SMTP_ADDRESS=localhost -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_USER_NAME=discourse@opensouceecology.org -e DISCOURSE_SMTP_PASSWORD=none -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-app -e DOCKER_HOST_IP=172.17.0.1 --name app -t -p 8020:8020 -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:53:2a:01:9b:c2 local_discourse/app /sbin/boot 9363d1d1e1944b8100eebb033446c261fde7c08850877c30d755e7d9faf7c633 [root@osestaging1 discourse]#
- it's running, but--ugh--it's bound to all interfaces
[root@osestaging1 discourse]# ss -plan | grep -i 8020 tcp LISTEN 0 128 :::8020 :::* users:(("docker-proxy",pid=18381,fd=4)) [root@osestaging1 discourse]#
- can I visit the site? I created a new dns entry in the osedev1:/etc/hosts file
[root@osedev1 etc]# tail /etc/hosts 127.0.0.1 localhost.localdomain localhost 127.0.0.1 localhost4.localdomain4 localhost4 # The following lines are desirable for IPv6 capable hosts ::1 osedev1 osedev1 ::1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 # staging 10.241.189.11 www.opensourceecology.org opensourceecology.org awstats.opensourceecology.org fef.opensourceecology.org forum.opensourceecology.org microfactory.opensourceecology.org munin.opensourceecology.org opensourceecology.org oswh.opensourceecology.org phplist.opensourceecology.org store.opensourceecology.org wiki.opensourceecology.org awstats.openbuildinginstitute.org openbuildinginstitute.org seedhome.openbuildinginstitute.org www.openbuildinginstitute.org discourse.opensourceecology.org [root@osedev1 etc]# service dnsmasq restart Redirecting to /bin/systemctl restart dnsmasq.service [root@osedev1 etc]# dig @127.0.0.1 discourse.opensourceecology.org ; <<>> DiG 9.11.4-P2-RedHat-9.11.4-9.P2.el7 <<>> @127.0.0.1 discourse.opensourceecology.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29089 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;discourse.opensourceecology.org. IN A ;; ANSWER SECTION: discourse.opensourceecology.org. 0 IN A 10.241.189.11 ;; Query time: 0 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Thu Nov 07 12:57:18 CET 2019 ;; MSG SIZE rcvd: 76 [root@osedev1 etc]#
- according to the install guide, I should now just be able to access the site. well, it doesn't work https://github.com/discourse/discourse/blob/master/docs/INSTALL-cloud.md
- I added the hostname to /etc/hosts on osestaging1 too, and even *there* it fails
[root@osestaging1 20191107]# curl -I discourse.opensourceecology.org:8020/ curl: (7) Failed connect to discourse.opensourceecology.org:8020; Connection refused [root@osestaging1 20191107]#
- so I guess the 'expose' section only pokes holes in the discourse docker firewall; it doesn't actually change the config? Anyway, there's a guide on how to set this up https://meta.discourse.org/t/running-other-websites-on-the-same-machine-as-discourse/17247
- I added the 'templates/web.socketed.template.yml' template to the app.yml template list per the link above
- "templates/web.socketed.template.yml"
- then I can setup my nginx proxy to forward to a socket file (unix:/var/discourse/shared/standalone/nginx.http.sock) as opposed to a port; that works. I created a new nginx config file for the vhost at /etc/nginx/discourse.opensourceecology.org from the fef config file and made some changes, such as commenting-out the varnish bits (I'll try to get that working later after this basic POC is operational) and replacing it with the recommended proxy lines from the above link
[root@osestaging1 conf.d]# cat discourse.opensourceecology.org ################################################################################ # File: discourse.opensourceecology.org.conf # Version: 0.1 # Purpose: Internet-listening web server for truncating https, basic DOS # protection, and passing to varnish cache (varnish then passes to # apache) # Author: Michael Altfield <michael@opensourceecology.org> # Created: 2019-11-07 # Updated: 2019-11-07 ################################################################################ # this whole site is a subdomain, so the below block for redirecting a naked # domain does not apply here #server { # # redirect the naked domain to 'www' # #log_format main '$remote_addr - $remote_user [$time_local] "$request" ' # # '$status $body_bytes_sent "$http_referer" ' # # '"$http_user_agent" "$http_x_forwarded_for"'; # #access_log /var/log/nginx/www.opensourceecology.org/access.log main; # #error_log /var/log/nginx/www.opensourceecology.org/error.log main; # include conf.d/secure.include; # include conf.d/ssl.opensourceecology.org.include; # listen 10.241.189.11:443; # server_name opensourceecology.org; # return 301 https://www.opensourceecology.org$uri; # #} server { access_log /var/log/nginx/discourse.opensourceecology.org/access.log main; error_log /var/log/nginx/discourse.opensourceecology.org/error.log; include conf.d/secure.include; include conf.d/ssl.opensourceecology.org.include; #include conf.d/ssl.openbuildinginstitute.org.include; listen 10.241.189.11:443; #listen [2a01:4f8:172:209e::2]:443; server_name discourse.opensourceecology.org; ############# # SITE_DOWN # ############# # uncomment this block && restart nginx prior to apache work to display the # "SITE DOWN" webpage for our clients # root /var/www/html/SITE_DOWN/htdocs/; # index index.html index.htm; # # # force all requests to load exactly this page # location / { # try_files $uri /index.html; # } ################### # SEND TO VARNISH # ################### # location / { # proxy_pass http://127.0.0.1:6081; # proxy_set_header X-Real-IP $remote_addr; # proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; # proxy_set_header X-Forwarded-Proto https; # proxy_set_header X-Forwarded-Port 443; # proxy_set_header Host $host; # } ################## # SEND TO DOCKER # ################## location / { proxy_pass http://unix:/var/discourse/shared/standalone/nginx.http.sock:; proxy_set_header Host $http_host; proxy_http_version 1.1; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto https; proxy_set_header X-Real-IP $remote_addr; } } [root@osestaging1 conf.d]#
- and per the docs I reloaded the nginx config & rebuilt the docker discourse app
[root@osestaging1 discourse]# service nginx reload Redirecting to /bin/systemctl reload nginx.service [root@osestaging1 discourse]# ./launcher rebuild app Ensuring launcher is up to date ...
- meanwhile, I created the necessary nginx log dirs
[root@osestaging1 conf.d]# mkdir /var/log/nginx/discourse.opensourceecology.org [root@osestaging1 conf.d]# chown nginx:nginx /var/log/nginx/discourse.opensourceecology.org/ [root@osestaging1 conf.d]# chmod 0755 /var/log/nginx/discourse.opensourceecology.org/ [root@osestaging1 conf.d]#
- note: discourse iteration with docker takes forever! I thought docker was supposed to make iterating times faster?!? Just this change from a port to a socket && this necessary "rebuild app" thing causes it to do a *ton* of opaque shit..
- ah, I had an issue with my nginx config file. It must end in '.conf' I moved it to 'discourse.opensourceecology.org.conf' && restarted nginx
- now I can access the discourse site in my browser! hooray!! Note that I had to start a private firefox window to create an exception to the hsts rules because 'discourse.opensourceecology.org' is not yet a valid subject alt name in our let's encrypt cert
- well, fuck, I can't login to the site because it's trying to send an email to michael@opensourceecology.org that never arrives. this is likely because google is blocking the email on their end, correctly noticing that it's coming from the wrong server. I don't want to fuck with our SPF records, etc and break production, so I think I'll just rebuild the discourse app so my email address is not hosted on google..
- this time it didn't come up because "container is marked for removal and cannot be started" ??
< pre> 2019-11-07 12:54:31.695 UTC [49] LOG: database system is shut down sha256:aa9684393a90a88c1bad7a780d3350a7abd9345d36066649f03110858f9abdef f7e67420146dd837bdcc5110c2849029fac5c7f7b795356143fdab0e7b0bddd4 Removing old container + /bin/docker rm app Error response from daemon: container 1ccf7cb96a6b4f099dbe5292041007f9639b128f5130270986ff44977e3d95fb: driver "overlay2" failed to remove root filesystem: unlinkat /var/lib/docker/overlay2/ce4f659013a7d25723e7e38f905e458b1b103a3009cd0fc4cd8d21e053c5e437/merged: device or resource busy
starting up existing container + /bin/docker start app Error response from daemon: container is marked for removal and cannot be started Error: failed to start containers: app [root@osestaging1 discourse]# </pre>
- `docker info` shows 3 docker containers are in the 'stopped' state
[root@osestaging1 discourse]# docker info Client: Debug Mode: false Server: Containers: 3 Running: 0 Paused: 0 Stopped: 3 Images: 6 Server Version: 19.03.4 Storage Driver: overlay2 Backing Filesystem: extfs Supports d_type: true Native Overlay Diff: true Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host ipvlan macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog Swarm: inactive Runtimes: runc Default Runtime: runc Init Binary: docker-init containerd version: b34a5c8af56e510852c35414db4c1f4fa6172339 runc version: 3e425f80a8c931f88e6d94a8c831b9d5aa481657 init version: fec3683 Security Options: seccomp Profile: default Kernel Version: 3.10.0-957.21.3.el7.x86_64 Operating System: CentOS Linux 7 (Core) (containerized) OSType: linux Architecture: x86_64 CPUs: 1 Total Memory: 1.748GiB Name: osestaging1 ID: 7RXD:GHAW:C4IE:IOYN:BOPN:4OTK:UO2R:VNNC:KGST:B72A:J5ML:EFFF Docker Root Dir: /var/lib/docker Debug Mode: false Registry: https://index.docker.io/v1/ Labels: Experimental: false Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false WARNING: bridge-nf-call-iptables is disabled WARNING: bridge-nf-call-ip6tables is disabled [root@osestaging1 discourse]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@osestaging1 discourse]#
- it looks like if I pass -a to `docker ps` then it will give me the stopped dockers too
[root@osestaging1 discourse]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 1ccf7cb96a6b f241e5ea3321 "/sbin/boot" 27 minutes ago Removal In Progress app a23752a126c1 discourse/base:2.0.20191013-2320 "/bin/bash -c 'cd /p…" 2 hours ago Removal In Progress jovial_mirzakhani 3c77792ab6b5 hello-world "/hello" 10 days ago Created hardcore_goodall [root@osestaging1 discourse]#
- is this docker shit production ready? this docker issue says to just delete the dir. god I wouldn't want this to happen on prod.. https://github.com/moby/moby/issues/22312
- first I safely got rid of the one that wasn't stuck in "Removal In Progress"; it worked
[root@osestaging1 discourse]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 1ccf7cb96a6b f241e5ea3321 "/sbin/boot" 32 minutes ago Removal In Progress app a23752a126c1 discourse/base:2.0.20191013-2320 "/bin/bash -c 'cd /p…" 2 hours ago Removal In Progress jovial_mirzakhani 3c77792ab6b5 hello-world "/hello" 10 days ago Created hardcore_goodall [root@osestaging1 discourse]# docker rm 3c77792ab6b5 3c77792ab6b5 [root@osestaging1 discourse]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 1ccf7cb96a6b f241e5ea3321 "/sbin/boot" 32 minutes ago Removal In Progress app a23752a126c1 discourse/base:2.0.20191013-2320 "/bin/bash -c 'cd /p…" 2 hours ago Removal In Progress jovial_mirzakhani [root@osestaging1 discourse]#
- ok, stopping docker & force `rm`ing the container dirs worked
Redirecting to /bin/systemctl stop docker.service [root@osestaging1 discourse]# rm f /var/lib/docker/containers/ 1ccf7cb96a6b4f099dbe5292041007f9639b128f5130270986ff44977e3d95fb/ a23752a126c179518a4ad5bdeeb431082167f2d4102875d07651e12fabf046da/ [root@osestaging1 discourse]# rm -rf /var/lib/docker/containers/* [root@osestaging1 discourse]# service docker start Redirecting to /bin/systemctl start docker.service [root@osestaging1 discourse]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@osestaging1 discourse]#
- ok, now it's working
[root@osestaging1 discourse]# ./launcher start app + /bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=2 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e DISCOURSE_HOSTNAME=discourse.opensourceecology.org -e DISCOURSE_DEVELOPER_EMAILS=osediscorse_2019@michaelaltfield.net -e DISCOURSE_SMTP_ADDRESS=localhost -e DISCOURSE_SMTP_PORT=25 -e DISCOURSE_SMTP_USER_NAME=discourse@opensouceecology.org -e DISCOURSE_SMTP_PASSWORD=none -e DISCOURSE_SMTP_AUTHENTICATION=none -e DISCOURSE_SMTP_OPENSSL_VERIFY_MODE=none -e DISCOURSE_SMTP_ENABLE_START_TLS=false -h osestaging1-app -e DOCKER_HOST_IP=172.17.0.1 --name app -t -p 8020:8020 -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:53:2a:01:9b:c2 local_discourse/app /sbin/boot d90b039776439ea5caf969b5bbc202cb1d90fc657b8e6e1949b0365b5ff6f8cb [root@osestaging1 discourse]# [root@osestaging1 discourse]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES d90b03977643 local_discourse/app "/sbin/boot" 13 seconds ago Up 12 seconds 0.0.0.0:8020->8020/tcp app [root@osestaging1 discourse]#
- I tried again with my new email address, but it still didn't come through. Note also that it said my username 'maltfield' wasn't unique. How do I do a fresh install? Whatever. It appears that the email issue may be between discourse and postfix. postfix logs don't even show anything when I click the "Resend Activation Email" button. Here's the relevant docs https://meta.discourse.org/t/troubleshooting-email-on-a-new-discourse-install/16326/2
- as noted in the above thread, the discourse log files are here: /var/discourse/shared/standalone/log/rails/production.log
- when I click on the "Resend Activation Email" button, this pops up in the above log file
Started PUT "/finish-installation/resend-email" for 127.0.0.1 at 2019-11-07 13:15:31 +0000 Processing by FinishInstallationController#resend_email as HTML Parameters: {"authenticity_token"=>"SzQCvRWiqdXsBKzOjIB0X7KkvXro7Od6SdP8Qa8vvrskPeNYZNos5ORHJfyDUrHiKShZR/txM6NHuqHHCQCR1w=="} Rendering finish_installation/resend_email.html.erb within layouts/finish_installation Rendered finish_installation/resend_email.html.erb within layouts/finish_installation (Duration: 0.7ms | Allocations: 103) Rendered layouts/_head.html.erb (Duration: 0.5ms | Allocations: 103) Completed 200 OK in 98ms (Views: 3.0ms | ActiveRecord: 0.0ms | Allocations: 4763) Rendering layouts/email_template.html.erb Rendered layouts/email_template.html.erb (Duration: 0.5ms | Allocations: 141) Delivered mail c4ca58ca-345e-46c4-81bc-6d0eac7afa04@discourse.opensourceecology.org (11.3ms) Job exception: wrong authentication type none
- aw ffs, back to this smtp auth shit again. We *don't* have auth on our smtp server; it's not exposed to the Internet, and it runs on localhost only; auth is not necessary. I set it to "none" to *not* use smtp auth. Apparently it doesn't like that *facepalm*
- I removed the username & password fields and rebuilt the app (the best way I've found is `./launcher destroy app && ./launcher rebuild app` which still takes for fucking ever to run); now it gets a bit further, but complains that localhost is refusing the connection.
Rendering layouts/email_template.html.erb Rendered layouts/email_template.html.erb (Duration: 0.6ms | Allocations: 139) Delivered mail ca01baae-880e-4448-81fd-bacfc71cfab3@discourse.opensourceecology.org (3.5ms) Job exception: Connection refused - connect(2) for "localhost" port 25
- so I think there's a few possible issues here:
- iptables is blocking traffic from the container to the host
- I attempted to fix this by adding an iptables rule permitting traffic from the 'docker0' interface into INPUT. Note that these rules were modified by docker, it seems, already.
- iptables is blocking traffic from the container to the host
[root@osestaging1 discourse]# iptables-save | head -n 40 # Generated by iptables-save v1.4.21 on Thu Nov 7 14:22:31 2019 *mangle :PREROUTING ACCEPT [1584:199946] :INPUT ACCEPT [1576:198394] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [1410:267329] :POSTROUTING ACCEPT [1404:266969] COMMIT # Completed on Thu Nov 7 14:22:31 2019 # Generated by iptables-save v1.4.21 on Thu Nov 7 14:22:31 2019 *nat :PREROUTING ACCEPT [10:1656] :INPUT ACCEPT [2:104] :OUTPUT ACCEPT [44:3026] :POSTROUTING ACCEPT [38:2666] :DOCKER - [0:0] -A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER -A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER -A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE -A POSTROUTING -s 172.17.0.2/32 -d 172.17.0.2/32 -p tcp -m tcp --dport 8020 -j MASQUERADE -A DOCKER -i docker0 -j RETURN -A DOCKER ! -i docker0 -p tcp -m tcp --dport 8020 -j DNAT --to-destination 172.17.0.2:8020 COMMIT # Completed on Thu Nov 7 14:22:31 2019 # Generated by iptables-save v1.4.21 on Thu Nov 7 14:22:31 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [28:2240] :DOCKER - [0:0] :DOCKER-ISOLATION-STAGE-1 - [0:0] :DOCKER-ISOLATION-STAGE-2 - [0:0] :DOCKER-USER - [0:0] -A INPUT -p tcp -m state --state NEW -m tcp --dport 25 -j ACCEPT -A INPUT -i docker0 -j ACCEPT -A INPUT -s 5.9.144.234/32 -j DROP -A INPUT -s 173.234.159.250/32 -j DROP -A INPUT -i lo -j ACCEPT -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -p icmp -j ACCEPT [root@osestaging1 discourse]#
- 'localhost' inside the docker container means the docker container itself; not the docker host running my smtp server
- according to this article, I can use the special DNS name 'host.docker.internal' to resolve to the host's ip address as addressible from the docker container https://stackoverflow.com/questions/31324981/how-to-access-host-port-from-docker-container
- 'localhost' inside the docker container means the docker container itself; not the docker host running my smtp server
- it would also be great if I could debug from within the container itself; that's a bit tricky, but I found I can get a shell in a container (with a stupid simple subset of commands) like so
[root@osestaging1 discourse]# docker ps dCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 7bf55da425bc local_discourse/app "/sbin/boot" 35 minutes ago Up 35 minutes app [root@osestaging1 discourse]# docker exec -it app /bin/bash root@osestaging1-app:/# ping host.docker.internal bash: ping: command not found root@osestaging1-app:/# dig bash: dig: command not found root@osestaging1-app:/# nslookup host.docker.internal bash: nslookup: command not found
- I installed 'adnshost', but--fuck--the dns entry does't work; looks like linux support for it is still pending *facepalm* https://github.com/docker/for-linux/issues/264
root@osestaging1-app:/# apt-get install adns-tools Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: libadns1 The following NEW packages will be installed: adns-tools libadns1 0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded. Need to get 107 kB of archives. After this operation, 276 kB of additional disk space will be used. Do you want to continue? [Y/n] y Get:1 http://deb.debian.org/debian buster/main amd64 libadns1 amd64 1.5.0~rc1-1.1 [66.2 kB] Get:2 http://deb.debian.org/debian buster/main amd64 adns-tools amd64 1.5.0~rc1-1.1 [40.3 kB] Fetched 107 kB in 0s (450 kB/s) debconf: delaying package configuration, since apt-utils is not installed Selecting previously unselected package libadns1. (Reading database ... 44574 files and directories currently installed.) Preparing to unpack .../libadns1_1.5.0~rc1-1.1_amd64.deb ... Unpacking libadns1 (1.5.0~rc1-1.1) ... Selecting previously unselected package adns-tools. Preparing to unpack .../adns-tools_1.5.0~rc1-1.1_amd64.deb ... Unpacking adns-tools (1.5.0~rc1-1.1) ... Setting up libadns1 (1.5.0~rc1-1.1) ... Setting up adns-tools (1.5.0~rc1-1.1) ... Processing triggers for libc-bin (2.28-10) ... root@osestaging1-app:/# ad add-apt-repository addgroup addr2line adduser adnshost adnsresfilter advmng advzip addgnupghome addpart add-shell adnsheloex adnslogres advdef advpng root@osestaging1-app:/# ad add-apt-repository addgroup addr2line adduser adnshost adnsresfilter advmng advzip addgnupghome addpart add-shell adnsheloex adnslogres advdef advpng root@osestaging1-app:/# ad add-apt-repository addgroup addr2line adduser adnshost adnsresfilter advmng advzip addgnupghome addpart add-shell adnsheloex adnslogres advdef advpng root@osestaging1-app:/# adns adnsheloex adnshost adnslogres adnsresfilter root@osestaging1-app:/# adnshost adnshost usage error: no domains given, and -f/--pipe not used; try --help root@osestaging1-app:/# adnshost google.com google.com A INET 172.217.22.78 google.com A INET6 2a00:1450:4001:800::200e root@osestaging1-app:/# adnshost host.docker.internal host.docker.internal does not exist root@osestaging1-app:/#
- dns is the robust option, but can we at least prove connectivity from within the container to the host at least over IP for testing? I installed telnet and couldn't get it to work..
root@osestaging1-app:/# cat /etc/hosts 127.0.0.1 localhost ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters 172.17.0.2 osestaging1-app root@osestaging1-app:/# telnet 172.17.0.2 25 Trying 172.17.0.2... telnet: Unable to connect to remote host: Connection refused root@osestaging1-app:/# telnet 172.17.0.1 25 Trying 172.17.0.1... telnet: Unable to connect to remote host: Connection refused root@osestaging1-app:/#
- here's the damn command to get the `ip` command; it's 'iproute2' https://stackoverflow.com/questions/51834978/ip-command-is-missing-from-ubuntu-docker-image
root@osestaging1-app:/# apt-get install iproute2 Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: libatm1 libmnl0 libxtables12 Suggested packages: iproute2-doc The following NEW packages will be installed: iproute2 libatm1 libmnl0 libxtables12 0 upgraded, 4 newly installed, 0 to remove and 0 not upgraded. Need to get 990 kB of archives. After this operation, 2,954 kB of additional disk space will be used. Do you want to continue? [Y/n] y Get:1 http://deb.debian.org/debian buster/main amd64 libmnl0 amd64 1.0.4-2 [12.2 kB] Get:2 http://deb.debian.org/debian buster/main amd64 libxtables12 amd64 1.8.2-4 [80.0 kB] Get:3 http://deb.debian.org/debian buster/main amd64 iproute2 amd64 4.20.0-2 [827 kB] Get:4 http://deb.debian.org/debian buster/main amd64 libatm1 amd64 1:2.5.1-2 [71.0 kB] Fetched 990 kB in 0s (13.2 MB/s) debconf: delaying package configuration, since apt-utils is not installed Selecting previously unselected package libmnl0:amd64. (Reading database ... 45481 files and directories currently installed.) Preparing to unpack .../libmnl0_1.0.4-2_amd64.deb ... Unpacking libmnl0:amd64 (1.0.4-2) ... Selecting previously unselected package libxtables12:amd64. Preparing to unpack .../libxtables12_1.8.2-4_amd64.deb ... Unpacking libxtables12:amd64 (1.8.2-4) ... Selecting previously unselected package iproute2. Preparing to unpack .../iproute2_4.20.0-2_amd64.deb ... Unpacking iproute2 (4.20.0-2) ... Selecting previously unselected package libatm1:amd64. Preparing to unpack .../libatm1_1%3a2.5.1-2_amd64.deb ... Unpacking libatm1:amd64 (1:2.5.1-2) ... Setting up libatm1:amd64 (1:2.5.1-2) ... Setting up libmnl0:amd64 (1.0.4-2) ... Setting up libxtables12:amd64 (1.8.2-4) ... Setting up iproute2 (4.20.0-2) ... Processing triggers for libc-bin (2.28-10) ... root@osestaging1-app:/# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 299: eth0@if300: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default link/ether 02:53:2a:01:9b:c2 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::53:2aff:fe01:9bc2/64 scope link valid_lft forever preferred_lft forever root@osestaging1-app:/#
- so it looks like the ip to the docker host from the docker container is '172.17.0.1'. the only open port there is 10000. 25 is not visible :\
root@osestaging1-app:/# nmap 172.17.0.1 Starting Nmap 7.70 ( https://nmap.org ) at 2019-11-07 15:00 UTC Nmap scan report for 172.17.0.1 Host is up (0.000019s latency). Not shown: 999 closed ports PORT STATE SERVICE 10000/tcp open snet-sensor-mgmt MAC Address: 02:42:80:35:65:A1 (Unknown) Nmap done: 1 IP address (1 host up) scanned in 1.85 seconds root@osestaging1-app:/# ip r default via 172.17.0.1 dev eth0 172.17.0.0/16 dev eth0 proto kernel scope link src 172.17.0.2 root@osestaging1-app:/#
- I added a question to the thread on how this fucking simple config is supposed to be achieved https://meta.discourse.org/t/troubleshooting-email-on-a-new-discourse-install/16326/372
Mon Oct 28, 2019
- I stumbled on buttplug.io and sent an email to Marcin about the market size of 3d-printable sex toys, a potential addition to our HeroX challenge. He responded that it's a > 23-billion-dollar industry https://www.businesswire.com/news/home/20181002005775/en/Global-Adult-Toys-Market-Worth-23.7Bn-2017
- https://buttplug.io/
- Buttplug
- Market Size
- I expect that it would be a farily trivial design, and the marketing potential is compelling
- and potential modular interoperability with the hammer drill?!? no, nevermind. back to sysadmin work..
- ...
- time to install discourse on staging and see if it breaks our existing sites
- the easy install guide provided by docker is in their git repo's docs dir named 'INSTALL-cloud' https://github.com/discourse/discourse/blob/master/docs/INSTALL-cloud.md
- this guide says to first checkout the discourse docker repot to /var/discourse https://github.com/discourse/discourse_docker
- the next step is to execute a `curl https://get.docker.com/ | sh`. God help us.
- fortunately, the insanely insecure step above is wrapped in an if-condition that only executes if docker is not first found, and if the user presses enter to proceed (providing the option for the user to safely ctrl+c out to prevent the above command from running
- I found doker exists in the yum repos, so I installed it from there first
[root@osestaging1 discourse]# yum install docker ... Installed: docker.x86_64 2:1.13.1-103.git7f2769b.el7.centos Dependency Installed: atomic-registries.x86_64 1:1.22.1-29.gitb507039.el7 container-selinux.noarch 2:2.107-3.el7 container-storage-setup.noarch 0:0.11.0-2.git5eaf76c.el7 containers-common.x86_64 1:0.1.37-3.el7.centos docker-client.x86_64 2:1.13.1-103.git7f2769b.el7.centos docker-common.x86_64 2:1.13.1-103.git7f2769b.el7.centos oci-register-machine.x86_64 1:0-6.git2b44233.el7 oci-systemd-hook.x86_64 1:0.2.0-1.git05e6923.el7_6 oci-umount.x86_64 2:2.5-3.el7 python-pytoml.noarch 0:0.1.14-1.git7dea353.el7 subscription-manager-rhsm-certificates.x86_64 0:1.24.13-3.el7.centos yajl.x86_64 0:2.0.4-4.el7 Dependency Updated: libselinux.x86_64 0:2.5-14.1.el7 libselinux-python.x86_64 0:2.5-14.1.el7 libselinux-utils.x86_64 0:2.5-14.1.el7 libsemanage.x86_64 0:2.5-14.el7 libsemanage-python.x86_64 0:2.5-14.el7 libsepol.x86_64 0:2.5-10.el7 policycoreutils.x86_64 0:2.5-33.el7 policycoreutils-python.x86_64 0:2.5-33.el7 selinux-policy.noarch 0:3.13.1-252.el7.1 selinux-policy-targeted.noarch 0:3.13.1-252.el7.1 setools-libs.x86_64 0:3.3.8-4.el7 Complete! [root@osestaging1 discourse]#
- that appeared to be the only curl-piped-to-shell line in the
discourse-setup
script, so I proceeded with the install. the first thing I noticed was that it yelled at me for having <2G of swap space. If this becomes an issue, I'll just create a 2-4G swap file somewhere on '/' (not in the ebs volume)
[root@osestaging1 discourse]# ./discourse-setup which: no docker.io in (/sbin:/bin:/usr/sbin:/usr/bin) which: no docker.io in (/sbin:/bin:/usr/sbin:/usr/bin) WARNING: Discourse requires at least 2GB of swap when running with 2GB of RAM or less. This system does not appear to have sufficient swap space. Without sufficient swap space, your site may not work properly, and future upgrades of Discourse may not complete successfully. Ctrl+C to exit or wait 5 seconds to have a 2GB swapfile created. Setting up swapspace version 1, size = 2097148 KiB no label, UUID=7e132ae9-7b1b-429c-8c11-d55310818030 /swapfile swap swap auto 0 0 sysctl: setting key "vm.swappiness": Read-only file system ./discourse-setup: line 277: netstat: command not found ./discourse-setup: line 277: netstat: command not found Ports 80 and 443 are free for use ‘samples/standalone.yml’ -> ‘containers/app.yml’ Found 1GB of memory and 1 physical CPU cores setting db_shared_buffers = 128MB setting UNICORN_WORKERS = 2 containers/app.yml memory parameters updated. Hostname for your Discourse? [discourse.example.com]: discourse.opensourceecology.org Email address for admin account(s)? [me@example.com,you@example.com]: michael@opensourceecology.org SMTP server address? [smtp.example.com]: localhost SMTP port? [587]: SMTP user name? [user@example.com]: SMTP password? [pa$$word]: Optional email address for setting up Let's Encrypt? (ENTER to skip) [me@example.com]: Does this look right? Hostname : discourse.opensourceecology.org Email : michael@opensourceecology.org SMTP address : localhost SMTP port : 587 SMTP username : user@example.com SMTP password : pa$$word ENTER to continue, 'n' to try again, Ctrl+C to exit:
- continuing failed! I entered no username & password for the smtp server as it should be unnecessary coming from localhost. apparently that doesn't override the default, though?
Configuration file at updated successfully! DISCOURSE_SMTP_USER_NAME left at incorrect default of user@example.com DISCOURSE_SMTP_PASSWORD left at incorrect default of pa$$word Sorry, these containers/app.yml settings aren't valid -- can't continue! If you have unusual requirements, edit containers/app.yml and then: ./launcher bootstrap app [root@osestaging1 discourse]#
- on second thought, maybe that's because I set the port to the default of 587; it should be 25
[root@osestaging1 discourse]# nmap localhost Starting Nmap 6.40 ( http://nmap.org ) at 2019-10-28 12:18 UTC Nmap scan report for localhost (127.0.0.1) Host is up (0.000010s latency). rDNS record for 127.0.0.1: localhost.localdomain Not shown: 996 closed ports PORT STATE SERVICE 25/tcp open smtp 8000/tcp open http-alt 8010/tcp open xmpp 10000/tcp open snet-sensor-mgmt Nmap done: 1 IP address (1 host up) scanned in 0.11 seconds [root@osestaging1 discourse]# ss -plan | grep -i ':25' tcp LISTEN 0 100 127.0.0.1:25 *:* users:(("master",pid=782,fd=13)) [root@osestaging1 discourse]#
- I tried again (the installer was a bit automagic at remembering previous args, which is nice), but changing to 25 still asked for creds
[root@osestaging1 discourse]# ./discourse-setup which: no docker.io in (/sbin:/bin:/usr/sbin:/usr/bin) which: no docker.io in (/sbin:/bin:/usr/sbin:/usr/bin) The configuration file containers/app.yml already exists! . . . reconfiguring . . . Saving old file as app.yml.2019-10-28-121933.bak Stopping existing container in 5 seconds or Control-C to cancel. Device "docker0" does not exist. Cannot connect to the docker daemon - verify it is running and you have access Found 1GB of memory and 1 physical CPU cores setting db_shared_buffers = 128MB setting UNICORN_WORKERS = 2 containers/app.yml memory parameters updated. Hostname for your Discourse? [discourse.opensourceecology.org]: Email address for admin account(s)? [michael@opensourceecology.org]: SMTP server address? [localhost]: SMTP port? [587]: 25 SMTP user name? [user@example.com]: SMTP password? [pa$$word]: Optional email address for setting up Let's Encrypt? (ENTER to skip) [me@example.com]: Does this look right? Hostname : discourse.opensourceecology.org Email : michael@opensourceecology.org SMTP address : localhost SMTP port : 25 SMTP username : user@example.com SMTP password : pa$$word ENTER to continue, 'n' to try again, Ctrl+C to exit: Configuration file at updated successfully! DISCOURSE_SMTP_USER_NAME left at incorrect default of user@example.com DISCOURSE_SMTP_PASSWORD left at incorrect default of pa$$word Sorry, these containers/app.yml settings aren't valid -- can't continue! If you have unusual requirements, edit containers/app.yml and then: ./launcher bootstrap app [root@osestaging1 discourse]#
- I manually edited the config file, blanking-out these default vaules
[root@osestaging1 discourse]# vim containers/app.yml ... [root@osestaging1 discourse]# grep SMTP_PORT containers/app.yml | head -n1 DISCOURSE_SMTP_PORT: 25 [root@osestaging1 discourse]# grep SMTP_USER containers/app.yml | head -n1 DISCOURSE_SMTP_USER_NAME: "" [root@osestaging1 discourse]# grep SMTP_PASSWORD containers/app.yml | head -n1 DISCOURSE_SMTP_PASSWORD: "" [root@osestaging1 discourse]#
- It was still pretty unhappy with me
[root@osestaging1 discourse]# ./discourse-setup which: no docker.io in (/sbin:/bin:/usr/sbin:/usr/bin) which: no docker.io in (/sbin:/bin:/usr/sbin:/usr/bin) The configuration file containers/app.yml already exists! . . . reconfiguring . . . Saving old file as app.yml.2019-10-28-122342.bak Stopping existing container in 5 seconds or Control-C to cancel. Device "docker0" does not exist. Cannot connect to the docker daemon - verify it is running and you have access Found 1GB of memory and 1 physical CPU cores setting db_shared_buffers = 128MB setting UNICORN_WORKERS = 2 containers/app.yml memory parameters updated. Hostname for your Discourse? [discourse.opensourceecology.org]: Email address for admin account(s)? [michael@opensourceecology.org]: SMTP server address? [localhost]: SMTP port? [25]: SMTP password? []: Optional email address for setting up Let's Encrypt? (ENTER to skip) [me@example.com]: Does this look right? Hostname : discourse.opensourceecology.org Email : michael@opensourceecology.org SMTP address : localhost SMTP port : 25 SMTP username : SMTP password : ENTER to continue, 'n' to try again, Ctrl+C to exit: Configuration file at updated successfully! DISCOURSE_SMTP_USER_NAME not present DISCOURSE_SMTP_PASSWORD not present Sorry, these containers/app.yml settings aren't valid -- can't continue! If you have unusual requirements, edit containers/app.yml and then: ./launcher bootstrap app [root@osestaging1 discourse]#
- so much for docker not requiring sysadmins, this is not exactly a trivial install. There's nothing in the comments of that config file that state how to set it up if you have auth-less smtp, but I I followed all the recommendations in a few relevant threads to no avail
[root@osestaging1 discourse]# ./discourse-setup which: no docker.io in (/sbin:/bin:/usr/sbin:/usr/bin) which: no docker.io in (/sbin:/bin:/usr/sbin:/usr/bin) The configuration file containers/app.yml already exists! . . . reconfiguring . . . Saving old file as app.yml.2019-10-28-123312.bak Stopping existing container in 5 seconds or Control-C to cancel. Device "docker0" does not exist. Cannot connect to the docker daemon - verify it is running and you have access Found 1GB of memory and 1 physical CPU cores setting db_shared_buffers = 128MB setting UNICORN_WORKERS = 2 containers/app.yml memory parameters updated. Hostname for your Discourse? [discourse.opensourceecology.org]: Email address for admin account(s)? [michael@opensourceecology.org]: SMTP server address? [localhost]: SMTP port? [25]: SMTP password? []: Optional email address for setting up Let's Encrypt? (ENTER to skip) [me@example.com]: Does this look right? Hostname : discourse.opensourceecology.org Email : michael@opensourceecology.org SMTP address : localhost SMTP port : 25 SMTP username : SMTP password : ENTER to continue, 'n' to try again, Ctrl+C to exit: Configuration file at updated successfully! DISCOURSE_SMTP_USER_NAME not present DISCOURSE_SMTP_PASSWORD not present Sorry, these containers/app.yml settings aren't valid -- can't continue! If you have unusual requirements, edit containers/app.yml and then: ./launcher bootstrap app
- per the last line, I tried running `launcher bootstrap app`, and that failed. Docker isn't running?
[root@osestaging1 discourse]# ./launcher bootstrap app Device "docker0" does not exist. Cannot connect to the docker daemon - verify it is running and you have access [root@osestaging1 discourse]# ps -ef | grep -i docker root 7839 692 0 12:35 pts/9 00:00:00 grep --color=auto -i docker [root@osestaging1 discourse]#
- I got a bit further by adding some options that *should* prevent auth to smtp, but I still had to set a value for the stmp password, else it yells at me and exits. this time it says it had connection issues to docker. do I have to manually start it?
[root@osestaging1 discourse]# ./discourse-setup which: no docker.io in (/sbin:/bin:/usr/sbin:/usr/bin) which: no docker.io in (/sbin:/bin:/usr/sbin:/usr/bin) The configuration file containers/app.yml already exists! . . . reconfiguring . . . Saving old file as app.yml.2019-10-28-130503.bak Stopping existing container in 5 seconds or Control-C to cancel. Device "docker0" does not exist. Cannot connect to the docker daemon - verify it is running and you have access Found 1GB of memory and 1 physical CPU cores setting db_shared_buffers = 128MB setting UNICORN_WORKERS = 2 containers/app.yml memory parameters updated. Hostname for your Discourse? [discourse.opensourceecology.org]: Email address for admin account(s)? [michael@opensourceecology.org]: SMTP server address? [localhost]: SMTP port? [25]: SMTP user name? [discourse@opensouceecology.org]: SMTP password? [none]: Optional email address for setting up Let's Encrypt? (ENTER to skip) [me@example.com]: Does this look right? Hostname : discourse.opensourceecology.org Email : michael@opensourceecology.org SMTP address : localhost SMTP port : 25 SMTP username : discourse@opensouceecology.org SMTP password : none ENTER to continue, 'n' to try again, Ctrl+C to exit: Configuration file at updated successfully! Updates successful. Rebuilding in 5 seconds. Building app Device "docker0" does not exist. Cannot connect to the docker daemon - verify it is running and you have access [root@osestaging1 discourse]# [root@osestaging1 discourse]# grep SMTP containers/app.yml ## TODO: The SMTP mail server used to validate new accounts and send notifications # SMTP ADDRESS, username, and password are required # WARNING the char '#' in SMTP password can cause problems! DISCOURSE_SMTP_ADDRESS: localhost DISCOURSE_SMTP_PORT: 25 DISCOURSE_SMTP_USER_NAME: discourse@opensouceecology.org DISCOURSE_SMTP_PASSWORD: "none" DISCOURSE_SMTP_AUTHENTICATION: none DISCOURSE_SMTP_OPENSSL_VERIFY_MODE: none DISCOURSE_SMTP_ENABLE_START_TLS: false #DISCOURSE_SMTP_ENABLE_START_TLS: true # (optional, default true) [root@osestaging1 discourse]#
- it does appear that the docker services are disabled. There's 4 of them. Which do I use?
[root@osestaging1 discourse]# systemctl list-units | grep -i docker [root@osestaging1 discourse]# systemctl list-unit-files | grep -i docker docker-cleanup.service disabled docker-storage-setup.service disabled docker.service disabled docker-cleanup.timer disabled [root@osestaging1 discourse]#
- a few tests fail https://meta.discourse.org/t/cant-run-the-launcher-to-install-discourse-on-centos-7/23095/21
[root@osestaging1 discourse]# docker run hello-world /usr/bin/docker-current: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?. See '/usr/bin/docker-current run --help'. [root@osestaging1 discourse]# ps -ef | grep -i docker root 14744 692 0 13:15 pts/9 00:00:00 grep --color=auto -i docker
- I started just the simplest service = 'docker.service'
[root@osestaging1 discourse]# systemctl start docker.service [root@osestaging1 discourse]# systemctl list-unit-files | grep -i docker docker-cleanup.service disabled docker-storage-setup.service disabled docker.service disabled docker-cleanup.timer disabled [root@osestaging1 discourse]# systemctl status | grep -i docker │ ├─docker.service │ │ ├─15302 /usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --default-runtime=docker-runc --exec-opt native.cgroupdriver=systemd --userland-proxy-path=/usr/libexec/docker/docker-proxy-current --init-path=/usr/libexec/docker/docker-init-current --seccomp-profile=/etc/docker/seccomp.json --selinux-enabled --log-driver=journald --signature-verification=false --storage-driver overlay2 │ │ └─15307 /usr/bin/docker-containerd-current -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --shim docker-containerd-shim --runtime docker-runc --runtime-args --systemd-cgroup=true │ ├─15386 grep --color=auto -i docker [root@osestaging1 discourse]# systemctl | grep -i docker var-lib-docker-containers.mount loaded active mounted /var/lib/docker/containers var-lib-docker-overlay2.mount loaded active mounted /var/lib/docker/overlay2 docker.service loaded active running Docker Application Container Engine docker-cleanup.timer loaded active waiting Run docker-cleanup every hour [root@osestaging1 discourse]#
- now it's running, but the hello-world test failed
[root@osestaging1 discourse]# ps -ef | grep -i docker root 15302 1 0 13:15 ? 00:00:00 /usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --default-runtime=docker-runc --exec-opt native.cgroupdriver=systemd --userland-proxy-path=/usr/libexec/docker/docker-proxy-current --init-path=/usr/libexec/docker/docker-init-current --seccomp-profile=/etc/docker/seccomp.json --selinux-enabled --log-driver=journald --signature-verification=false --storage-driver overlay2 root 15307 15302 0 13:15 ? 00:00:00 /usr/bin/docker-containerd-current -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --shim docker-containerd-shim --runtime docker-runc --runtime-args --systemd-cgroup=true root 15509 692 0 13:17 pts/9 00:00:00 grep --color=auto -i docker [root@osestaging1 discourse]# docker run hello-world Unable to find image 'hello-world:latest' locally Trying to pull repository docker.io/library/hello-world ... latest: Pulling from docker.io/library/hello-world 1b930d010525: Pull complete Digest: sha256:c3b4ada4687bbaa170745b3e4dd8ac3f194ca95b2d0518b417fb47e5879d9b5f Status: Downloaded newer image for docker.io/hello-world:latest container_linux.go:235: starting container process caused "process_linux.go:327: setting cgroup config for procHooks process caused \"failed to write c 5:1 rwm to devices.allow: write /sys/fs/cgroup/devices/lxc/osestaging1/system.slice/docker-3c77792ab6b5f23d727daf392b5b8d33a8713849da4e7f30e8cfcd2197c7ec0c.scope/devices.allow: operation not permitted\"" /usr/bin/docker-current: Error response from daemon: oci runtime error: container_linux.go:235: starting container process caused "process_linux.go:327: setting cgroup config for procHooks process caused \"failed to write c 5:1 rwm to devices.allow: write /sys/fs/cgroup/devices/lxc/osestaging1/system.slice/docker-3c77792ab6b5f23d727daf392b5b8d33a8713849da4e7f30e8cfcd2197c7ec0c.scope/devices.allow: operation not permitted\"". [root@osestaging1 discourse]#
- attempting the install failed again, stating that my version of docker as installed from the cent repos is too old :(
[root@osestaging1 discourse]# ./discourse-setup ... Configuration file at updated successfully! Updates successful. Rebuilding in 5 seconds. Building app ERROR: Docker version 1.13.1 not supported, please upgrade to at least 17.03.1, or recommended 17.06.2 [root@osestaging1 discourse]#
- I removed the docker installed from yum
[root@osestaging1 discourse]# yum remove docker Loaded plugins: fastestmirror, replace Resolving Dependencies --> Running transaction check ---> Package docker.x86_64 2:1.13.1-103.git7f2769b.el7.centos will be erased --> Finished Dependency Resolution Dependencies Resolved ================================================================================================================================================== Package Arch Version Repository Size ================================================================================================================================================== Removing: docker x86_64 2:1.13.1-103.git7f2769b.el7.centos @extras 65 M Transaction Summary ================================================================================================================================================== Remove 1 Package Installed size: 65 M Is this ok [y/N]: y Downloading packages: Running transaction check Running transaction test Transaction test succeeded Running transaction Erasing : 2:docker-1.13.1-103.git7f2769b.el7.centos.x86_64 1/1 warning: /etc/sysconfig/docker-storage saved as /etc/sysconfig/docker-storage.rpmsave Verifying : 2:docker-1.13.1-103.git7f2769b.el7.centos.x86_64 1/1 Removed: docker.x86_64 2:1.13.1-103.git7f2769b.el7.centos Complete! [root@osestaging1 discourse]#
- but, umm, docker is still installed?
[root@osestaging1 discourse]# docker -v Docker version 1.13.1, build 7f2769b/1.13.1 [root@osestaging1 discourse]# yum remove docker Loaded plugins: fastestmirror, replace No Match for argument: docker No Packages marked for removal [root@osestaging1 discourse]#
- I removed docker-client & docker-common too
[root@osestaging1 discourse]# rpm -qa | grep -i docker docker-client-1.13.1-103.git7f2769b.el7.centos.x86_64 docker-common-1.13.1-103.git7f2769b.el7.centos.x86_64 [root@osestaging1 discourse]# yum remove docker-client docker-common Loaded plugins: fastestmirror, replace Existing lock /var/run/yum.pid: another copy is running as pid 16965. Another app is currently holding the yum lock; waiting for it to exit... The other application is: yum Memory : 224 M RSS (986 MB VSZ) Started: Mon Oct 28 13:21:58 2019 - 00:03 ago State : Running, pid: 16965 Resolving Dependencies --> Running transaction check ---> Package docker-client.x86_64 2:1.13.1-103.git7f2769b.el7.centos will be erased ---> Package docker-common.x86_64 2:1.13.1-103.git7f2769b.el7.centos will be erased --> Finished Dependency Resolution Dependencies Resolved ================================================================================================================================================== Package Arch Version Repository Size ================================================================================================================================================== Removing: docker-client x86_64 2:1.13.1-103.git7f2769b.el7.centos @extras 13 M docker-common x86_64 2:1.13.1-103.git7f2769b.el7.centos @extras 4.4 k Transaction Summary ================================================================================================================================================== Remove 2 Packages Installed size: 13 M Is this ok [y/N]: y Downloading packages: Running transaction check Running transaction test Transaction test succeeded Running transaction Erasing : 2:docker-client-1.13.1-103.git7f2769b.el7.centos.x86_64 1/2 Erasing : 2:docker-common-1.13.1-103.git7f2769b.el7.centos.x86_64 2/2 Verifying : 2:docker-common-1.13.1-103.git7f2769b.el7.centos.x86_64 1/2 Verifying : 2:docker-client-1.13.1-103.git7f2769b.el7.centos.x86_64 2/2 Removed: docker-client.x86_64 2:1.13.1-103.git7f2769b.el7.centos docker-common.x86_64 2:1.13.1-103.git7f2769b.el7.centos Complete! [root@osestaging1 discourse]#
- Aaaand now it's gone
[root@osestaging1 discourse]# docker -v bash: /bin/docker: No such file or directory [root@osestaging1 discourse]#
- alright, let's get that damn unsigned https install script and see what it does
[root@osestaging1 discourse]# wget https://get.docker.com/ -O installDocker.sh --2019-10-28 13:26:37-- https://get.docker.com/ Resolving get.docker.com (get.docker.com)... 143.204.101.29, 143.204.101.37, 143.204.101.126, ... Connecting to get.docker.com (get.docker.com)|143.204.101.29|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 13216 (13K) [text/plain] Saving to: ‘installDocker.sh’ 100%[========================================================================================================>] 13,216 --.-K/s in 0s 2019-10-28 13:26:37 (32.0 MB/s) - ‘installDocker.sh’ saved [13216/13216] [root@osestaging1 discourse]#
- good christ, it's 476 lines long. The sources (and some related documentation) is found on their github here. Note that even they say not to use this script on production systems :( https://github.com/docker/docker-install
- I see no on there about verifying a cryptographic signature of the file. ffs it's not hard to implement. I remember hitting this wall when I was researching discourse & docker in the past, and I was stunned to discover that even the whonix project used discourse. I asked the founder (Patrick Schleizer) about his thoughts on the security of Discourse back in 2018-09, and the best suggestion he offered for authenticity of the install script was some args for curl https://forums.whonix.org/t/change-whonix-forum-software-to-discourse/1181/15
curl --remote-name --tlsv1.2 --proto =https --location --remote-name https://get.docker.com/
- his command was untested and had some issues; I fixed it, having to bruteforce the name of the script. It's actually 'install.sh' (not get-docker.sh as the comments suggest), and attempts to grab anything else return a 403 from cloudflare. I did this one in whonix (meta) through tor
user@host:~$ curl -i --tlsv1.2 --proto =https --location --remote-name https://get.docker.com/install.sh % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 76963 100 76963 0 0 13326 0 0:00:05 0:00:05 --:--:-- 17836 user@host:~$ sha384sum install.sh da1bb77df1cc6aea926b893cb67780c492ca8fcaf52edd5328819732ce914c894f2fed8c210aec92a9df1c03de51107b install.sh user@host:~$
- compare that to what I downloaded through the internet from our staging server, and we have a --- a mismatch?
[root@osestaging1 discourse]# sha384sum installDocker.sh 68041f4b75f5485834c53c549d1682f1d36af864ac2fde5eba1d7bf401fd44db3a6c79ba32d7f85c6778aea5897182c4 installDocker.sh [root@osestaging1 discourse]#
- so somehow install.sh obtained through tor is actually a binary file? Did I really just get MITM'd?
user@host:~$ less install.sh "install.sh" may be a binary file. See it anyway? user@host:~$
- well, when I tried this command from our server, I got the same binary
[root@osestaging1 discourse]# curl --tlsv1.2 --proto=https --location --remote-name https://get.docker.com/install.sh curl: option --proto=https: is unknown curl: try 'curl --help' or 'curl --manual' for more information [root@osestaging1 discourse]# curl --tlsv1.2 --proto =https --location --remote-name https://get.docker.com/install.sh % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 76963 100 76963 0 0 230k 0 --:--:-- --:--:-- --:--:-- 231k [root@osestaging1 discourse]# sha384sum install.sh da1bb77df1cc6aea926b893cb67780c492ca8fcaf52edd5328819732ce914c894f2fed8c210aec92a9df1c03de51107b install.sh [root@osestaging1 discourse]# less install.sh "install.sh" may be a binary file. See it anyway? [root@osestaging1 discourse]#
- this is hard because there there appears to be no endpoint file name in the URI, which curl wants (I guess for security reasons it would be good to ensure there's no redirects), but I can't find a file name that's correct. It's just spat out on a query for '/', and otherwise I get a 403 Access Denied. If I change to stdout instead of --remote-name, then it works
user@host:~$ curl --tlsv1.2 --proto =https --location https://get.docker.com/ > get-docker.sh % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 13216 100 13216 0 0 3649 0 0:00:03 0:00:03 --:--:-- 3648 user@host:~$ sha384sum get-docker.sh 68041f4b75f5485834c53c549d1682f1d36af864ac2fde5eba1d7bf401fd44db3a6c79ba32d7f85c6778aea5897182c4 get-docker.sh user@host:~$
- I downloaded it again through a distinct path on a vpn
user@disp5412:~$ curl --tlsv1.2 --proto https --location https://get.docker.com/ > get-docker.sh % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 13216 100 13216 0 0 1606 0 0:00:08 0:00:08 --:--:-- 3561 user@disp5412:~$ sha384sum get-docker.sh 68041f4b75f5485834c53c549d1682f1d36af864ac2fde5eba1d7bf401fd44db3a6c79ba32d7f85c6778aea5897182c4 get-docker.sh user@disp5412:~$ user@disp5412:~$ curl ifconfig.co/json {"ip":"5.254.96.242","ip_decimal":100557042,"country":"Romania","country_eu":true,"country_iso":"RO","city":"Bucharest","latitude":44.4354,"longitude":26.1033,"asn":"AS3223","asn_org":"Voxility LLP","user_agent":{"product":"curl","version":"7.52.1","raw_value":"curl/7.52.1"}}user@disp5412:~$
Wed Oct 25, 2019
- now that I've finished a script to automate the sync from prod to staging, I can finally proceed with a POC of Discourse or AskBot
- I emailed Marcin asking which was higher priority, which I'll begin next week
- Marcin said he's getting 414 request-uri too large issues from wordpress when attempting spam moderation. I checked our nginx config, which uses a 10M limit on 'client_max_body_size' which is 10x the default of 1M.
- I responded to our old email chain with Christian from almost 2 months ago asking if he heard back from kiwix regarding our offline zim wiki archive, and I asked if he could write an article about this archive as a howto for users to use it on andorid to publish on osemain
- Marcin confirmed: I should work on the Discourse POC next week
- I updated my TODO list https://wiki.opensourceecology.org/wiki/OSE_Server#TODO
- namely, in addition to this Discourse POC, I also need to add 2FA support to our VPN and put together guides for OSE devs to gain access to the VPN and also guides for the OSE sysadmin to grant them access
Tue Oct 24, 2019
- Marcin mentioned yesterday that the ajax signup form for osemail on our phplist post on osemain is broken https://www.opensourceecology.org/moving-to-open-source-email-list-software/
- looks like it's wordpress wrapping our javascript in paragraph
tags again; I fixed this back January by using the wpautop-control plugin https://wiki.opensourceecology.org/wiki/Maltfield_Log/2019_Q1#Sat_Jan_26.2C_2019
- Marcin said he couldn't fix it today by doing a restore (fucking worpdress doesn't do an actual restore; it *still* tries to prase the old content & add paragraph tags??), so he just made an image of the form and linked to the signup page on phplist.opensourceecology.org. That sucks.
- I logged into osemain's wordpress wui. Oh, no, the 'wpautop-control' plugin isn't activated anymore. I'm assuming that marcin disabled it when doing some cleanup to debug slowdown after we added the social media and seo plugins https://wiki.opensourceecology.org/wiki/Maltfield_Log/2019_Q3#Mon_Sep_10.2C_2019
- I activated the plugin, and I restored to the most recent revision that was made by me. And it worked! https://wiki.opensourceecology.org/wiki/Maltfield_Log/2019_Q3#Mon_Sep_10.2C_2019
- ...
- continuing from yesterday, I need to create a new non-root user (which will have to exist on both staging & production) that I'll both [a] give NOPASSWD sudo access on the staging server only and [b] grant ssh key authorized access to only on the staging server
- I named this user stagingsync. On staging, I added an authorized_keys file with the root-owned public key for the 4096-bit passwordless rsa ssh key that I generated on prod yesterday
[root@osestaging1 ~]# adduser stagingsync [root@osestaging1 ~]# ls -lah /home/stagingsync total 20K drwx------. 2 stagingsync stagingsync 4.0K Oct 24 12:12 . drwxr-xr-x. 14 root root 4.0K Oct 24 12:12 .. -rw-r--r--. 1 stagingsync stagingsync 18 Sep 6 2017 .bash_logout -rw-r--r--. 1 stagingsync stagingsync 193 Sep 6 2017 .bash_profile -rw-r--r--. 1 stagingsync stagingsync 231 Sep 6 2017 .bashrc [root@osestaging1 ~]# su - stagingsync [stagingsync@osestaging1 ~]$ mkdir .ssh [stagingsync@osestaging1 ~]$ chmod 0700 .ssh [stagingsync@osestaging1 ~]$ echo "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQC4FqRKRYw8qgLqbgfH1Yze+EWQ9wJudNU4+jrHHsatKag3yl90zE557NukZGfIcNP6sFp6+f8VeK0W9g6yhMiAq9wrsS6VrgZw1frjsFaflBaDPwPQb8s5uvj5O6P+9R0jg05t5kiHtkSrgXD7uFXkYbXeUm7xaeQRgOk+0Lt1tnVcT8g+EJDnQ7XlChLd+AXGUCiyRv+kLYCO9014Yd0Q4zlLfpRvHwXgE2gPjJDUqjiVM4SDtCqP1wSSp6JvW+bGAnFKEof/n1MyuYWajicJBijLkooCamI6VY20Qed1mv0V4E/9q2E3eQa/itd/Ai3SiEHxZURl3sVL3MPpKWqX9SG7ygZYIcnfnRah/JRjEkS84drIhdPgvF+W+X8r9i3/jRduP4H5nY9giqQBkchgZ+zixduVsjJk69oaxW3bMsJDH/UfX96gKl4HZaboJecBbKm3ZZi1YKsmAWBl6FdfsLT2FERHxWpb3PUsrfUGza187N9UHnPQESqyhpI0SRd+xMF/nZypDQEv1dSHnl4W/d6iaotZ4/RSMUF+nNHzbL/hjtusnd0f9llaEkc+v0IzRMtL6DB5XMmp9wWVkfE0Mg9qWIaqWgJKu1/wp4GABjpt2T5D2OkksgePWUQgHzXVC7By0I3XoEswFfFV/FTpp4r16lZc36s4dkDGsXT/6Q== root@opensourceecology.org" > .ssh/authorized_keys [stagingsync@osestaging1 ~]$ chmod 0600 .ssh/authorized_keys [stagingsync@osestaging1 ~]$
</pre>
- then, per requirement, I added the stagingsync user to the sshaccess group
[root@osestaging1 ~]# gpasswd -a stagingsync sshaccess Adding user stagingsync to group sshaccess [root@osestaging1 ~]# </pre>
- I don't want stagingsync to have ssh access to prod (which, without a authorized_keys file on prod, it wouldn't be able to ssh in anyway--but it would be wise anyway to leave it out of the sshaccess group on prod), so I'll *not* do this on prod. because I do want to sync the /etc/groups file from prod to staging, I'll add a step in the sync script that appends ',stagingsync' to the 'sshaccess' line in /etc/groups
- cool, it works
[root@opensourceecology ~]# ssh -i /root/.ssh/id_rsa.201910 -p 32415 stagingsync@10.241.189.11 hostname osestaging1 [root@opensourceecology ~]#
- now I added the 'stagingsync' user to have NOPASSWD rights on staging only; note that this will not get overwritten as our rsync command explicitly excludes the sudo config
[root@osestaging1 ~]# tail /etc/sudoers # %users ALL=/sbin/mount /mnt/cdrom, /sbin/umount /mnt/cdrom ## Allows members of the users group to shutdown this system # %users localhost=/sbin/shutdown -h now ## Read drop-in files from /etc/sudoers.d (the # here does not mean a comment) #includedir /etc/sudoers.d maltfield ALL=(ALL) NOPASSWD: ALL stagingsync ALL=(ALL) NOPASSWD: ALL [root@osestaging1 ~]#
- I'm having issues with connections to staging suddenly failing from other vpn clients (my laptop and the prod server) after some time, even though my connection appears to remain successful. closing & reconnecting re-enables me to access staging.
- I inititaed a new rsync using my new script. here's what it looks like now
############ # SETTINGS # ############ STAGING_HOST=10.241.189.11 STAGING_SSH_PORT=32415 SYNC_USERNAME=stagingsync ######### # RSYNC # ######### # bwlimit prevents saturating the network on prod # rsync-path makes a non-root ssh user become root on the staging side # exclude /home/b2user just saves space & time # exclude /home/stagingsync because 'stagingsync' should be able to access # staging but not production # exclude /etc/sudo* as we want 'stagingsync' NOPASSWD on staging, not root time nice rsync \ -e "ssh -p ${STAGING_SSH_PORT} -i /root/.ssh/id_rsa.201910" \ --bwlimit=3000 \ --numeric-ids \ --rsync-path="sudo rsync" \ --exclude=/root \ --exclude=/run \ --exclude=/home/b2user/sync* \ --exclude=/home/stagingsync* \ --exclude=/etc/sudo* \ --exclude=/etc/openvpn \ --exclude=/usr/share/easy-rsa \ --exclude=/dev \ --exclude=/sys \ --exclude=/proc \ --exclude=/boot/ \ --exclude=/etc/sysconfig/network* \ --exclude=/tmp \ --exclude=/var/tmp \ --exclude=/etc/fstab \ --exclude=/etc/mtab \ --exclude=/etc/mdadm.conf \ --exclude=/etc/hostname \ -av \ --progress \ / ${SYNC_USERNAME}@${STAGING_HOST}:/
- it works!
[root@opensourceecology bin]# ./syncToStaging.sh ... var/www/html/www.opensourceecology.org/htdocs/wp-content/uploads/2019/10/workshop9sm.jpg 97794 100% 113.29kB/s 0:00:00 (xfer#4820, to-check=898/518063) sent 810748552 bytes received 2196940 bytes 1910565.20 bytes/sec total size is 41443449279 speedup is 50.98 + exit 0 [root@opensourceecology bin]#
- A double-tap fails, probably because the sync updated /etc/group, removing 'stagingsync' from the 'sshaccess' group
[root@opensourceecology bin]# ./syncToStaging.sh + STAGING_HOST=10.241.189.11 + STAGING_SSH_PORT=32415 + SYNC_USERNAME=stagingsync + nice rsync -e 'ssh -p 32415 -i /root/.ssh/id_rsa.201910' --bwlimit=3000 --numeric-ids '--rsync-path=sudo rsync' --exclude=/root --exclude=/run '--exclude=/home/b2user/sync*' '--exclude=/home/stagingsync*' '--exclude=/etc/sudo*' --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ '--exclude=/etc/sysconfig/network*' --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf --exclude=/etc/hostname -av --progress / stagingsync@10.241.189.11:/ Permission denied (publickey). rsync: connection unexpectedly closed (0 bytes received so far) [sender] rsync error: unexplained error (code 255) at io.c(605) [sender=3.0.9] real 0m0.147s user 0m0.031s sys 0m0.007s + exit 0 [root@opensourceecology bin]#
- I ran this sed command, which I'll add to the script
[root@osestaging1 ~]# grep sshaccess /etc/group sshaccess:x:1006:cmota,marcin,tgriffing,maltfield,lberezhny,crupp [root@osestaging1 ~]# sed -i 's/^sshaccess:\(.*\)$/sshaccess:\1,stagingsync/' /etc/group [root@osestaging1 ~]# grep sshaccess /etc/group sshaccess:x:1006:cmota,marcin,tgriffing,maltfield,lberezhny,crupp,stagingsync [root@osestaging1 ~]#
- now the second tap works; it wasn't quite as fast as I'd like; it spent a lot of time on mysql, logs, ossec, munin, etc files that changed from just a few minutes ago
[root@opensourceecology bin]# ./syncToStaging.sh ... var/www/html/munin/static/zoom.js 4760 100% 422.59kB/s 0:00:00 (xfer#773, to-check=1002/322356) sent 61086821 bytes received 1431809 bytes 454680.95 bytes/sec total size is 41445400614 speedup is 662.93 real 2m17.019s user 0m22.964s sys 0m8.157s + exit 0 [root@opensourceecology bin]#
- I went to add the sed command to be executed after the rsync but--well--that's a new line with a new connection necessarily. And I can't connect after rsync copied-over the /etc/group file. I'm in a catch-22.
- my solution: exclude the rsync of /etc/group, and do it manually with sed piping to the file over ssh
sed 's/^sshaccess:\(.*\)$/sshaccess:\1,stagingsync/' /etc/group | ssh -p32415 -i /root/.ssh/id_rsa.201910 stagingsync@10.241.189.11 'sudo tee /etc/group'
- I was also able to fix all the nginx configs by adding this to the script
############ # SETTINGS # ############ ... PRODUCTION_IP1=138.201.84.243 PRODUCTION_IP2=138.201.84.223 PRODUCTION_IPv6='2a01:4f8:172:209e::2' ... ############# # FUNCTIONS # ############# runOnStaging () { ssh -p ${STAGING_SSH_PORT} -i '/root/.ssh/id_rsa.201910' ${SYNC_USERNAME}@${STAGING_HOST} $1 } ... ################## # NGINX BINDINGS # ################## # nginx configs must be updated to bind to our staging server's VPN address # instead of the prod server's internet-facing IP addresses # update the listen lines to use the VPN IP runOnStaging "sudo sed -i 's/${PRODUCTION_IP1}/${STAGING_HOST}/g' /etc/nginx/conf.d/*" runOnStaging "sudo sed -i 's/${PRODUCTION_IP2}/${STAGING_HOST}/g' /etc/nginx/conf.d/*" # since the main config file has both listens (for redirecting port 80 to port # 80 to port 443, we just do it once & comment-out the second one to avoid err runOnStaging "sudo sed -i 's/${PRODUCTION_IP1}/${STAGING_HOST}/g' /etc/nginx/nginx.conf" runOnStaging "sudo sed -i 's^\(\s*\)[^#]*listen ${PRODUCTION_IP2}\(.*\)^\1#listen ${PRODUCTION_IP2}\2^' /etc/nginx/nginx.conf" # just remove all of ipv6 listens runOnStaging "sudo sed -i 's^\(\s*\)[^#]*listen \[${PRODUCTION_IPv6}\(.*\)^\1#listen \[${PRODUCTION_IPv6}\2^' /etc/nginx/nginx.conf" runOnStaging "sudo sed -i 's^\(\s*\)[^#]*listen \[${PRODUCTION_IPv6}\(.*\)^\1#listen \[${PRODUCTION_IPv6}\2^' /etc/nginx/conf.d/*" # since we went from 2 prod IPs to 1 staging IP, we must remove one of the # default_server entries. We choose to make OSE default & remove it from OBI runOnStaging "sudo sed -i 's^listen \(.*\) default_server^listen \1^' /etc/nginx/conf.d/www.openbuildinginstitute.org.conf"
- because the websites necessarily look exactly the same, I decided to add a quick one-liner to add a 'is_staging' file into the docroot of the vhosts with the contents 'true' on the staging box after the sync. On prod, a GET for '/is_staging' should return a 404.
for docroot in $(find /var/www/html/* -maxdepth 1 -name htdocs -type d); do echo 'true' > "$docroot/is_staging"; done
- ok, I finished the sync script! I haven't added it to a cron yet (which I would also have to comment-out on the staging box; super meta), but here's what I got so far
[root@opensourceecology bin]# cat syncToStaging.sh #!/bin/bash set -x ################################################################################ # Author: Michael Altfield <michael at opensourceecology dot org> # Created: 2019-10-23 # Updated: 2019-10-24 # Version: 0.1 # Purpose: Syncs 99% of the prod node state to staging & staging-ifys it ################################################################################ ############ # SETTINGS # ############ STAGING_HOST=10.241.189.11 STAGING_SSH_PORT=32415 SYNC_USERNAME=stagingsync PRODUCTION_IP1=138.201.84.243 PRODUCTION_IP2=138.201.84.223 PRODUCTION_IPv6='2a01:4f8:172:209e::2' ############# # FUNCTIONS # ############# runOnStaging () { ssh -p ${STAGING_SSH_PORT} -i '/root/.ssh/id_rsa.201910' ${SYNC_USERNAME}@${STAGING_HOST} $1 } ######### # RSYNC # ######### # bwlimit prevents saturating the network on prod # rsync-path makes a non-root ssh user become root on the staging side # exclude /home/b2user/sync* just saves space & time # exclude /home/stagingsync* because 'stagingsync' should be able to access # staging but not production # exclude /etc/group so 'stagingsync' is in the 'sshaccess' group on staging # but not on prod # exclude /etc/sudo* as we want 'stagingsync' NOPASSWD on staging, not root time nice rsync \ -e "ssh -p ${STAGING_SSH_PORT} -i /root/.ssh/id_rsa.201910" \ --bwlimit=3000 \ --numeric-ids \ --rsync-path="sudo rsync" \ --exclude=/root \ --exclude=/run \ --exclude=/home/b2user/sync* \ --exclude=/home/stagingsync* \ --exclude=/etc/sudo* \ --exclude=/etc/group \ --exclude=/etc/openvpn \ --exclude=/usr/share/easy-rsa \ --exclude=/dev \ --exclude=/sys \ --exclude=/proc \ --exclude=/boot/ \ --exclude=/etc/sysconfig/network* \ --exclude=/tmp \ --exclude=/var/tmp \ --exclude=/etc/fstab \ --exclude=/etc/mtab \ --exclude=/etc/mdadm.conf \ --exclude=/etc/hostname \ -av \ --progress \ / ${SYNC_USERNAME}@${STAGING_HOST}:/ ################## # NGINX BINDINGS # ################## # nginx configs must be updated to bind to our staging server's VPN address # instead of the prod server's internet-facing IP addresses # update the listen lines to use the VPN IP runOnStaging "sudo sed -i 's/${PRODUCTION_IP1}/${STAGING_HOST}/g' /etc/nginx/conf.d/*" runOnStaging "sudo sed -i 's/${PRODUCTION_IP2}/${STAGING_HOST}/g' /etc/nginx/conf.d/*" # since the main config file has both listens (for redirecting port 80 to port # 80 to port 443, we just do it once & comment-out the second one to avoid err runOnStaging "sudo sed -i 's/${PRODUCTION_IP1}/${STAGING_HOST}/g' /etc/nginx/nginx.conf" runOnStaging "sudo sed -i 's^\(\s*\)[^#]*listen ${PRODUCTION_IP2}\(.*\)^\1#listen ${PRODUCTION_IP2}\2^' /etc /nginx/nginx.conf" # just remove all of ipv6 listens runOnStaging "sudo sed -i 's^\(\s*\)[^#]*listen \[${PRODUCTION_IPv6}\(.*\)^\1#listen \[${PRODUCTION_IPv6}\2^ ' /etc/nginx/nginx.conf" runOnStaging "sudo sed -i 's^\(\s*\)[^#]*listen \[${PRODUCTION_IPv6}\(.*\)^\1#listen \[${PRODUCTION_IPv6}\2^ ' /etc/nginx/conf.d/*" # since we went from 2 prod IPs to 1 staging IP, we must remove one of the # default_server entries. We choose to make OSE default & remove it from OBI runOnStaging "sudo sed -i 's^listen \(.*\) default_server^listen \1^' /etc/nginx/conf.d/www.openbuildinginstitute.org.conf" # finally, restart nginx runOnStaging "sudo systemctl restart nginx.service" ######################### # MAKE THE STAGING MARK # ######################### # we leave a mark so we can test to see if we're looking at staging by doing a # GET request against '/is_staging'. It should 404 on prod but return 200 on # staging runOnStaging 'for docroot in $(sudo find /var/www/html/* -maxdepth 1 -name htdocs -type d); do echo 'true' | sudo tee "$docroot/is_staging"; done' ################### # OSSEC SILENCING # ################### # we don't need ossec email alerts from our staging server runOnStaging "sudo sed -i 's^<email_notification>yes</email_notification>^<email_notification>no</email_notification>^' /var/ossec/etc/ossec.conf" ################## # CRON DISABLING # ################## # disable certbot cron runOnStaging "sudo sed -i 's^\(\s*\)\([^#]\)\(.*\)^\1#\2\3^' /etc/cron.d/letsencrypt" # disable backups cron runOnStaging "sudo sed -i 's^\(\s*\)\([^#]\)\(.*\)^\1#\2\3^' /etc/cron.d/backup_to_backblaze" ############### # USER/GROUPS # ############### # append ',stagingsync' to the 'sshaccess' line in /etc/groups to permit this # user to be able to ssh into staging (we don't do this on prod so they can't # ssh into prod) sed 's/^sshaccess:\(.*\)$/sshaccess:\1,stagingsync/' /etc/group | ssh -p${STAGING_SSH_PORT} -i /root/.ssh/id_rsa.201910 ${SYNC_USERNAME}@${STAGING_HOST} 'sudo tee /etc/group' ######## # EXIT # ######## # clean exit exit 0 [root@opensourceecology bin]#
Mon Oct 23, 2019
- I updated the wiki documentation on the development server, added an article on the staging server, and added some bits about the /var network block mount and the vpn config
- ...
- it does not appear that I can simply add items to the client's /etc/hosts file or otherwise on a per-ip or per-dns basis. It appears that I can only add a "dhcp-option DNS" item to the server (or client) configs to override the dns server used on the client https://openvpn.net/community-resources/pushing-dhcp-options-to-clients/
- so then I can run a dns server on osedev1 which has a few entries for each of our websites, point them to the VPN IP of osestaging1 (10.241.189.11), and defers the rest onto 1.1.1.1 or something.
- this question suggests using dnsmasq https://askubuntu.com/questions/885497/openvpn-and-dns
- cool, dnsmasq-2.76-9 is already installed on our cent7 osedev1 box. Let's take that low-hanging fruit
[root@osedev1 3]# rpm -qa | grep -i dns dnsmasq-2.76-9.el7.x86_64 [root@osedev1 3]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@osedev1 3]#
- It also appears to already be running
[root@osedev1 3]# ps -ef | grep dnsmasq nobody 1346 1 0 Oct22 ? 00:00:00 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper root 1347 1346 0 Oct22 ? 00:00:00 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper root 18856 14405 0 13:55 pts/10 00:00:00 grep --color=auto dnsmasq [root@osedev1 3]#
- oh, shit, it appears to be only listening on 192.168.122.1:53 which is our lxc network
[root@osedev1 etc]# ss -plan | grep -i dnsmasq u_dgr UNCONN 0 0 * 18757 * 8150 users:(("dnsmasq",pid=1346,fd=10)) udp UNCONN 0 0 192.168.122.1:53 *:* users:(("dnsmasq",pid=1346,fd=5)) udp UNCONN 0 0 *%virbr0:67 *:* users:(("dnsmasq",pid=1346,fd=3)) tcp LISTEN 0 5 192.168.122.1:53 *:* users:(("dnsmasq",pid=1346,fd=6)) [root@osedev1 etc]# ss -planu | grep -i dnsmasq UNCONN 0 0 192.168.122.1:53 *:* users:(("dnsmasq",pid=1346,fd=5)) UNCONN 0 0 *%virbr0:67 *:* users:(("dnsmasq",pid=1346,fd=3)) [root@osedev1 etc]#
- I could find no entries in the dnsmasq.conf file for the bind address
[root@osedev1 etc]# ls -lah /etc/dnsmasq.* -rw-r--r--. 1 root root 27K Aug 9 01:12 /etc/dnsmasq.conf /etc/dnsmasq.d: total 8.0K drwxr-xr-x. 2 root root 4.0K Aug 9 01:12 . drwxr-xr-x. 86 root root 4.0K Oct 23 14:20 .. [root@osedev1 etc]# grep '192.168.122' /etc/dnsmasq.conf [root@osedev1 etc]#
- I found two unrelated files that specify this network--unless dnsmasq is somehow configured by libvirt?
[root@osedev1 etc]# grep -irl '192.168.122' /etc /etc/libvirt/qemu/networks/default.xml /etc/openvpn/openvpn-status.log [root@osedev1 etc]# cat /etc/libvirt/qemu/networks/default.xml <!-- WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE OVERWRITTEN AND LOST. Changes to this xml configuration should be made using: virsh net-edit default or other application using the libvirt API. --> <network> <name>default</name> <uuid>a11767e5-cc15-4acd-9443-bbffc220fa4d</uuid> <forward mode='nat'/> <bridge name='virbr0' stp='on' delay='0'/> <mac address='52:54:00:7d:01:71'/> <ip address='192.168.122.1' netmask='255.255.255.0'> <dhcp> <range start='192.168.122.2' end='192.168.122.254'/> </dhcp> </ip> </network> [root@osedev1 etc]# cat /etc/openvpn/openvpn-status.log OpenVPN CLIENT LIST Updated,Wed Oct 23 14:28:10 2019 Common Name,Real Address,Bytes Received,Bytes Sent,Connected Since hetzner2,138.201.84.223:34914,8122227,496427,Tue Oct 22 17:48:40 2019 osestaging1,192.168.122.201:51674,2340646,949941,Tue Oct 22 18:07:40 2019 maltfield,27.7.149.58:51080,44891,39735,Wed Oct 23 13:28:30 2019 ROUTING TABLE Virtual Address,Common Name,Real Address,Last Ref 10.241.189.10,maltfield,27.7.149.58:51080,Wed Oct 23 14:28:09 2019 10.241.189.11,osestaging1,192.168.122.201:51674,Wed Oct 23 13:48:08 2019 GLOBAL STATS Max bcast/mcast queue length,1 END [root@osedev1 etc]#
- I think it is libvirt; this libvirt guide describes how to avoid conflicts when trying to use a distinct "global" dnsmasq config https://wiki.libvirt.org/page/Libvirtd_and_dnsmasq
- I made a backup of the existing /etc/dnsmasq.conf file and added the lines to bind dnsmasq only on tun0 to the config
[root@osedev1 etc]# cp dnsmasq.conf dnsmasq.20191023.orig.conf [root@osedev1 etc]# vim dnsmasq.conf ... [root@osedev1 etc]# tail /etc/dnsmasq.conf #conf-dir=/etc/dnsmasq.d,.bak # Include all files in a directory which end in .conf #conf-dir=/etc/dnsmasq.d/,*.conf # Include all files in /etc/dnsmasq.d except RPM backup files conf-dir=/etc/dnsmasq.d,.rpmnew,.rpmsave,.rpmorig interface=tun0 bind-interfaces [root@osedev1 etc]#
- And then I verified that the dnsmasq.service is disabled
[root@osedev1 etc]# systemctl list-units | grep -i dns unbound-anchor.timer loaded active waiting daily update of the root trust anchor for DNSSEC [root@osedev1 etc]# systemctl list-unit-files | grep -i dns chrony-dnssrv@.service static dnsmasq.service disabled chrony-dnssrv@.timer disabled [root@osedev1 etc]#
- I started it
[root@osedev1 etc]# systemctl status dnsmasq.service ● dnsmasq.service - DNS caching server. Loaded: loaded (/usr/lib/systemd/system/dnsmasq.service; disabled; vendor preset: disabled) Active: inactive (dead) [root@osedev1 etc]# systemctl start dnsmasq.service [root@osedev1 etc]# systemctl status dnsmasq.service ● dnsmasq.service - DNS caching server. Loaded: loaded (/usr/lib/systemd/system/dnsmasq.service; disabled; vendor preset: disabled) Active: active (running) since Wed 2019-10-23 14:40:58 CEST; 2s ago Main PID: 29666 (dnsmasq) Tasks: 1 CGroup: /system.slice/dnsmasq.service └─29666 /usr/sbin/dnsmasq -k Oct 23 14:40:58 osedev1 systemd[1]: Started DNS caching server.. Oct 23 14:40:58 osedev1 dnsmasq[29666]: started, version 2.76 cachesize 150 Oct 23 14:40:58 osedev1 dnsmasq[29666]: compile time options: IPv6 GNU-getopt DBus no-i18n IDN DHCP DHCPv6 no-Lua TFTP no-...inotify Oct 23 14:40:58 osedev1 dnsmasq[29666]: reading /etc/resolv.conf Oct 23 14:40:58 osedev1 dnsmasq[29666]: using nameserver 213.133.100.100#53 Oct 23 14:40:58 osedev1 dnsmasq[29666]: using nameserver 213.133.99.99#53 Oct 23 14:40:58 osedev1 dnsmasq[29666]: using nameserver 213.133.98.98#53 Oct 23 14:40:58 osedev1 dnsmasq[29666]: read /etc/hosts - 6 addresses Hint: Some lines were ellipsized, use -l to show in full. [root@osedev1 etc]#
- cool, now it looks like it's running on both the 192.168.122 virbr0 lxc network and the 10.241.189 tun0 vpn network
[root@osedev1 etc]# ss -plan | grep -i dnsmasq u_dgr UNCONN 0 0 * 20573881 * 8150 users:(("dnsmasq",pid=29666,fd=15)) u_dgr UNCONN 0 0 * 18757 * 8150 users:(("dnsmasq",pid=1346,fd=10)) udp UNCONN 0 0 127.0.0.1:53 *:* users:(("dnsmasq",pid=29666,fd=6)) udp UNCONN 0 0 10.241.189.1:53 *:* users:(("dnsmasq",pid=29666,fd=4)) udp UNCONN 0 0 192.168.122.1:53 *:* users:(("dnsmasq",pid=1346,fd=5)) udp UNCONN 0 0 *%virbr0:67 *:* users:(("dnsmasq",pid=1346,fd=3)) udp UNCONN 0 0 ::1:53 :::* users:(("dnsmasq",pid=29666,fd=10)) udp UNCONN 0 0 fe80::fd4a:7df9:169:e7e2%tun0:53 :::* users:(("dnsmasq",pid=29666,fd=8)) tcp LISTEN 0 5 127.0.0.1:53 *:* users:(("dnsmasq",pid=29666,fd=7)) tcp LISTEN 0 5 10.241.189.1:53 *:* users:(("dnsmasq",pid=29666,fd=5)) tcp LISTEN 0 5 192.168.122.1:53 *:* users:(("dnsmasq",pid=1346,fd=6)) tcp LISTEN 0 5 ::1:53 :::* users:(("dnsmasq",pid=29666,fd=11)) tcp LISTEN 0 5 fe80::fd4a:7df9:169:e7e2%tun0:53 :::* users:(("dnsmasq",pid=29666,fd=9)) [root@osedev1 etc]# ss -planu | grep -i dnsmasq UNCONN 0 0 127.0.0.1:53 *:* users:(("dnsmasq",pid=29666,fd=6)) UNCONN 0 0 10.241.189.1:53 *:* users:(("dnsmasq",pid=29666,fd=4)) UNCONN 0 0 192.168.122.1:53 *:* users:(("dnsmasq",pid=1346,fd=5)) UNCONN 0 0 *%virbr0:67 *:* users:(("dnsmasq",pid=1346,fd=3)) UNCONN 0 0 ::1:53 :::* users:(("dnsmasq",pid=29666,fd=10)) UNCONN 0 0 fe80::fd4a:7df9:169:e7e2%tun0:53 :::* users:(("dnsmasq",pid=29666,fd=8)) [root@osedev1 etc]#
- cool, from my laptop the 53 udp port on osedev1's vpn address appears to be open. or, uh, filtered?
user@ose:~/openvpn$ sudo nmap -Pn -sU -p53 10.137.0.1 Starting Nmap 7.40 ( https://nmap.org ) at 2019-10-23 18:31 +0545 Nmap scan report for 10.137.0.1 (10.137.0.1) Host is up. PORT STATE SERVICE 53/udp open|filtered domain Nmap done: 1 IP address (1 host up) scanned in 2.12 seconds user@ose:~/openvpn$
- nope, fail.
user@ose:~/openvpn$ dig @10.137.0.1 google.com ; <<>> DiG 9.10.3-P4-Debian <<>> @10.137.0.1 google.com ; (1 server found) ;; global options: +cmd ;; connection timed out; no servers could be reached user@ose:~/openvpn$
- Indeed, all ports are reported as "filtered"
user@ose:~/openvpn$ nmap -Pn 10.241.189.1 Starting Nmap 7.40 ( https://nmap.org ) at 2019-10-23 18:29 +0545 Nmap scan report for 10.241.189.1 (10.241.189.1) Host is up. All 1000 scanned ports on 10.241.189.1 (10.241.189.1) are filtered Nmap done: 1 IP address (1 host up) scanned in 201.47 seconds user@ose:~/openvpn$
- I bet this is an iptables issues. And, christ, the iptables looks more complex than anything I built; I guess this is libvirt's doing?
[root@osedev1 etc]# iptables -nvL Chain INPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 2929 205K ACCEPT udp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53 0 0 ACCEPT tcp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53 54 17712 ACCEPT udp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67 0 0 ACCEPT tcp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:67 101 15196 ACCEPT all -- lo * 0.0.0.0/0 0.0.0.0/0 107K 15M ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED 17 706 ACCEPT icmp -- * * 0.0.0.0/0 0.0.0.0/0 4 628 ACCEPT tcp -- * * 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:32415 9 804 ACCEPT udp -- * * 0.0.0.0/0 0.0.0.0/0 state NEW udp dpt:1194 11218 621K DROP all -- * * 0.0.0.0/0 0.0.0.0/0 Chain FORWARD (policy ACCEPT 320 packets, 26880 bytes) pkts bytes target prot opt in out source destination 7135 30M ACCEPT all -- * virbr0 0.0.0.0/0 192.168.122.0/24 ctstate RELATED,ESTABLISHED 7756 935K ACCEPT all -- virbr0 * 192.168.122.0/24 0.0.0.0/0 0 0 ACCEPT all -- virbr0 virbr0 0.0.0.0/0 0.0.0.0/0 0 0 REJECT all -- * virbr0 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable 0 0 REJECT all -- virbr0 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable Chain OUTPUT (policy ACCEPT 38266 packets, 6557K bytes) pkts bytes target prot opt in out source destination 54 18295 ACCEPT udp -- * virbr0 0.0.0.0/0 0.0.0.0/0 udp dpt:68 [root@osedev1 etc]# [root@osedev1 etc]# iptables-save # Generated by iptables-save v1.4.21 on Wed Oct 23 14:50:44 2019 *mangle :PREROUTING ACCEPT [136754:47136918] :INPUT ACCEPT [121507:15710076] :FORWARD ACCEPT [15247:31426842] :OUTPUT ACCEPT [38360:6581630] :POSTROUTING ACCEPT [53607:38008472] -A POSTROUTING -o virbr0 -p udp -m udp --dport 68 -j CHECKSUM --checksum-fill COMMIT # Completed on Wed Oct 23 14:50:44 2019 # Generated by iptables-save v1.4.21 on Wed Oct 23 14:50:44 2019 *nat :PREROUTING ACCEPT [13853:810289] :INPUT ACCEPT [1821:140336] :OUTPUT ACCEPT [2275:162484] :POSTROUTING ACCEPT [2276:162568] -A POSTROUTING -s 192.168.122.0/24 -d 224.0.0.0/24 -j RETURN -A POSTROUTING -s 192.168.122.0/24 -d 255.255.255.255/32 -j RETURN -A POSTROUTING -s 192.168.122.0/24 ! -d 192.168.122.0/24 -p tcp -j MASQUERADE --to-ports 1024-65535 -A POSTROUTING -s 192.168.122.0/24 ! -d 192.168.122.0/24 -p udp -j MASQUERADE --to-ports 1024-65535 -A POSTROUTING -s 192.168.122.0/24 ! -d 192.168.122.0/24 -j MASQUERADE COMMIT # Completed on Wed Oct 23 14:50:44 2019 # Generated by iptables-save v1.4.21 on Wed Oct 23 14:50:44 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [320:26880] :OUTPUT ACCEPT [38306:6563335] -A INPUT -i virbr0 -p udp -m udp --dport 53 -j ACCEPT -A INPUT -i virbr0 -p tcp -m tcp --dport 53 -j ACCEPT -A INPUT -i virbr0 -p udp -m udp --dport 67 -j ACCEPT -A INPUT -i virbr0 -p tcp -m tcp --dport 67 -j ACCEPT -A INPUT -i lo -j ACCEPT -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -p icmp -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 32415 -j ACCEPT -A INPUT -p udp -m state --state NEW -m udp --dport 1194 -j ACCEPT -A INPUT -j DROP -A FORWARD -d 192.168.122.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -A FORWARD -s 192.168.122.0/24 -i virbr0 -j ACCEPT -A FORWARD -i virbr0 -o virbr0 -j ACCEPT -A FORWARD -o virbr0 -j REJECT --reject-with icmp-port-unreachable -A FORWARD -i virbr0 -j REJECT --reject-with icmp-port-unreachable -A OUTPUT -o virbr0 -p udp -m udp --dport 68 -j ACCEPT COMMIT # Completed on Wed Oct 23 14:50:44 2019 [root@osedev1 etc]#
- I just added a single line before the drop to permit udp packets to 53 from tun0
[root@osedev1 20191023]# service iptables save iptables: Saving firewall rules to /etc/sysconfig/iptables:[ OK ] [root@osedev1 20191023]# iptables-save # Generated by iptables-save v1.4.21 on Wed Oct 23 15:23:13 2019 *mangle :PREROUTING ACCEPT [279:24925] :INPUT ACCEPT [279:24925] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [106:13445] :POSTROUTING ACCEPT [106:13445] -A POSTROUTING -o virbr0 -p udp -m udp --dport 68 -j CHECKSUM --checksum-fill COMMIT # Completed on Wed Oct 23 15:23:13 2019 # Generated by iptables-save v1.4.21 on Wed Oct 23 15:23:13 2019 *nat :PREROUTING ACCEPT [30:1478] :INPUT ACCEPT [3:218] :OUTPUT ACCEPT [4:304] :POSTROUTING ACCEPT [4:304] -A POSTROUTING -s 192.168.122.0/24 -d 224.0.0.0/24 -j RETURN -A POSTROUTING -s 192.168.122.0/24 -d 255.255.255.255/32 -j RETURN -A POSTROUTING -s 192.168.122.0/24 ! -d 192.168.122.0/24 -p tcp -j MASQUERADE --to-ports 1024-65535 -A POSTROUTING -s 192.168.122.0/24 ! -d 192.168.122.0/24 -p udp -j MASQUERADE --to-ports 1024-65535 -A POSTROUTING -s 192.168.122.0/24 ! -d 192.168.122.0/24 -j MASQUERADE COMMIT # Completed on Wed Oct 23 15:23:13 2019 # Generated by iptables-save v1.4.21 on Wed Oct 23 15:23:13 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [106:13445] -A INPUT -i virbr0 -p udp -m udp --dport 53 -j ACCEPT -A INPUT -i virbr0 -p tcp -m tcp --dport 53 -j ACCEPT -A INPUT -i virbr0 -p udp -m udp --dport 67 -j ACCEPT -A INPUT -i virbr0 -p tcp -m tcp --dport 67 -j ACCEPT -A INPUT -i lo -j ACCEPT -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -p icmp -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 32415 -j ACCEPT -A INPUT -p udp -m state --state NEW -m udp --dport 1194 -j ACCEPT -A INPUT -i tun0 -p udp -m udp --dport 53 -j ACCEPT -A INPUT -j DROP -A FORWARD -d 192.168.122.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -A FORWARD -s 192.168.122.0/24 -i virbr0 -j ACCEPT -A FORWARD -i virbr0 -o virbr0 -j ACCEPT -A FORWARD -o virbr0 -j REJECT --reject-with icmp-port-unreachable -A FORWARD -i virbr0 -j REJECT --reject-with icmp-port-unreachable -A OUTPUT -o virbr0 -p udp -m udp --dport 68 -j ACCEPT COMMIT # Completed on Wed Oct 23 15:23:13 2019 [root@osedev1 20191023]#
- And now it works!
user@ose:~/openvpn$ dig @10.241.189.1 michaelaltfield.net ; <<>> DiG 9.10.3-P4-Debian <<>> @10.241.189.1 michaelaltfield.net ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 34648 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;michaelaltfield.net. IN A ;; ANSWER SECTION: michaelaltfield.net. 3554 IN A 176.56.237.113 ;; Query time: 148 msec ;; SERVER: 10.241.189.1#53(10.241.189.1) ;; WHEN: Wed Oct 23 19:07:28 +0545 2019 ;; MSG SIZE rcvd: 64 user@ose:~/openvpn$
- now let's see i I can hardcode www.opensourceecology.org. By default, it returns the internet ip address of our prod server per our public dns records
user@ose:~/openvpn$ dig @10.241.189.1 www.opensourceecology.org ; <<>> DiG 9.10.3-P4-Debian <<>> @10.241.189.1 www.opensourceecology.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 40391 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;www.opensourceecology.org. IN A ;; ANSWER SECTION: www.opensourceecology.org. 120 IN A 138.201.84.243 ;; Query time: 214 msec ;; SERVER: 10.241.189.1#53(10.241.189.1) ;; WHEN: Wed Oct 23 19:09:07 +0545 2019 ;; MSG SIZE rcvd: 70 user@ose:~/openvpn$
- The dnsmasq.conf config says that it reads from /etc/hosts, so I just added a line to osedev1:/etc/hosts
[root@osedev1 20191023]# tail /etc/hosts 127.0.0.1 localhost.localdomain localhost 127.0.0.1 localhost4.localdomain4 localhost4 # The following lines are desirable for IPv6 capable hosts ::1 osedev1 osedev1 ::1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 # staging 10.241.189.11 www.opensourceecology.org [root@osedev1 20191023]#
- I tried the query again, but I still got the 138.201.84.243 address
user@ose:~/openvpn$ dig @10.241.189.1 www.opensourceecology.org ; <<>> DiG 9.10.3-P4-Debian <<>> @10.241.189.1 www.opensourceecology.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62221 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;www.opensourceecology.org. IN A ;; ANSWER SECTION: www.opensourceecology.org. 87 IN A 138.201.84.243 ;; Query time: 158 msec ;; SERVER: 10.241.189.1#53(10.241.189.1) ;; WHEN: Wed Oct 23 19:11:46 +0545 2019 ;; MSG SIZE rcvd: 70 user@ose:~/openvpn$
- I gave dnsmasq a restart (maybe caching issue?)
[root@osedev1 20191023]# service dnsmasq restart Redirecting to /bin/systemctl restart dnsmasq.servic [root@osedev1 20191023]#
- And I tried again; It worked this time!
user@ose:~/openvpn$ dig @10.241.189.1 www.opensourceecology.org ; <<>> DiG 9.10.3-P4-Debian <<>> @10.241.189.1 www.opensourceecology.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 34890 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;www.opensourceecology.org. IN A ;; ANSWER SECTION: www.opensourceecology.org. 0 IN A 10.241.189.11 ;; Query time: 146 msec ;; SERVER: 10.241.189.1#53(10.241.189.1) ;; WHEN: Wed Oct 23 19:11:56 +0545 2019 ;; MSG SIZE rcvd: 70 user@ose:~/openvpn$
- cool, so now to push that option in the vpn; I added the "push dhcp-option" line to /etc/openvpn/server/server.conf
push "dhcp-option DNS 10.241.189.1"
- I reconnected to the vpn from my laptop, but there were no changes to my /etc/resolv.conf. I tried to restart the openvpn server on osedev1
[root@osedev1 server]# systemctl restart openvpn@server.service [root@osedev1 server]#
- I still have no changes on my resolv.conf, but I do see the option in the output of the client
Wed Oct 23 19:20:42 2019 PUSH: Received control message: 'PUSH_REPLY,dhcp-option DNS 10.241.189.1,route 10.241.189.0 255.255.255.0,topology net30,ping 10,ping-restart 120,ifconfig 10.241.189.10 10.241.189.9,peer-id 2,cipher AES-256-GCM'
- ah, fuck, apparently the linux client of openvpn doesn't support the dhcp-option push https://unix.stackexchange.com/questions/201946/how-to-define-dns-server-in-openvpn
- this archlinux wiki has a solution for linux, but the location of the scripts pull-resolv-conf are in a distinct location on centos https://wiki.archlinux.org/index.php/OpenVPN#DNS
[root@osedev1 server]# find / | grep -i pull-resolv-conf /usr/share/doc/openvpn-2.4.7/contrib/pull-resolv-conf /usr/share/doc/openvpn-2.4.7/contrib/pull-resolv-conf/client.down /usr/share/doc/openvpn-2.4.7/contrib/pull-resolv-conf/client.up ...
- these scripts actually have to live client-side, though. My client is debian-9. It doesn't have the 'pull-resolve-conf' scripts on it. But it does have 'update-resolv-conf' and 'systemd-resolved'. The latter isn't openvpn-specific, however. I think I should use '/etc/openvpn/update-resolv-conf'
root@ose:~# find / | grep -i pull-resolv-conf root@ose:~# find / | grep -i resolv-conf /etc/openvpn/update-resolv-conf root@ose:~# find / | grep -i systemd-resolved /usr/share/man/man8/systemd-resolved.service.8.gz /usr/share/man/man8/systemd-resolved.8.gz /lib/systemd/system/systemd-resolved.service.d /lib/systemd/system/systemd-resolved.service.d/resolvconf.conf /lib/systemd/system/systemd-resolved.service /lib/systemd/systemd-resolved root@ose:~# cat /etc/issue Debian GNU/Linux 9 \n \l root@ose:~# ls -lah /etc/openvpn/update-resolv-conf -rwxr-xr-x 1 root root 1.3K Oct 15 2018 /etc/openvpn/update-resolv-conf root@ose:~#
- I added the needful to my client.conf file, but it didn't do anything when I reconnected to the vpn
root@ose:/home/user/openvpn# tail client.conf # Silence repeating messages ;mute 20 # hardening tls-cipher TLS-DHE-RSA-WITH-AES-256-GCM-SHA384 # dns for staging script-security 2 up /etc/openvpn/update-resolv-conf down /etc/openvpn/update-resolv-conf root@ose:/home/user/openvpn#
- well, the first non-commented line in that script is to check for the existance of /sbin/resolvconf and `exit 0` if it doesn't exist. Yeah, it doesn't exist.
root@ose:/home/user/openvpn# grep resolvconf /etc/openvpn/update-resolv-conf # Used snippets of resolvconf script by Thomas Hood and Chris Hanson. [ -x /sbin/resolvconf ] || exit 0 echo -n "$R" | /sbin/resolvconf -a "${dev}.openvpn" /sbin/resolvconf -d "${dev}.openvpn" root@ose:/home/user/openvpn# ls -lah /sbin/resolvconf ls: cannot access '/sbin/resolvconf': No such file or directory root@ose:/home/user/openvpn#
- per the archlinux guide linked above, I installed the 'openresolv' package from apt-get. This time it worked!
user@ose:~/openvpn$ sudo openvpn client.conf ... Wed Oct 23 19:43:00 2019 PUSH: Received control message: 'PUSH_REPLY,dhcp-option DNS 10.241.189.1,route 10.241.189.0 255.255.255.0,topology net30,ping 10,ping-restart 120,ifconfig 10.241.189.10 10.241.189.9,peer-id 2,cipher AES-256-GCM' Wed Oct 23 19:43:00 2019 OPTIONS IMPORT: timers and/or timeouts modified Wed Oct 23 19:43:00 2019 OPTIONS IMPORT: --ifconfig/up options modified Wed Oct 23 19:43:00 2019 OPTIONS IMPORT: route options modified Wed Oct 23 19:43:00 2019 OPTIONS IMPORT: --ip-win32 and/or --dhcp-option options modified Wed Oct 23 19:43:00 2019 OPTIONS IMPORT: peer-id set Wed Oct 23 19:43:00 2019 OPTIONS IMPORT: adjusting link_mtu to 1624 Wed Oct 23 19:43:00 2019 OPTIONS IMPORT: data channel crypto options modified Wed Oct 23 19:43:00 2019 Data Channel Encrypt: Cipher 'AES-256-GCM' initialized with 256 bit key Wed Oct 23 19:43:00 2019 Data Channel Decrypt: Cipher 'AES-256-GCM' initialized with 256 bit key Wed Oct 23 19:43:00 2019 ROUTE_GATEWAY 10.137.0.6 Wed Oct 23 19:43:00 2019 TUN/TAP device tun0 opened Wed Oct 23 19:43:00 2019 TUN/TAP TX queue length set to 100 Wed Oct 23 19:43:00 2019 do_ifconfig, tt->did_ifconfig_ipv6_setup=0 Wed Oct 23 19:43:00 2019 /sbin/ip link set dev tun0 up mtu 1500 Wed Oct 23 19:43:00 2019 /sbin/ip addr add dev tun0 local 10.241.189.10 peer 10.241.189.9 Wed Oct 23 19:43:00 2019 /etc/openvpn/update-resolv-conf tun0 1500 1552 10.241.189.10 10.241.189.9 init dhcp-option DNS 10.241.189.1 Too few arguments. Wed Oct 23 19:43:00 2019 /sbin/ip route add 10.241.189.0/24 via 10.241.189.9 Wed Oct 23 19:43:00 2019 Initialization Sequence Completed
- And my laptop's new resolv.conf file
user@ose:~/openvpn$ cat /etc/resolv.conf # Generated by resolvconf nameserver 10.241.189.1 user@ose:~/openvpn$
- I refreshed the 'www.opensourceecology.org' page on my browser, and--boom--it's now showing staging! Success!!1one
- now, I finished adding the other hostnames to osedev1:/etc/hosts. Unfortunately, this will have to be updated as-needed in the future
[root@osedev1 pull-resolv-conf]# tail /etc/hosts 127.0.0.1 localhost.localdomain localhost 127.0.0.1 localhost4.localdomain4 localhost4 # The following lines are desirable for IPv6 capable hosts ::1 osedev1 osedev1 ::1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 # staging 10.241.189.11 www.opensourceecology.org opensourceecology.org awstats.opensourceecology.org fef.opensourceecology.org forum.opensourceecology.org microfactory.opensourceecology.org munin.opensourceecology.org opensourceecology.org oswh.opensourceecology.org phplist.opensourceecology.org store.opensourceecology.org wiki.opensourceecology.org awstats.openbuildinginstitute.org openbuildinginstitute.org seedhome.openbuildinginstitute.org www.openbuildinginstitute.org [root@osedev1 pull-resolv-conf]#
- I restarted dnsmasq and attempted to test www.openbuildinginstitute.org. Well, it kinda worked. It pointed to the staging server--which has an expired certificate. This means that I need to do another sync & automate this nginx config sed process. But it also means that I need to somehow kill the certbot cron on staging
- ...
- meanwhile, I logged-into backblaze b2 to check the status of our backups of the dev node
- first of all the prod 'ose-sever-backups bucket has 19 files totaling to 300G. One file appears to be uploding at the moment. There's two from 2018-11 & 2018-12 at <20M, but the others vary in size from 17.5G - 18.4G.
- as for the new dev-specific 'ose-dev-server-backups' bucket, there's 0 fucking files
- I kicked-off a backup; it completed relatively fast. There were no obvious errors during the upload, but the file is not visible on the wui
INFO: moving encrypted backup file to b2user's sync dir INFO: Beginning upload to backblaze b2 URL by file name: https://f001.backblazeb2.com/file/ose-dev-server-backups/daily_osedev1_20191023_144309.tar.gpg URL by fileId: https://f001.backblazeb2.com/b2api/v2/b2_download_file_by_id?fileId=4_z2675c17c55dd1d696edd0118_f10281b8779570cee_d20191023_m144325_c001_v0001130_t0041 { "action": "upload", "fileId": "4_z2675c17c55dd1d696edd0118_f10281b8779570cee_d20191023_m144325_c001_v0001130_t0041", "fileName": "daily_osedev1_20191023_144309.tar.gpg", "size": 18465051, "uploadTimestamp": 1571841805000 } real 0m27.979s user 0m1.037s sys 0m0.321s [root@osedev1 backups]# [root@osedev1 backups]# ./backup.sh
- the last upload appears to be from 20 days ago
[root@osedev1 backups]# ls -lah /home/b2user/sync total 18M drwxr-xr-x. 2 root root 4.0K Oct 23 16:43 . drwx------. 8 b2user b2user 4.0K Oct 23 16:43 .. -rw-r--r--. 1 b2user root 18M Oct 23 16:43 daily_osedev1_20191023_144309.tar.gpg [root@osedev1 backups]# ls -lah /home/b2user/sync.old total 17M drwxr-xr-x. 2 root root 4.0K Oct 3 07:24 . drwx------. 8 b2user b2user 4.0K Oct 23 16:43 .. -rw-r--r--. 1 b2user root 17M Oct 3 07:24 daily_osedev1_20191003_052448.tar.gpg [root@osedev1 backups]#
- the cron job looks good
[root@osedev1 backups]# cat /etc/cron.d/backup_to_backblaze 20 07 * * * root time /bin/nice /root/backups/backup.sh &>> /var/log/backups/backup.log 20 04 03 * * root time /bin/nice /root/backups/backupReport.sh [root@osedev1 backups]#
- but the logging dir doesn't exist; I created it
[root@osedev1 backups]# ls -lah /var/log/backups ls: cannot access /var/log/backups: No such file or directory [root@osedev1 backups]# mkdir /var/log/backups [root@osedev1 backups]#
- actually, after some time, the b2 wui now shows the files I just uploaded; totalling to 36.9M. Wasn't the dev server in a broken state recently? That's probably what happened..
- well, I'll follow-up in a few days. Hopefully it'll be stable for ~10 days through the monthly backup on 2019-11-01, which will have a 1-year retention time.
- ..
- ok, back to the sync. First, I fixed the hostname of the staging node so I don't do the sync the wrong way (!)
[root@opensourceecology ~]# vim /etc/hostname [root@opensourceecology ~]# cat /etc/hostname osestaging1 [root@opensourceecology ~]# [root@opensourceecology ~]# hostname osestaging1 [root@opensourceecology ~]# exit logout [maltfield@osestaging1 ~]$
- oh, shit, weird. I went to ssh into the prod server using `ssh opensourceecology.org`, but it ssh'd into staging because of the new dns changes. I fixed this by updating my .ssh/config file for the 'oseprod' Host line
user@ose:~$ head .ssh/config # OSE Host oseprod HostName 138.201.84.243 Port 32415 ForwardAgent yes IdentityFile /home/user/.ssh/id_rsa.ose User maltfield Host osedev1 HostName 195.201.233.113 user@ose:~$ user@ose:~$ ssh oseprod Last login: Wed Oct 23 15:01:19 2019 from 116.75.124.97 [maltfield@opensourceecology ~]$
- so I think I should put this sync & sed process into a script that lives on prod. This was the last command I see executed in screen on prod
[ "`hostname`" = "opensourceecology.org" ] && time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --numeric-ids --rsync-path="sudo rsync" --exclude=/root --exclude=/run --exclude=/home/b2user/sync* --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/
- In order to automate this, I'll also need to give root an ssh key that lives on prod and has the ability to ssh into the staging node as some sync user which has NOPASSWD sudo rights. Of course, I do *not* want *any* such config that permits someone to do such a thing to our prod node but to grant prod access to staging in this way seems fair enough. If someone gains this locked-down key file from the prod server, we have bigger problems..
- I created a new script for this & locked it down
[root@opensourceecology bin]# date Wed Oct 23 15:10:06 UTC 2019 [root@opensourceecology bin]# pwd /root/bin [root@opensourceecology bin]# ls -lah syncToStaging.sh -rwx------ 1 root root 469 Oct 23 15:09 syncToStaging.sh [root@opensourceecology bin]# cat syncToStaging.sh #!/bin/bash set -x ################################################################################ # Author: Michael Altfield <michael at opensourceecology dot org> # Created: 2019-10-23 # Updated: 2019-10-23 # Version: 0.1 # Purpose: Syncs 99% of the prod node state to staging & staging-ifys it ################################################################################ ############ # SETTINGS # ############ ######## # EXIT # ######## # clean exit exit 0 [root@opensourceecology bin]#
- There is an existing rsa key for the root user on our prod server, but it's only 2048-bits. I think this was used to auth to our dreamhost server for scp-ing backups back in the day. In any case, it's too small; I generated a new one. Note that this key should only be used for ssh-ing into the staging server as a non-root (on the staging server). It should *not* be used to ssh into the prod server. And, of course, we should *never* allow root to ssh into any server anywhere. Oh, and, the staging server is also not exposed on the Internet; it's only accessible behind the VPN..
[root@opensourceecology bin]# ssh-keygen -lf /root/.ssh/id_rsa.pub 2048 SHA256:/LpjdDSJFVAt0a4d2PM3fWu7ci3VVwqQT0UxobZel2s root@CentOS-72-64-minimal (RSA) [root@opensourceecology bin]# ssh-keygen -t rsa -b 4096 -o -a 100 Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): /root/.ssh/id_rsa.201910 Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa.201910. Your public key has been saved in /root/.ssh/id_rsa.201910.pub. ... [root@opensourceecology bin]# ls -lah /root/.ssh/id_rsa.201910* -rw------- 1 root root 3.4K Oct 23 15:27 /root/.ssh/id_rsa.201910 -rw-r--r-- 1 root root 752 Oct 23 15:27 /root/.ssh/id_rsa.201910.pub [root@opensourceecology bin]#
- now I need a non-root user (which will have to exist on both staging & production) that I'll both [a] give NOPASSWD sudo access on the staging server only and [b] grant ssh key authorized access to only on the staging server
Mon Oct 21, 2019
- earlier this month a critical vulnerability was fixed in sudo 1.8.28 https://www.sudo.ws/alerts/minus_1_uid.html
- I configured this server to auto-update security-related updates, but I didn't see any changes to `sudo` since I've been away. I *did* see updates to nginx, but why didn't sudo update. Indeed, it's stuck at 1.8.19p2-11
[root@opensourceecology ~]# rpm -qa | grep -i sudo sudo-1.8.19p2-11.el7_4.x86_64 [root@opensourceecology ~]#
- fortunately the issue is an edge-case that doesn't affect us, specifically when the sudo config is setup to allow a defined user to run a defined command as any user except root https://access.redhat.com/security/cve/cve-2019-14287
- the fucking redhat solution is to fix your config; not to update sudo. A check-update run shows there *is* a newer version of sudo available
upgrade [root@opensourceecology ~]# yum check-update sudo Loaded plugins: fastestmirror, replace Loading mirror speeds from cached hostfile * base: mirror.checkdomain.de * epel: mirror.wiuwiu.de * extras: centosmirror.netcup.net * updates: mirror.checkdomain.de * webtatic: uk.repo.webtatic.com sudo.x86_64 1.8.23-4.el7 base [root@opensourceecology ~]#
- it looks like the '--changelog' arg to `rpm` only shows changes for what's installed, not prospective updates. So I updated
[root@opensourceecology ~]# yum install sudo Loaded plugins: fastestmirror, replace Loading mirror speeds from cached hostfile * base: mirror.checkdomain.de * epel: mirror.wiuwiu.de * extras: centosmirror.netcup.net * updates: mirror.checkdomain.de * webtatic: uk.repo.webtatic.com Resolving Dependencies --> Running transaction check ---> Package sudo.x86_64 0:1.8.19p2-11.el7_4 will be updated ---> Package sudo.x86_64 0:1.8.23-4.el7 will be an update --> Finished Dependency Resolution Dependencies Resolved ================================================================================ Package Arch Version Repository Size ================================================================================ Updating: sudo x86_64 1.8.23-4.el7 base 841 k Transaction Summary ================================================================================ Upgrade 1 Package Total download size: 841 k Is this ok [y/d/N]: y Downloading packages: Delta RPMs disabled because /usr/bin/applydeltarpm not installed. sudo-1.8.23-4.el7.x86_64.rpm | 841 kB 00:00 Running transaction check Running transaction test Transaction test succeeded Running transaction Updating : sudo-1.8.23-4.el7.x86_64 1/2 warning: /etc/sudoers created as /etc/sudoers.rpmnew Cleanup : sudo-1.8.19p2-11.el7_4.x86_64 2/2 Verifying : sudo-1.8.23-4.el7.x86_64 1/2 Verifying : sudo-1.8.19p2-11.el7_4.x86_64 2/2 Updated: sudo.x86_64 0:1.8.23-4.el7 Complete! [root@opensourceecology ~]#
- apparently the update doesn't patch his bug. ugh, I'm loosing faith in cent/rhel over debian..
[root@opensourceecology ~]# rpm -q --changelog sudo | head * Wed Feb 20 2019 Radovan Sroka <rsroka@redhat.com> 1.8.23-4 - RHEL-7.7 erratum Resolves: rhbz#1672876 - Backporting sudo bug with expired passwords Resolves: rhbz#1665285 - Problem with sudo-1.8.23 and 'who am i' * Mon Sep 24 2018 Daniel Kopecek <dkopecek@redhat.com> 1.8.23-3 - RHEL-7.6 erratum Resolves: rhbz#1547974 - Rebase sudo to latest stable upstream version * Fri Sep 21 2018 Daniel Kopecek <dkopecek@redhat.com> 1.8.23-2 [root@opensourceecology ~]#
- well, that's all I can do for now on sudo
- regarding the package that *did* update, I got an email from ossec on changed packages two days ago on Oct 20th, and the checksums changing to the binaries
OSSEC HIDS Notification. 2019 Oct 20 04:39:44 Received From: opensourceecology->/var/log/messages Rule: 2932 fired (level 7) -> "New Yum package installed." Portion of the log(s): Oct 20 04:39:42 opensourceecology yum[29637]: Installed: nginx.x86_64 1:1.16.1-1.el7
- the changelog shows a sec update from 2 months ago. why so delayed?
[root@opensourceecology ~]# rpm -q --changelog nginx | head * Sun Sep 15 2019 Warren Togami <warren@blockstream.com> - add conditionals for EPEL7, see rhbz#1750857 * Tue Aug 13 2019 Jamie Nguyen <jamielinux@fedoraproject.org> - 1:1.16.1-1 - Update to upstream release 1.16.1 - Fixes CVE-2019-9511, CVE-2019-9513, CVE-2019-9516 * Thu Jul 25 2019 Fedora Release Engineering <releng@fedoraproject.org> - 1:1.16.0-5 - Rebuilt for https://fedoraproject.org/wiki/Fedora_31_Mass_Rebuild [root@opensourceecology ~]#
- the yum-cron package is responsible for updating security packages; it's kicked-off daily
[root@opensourceecology log]# ls -lah /etc/cron.daily/0yum-daily.cron -rwxr-xr-x 1 root root 332 Aug 5 2017 /etc/cron.daily/0yum-daily.cron [root@opensourceecology log]# cat /etc/cron.daily/0yum-daily.cron #!/bin/bash # Only run if this flag is set. The flag is created by the yum-cron init # script when the service is started -- this allows one to use chkconfig and # the standard "service stop|start" commands to enable or disable yum-cron. if ! -f /var/lock/subsys/yum-cron ; then exit 0 fi # Action! exec /usr/sbin/yum-cron [root@opensourceecology log]#
- the logs show that it was only updated in Oct 20
[root@opensourceecology log]# grep -ir nginx yum.log May 26 06:30:47 Updated: nginx-filesystem.noarch 1:1.12.2-3.el7 May 26 06:30:47 Updated: nginx-mod-http-perl.x86_64 1:1.12.2-3.el7 May 26 06:30:47 Updated: nginx-mod-mail.x86_64 1:1.12.2-3.el7 May 26 06:30:47 Updated: nginx-mod-stream.x86_64 1:1.12.2-3.el7 May 26 06:30:47 Updated: nginx-mod-http-image-filter.x86_64 1:1.12.2-3.el7 May 26 06:30:48 Updated: nginx-mod-http-xslt-filter.x86_64 1:1.12.2-3.el7 May 26 06:30:48 Updated: nginx-mod-http-geoip.x86_64 1:1.12.2-3.el7 May 26 06:30:48 Updated: nginx-all-modules.noarch 1:1.12.2-3.el7 May 26 06:30:48 Updated: nginx.x86_64 1:1.12.2-3.el7 Oct 20 04:39:42 Updated: nginx-filesystem.noarch 1:1.16.1-1.el7 Oct 20 04:39:42 Updated: nginx-mod-mail.x86_64 1:1.16.1-1.el7 Oct 20 04:39:42 Updated: nginx-mod-http-image-filter.x86_64 1:1.16.1-1.el7 Oct 20 04:39:42 Updated: nginx-mod-stream.x86_64 1:1.16.1-1.el7 Oct 20 04:39:42 Updated: nginx-mod-http-xslt-filter.x86_64 1:1.16.1-1.el7 Oct 20 04:39:42 Installed: nginx.x86_64 1:1.16.1-1.el7 Oct 20 04:39:42 Updated: nginx-mod-http-perl.x86_64 1:1.16.1-1.el7 Oct 20 04:39:42 Updated: nginx-all-modules.noarch 1:1.16.1-1.el7 Oct 20 04:39:42 Erased: nginx-mod-http-geoip [root@opensourceecology log]#
- and the yum-cron config looks sane
[root@opensourceecology log]# head /etc/yum/yum-cron.conf [commands] # What kind of update to use: # default = yum upgrade # security = yum --security upgrade # security-severity:Critical = yum --sec-severity=Critical upgrade # minimal = yum --bugfix update-minimal # minimal-security = yum --security update-minimal # minimal-security-severity:Critical = --sec-severity=Critical update-minimal update_cmd = minimal-security [root@opensourceecology log]#
- I still don't understand why it was delayed, but everything seems to be setup properly..
- ...
- anyway, returning to the dev/staging server setup; it looks like I can't VPN into our dev server anymore
user@ose:~/openvpn$ Tue Oct 22 21:24:32 2019 OpenVPN 2.4.0 x86_64-pc-linux-gnu [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Oct 14 2018 Tue Oct 22 21:24:32 2019 library versions: OpenSSL 1.0.2t 10 Sep 2019, LZO 2.08 Enter Private Key Password: * Tue Oct 22 21:24:35 2019 WARNING: this configuration may cache passwords in memory -- use the auth-nocache option to prevent this Tue Oct 22 21:24:35 2019 Outgoing Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Tue Oct 22 21:24:35 2019 Incoming Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Tue Oct 22 21:24:35 2019 TCP/UDP: Preserving recently used remote address: [AF_INET]195.201.233.113:1194 Tue Oct 22 21:24:35 2019 Socket Buffers: R=[212992->212992] S=[212992->212992] Tue Oct 22 21:24:35 2019 UDP link local: (not bound) Tue Oct 22 21:24:35 2019 UDP link remote: [AF_INET]195.201.233.113:1194 Tue Oct 22 21:25:35 2019 TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity) Tue Oct 22 21:25:35 2019 TLS Error: TLS handshake failed Tue Oct 22 21:25:35 2019 SIGUSR1[soft,tls-error] received, process restarting Tue Oct 22 21:25:35 2019 Restart pause, 5 second(s)
- And I can't ping the server either
user@ose:~$ ping 195.201.233.113 PING 195.201.233.113 (195.201.233.113) 56(84) bytes of data. ^C --- 195.201.233.113 ping statistics --- 104 packets transmitted, 0 received, 100% packet loss, time 105449ms user@ose:~$
- and ssh fails
user@ose:~$ ssh -vvvv osedev1 OpenSSH_7.4p1 Debian-10+deb9u7, OpenSSL 1.0.2t 10 Sep 2019 debug1: Reading configuration data /home/user/.ssh/config debug1: /home/user/.ssh/config line 8: Applying options for osedev1 debug1: Reading configuration data /etc/ssh/ssh_config debug1: /etc/ssh/ssh_config line 19: Applying options for * debug2: resolving "195.201.233.113" port 32415 debug2: ssh_connect_direct: needpriv 0 debug1: Connecting to 195.201.233.113 [195.201.233.113] port 32415. debug1: connect to address 195.201.233.113 port 32415: Connection timed out ssh: connect to host 195.201.233.113 port 32415: Connection timed out user@ose:~$
- logging into the hetzner cloud console shows that the box is online and sitting on the login screen. I tried to login, but after typing the username it freezes. Now my dev node is acting like my damn staging node was.
- I gave the dev server a reboot
- after a few minutes, I could ssh-in.
- and I could VPN-in as well.
- now when I start the staging container, I still get timeout issues
opensourceecology login: maltfield Password: login: timed out after 60 seconds CentOS Linux 7 (Core) Kernel 3.10.0-957.21.3.el7.x86_64 on an x86_64 opensourceecology login:
- it's worth noting that systemd-journal is chewing up >90% of the CPU on the host osedev1 server
- I added these 2x lines to the lxc container's config file per https://serverfault.com/questions/658052/systemd-journal-in-debian-jessie-lxc-container-eats-100-cpu
lxc.autodev = 1 lxc.kmsg = 0
- I stopped the container & started it again; this time systemd on the host was <10% CPU usage, and I was able to login without any delay!
- I became root too, and that worked great!
- I had issues with ssh-ing in from my laptop, but after I disconnected from the VPN and reconnected, I was able to ssh into osestaging1 from my laptop!
- this time I was also able to become root and poke around at our new, shiny production clone server, cool!
user@ose:~/openvpn$ ssh -vvvvv osestaging1 ... Last login: Tue Oct 22 16:08:03 2019 [maltfield@opensourceecology ~]$ sudo su - Last login: Tue Oct 22 16:09:19 UTC 2019 on lxc/console [root@opensourceecology ~]# ls -lah /var/www/html/ | head total 100K drwxr-xr-x. 25 root root 4.0K Apr 9 2019 . drwxr-xr-x. 5 root root 4.0K Aug 23 2017 .. d---r-x---. 3 not-apache apache 4.0K Aug 8 2018 3dp.opensourceecology.org drwxr-xr-x. 3 root root 4.0K Dec 24 2017 awstats.openbuildinginstitute.org drwxr-xr-x. 3 root root 4.0K Feb 9 2018 awstats.opensourceecology.org drwxr-xr-x. 2 root root 4.0K Mar 2 2018 cacti.opensourceecology.org.old drwxr-xr-x. 3 apache apache 4.0K Feb 9 2018 certbot d---r-x---. 3 not-apache apache 4.0K Aug 7 2018 d3d.opensourceecology.org d---r-x---. 3 not-apache apache 4.0K Apr 9 2019 fef.opensourceecology.org [root@opensourceecology ~]#
- ss shows that varnish & apache are listening
[root@opensourceecology ~]# ss -plan | grep -i LISTEN u_str LISTEN 0 100 private/proxymap 183064 * 0 users:(("master",pid=782,fd=49)) u_str LISTEN 0 100 public/pickup 183032 * 0 users:(("pickup",pid=791,fd=6),("master",pid=782,fd=17)) u_str LISTEN 0 100 public/cleanup 183036 * 0 users:(("master",pid=782,fd=21)) u_str LISTEN 0 100 public/qmgr 183039 * 0 users:(("qmgr",pid=792,fd=6),("master",pid=782,fd=24)) u_str LISTEN 0 100 private/tlsmgr 183043 * 0 users:(("master",pid=782,fd=28)) u_str LISTEN 0 100 private/rewrite 183046 * 0 users:(("master",pid=782,fd=31)) u_str LISTEN 0 100 private/bounce 183049 * 0 users:(("master",pid=782,fd=34)) u_str LISTEN 0 100 private/defer 183052 * 0 users:(("master",pid=782,fd=37)) u_str LISTEN 0 100 private/trace 183055 * 0 users:(("master",pid=782,fd=40)) u_str LISTEN 0 128 /run/systemd/private 174128 * 0 users:(("systemd",pid=1,fd=12)) u_str LISTEN 0 128 /run/lvm/lvmpolld.socket 174135 * 0 users:(("systemd",pid=1,fd=20)) u_str LISTEN 0 128 /run/lvm/lvmetad.socket 174138 * 0 users:(("lvmetad",pid=24,fd=3),("systemd",pid=1,fd=21)) u_str LISTEN 0 128 /run/systemd/journal/stdout 174140 * 0 users:(("systemd-journal",pid=18,fd=3),("systemd",pid=1,fd=22)) u_str LISTEN 0 100 private/verify 183058 * 0 users:(("master",pid=782,fd=43)) u_str LISTEN 0 128 /tmp/ssh-bd3GlfYKNm/agent.1751 223092 * 0 users:(("sshd",pid=1751,fd=9)) u_str LISTEN 0 100 private/retry 183082 * 0 users:(("master",pid=782,fd=67)) u_str LISTEN 0 50 /var/lib/mysql/mysql.sock 187559 * 0 users:(("mysqld",pid=1011,fd=14)) u_str LISTEN 0 100 private/discard 183085 * 0 users:(("master",pid=782,fd=70)) u_str LISTEN 0 100 public/flush 183061 * 0 users:(("master",pid=782,fd=46)) u_str LISTEN 0 100 private/local 183088 * 0 users:(("master",pid=782,fd=73)) u_str LISTEN 0 100 private/virtual 183091 * 0 users:(("master",pid=782,fd=76)) u_str LISTEN 0 100 private/lmtp 183094 * 0 users:(("master",pid=782,fd=79)) u_str LISTEN 0 100 private/anvil 183097 * 0 users:(("master",pid=782,fd=82)) u_str LISTEN 0 100 private/scache 183100 * 0 users:(("master",pid=782,fd=85)) u_str LISTEN 0 100 private/proxywrite 183067 * 0 users:(("master",pid=782,fd=52)) u_str LISTEN 0 100 private/smtp 183070 * 0 users:(("master",pid=782,fd=55)) u_str LISTEN 0 100 private/relay 183073 * 0 users:(("master",pid=782,fd=58)) u_str LISTEN 0 100 public/showq 183076 * 0 users:(("master",pid=782,fd=61)) u_str LISTEN 0 100 private/error 183079 * 0 users:(("master",pid=782,fd=64)) u_str LISTEN 0 10 /var/run/acpid.socket 176097 * 0 users:(("acpid",pid=48,fd=5)) u_str LISTEN 0 128 /var/run/dbus/system_bus_socket 175844 * 0 users:(("dbus-daemon",pid=51,fd=3),("systemd",pid=1,fd=31)) tcp LISTEN 0 128 127.0.0.1:8000 *:* users:(("httpd",pid=520,fd=3),("httpd",pid=519,fd=3),("httpd",pid=518,fd=3),("httpd",pid=517,fd=3),("httpd",pid=516,fd=3),("httpd",pid=314,fd=3)) tcp LISTEN 0 128 127.0.0.1:6081 *:* users:(("varnishd",pid=1165,fd=6)) tcp LISTEN 0 10 127.0.0.1:6082 *:* users:(("varnishd",pid=1109,fd=5)) tcp LISTEN 0 128 127.0.0.1:8010 *:* users:(("httpd",pid=520,fd=4),("httpd",pid=519,fd=4),("httpd",pid=518,fd=4),("httpd",pid=517,fd=4),("httpd",pid=516,fd=4),("httpd",pid=314,fd=4)) tcp LISTEN 0 128 *:10000 *:* users:(("miniserv.pl",pid=533,fd=5)) tcp LISTEN 0 100 127.0.0.1:25 *:* users:(("master",pid=782,fd=13)) tcp LISTEN 0 128 *:32415 *:* users:(("sshd",pid=326,fd=3)) tcp LISTEN 0 128 :::4949 :::* users:(("munin-node",pid=379,fd=5)) tcp LISTEN 0 128 :::32415 :::* users:(("sshd",pid=326,fd=4)) [root@opensourceecology ~]#
- as expected, nginx is failing because it can't bind to the hardcoded external ip addresses that don't exist on this distinct server; we'll have to sed this later
[root@opensourceecology ~]# nginx -t nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: [emerg] bind() to 138.201.84.223:4443 failed (99: Cannot assign requested address) nginx: configuration file /etc/nginx/nginx.conf test failed [root@opensourceecology ~]#
- note that the hostname above is an exact match of the production server. This is confusing for my logs and makes it a risk of running commands on the wrong server. If possible, I should try to sed this back to 'osestaging1' or exclude the relevant configs' rsync as well
- so it looks like apache is listening to 127.0.0.1:8000 for name-based-vhosts, except certbot which listens on 127.0.0.1:8010
[root@opensourceecology conf.d]# grep VirtualHost * 000-www.opensourceecology.org.conf:<VirtualHost 127.0.0.1:8000> 000-www.opensourceecology.org.conf:</VirtualHost> 00-fef.opensourceecology.org.conf:<VirtualHost 127.0.0.1:8000> 00-fef.opensourceecology.org.conf:</VirtualHost> 00-forum.opensourceecology.org.conf:<VirtualHost 127.0.0.1:8000> 00-forum.opensourceecology.org.conf:</VirtualHost> 00-microfactory.opensourceecology.org.conf:<VirtualHost 127.0.0.1:8000> 00-microfactory.opensourceecology.org.conf:</VirtualHost> 00-oswh.opensourceecology.org.conf:<VirtualHost 127.0.0.1:8000> 00-oswh.opensourceecology.org.conf:</VirtualHost> 00-phplist.opensourceecology.org.conf:<VirtualHost 127.0.0.1:8000> 00-phplist.opensourceecology.org.conf:</VirtualHost> 00-seedhome.openbuildinginstitute.org.conf:<VirtualHost 127.0.0.1:8000> 00-seedhome.openbuildinginstitute.org.conf:</VirtualHost> 00-store.opensourceecology.org.conf:<VirtualHost 127.0.0.1:8000> 00-store.opensourceecology.org.conf:</VirtualHost> 00-wiki.opensourceecology.org.conf:<VirtualHost 127.0.0.1:8000> 00-wiki.opensourceecology.org.conf:</VirtualHost> 00-www.openbuildinginstitute.org.conf:<VirtualHost 127.0.0.1:8000> 00-www.openbuildinginstitute.org.conf:</VirtualHost> awstats.openbuildinginstitute.org.conf:<VirtualHost 127.0.0.1:8000> awstats.openbuildinginstitute.org.conf:</VirtualHost> awstats.opensourceecology.org.conf:<VirtualHost 127.0.0.1:8000> awstats.opensourceecology.org.conf:</VirtualHost> certbot.conf:<VirtualHost 127.0.0.1:8010> certbot.conf:</VirtualHost> munin.opensourceecology.org.conf:<VirtualHost 127.0.0.1:8000> munin.opensourceecology.org.conf:</VirtualHost> ssl.conf.disabled:<VirtualHost _default_:443> ssl.conf.disabled:# moved outside VirtualHost block (see below) ssl.conf.disabled:# moved outside VirtualHost block (see below) ssl.conf.disabled:</VirtualHost> ssl.conf.orig:<VirtualHost _default_:443> ssl.conf.orig:</VirtualHost> ssl.openbuildinginstitute.org:# Purpose: To be included inside the <VirtualHost> block for all ssl.opensourceecology.org:# Purpose: To be included inside the <VirtualHost> block for all staging.openbuildinginstitute.org.conf.bak:<VirtualHost staging.openbuildinginstitute.org:8000> staging.openbuildinginstitute.org.conf.bak:</VirtualHost> staging.opensourceecology.org.conf:<VirtualHost 127.0.0.1:8000> staging.opensourceecology.org.conf:</VirtualHost> varnishTest.conf.disabled:<VirtualHost 127.0.0.1:8000> varnishTest.conf.disabled:</VirtualHost> [root@opensourceecology conf.d]#
- unfortunately I get 403 forbiddens for both with curl
[root@opensourceecology conf.d]# curl 127.0.0.1:8000/ <!DOCTYPE HTML PUBLIC "-IETFDTD HTML 2.0//EN"> <html><head> <title>403 Forbidden</title> </head><body> <h1>Forbidden</h1> <p>You don't have permission to access / on this server.</p> </body></html> [root@opensourceecology conf.d]# curl 127.0.0.1:8010/ <!DOCTYPE HTML PUBLIC "-IETFDTD HTML 2.0//EN"> <html><head> <title>403 Forbidden</title> </head><body> <h1>Forbidden</h1> <p>You don't have permission to access / on this server.</p> </body></html> [root@opensourceecology conf.d]#
- tailing the logs shows modsec blocking us from the fef vhost because we specified the URI as an IP address. Well, ok.
==> fef.opensourceecology.org/error_log <== [Tue Oct 22 16:20:34.573535 2019] [:error] [pid 518] [client 127.0.0.1] ModSecurity: Access denied with code 403 (phase 2). Pattern match "^[\\\\d.:]+$" at REQUEST_HEADERS:Host. [file "/etc/httpd/modsecurity.d/activated_rules/modsecurity_crs_21_protocol_anomalies.conf"] [line "98"] [id "960017"] [rev "2"] [msg "Host header is a numeric IP address"] [data "127.0.0.1:8000"] [severity "WARNING"] [ver "OWASP_CRS/2.2.9"] [maturity "9"] [accuracy "9"] [tag "OWASP_CRS/PROTOCOL_VIOLATION/IP_HOST"] [tag "WASCTC/WASC-21"] [tag "OWASP_TOP_10/A7"] [tag "PCI/6.5.10"] [tag "http://technet.microsoft.com/en-us/magazine/2005.01.hackerbasher.aspx"] [hostname "127.0.0.1"] [uri "/"] [unique_id "Xa8sUlvJZ8GVfznr1gxo6AAAAAI"] ==> modsec_audit.log <== --cbc91b75-A-- [22/Oct/2019:16:20:34 +0000] Xa8sUlvJZ8GVfznr1gxo6AAAAAI 127.0.0.1 33594 127.0.0.1 8000 --cbc91b75-B-- GET / HTTP/1.1 User-Agent: curl/7.29.0 Host: 127.0.0.1:8000 Accept: */* --cbc91b75-F-- HTTP/1.1 403 Forbidden Content-Length: 202 Content-Type: text/html; charset=iso-8859-1 --cbc91b75-E-- --cbc91b75-H-- Message: Access denied with code 403 (phase 2). Pattern match "^[\\d.:]+$" at REQUEST_HEADERS:Host. [file "/etc/httpd/modsecurity.d/activated_rules/modsecurity_crs_21_protocol_anomalies.conf"] [line "98"] [id "960017"] [rev "2"] [msg "Host header is a numeric IP address"] [data "127.0.0.1:8000"] [severity "WARNING"] [ver "OWASP_CRS/2.2.9"] [maturity "9"] [accuracy "9"] [tag "OWASP_CRS/PROTOCOL_VIOLATION/IP_HOST"] [tag "WASCTC/WASC-21"] [tag "OWASP_TOP_10/A7"] [tag "PCI/6.5.10"] [tag "http://technet.microsoft.com/en-us/magazine/2005.01.hackerbasher.aspx"] Action: Intercepted (phase 2) Stopwatch: 1571761234559472 14421 (- - -) Stopwatch2: 1571761234559472 14421; combined=4661, p1=4516, p2=113, p3=0, p4=0, p5=32, sr=3976, sw=0, l=0, gc=0 Response-Body-Transformed: Dechunked Producer: ModSecurity for Apache/2.7.3 (http://www.modsecurity.org/); OWASP_CRS/2.2.9. Server: Apache Engine-Mode: "ENABLED" --cbc91b75-Z-- ==> fef.opensourceecology.org/access_log <== 127.0.0.1 - - [22/Oct/2019:16:20:34 +0000] "GET / HTTP/1.1" 403 202 "-" "curl/7.29.0"
- attempting 8000 does a redirect that tries to strip itself; attempting 8010 works! The latter is just an empty docroot that gets populated by `certbot` for renewing certs on complicated some non-public vhost sites
[root@opensourceecology ~]# curl -i http://localhost:8000/ HTTP/1.1 301 Moved Permanently Date: Tue, 22 Oct 2019 16:22:24 GMT Server: Apache X-VC-Enabled: true X-VC-TTL: 86400 Location: http://localhost/ X-XSS-Protection: 1; mode=block Content-Length: 0 Content-Type: text/html; charset=UTF-8 [root@opensourceecology ~]# curl -i http://localhost:8010/ HTTP/1.1 200 OK Date: Tue, 22 Oct 2019 16:23:43 GMT Server: Apache Last-Modified: Fri, 09 Feb 2018 20:56:47 GMT Accept-Ranges: bytes Content-Length: 18 X-XSS-Protection: 1; mode=block Content-Type: text/html; charset=UTF-8 can you see this? [root@opensourceecology ~]#
- this is going to be a pain; let's see if I can get nginx working; we have to fix '138.201.84.223'
[root@opensourceecology nginx]# nginx -t nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: [emerg] bind() to 138.201.84.223:4443 failed (99: Cannot assign requested address) nginx: configuration file /etc/nginx/nginx.conf test failed [root@opensourceecology nginx]# [root@opensourceecology nginx]# grep -irl '138.201.84.223' * conf.d/www.openbuildinginstitute.org.conf conf.d/wiki.opensourceecology.org.conf conf.d/seedhome.openbuildinginstitute.org.conf conf.d/www.opensourceecology.org.conf conf.d/awstats.openbuildinginstitute.org.conf nginx.conf [root@opensourceecology nginx]#
- I replaced the first IP for OBI with our VPN IP
[root@opensourceecology nginx]# sed -i 's/138.201.84.223/10.241.189.11/g' nginx.conf [root@opensourceecology nginx]# sed -i 's/138.201.84.223/10.241.189.11/g' conf.d/* [root@opensourceecology nginx]#
- And then I replaced the second IP for oSE with our VPN IP as well
[root@opensourceecology nginx]# nginx -t nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: [emerg] bind() to 138.201.84.243:4443 failed (99: Cannot assign requested address) nginx: configuration file /etc/nginx/nginx.conf test failed [root@opensourceecology nginx]# sed -i 's/138.201.84.243/10.241.189.11/g' nginx.conf [root@opensourceecology nginx]# sed -i 's/138.201.84.243/10.241.189.11/g' conf.d/* [root@opensourceecology nginx]#
- well now there is a duplicate line to listen on this same IP; I removed that from nginx.conf
- And now I'm having issues with a duplicate default_server line. Oh, right, now that OBI and OSE share the same IP I'll make OSE the default server and remove it from OBI
[root@opensourceecology conf.d]# nginx -t nginx: [emerg] a duplicate default server for 10.241.189.11:443 in /etc/nginx/conf.d/www.opensourceecology.org.conf:58 nginx: configuration file /etc/nginx/nginx.conf test failed [root@opensourceecology conf.d]# grep -irl 'default_server' * www.openbuildinginstitute.org.conf www.opensourceecology.org.conf [root@opensourceecology conf.d]# vim www.openbuildinginstitute.org.conf
- Aaand now it's failing on the same issue but for the IPv6 addresses. I'm just going to comment those out entirely for the staging server
[root@opensourceecology conf.d]# nginx -t nginx: [warn] conflicting server name "_" on 10.241.189.11:443, ignored nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: [emerg] bind() to [2a01:4f8:172:209e::2]:443 failed (99: Cannot assign requested address) nginx: configuration file /etc/nginx/nginx.conf test failed [root@opensourceecology conf.d]# grep -irl '2a01:4f8:172:209e::2' * awstats.opensourceecology.org.conf fef.opensourceecology.org.conf forum.opensourceecology.org.conf microfactory.opensourceecology.org munin.opensourceecology.org.conf oswh.opensourceecology.org.conf store.opensourceecology.org.conf wiki.opensourceecology.org.conf www.opensourceecology.org.conf
- This last sed fixed it!
[root@opensourceecology conf.d]# sed -i 's^\(\s*\)[^#]*listen \[2a01:4f8:172:209e::2\(.*\)^\1#listen \[2a01:4f8:172:209e::2\2^' * [root@opensourceecology conf.d]# nginx -t nginx: [warn] conflicting server name "_" on 10.241.189.11:443, ignored nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: configuration file /etc/nginx/nginx.conf test is successful [root@opensourceecology conf.d]#
- I added these lines to /etc/hosts to make a new domain 'staging.www.opensourceecology.org' point to this IP address; it works!
[root@opensourceecology conf.d]# tail /etc/hosts fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters ff02::3 ip6-allhosts 2a01:4f8:172:209e::2 hetzner2.opensourceecology.org hetzner2 # staging 127.0.0.1 staging.www.opensourceecology.org [root@opensourceecology conf.d]# [root@opensourceecology conf.d]# curl -si https://staging.opensourceecology.org | tail var mo_theme = {"name_required":"Please provide your name","name_format":"Your name must consist of at least 5 characters","email_required":"Please provide a valid email address","url_required":"Please provide a valid URL","phone_required":"Minimum 5 characters required","human_check_failed":"The input the correct value for the equation above","message_required":"Please input the message","message_format":"Your message must be at least 15 characters long","success_message":"Your message has been sent. Thanks!","blog_url":"https:\/\/staging.opensourceecology.org","loading_portfolio":"Loading the next set of posts...","finished_loading":"No more items to load..."}; /* ]]> */ </script> <script type='text/javascript' src='https://staging.opensourceecology.org/wp-content/themes/enigmatic/js/main.js?ver=1.6'></script> <script type='text/javascript' src='https://staging.opensourceecology.org/wp-includes/js/wp-embed.min.js?ver=4.9.4'></script> </body> </html> [root@opensourceecology conf.d]#
- I then added a line for 'staging.www.opensourceecology.org' and 'www.opensourceecology.org' to point to my staging server's VPN IP address on my laptop and fired up firefox; I was successfully able to access the staging site's nginx -> varnish -> http site!
10.241.189.11 www.opensourceecology.org 10.241.189.11 staging.www.opensourceecology.org
- note that, of course, I get a cert error when attempting to access 'staging.www.opensourceecology.org', but it loads fine when hitting 'www.opensourceecology.org'. I'll have to think more about how I want to fix this. If one is on the VPN, should they be automatically forced to using the staging site? That seems like it could create confusion, but if the names are *not* the same, then I'm sure lots of errors will be encountered with links and such; so perhaps that *is* the most logical thing to do...
- oh fuck. now, somehow, I am getting emails from OSSEC on the staging server. I'll have to fix that too. For now I just stopped the ossec service on the staging server
Tue Oct 08, 2019
- continuing from yesterday, I checked-up on the rsync running from prod to staging, and it appears to have stalled
75497472 100% 2.90MB/s 0:00:24 (xfer#4297, to-check=1538/7463) run/log/journal/34a04596e14a410d9f2f816d507c55ab/system@fb40211581a0421d8abbe026c6a270ac-00000000000d4a57-00058e887df34962.journal 75497472 100% 2.80MB/s 0:00:25 (xfer#4298, to-check=1537/7463) run/log/journal/34a04596e14a410d9f2f816d507c55ab/system@fb40211581a0421d8abbe026c6a270ac-00000000000e7f5a-00058ec8f2c8422b.journal 23429120 31% 2.91MB/s 0:00:17
- it's probably not a good idea to sync the /run dir..
- attempting to ssh into the server fails
user@ose:~/openvpn$ ssh osestaging1 The authenticity of host '[10.241.189.11]:32415 ([10.241.189.11]:32415)' can't be established. ECDSA key fingerprint is SHA256:HclF8ZQOjGqx+9TmwL111kZ7QxgKkoEw8g3l2YxV0gk. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '[10.241.189.11]:32415' (ECDSA) to the list of known hosts. Permission denied (publickey). user@ose:~/openvpn$
- I _can_ get into the staging server from the lxc-console on the dev server, but it doesn't look like anything is wrong with the setup of my user
[root@osestaging1 ~]# grep maltfield /etc/passwd maltfield:x:1005:1005::/home/maltfield:/bin/bash [root@osestaging1 ~]# grep maltfield /etc/shadow maltfield:TRUNCATED [root@osestaging1 ~]# grep maltfield /etc/group wheel:x:10:maltfield,crupp,tgriffing,root apache:x:48:cmota,crupp,maltfield,wp,apache,marcin maltfield:x:1005:apache sshaccess:x:1006:cmota,marcin,tgriffing,maltfield,lberezhny,crupp keepass:x:993:maltfield,marcin,cmota,crupp apache-admins:x:1012:cmota,maltfield,marcin,crupp,tgriffing,wp,apache [root@osestaging1 ~]# ls -lah /home/maltfield/.ssh total 16K drwxr-xr-x. 2 tgriffing maltfield 4.0K Jan 19 2018 . drwx------. 10 tgriffing maltfield 4.0K Oct 3 07:06 .. -rw-r--r--. 1 root root 750 Jun 20 2017 authorized_keys -rw-r--r--. 1 tgriffing tgriffing 1.1K Oct 3 13:44 known_hosts [root@osestaging1 ~]# cat /home/maltfield/.ssh/authorized_keys ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDGNYjR7UKiJSAG/AbP+vlCBqNfQZ2yuSXfsEDuM7cEU8PQNJyuJnS7m0VcA48JRnpUpPYYCCB0fqtIEhpP+szpMg2LByfTtbU0vDBjzQD9mEfwZ0mzJsfzh1Nxe86l/d6h6FhxAqK+eG7ljYBElDhF4l2lgcMAl9TiSba0pcqqYBRsvJgQoAjlZOIeVEvM1lyfWfrmDaFK37jdUCBWq8QeJ98qpNDX4A76f9T5Y3q5EuSFkY0fcU+zwFxM71bGGlgmo5YsMMdSsW+89fSG0652/U4sjf4NTHCpuD0UaSPB876NJ7QzeDWtOgyBC4nhPpS8pgjsnl48QZuVm6FNDqbXr9bVk5BdntpBgps+gXdSL2j0/yRRayLXzps1LCdasMCBxCzK+lJYWGalw5dNaIDHBsEZiK55iwPp0W3lU9vXFO4oKNJGFgbhNmn+KAaW82NBwlTHo/tOlj2/VQD9uaK5YLhQqAJzIq0JuWZWFLUC2FJIIG0pJBIonNabANcN+vq+YJqjd+JXNZyTZ0mzuj3OAB/Z5zS6lT9azPfnEjpcOngFs46P7S/1hRIrSWCvZ8kfECpa8W+cTMus4rpCd40d1tVKzJA/n0MGJjEs2q4cK6lC08pXxq9zAyt7PMl94PHse2uzDFhrhh7d0ManxNZE+I5/IPWOnG1PJsDlOe4Yqw== guttersnipe@guttersnipe [root@osestaging1 ~]#
- ssh appears to be running too
[root@osestaging1 ~]# systemctl list-units | grep -i ssh sshd.service loaded active running OpenSSH server daemon [root@osestaging1 ~]# ss -plan | grep -i ssh u_str ESTAB 0 0 * 32621 * 32622 users:(("sshd",pid=350,fd=5)) u_dgr UNCONN 0 0 * 32618 * 29344 users:(("sshd",pid=350,fd=4),("sshd",pid=348,fd=4)) u_str ESTAB 0 0 * 31143 * 0 users:(("sshd",pid=274,fd=2),("sshd",pid=274,fd=1)) u_str ESTAB 0 0 * 32622 * 32621 users:(("sshd",pid=348,fd=7)) tcp LISTEN 0 128 *:32415 *:* users:(("sshd",pid=274,fd=3)) tcp ESTAB 0 0 10.241.189.11:32415 10.241.189.10:41270 users:(("sshd",pid=350,fd=3),("sshd",pid=348,fd=3)) tcp LISTEN 0 128 [::]:32415 [::]:* users:(("sshd",pid=274,fd=4)) [root@osestaging1 ~]#
- the ssh server logs say that the client just disconnects
Oct 8 05:57:01 localhost sshd[3586]: Connection closed by 10.241.189.10 port 41334 [preauth]
- the ssh client says that the server rejected our public key
user@ose:~/openvpn$ ssh -vvv osestaging1 ... debug1: Next authentication method: publickey debug1: Offering RSA public key: /home/user/.ssh/id_rsa.ose debug3: send_pubkey_test debug3: send packet: type 50 debug2: we sent a publickey packet, wait for reply debug3: receive packet: type 51 debug1: Authentications that can continue: publickey debug2: we did not send a packet, disable method debug1: No more authentication methods to try. Permission denied (publickey). user@ose:~/openvpn$
- I did notice that the ownership of the relevant /home/.ssh/authorized_keys file differs on the prod & staging servers
[maltfield@opensourceecology ~]$ ls -lahn /home/maltfield/.ssh total 16K drwxr-xr-x 2 1005 1005 4.0K Jan 19 2018 . drwx------ 10 1005 1005 4.0K Oct 3 07:06 .. -rw-r--r-- 1 0 0 750 Jun 20 2017 authorized_keys -rw-r--r-- 1 1005 1005 1.1K Oct 3 13:44 known_hosts [maltfield@opensourceecology ~]$
[root@osestaging1 ~]# ls -lahn /home/maltfield/.ssh total 16K drwxr-xr-x. 2 1000 1005 4.0K Jan 19 2018 . drwx------. 10 1000 1005 4.0K Oct 3 07:06 .. -rw-r--r--. 1 0 0 750 Jun 20 2017 authorized_keys -rw-r--r--. 1 1000 1000 1.1K Oct 3 13:44 known_hosts [root@osestaging1 ~]#
- while the passwd, group, and shadow files all match
[root@opensourceecology ~]# md5sum /etc/passwd cabf495ca12f7f32605eb764dd12c861 /etc/passwd [root@opensourceecology ~]# md5sum /etc/group 04a70553d59a646406ecb89f2f7b17b5 /etc/group [root@opensourceecology ~]# md5sum /etc/shadow 6f27deaf639ae2db1a1d94739a8bb834 /etc/shadow [root@opensourceecology ~]#
[root@osestaging1 ~]# md5sum /etc/passwd cabf495ca12f7f32605eb764dd12c861 /etc/passwd [root@osestaging1 ~]# md5sum /etc/group 04a70553d59a646406ecb89f2f7b17b5 /etc/group [root@osestaging1 ~]# md5sum /etc/shadow 6f27deaf639ae2db1a1d94739a8bb834 /etc/shadow [root@osestaging1 ~]#
- for some reason my '/home/maltfield' dir was also owned by 'tgriffin'. I was able to ssh-in again after fixing this
[root@osestaging1 ~]# chown -R maltfield:maltfield /home/maltfield/ [root@osestaging1 ~]# ls -lah /home total 52K drwxr-xr-x. 13 root root 4.0K Jul 28 2018 . dr-xr-xr-x. 20 root root 4.0K Oct 7 10:05 .. drwx------. 7 b2user b2user 4.0K Oct 7 07:46 b2user drwx------. 5 cmota cmota 4.0K Jul 14 2017 cmota drwx------. 5 crupp crupp 4.0K Aug 12 2017 crupp drwx------. 2 Flipo Flipo 4.0K Sep 20 2016 Flipo drwx------. 2 hart hart 4.0K Mar 30 2017 hart drwx------. 3 lberezhny lberezhny 4.0K Jul 20 2017 lberezhny drwx------. 10 maltfield maltfield 4.0K Oct 3 07:06 maltfield drwx------. 4 marcin marcin 4.0K Jul 6 2017 marcin drwx------. 2 not-apache not-apache 4.0K Feb 12 2018 not-apache drwx------. 5 tgriffing tgriffing 4.0K Aug 1 09:19 tgriffing drwx------. 5 wp wp 4.0K Oct 7 2017 wp [root@osestaging1 ~]#
- I re-opened the screen for the rsync, and it now exited
75497472 100% 2.90MB/s 0:00:24 (xfer#4297, to-check=1538/7463) run/log/journal/34a04596e14a410d9f2f816d507c55ab/system@fb40211581a0421d8abbe026c6a270ac-00000000000d4a57-00058e887df34962.journal 75497472 100% 2.80MB/s 0:00:25 (xfer#4298, to-check=1537/7463) run/log/journal/34a04596e14a410d9f2f816d507c55ab/system@fb40211581a0421d8abbe026c6a270ac-00000000000e7f5a-00058ec8f2c8422b.journal 23429120 31% 2.91MB/s 0:00:17 packet_write_wait: Connection to 10.241.189.11 port 32415: Broken pipe rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32) rsync: connection unexpectedly closed (119371 bytes received so far) [sender] rsync error: unexplained error (code 255) at io.c(605) [sender=3.0.9] real 1059m42.282s user 12m34.775s sys 3m5.253s [maltfield@opensourceecology ~]$ [maltfield@opensourceecology ~]$ [maltfield@opensourceecology ~]$ [maltfield@opensourceecology ~]$ [maltfield@opensourceecology ~]$ [maltfield@opensourceecology ~]$ [maltfield@opensourceecology ~]$ time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --rsync-path="sudo rsync" --exclude=/root --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/
- I updated the rsync command to exclude /run, and I kicked-off the rsync again
time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --rsync-path="sudo rsync" --exclude=/root --exclude=/run --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/
- ah, ffs! my internet connection here failed me, and I was silently disconnected from my ssh session with the prod node and dumped into a local shell. So I ended-up kicking off this rsync not from the prod node on which I was ssh'd, but my personal laptop (when I was dropped out of the prod server's ssh shell into my laptop's shell). By the time I realized it, the fucking staging server was broken!
- fucking hell, I had successfully copied 35G overnight; now I have to restore from snapshot and start over.
- I prepended a fucking hostname check to make sure this stupid shit doesn't happen again
[ "`hostname`" = "opensourceecology.org" ] && time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --rsync-path="sudo rsync" --exclude=/root --exclude=/run --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/
- I had a bunch of issues restoring from snapshot; eventually I just did an rsync of the '/var/lib/lxcsnaps/osestaging1/snap1' dir to '/var/lib/lxc/osestaging1', and I was finally successfully able to `lxc-start -n osestaging1`
- I did the `visudo` and install of rsync and re-initiated the rsync from prod to staging using the above-command. I noticed that I forgot to exclude the backups; here's what I should use next time
[ "`hostname`" = "opensourceecology.org" ] && time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --rsync-path="sudo rsync" --exclude=/root --exclude=/run --exclude=/home/b2user/sync --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/
- while that ran, I checked our munin graphs. I nice'd & bwlimit'd the above rsync, but it's still good to check.
- there's a spike in varnish requests, which is a bit odd
- there was a shift in memory usage, but no issues there
- load spiked to ~2, but our box has 8; no problems
- there was a spike in 'nice' to ~100% cpu usage; cool
- firewall throughput, eth0 traffic spiked to about the same level as our backups. excellent
- there's a huge spike in disk usage read, disk IO that's much higher than backups; hmm
- I also noted that the apache graphs that I added some time ago are blank; I probably have to setup an apache stats vhost for munin to scrape
- munin processing graphs are also blank; hmm
- all mysql graphs are also blank
- even nginx graphs are all blank
- I also added plugins for monitoring the 'mysqld' process and the memory of a bunch of processes
[root@opensourceecology plugins]# ls apache_access if_err_eth0 mysql_slowqueries uptime varnish_memory_usage.bak apache_processes if_eth0 mysql_threads users varnish_objects apache_volume interrupts nginx_request varnish4_ varnish_objects.bak cpu irqstats nginx_status varnish_backend_traffic varnish_request_rate df load open_files varnish_backend_traffic.bak varnish_request_rate.bak df_inode memory open_inodes varnish_bad varnish_threads diskstats munin_stats postfix_mailqueue varnish_bad.bak varnish_threads.bak entropy mysql_ postfix_mailvolume varnish_expunge varnish_transfer_rates forks mysql_bytes processes varnish_expunge.bak varnish_transfer_rates.bak fw_conntrack mysql_innodb proc_pri varnish_hit_rate varnish_uptime fw_forwarded_local mysql_isam_space_ swap varnish_hit_rate.bak varnish_uptime.bak fw_packets mysql_queries threads varnish_memory_usage vmstat [root@opensourceecology plugins]# ls -lah | head -n 5 total 36K drwxr-xr-x 2 root root 4.0K Sep 7 07:37 . drwxr-xr-x 8 root root 4.0K Jun 24 16:05 .. lrwxrwxrwx 1 root root 38 Sep 7 07:36 apache_access -> /usr/share/munin/plugins/apache_access lrwxrwxrwx 1 root root 41 Sep 7 07:36 apache_processes -> /usr/share/munin/plugins/apache_processes [root@opensourceecology plugins]# ln -s /usr/share/munin/plugins/multip multiping multips multips_memory [root@opensourceecology plugins]# ln -s /usr/share/munin/plugins/multips_memory [root@opensourceecology plugins]# ln -s /usr/share/munin/plugins/ps_ ps_mysqld [root@opensourceecology plugins]#
- for the munin mysql graphs, it looks like I need to grant access for the 'munin' user
[root@opensourceecology plugin-conf.d]# munin-run --debug mysql_queries # Processing plugin configuration from /etc/munin/plugin-conf.d/amavis # Processing plugin configuration from /etc/munin/plugin-conf.d/df # Processing plugin configuration from /etc/munin/plugin-conf.d/fw_ # Processing plugin configuration from /etc/munin/plugin-conf.d/hddtemp_smartctl # Processing plugin configuration from /etc/munin/plugin-conf.d/munin-node # Processing plugin configuration from /etc/munin/plugin-conf.d/postfix # Processing plugin configuration from /etc/munin/plugin-conf.d/postgres # Processing plugin configuration from /etc/munin/plugin-conf.d/sendmail # Processing plugin configuration from /etc/munin/plugin-conf.d/zzz-ose # Setting /rgid/ruid/ to /99/99/ # Setting /egid/euid/ to /99 99/99/ # Setting up environment # Environment mysqlopts = -u munin # About to run '/etc/munin/plugins/mysql_queries' mysqladmin: connect to server at 'localhost' failed error: 'Access denied for user 'munin'@'localhost' (using password: NO)' [root@opensourceecology plugin-conf.d]#
- woah, this guide suggests that there's a ton more graphs than just is what symlink-able https://blog.penumbra.be/2010/04/monitoring-mysql-munin-directadmin/
[root@opensourceecology plugins]# ls -lah mysql_* lrwxrwxrwx 1 root root 31 Sep 7 07:36 mysql_ -> /usr/share/munin/plugins/mysql_ lrwxrwxrwx 1 root root 36 Sep 7 07:36 mysql_bytes -> /usr/share/munin/plugins/mysql_bytes lrwxrwxrwx 1 root root 37 Sep 7 07:36 mysql_innodb -> /usr/share/munin/plugins/mysql_innodb lrwxrwxrwx 1 root root 42 Sep 7 07:36 mysql_isam_space_ -> /usr/share/munin/plugins/mysql_isam_space_ lrwxrwxrwx 1 root root 38 Sep 7 07:36 mysql_queries -> /usr/share/munin/plugins/mysql_queries lrwxrwxrwx 1 root root 42 Sep 7 07:36 mysql_slowqueries -> /usr/share/munin/plugins/mysql_slowqueries lrwxrwxrwx 1 root root 38 Sep 7 07:36 mysql_threads -> /usr/share/munin/plugins/mysql_threads [root@opensourceecology plugins]# ls -lah /usr/share/munin/plugins/mysql_* -rwxr-xr-x 1 root root 33K Mar 3 2017 /usr/share/munin/plugins/mysql_ -rwxr-xr-x 1 root root 1.8K Mar 3 2017 /usr/share/munin/plugins/mysql_bytes -rwxr-xr-x 1 root root 5.4K Mar 3 2017 /usr/share/munin/plugins/mysql_innodb -rwxr-xr-x 1 root root 5.7K Mar 3 2017 /usr/share/munin/plugins/mysql_isam_space_ -rwxr-xr-x 1 root root 2.5K Mar 3 2017 /usr/share/munin/plugins/mysql_queries -rwxr-xr-x 1 root root 1.5K Mar 3 2017 /usr/share/munin/plugins/mysql_slowqueries -rwxr-xr-x 1 root root 1.7K Mar 3 2017 /usr/share/munin/plugins/mysql_threads [root@opensourceecology plugins]# /usr/share/munin/plugins/mysql_ suggest bin_relay_log commands connections files_tables innodb_bpool innodb_bpool_act innodb_insert_buf innodb_io innodb_io_pend innodb_log innodb_rows innodb_semaphores innodb_tnx myisam_indexes network_traffic qcache qcache_mem replication select_types slow sorts table_locks tmp_tables [root@opensourceecology plugins]#
- I added all the mysql things
root@opensourceecology plugins]# ls -lah mysql_* lrwxrwxrwx 1 root root 31 Sep 7 07:36 mysql_ -> /usr/share/munin/plugins/mysql_ lrwxrwxrwx 1 root root 36 Sep 7 07:36 mysql_bytes -> /usr/share/munin/plugins/mysql_bytes lrwxrwxrwx 1 root root 37 Sep 7 07:36 mysql_innodb -> /usr/share/munin/plugins/mysql_innodb lrwxrwxrwx 1 root root 42 Sep 7 07:36 mysql_isam_space_ -> /usr/share/munin/plugins/mysql_isam_space_ lrwxrwxrwx 1 root root 38 Sep 7 07:36 mysql_queries -> /usr/share/munin/plugins/mysql_queries lrwxrwxrwx 1 root root 42 Sep 7 07:36 mysql_slowqueries -> /usr/share/munin/plugins/mysql_slowqueries lrwxrwxrwx 1 root root 38 Sep 7 07:36 mysql_threads -> /usr/share/munin/plugins/mysql_threads [root@opensourceecology plugins]# rm -rf mysql_* [root@opensourceecology plugins]# ln -sf /usr/share/munin/plugins/mysql_ mysql_ [root@opensourceecology plugins]# for i in `./mysql_ suggest`; \ > do ln -sf /usr/share/munin/plugins/mysql_ $i; done [root@opensourceecology plugins]# ls -lah mysql_* lrwxrwxrwx 1 root root 31 Oct 8 08:06 mysql_ -> /usr/share/munin/plugins/mysql_ [root@opensourceecology plugins]# ls -lah commands lrwxrwxrwx 1 root root 31 Oct 8 08:06 commands -> /usr/share/munin/plugins/mysql_ [root@opensourceecology plugins]#
- according to this guide, munin needs a user that doesn't need any GRANTs to any databases, and that's sufficient http://www.mbrando.com/2007/08/06/how-to-get-your-mysql-munin-graphs-working/
create user munin@localhost identified by 'CHANGEME'; flush privileges;
- and I added this stanza to /etc/munin/plugin-conf.d/zzz-ose
[mysql*] user root group wheel env.mysqlopts -u munin_user -pOBFUSCATED
- test worked
[root@opensourceecology plugins]# munin-run --debug mysql_queries # Processing plugin configuration from /etc/munin/plugin-conf.d/amavis # Processing plugin configuration from /etc/munin/plugin-conf.d/df # Processing plugin configuration from /etc/munin/plugin-conf.d/fw_ # Processing plugin configuration from /etc/munin/plugin-conf.d/hddtemp_smartctl # Processing plugin configuration from /etc/munin/plugin-conf.d/munin-node # Processing plugin configuration from /etc/munin/plugin-conf.d/postfix # Processing plugin configuration from /etc/munin/plugin-conf.d/postgres # Processing plugin configuration from /etc/munin/plugin-conf.d/sendmail # Processing plugin configuration from /etc/munin/plugin-conf.d/zzz-ose # Setting /rgid/ruid/ to /99/0/ # Setting /egid/euid/ to /99 99 10/0/ # Setting up environment # Environment mysqlopts = -u munin_user -pqd2qQiFdeNGepvhv5dsQx4rVt7pRyFJ # About to run '/etc/munin/plugins/mysql_queries' delete.value 837242 insert.value 896145 replace.value 1197242 select.value 148647861 update.value 1721521 cache_hits.value 0 [root@opensourceecology plugins]#
- now for nginx, I confirmed that we do have the ability to spit out the status page
[root@opensourceecology plugins]# nginx -V 2>&1 | grep -o with-http_stub_status_module with-http_stub_status_module [root@opensourceecology plugins]#
- I tried adding a block for '/nginx_status' only accessible to '127.0.0.1', but I still got 403'd when attempting to access it via curl on the local machine
- the access logs showed it being accessed from an ipv6 address
2a01:4f8:172:209e::2 - - [08/Oct/2019:08:37:49 +0000] "GET /nginx_status HTTP/1.1" 403 162 "-" "curl/7.29.0" "-"
- I guess it has to go out over eth0 because the server is necessarily bound to that ip (it's not bound to 127.0.0.1)
- I used the following block
# stats for munin location /nginx_status { stub_status on; access_log off; allow 127.0.0.1/32; allow 138.201.84.223/32; allow 138.201.84.243/32; allow ::1/128; allow 2a01:4f8:172:209e::2/128; allow fe80::921b:eff:fe94:7c4/128; deny all; }
- and it worked!
[root@opensourceecology conf.d]# nginx -t nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: configuration file /etc/nginx/nginx.conf test is successful [root@opensourceecology conf.d]# service nginx reload Redirecting to /bin/systemctl reload nginx.service [root@opensourceecology conf.d]# curl https://www.opensourceecology.org/nginx_status Active connections: 1 server accepts handled requests 16063989 16063989 27383851 Reading: 0 Writing: 1 Waiting: 0 [root@opensourceecology conf.d]#
- I found that my nginx module wouldn't work unless I installed the 'perl-LWP-Protocol-https' package
[root@opensourceecology plugins]# yum install perl-LWP-Protocol-https ... Installed: perl-LWP-Protocol-https.noarch 0:6.04-4.el7 Dependency Installed: perl-Mozilla-CA.noarch 0:20130114-5.el7 Complete! [root@opensourceecology plugins]#
- I added nginx configs for both the wiki & osemain. If all is well, I'll add the configs for out other vhosts
- I didn't bother with apache for now (also, the acl will be confusing since it sees all traffic coming from 127.0.0.1 via varnish)
- meanwhile, some of the mysql graphs are populating. good!
- and meanwhile, the rsync is still going; it's currently at "var/lib/mysql" copying or mysql databases' data. cool.
- ...
- after a few hours, I checked-up on rsync; it was stuck again
var/www/html/wiki.opensourceecology.org/htdocs/images/archive/5/5f/20170722193549!CEBPressJuneGroup.fcstd 4840012 100% 2.56MB/s 0:00:01 (xfer#344966, to-check=1043/396314) var/www/html/wiki.opensourceecology.org/htdocs/images/archive/5/5f/20170722195024!CEBPressJuneGroup.fcstd 950272 19% 879.62kB/s 0:00:04
- the vpn client appears to have disconnected, and I can't ping the staging host at all from prod
[maltfield@opensourceecology ~]$ ping 10.241.189.11 PING 10.241.189.11 (10.241.189.11) 56(84) bytes of data. ^C --- 10.241.189.11 ping statistics --- 59 packets transmitted, 0 received, 100% packet loss, time 57999ms [maltfield@opensourceecology ~]$
- I manually exited-out of the openvpn connection & reinitiated it; pings now work. After about 60 seconds, the rsync started outputting again..
- when I went to check the size of the lxc container, I was told <1G, which can't be right
[root@osedev1 lxc]# du -sh /var/lib/lxc/osestaging1 604M /var/lib/lxc/osestaging1 [root@osedev1 lxc]#
- ncdu pointed me to the snap1 dir, which s currently 48G
[root@osedev1 lxc]# du -sh /var/lib/lxcsnaps/osestaging1/snap1 48G /var/lib/lxcsnaps/osestaging1/snap1 [root@osedev1 lxc]#
- apparently this is the consequence of restoring a snapshot just by doing a rsync; the snapshot's config file has a new line that identifies the rootfs path explicitly as the snapshot's rootfs
[root@osedev1 lxc]# tail /var/lib/lxc/osestaging1/config lxc.cap.drop = mac_admin lxc.cap.drop = mac_override lxc.cap.drop = setfcap lxc.cap.drop = sys_module lxc.cap.drop = sys_nice lxc.cap.drop = sys_pacct lxc.cap.drop = sys_rawio lxc.cap.drop = sys_time lxc.hook.clone = /usr/share/lxc/hooks/clonehostname lxc.rootfs = /var/lib/lxcsnaps/osestaging1/snap1/rootfs [root@osedev1 lxc]#
- perhaps that means the actual dir is now my *real* snapshots data
- while rsync continued, I noted that my nginx graphs are appearing, but there's no label that differentiates the wiki from osemain's graphs
- I can see a list of variables defined by my plugin by default with the `munin-run <plugin> config` command https://munin.opensourceecology.org:4443/nginx-day.html
[root@opensourceecology plugins]# munin-run nginx_www.opensourceecology.org_status config graph_title NGINX status graph_args --base 1000 graph_category nginx graph_vlabel Connections total.label Active connections total.info Active connections total.draw LINE2 reading.label Reading reading.info Reading reading.draw LINE2 writing.label Writing writing.info Writing writing.draw LINE2 waiting.label Waiting waiting.info Waiting waiting.draw LINE2 [root@opensourceecology plugins]#
- so it looks like I can set this as 'graph_title' or 'graph_info'
- I restarted munin-node and triggered the munin-cron to update the html pages
[root@opensourceecology plugins]# service munin-node restart Redirecting to /bin/systemctl restart munin-node.service [root@opensourceecology plugins]# [root@opensourceecology plugins]# sudo -u munin /usr/bin/munin-cron
- the new variables didn't affect anything, so I started grepping the logs
- unrelated, the logs complained about mysql auth failure for:
- network_traffic
- select_types
- innodb_tnx
- innodb_log
- sorts
- myisam_indexes
- qcache_mem
- innodb_io
- connections
- qcache
- innodb_insert_buf
- replication
- bin_relay_log
- mysql_queries
- innodb_rows
- innodb_bpool_act
- files_table
- commands
- innodb_bpool
- tmp_tables
- innodb_semaphores
- innodb_io_pend
- table_locks
- slow
- but there was nothing related to nginx
- I tried overriding the graph_title in the plugins, but it didn't work
- I found the datafile for munin in /var/lib/munin/datafile. This is clearly where the graph title is defined before being generated into html files
[root@opensourceecology plugins]# grep nginx /var/lib/munin/datafile | grep -i graph_title localhost;localhost:nginx_wiki_opensourceecology_org_request.graph_title Nginx requests localhost;localhost:nginx_wiki_opensourceecology_org_status.graph_title NGINX status localhost;localhost:nginx_www_opensourceecology_org_status.graph_title NGINX status localhost;localhost:nginx_www_opensourceecology_org_request.graph_title Nginx requests [root@opensourceecology plugins]#
- I found that I *could* override the title in /etc/muin/munin.conf https://www.aroundmyroom.com/2015/01/10/munin-help-needed/
[localhost] address 127.0.0.1 use_node_name yes nginx_www_opensourceecology_org_status.graph_title Nginx Status (www.opensourceecology.org) nginx_wiki_opensourceecology_org_status.graph_title Nginx Status (wiki.opensourceecology.org)
- ...
- meanwhile, the rsync finished!
[maltfield@opensourceecology ~]$ [ "`hostname`" = "opensourceecology.org" ] && time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --rsync-path="sudo rsync" --exclude=/root --exclude=/run --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/ ... var/www/html/www.opensourceecology.org/htdocs/wp-includes/widgets/class-wp-widget-text.php 20735 100% 21.05kB/s 0:00:00 (xfer#450852, to-check=0/517755) var/yp/ sent 59229738371 bytes received 11198208 bytes 2959309.47 bytes/sec total size is 77965794338 speedup is 1.32 rsync warning: some files vanished before they could be transferred (code 24) at main.c(1052) [sender=3.0.9] real 333m37.655s user 19m50.292s sys 6m0.997s [maltfield@opensourceecology ~]$
- but I still can't ssh into it; again, my home dir is owned by the wrong user
[root@osestaging1 ~]# ls -lah /home/maltfield/.ssh total 16K drwxr-xr-x. 2 tgriffing tgriffing 4.0K Jan 19 2018 . drwx------. 10 tgriffing tgriffing 4.0K Oct 3 07:06 .. -rw-r--r--. 1 root root 750 Jun 20 2017 authorized_keys -rw-r--r--. 1 tgriffing tgriffing 1.1K Oct 3 13:44 known_hosts [root@osestaging1 ~]#
- maybe I should add the '--numeric-ids' option if rsync is mapping the uids over?
[ "`hostname`" = "opensourceecology.org" ] && time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --numeric-ids --rsync-path="sudo rsync" --exclude=/root --exclude=/run --exclude=/home/b2user/sync --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/
- I found that the 'sync.old' dir was still trying to sync, so I updated the command to add a wildcard after the exclude; it worked
[ "`hostname`" = "opensourceecology.org" ] && time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --numeric-ids --rsync-path="sudo rsync" --exclude=/root --exclude=/run --exclude=/home/b2user/sync* --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/
- this time the double-tap took only 3 minutes wall time
[maltfield@opensourceecology ~]$ [ "`hostname`" = "opensourceecology.org" ] && time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --numeric-ids --rsync-path="sudo rsync" --exclude=/root --exclude=/run --exclude=/home/b2user/sync* --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/ ... var/www/html/munin/static/zoom.js 4760 100% 1.13MB/s 0:00:00 (xfer#2239, to-check=1002/321739) sent 224884435 bytes received 1668273 bytes 1352553.48 bytes/sec total size is 41283867704 speedup is 182.23 real 2m46.967s user 0m32.382s sys 0m8.095s [maltfield@opensourceecology ~]$
- this time the permissions of my home dir didn't break, and I was able to ssh-in.
- I'd like to take a snapshot of the staging server, but at this point we don't have space for it
[root@osedev1 lxc]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 19G 3.4G 15G 19% / devtmpfs 873M 0 873M 0% /dev tmpfs 896M 0 896M 0% /dev/shm tmpfs 896M 17M 879M 2% /run tmpfs 896M 0 896M 0% /sys/fs/cgroup /dev/mapper/ose_dev_volume_1 125G 94G 25G 80% /mnt/ose_dev_volume_1 tmpfs 180M 0 180M 0% /run/user/1000 [root@osedev1 lxc]#
- ok, now, drum roll: did we break the staging server? let's try to shut it down & start it again.
- aaaaand: IT CAME BACK UP! Now it said its hostname isn't 'osestaging1' but 'opensourceecology'. Coolz.
- I was successfully able to ssh into it, but then it froze. And my attempts to login to the lxc-console all end in timeouts
opensourceecology login: maltfield Password: login: timed out after 60 seconds CentOS Linux 7 (Core) Kernel 3.10.0-957.21.3.el7.x86_64 on an x86_64 opensourceecology login:
- if I attempt to login as root, then it just times-out before it even asks me for a password
opensourceecology login: root login: timed out after 60 seconds CentOS Linux 7 (Core) Kernel 3.10.0-957.21.3.el7.x86_64 on an x86_64 opensourceecology login:
- ssh auth suceeds, but it also fails before I get a shell
... debug1: Authentication succeeded (publickey). Authenticated to 10.241.189.11 ([10.241.189.11]:32415). debug1: channel 0: new [client-session] debug3: ssh_session2_open: channel_new: 0 debug2: channel 0: send open debug3: send packet: type 90 debug1: Requesting no-more-sessions@openssh.com debug3: send packet: type 80 debug1: Entering interactive session. debug1: pledge: network
- I stopped the container again. This time when I tried to start it, I got an error
[root@osedev1 ~]# lxc-start -n -osestaging1 lxc-start: lxc_start.c: main: 290 Executing '/sbin/init' with no configuration file may crash the host [root@osedev1 ~]#
- I moved some dirs around so that I'm no longer using the 'rootfs' dir from the snaps dir, but now I get this damn message. duckducks are dead-ends
[root@osedev1 lxc]# lxc-start -n osestaging1 lxc-start: sync.c: __sync_wake: 74 sync wake failure : Broken pipe lxc-start: start.c: __lxc_start: 1149 failed to spawn 'osestaging1' [root@osedev1 lxc]# lxc-start -P /var/lib/lxc/ -n osestaging1 lxc-start: sync.c: __sync_wake: 74 sync wake failure : Broken pipe lxc-start: start.c: __lxc_start: 1149 failed to spawn 'osestaging1' [root@osedev1 lxc]#
- I tried rebooting the dev server. after it came up, I still got the same error when attempting to `lxc-start`
- I found I could get debug logs by adding `-l log -o <file>` https://github.com/lxc/lxc/issues/1555
[root@osedev1 ~]# lxc-start -n osestaging1 -l debug -o lxc-start.log lxc-start: sync.c: __sync_wake: 74 sync wake failure : Broken pipe lxc-start: start.c: __lxc_start: 1149 failed to spawn 'osestaging1' [root@osedev1 ~]# cat lxc-start.log ...
- all the god damn google results on this "sync wake failure" shit (which are already few) are regarding configs of multiple containers sharing a network. I'll destroy the whole network namespace if needed. but how? why does nobody else encounter this damn issue?
- well, I found the source code. could be an issue with an open file descriptor or something? https://fossies.org/linux/lxc/src/lxc/sync.c
- my best guess is that it's an issue with the 'rootfs.dev' symlink
[root@osedev1 lxc]# ls -lah osestaging1 total 28K drwxrwx---. 5 root root 4.0K Oct 8 16:17 . drwxr-xr-x. 6 root root 4.0K Oct 8 16:05 .. -rw-r--r--. 1 root root 1.1K Oct 8 15:46 config drwxr-xr-x. 3 root root 4.0K Oct 8 15:46 dev drwxr-xr-x. 2 root root 4.0K Oct 8 15:52 osestaging1 dr-xr-xr-x. 20 root root 4.0K Oct 8 15:21 rootfs lrwxrwxrwx. 1 root root 38 Oct 8 16:17 rootfs.dev -> /dev/.lxc/osestaging1.72930b02843095eb -rw-r--r--. 1 root root 19 Oct 3 15:40 ts [root@osedev1 lxc]#
- I commented-out every fucking line in the config file that had the word 'dev' in it...and the system started! Except that, umm, I couldn't connect to its console?
[root@osedev1 lxc]# lxc-start -n osestaging1 -f osestaging1/config -l trace -o lxc-start.log Failed to create unit file /run/systemd/generator.late/netconsole.service: File exists Failed to create unit file /run/systemd/generator.late/network.service: File exists Running in a container, ignoring fstab device entry for /dev/disk/by-uuid/1e457b76-5100-4b53-bcdc-667ca122b941. Running in a container, ignoring fstab device entry for /dev/mapper/ose_dev_volume_1. Failed to create unit file /run/systemd/generator/systemd-cryptsetup@ose_dev_volume_1.service: File exists lxc-start: console.c: lxc_console_peer_proxy_alloc: 315 console not set up
- I found that if I commented-out the first line and added-back a rootfs line, I could get it to boot again, but I couldn't login from the console (same 60 second timeout) or ssh in (or ping it)
#lxc.mount.entry = /dev/net dev/net none bind,create=dir ... lxc.rootfs = /var/lib/lxc/osestaging1/rootfs
- I uncommented the first line, and it still started! looks like the issue was that I didn't explicitly define a rootfs..
- this time I could ping the server from my laptop over the vpn
- I was able to login as 'maltfield' from the console, but it locked-up when I tried to `sudo su -`
- on the next reboot, tailed all the files in /var/log from the osedev1 server (inside the staging container's rootfs dir); I saw some interesting results
==> osestaging1/rootfs/var/log/messages <== Oct 8 14:50:00 opensourceecology NET[248]: /usr/sbin/dhclient-script : updated /etc/resolv.conf Oct 8 14:50:00 opensourceecology dhclient[201]: bound to 192.168.122.201 -- renewal in 1588 seconds. Oct 8 14:50:00 opensourceecology network: Determining IP information for eth0... done. Oct 8 14:50:00 opensourceecology network: [ OK ] Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/kernel/yama/ptrace_scope': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '16' to '/proc/sys/kernel/sysrq': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '1' to '/proc/sys/kernel/core_uses_pid': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '1' to '/proc/sys/net/ipv4/conf/default/rp_filter': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '1' to '/proc/sys/net/ipv4/conf/all/rp_filter': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv4/conf/default/accept_source_route': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv4/conf/all/accept_source_route': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '1' to '/proc/sys/net/ipv4/conf/default/promote_secondaries': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '1' to '/proc/sys/net/ipv4/conf/all/promote_secondaries': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '1' to '/proc/sys/fs/protected_hardlinks': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '1' to '/proc/sys/fs/protected_symlinks': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '1' to '/proc/sys/net/ipv4/icmp_echo_ignore_broadcasts': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/default/autoconf': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/default/accept_dad': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/default/accept_ra': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/default/accept_ra_defrtr': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/default/accept_ra_rtr_pref': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/default/accept_ra_pinfo': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/default/accept_source_route': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/default/accept_redirects': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/default/forwarding': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/all/autoconf': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/all/accept_dad': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/all/accept_ra': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/all/accept_ra_defrtr': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/all/accept_ra_rtr_pref': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/all/accept_ra_pinfo': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/all/accept_source_route': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/all/accept_redirects': Read-only file system Oct 8 14:50:00 opensourceecology systemd-sysctl: Failed to write '0' to '/proc/sys/net/ipv6/conf/all/forwarding': Read-only file system Oct 8 14:50:01 opensourceecology systemd: Started LSB: Bring up/down networking.
- and issues with /run
Oct 8 14:50:05 opensourceecology systemd-logind: Failed to remove runtime directory /run/user/0: Device or resource busy
Mon Oct 07, 2019
- I added a comment to our long-standing feature request with the Libre Office Online CODE project for the ability to draw lines & arrows in their online version of "present" https://bugs.documentfoundation.org/show_bug.cgi?id=113386#c4
- wiki updates & logging
- I tried to login to my hetzner cloud account, but I got "Account is disabled" fucking hell. so much for user-specific auditing. I logged-in with our shared account..
- I confirmed that our osedev1 node has a 20G disk + 10G volume.
- we currently are using 3.4/19G on osedev1; I never setup the 10G volume that appears to be at /mnt/HC_Volume_3110278. It has 10G avail
[maltfield@osedev1 ~]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 19G 3.4G 15G 19% / devtmpfs 873M 0 873M 0% /dev tmpfs 896M 0 896M 0% /dev/shm tmpfs 896M 25M 871M 3% /run tmpfs 896M 0 896M 0% /sys/fs/cgroup /dev/sdb 9.8G 37M 9.3G 1% /mnt/HC_Volume_3110278 tmpfs 180M 0 180M 0% /run/user/1000 [maltfield@osedev1 ~]$ ls -lah /mnt/HC_Volume_3110278/ total 24K drwxr-xr-x. 3 root root 4.0K Aug 20 11:50 . drwxr-xr-x. 3 root root 4.0K Aug 20 12:16 .. drwx------. 2 root root 16K Aug 20 11:50 lost+found [maltfield@osedev1 ~]$
- the disk RAID1'd disk on prod is 197G with 75G used
[maltfield@opensourceecology ~]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/md2 197G 75G 113G 40% / devtmpfs 32G 0 32G 0% /dev tmpfs 32G 8.0K 32G 1% /dev/shm tmpfs 32G 2.6G 29G 9% /run tmpfs 32G 0 32G 0% /sys/fs/cgroup /dev/md1 488M 289M 174M 63% /boot tmpfs 6.3G 0 6.3G 0% /run/user/0 tmpfs 6.3G 0 6.3G 0% /run/user/1005 [maltfield@opensourceecology ~]$
- a quick duckduck pulled up this guide for using luks to create an encrypted volume out of hetzner block volumes; this is a good idea https://angristan.xyz/how-to-use-encrypted-block-storage-volumes-hetzner-cloud/
- the guide shows a method for resizing the encrypted volume. I didn't think that would be trivial, but it appears that resize2fs can increase the size of a luks-encrypted volume without issue. this is good to know. if we run out of space (or maybe we create a second staging node or ad-hoc dev nodes), we should be able to shutdown all our lxc containers, unmount the block drive, resize it, and remount it. That said, I don't think we'll be making backups of these (dev/staging) containers, so if we fuck up it would be bad.
- our 10G hetzner cloud block volume has been costing 0.48 EUR/mo = 5.76 EUR/yr
- the min needed for our current prod server is 75G. The slider on the product page has weird increments, but the actual "resize volume" option in the cloud console wui permits resizing in 1G increments. A 75G volume would cost 3.00 EUR/mo = 35 EUR/yr
- A much more sane choice would be equal to the disk on prod = 197G = 7.88 EUR/mo = 94.56 EUR/yr
- fuck, I asked Marcin for $100/yr. Currently we're spending 2.49/mo on the osedev1 instance alone. That's 29.88 EUR/yr = 32.81 USD/yr. For a 100 USD/yr budget, that leaves 67.19 USD for disk space = 61.19 EUR/yr. That's 5.09 EUR/mo, which will buy us a 127G volume at 5.08 EUR/mo.
- 127/197 = 0.64. Therefore, a 127G block volume will allow for an lxc staging node to replicate our prod node until our prod node grows beyond 64% capacity. 70% is a good general high-water-mark at which we'd need to look at migrating prod anyway. This (127G) seems like a resonable low-budget solution that meets the 100 USD/yr line.
- I resized our 10G 'ose-dev-volume-1' volume to 127G in the hetzner WUI.
- I clicked the 'enable protection' option, which prevents it from being deleted until the protection is manually removed
- the 'show configuration' window in the wui tells us that the volume is '/dev/disk/by-id/scsi-0HC_Volume_3110278' on osedev1
- the box itself looks like it's really /dev/sdb
[maltfield@osedev1 ~]$ mount sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel) proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=893568k,nr_inodes=223392,mode=755) securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,seclabel) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000) tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,mode=755) tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,seclabel,mode=755) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd) pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,net_prio,net_cls) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,perf_event) cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,memory) cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,pids) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,cpuset,clone_children) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,cpuacct,cpu) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,blkio) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,freezer) cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,hugetlb) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,devices) configfs on /sys/kernel/config type configfs (rw,relatime) /dev/sda1 on / type ext4 (rw,relatime,seclabel,data=ordered) selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime) debugfs on /sys/kernel/debug type debugfs (rw,relatime) systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=27,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=11033) hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel) mqueue on /dev/mqueue type mqueue (rw,relatime,seclabel) /dev/sdb on /mnt/HC_Volume_3110278 type ext4 (rw,relatime,seclabel,discard,data=ordered) tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=183308k,mode=700,uid=1000,gid=1000) [maltfield@osedev1 ~]$
- but the other name appears in fstab
[root@osedev1 ~]# cat /etc/fstab # # /etc/fstab # Created by anaconda on Sun Jul 14 04:14:25 2019 # # Accessible filesystems, by reference, are maintained under '/dev/disk' # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info # UUID=1e457b76-5100-4b53-bcdc-667ca122b941 / ext4 defaults 1 1 /dev/disk/by-id/scsi-0HC_Volume_3110278 /mnt/HC_Volume_3110278 ext4 discard,nofail,defaults 0 0 [root@osedev1 ~]#
- ah, indeed, the above disk is just a link back to /dev/sdb
[root@osedev1 ~]# ls -lah /dev/disk/by-id/scsi-0HC_Volume_3110278 lrwxrwxrwx. 1 root root 9 Oct 7 10:31 /dev/disk/by-id/scsi-0HC_Volume_3110278 -> ../../sdb [root@osedev1 ~]#
- before I rebuild this volume, the cryptfs command begs the question: where do I store the key?
- assuming I want the server to be able to restart by itself without user interaction, the key should probably be stored in a file somewhere on '/root' on 'osedev1' but while my OS would lock-down the permissions to that file, the key file itself would likely be stored unencrypted on some hetzner drive somewhere. Is it worth encrypting the contents of the block volume when the encryption key itself might be stored unencrypted somewhere at hetzner's datacenter?
- as a test, I ran `testdisk` to see if I could find any deleted files in the 10G volume that hetzner gave us from previous customers; I couldn't.
- someone asked about this, but there wasn't much great discussion on how hetzner provisions their disks https://serverfault.com/questions/950790/cloud-server-vulnerability-analysis?noredirect=1
- so risk assessment: when working in a cloud, we have to accept the integrity of the cloud provider. If a rogue hetzner employee wants to steal all our data, they can. There's absolutely nothing we can do about that other than building the servers ourselves and physically locking them down. The decision to use hetzner predates me, but I agree with it. It does not make sense for OSE to buy a server rack and host our equipment at FeF. So, I accept the risk and trust that hetzner not do something malicious that will put our data at risk
- the real concern here is that we resize our volume (or hetzner in the background shuffles some abstracted blocks around physical devices that's black-boxed to us), and a different customer suddently gets, for example, our user's PII in their new volume. Or a malicious hetzner cloud user triggers some shuffling and is successfully able to exfiltrate our data from their cloud without breaking into our server. This is the risk that we're trying to prevent. In this case, I think it *is* worthwhile to encrypt our block volume. The chances that someone is able to get chunks of our data from an old 127G block volume that lacked encryption is significantly higher than them able to get those *and* the key from our server *and* be able to use the key to extract meaninful data from the likely non-contiguious bits that may be extracted from our recycled block volume data.
- hetzner does not have a clean record, but hardly anybody does. This is only customer data, though. Not the their customer's server contents data https://mybroadband.co.za/news/cloud-hosting/279181-hetzner-client-data-exposed-after-attack.html
- so, while recognizing that it has limitations, I also recognize that there are sufficient benefits to justfy encrypting this block volume with a key stored unencrypted on our cloud instance
- meanwhile, I found a guide for how to migrate the contents of /var to a block volume. It suggested doing so from a resuce disk, then editing fstab for the next reboot https://serverfault.com/questions/947732/how-to-add-hetzner-cloud-disk-volume-to-extend-var-partition
- I created a new key file on my laptop, stored it in our shared keepass, and uploaded it to the server at /root/keys/ose-dev-volume-1.201910.key
- let's shutdown osedev1 and migrate its /var/ to a block volume. First I'll shutdown the osestagng1 staging lxc container then the host osedev1
[root@osedev1 ~]# lxc-stop -n osestaging1 [root@osedev1 ~]# shutdown -h now Connection to 195.201.233.113 closed by remote host. Connection to 195.201.233.113 closed. user@ose:~$
- I confirmed that the server was off in the hetzner cloud console wui
- I clicked on the server. I'm not clear if I should mount a rescue disk or click the "rescue" option. No idea what the latter is, so I navigated to "ISO IMAGES", found SystemRescueCD, and clicked the "MOUNT" button next to it. I went back to the "servers"# I added a comment to our long-standing feature request with the Libre Office Online CODE project for the ability to draw lines & arrows in their online version of "present" https://bugs.documentfoundation.org/show_bug.cgi?id=113386#c4
- wiki updates & logging
- I tried to login to my hetzner cloud account, but I got "Account is disabled" fucking hell. so much for user-specific auditing. I logged-in with our shared account..
- I confirmed that our osedev1 node has a 20G disk + 10G volume.
- we currently are using 3.4/19G on osedev1; I never setup the 10G volume that appears to be at /mnt/HC_Volume_3110278. It has 10G avail
[maltfield@osedev1 ~]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 19G 3.4G 15G 19% / devtmpfs 873M 0 873M 0% /dev tmpfs 896M 0 896M 0% /dev/shm tmpfs 896M 25M 871M 3% /run tmpfs 896M 0 896M 0% /sys/fs/cgroup /dev/sdb 9.8G 37M 9.3G 1% /mnt/HC_Volume_3110278 tmpfs 180M 0 180M 0% /run/user/1000 [maltfield@osedev1 ~]$ ls -lah /mnt/HC_Volume_3110278/ total 24K drwxr-xr-x. 3 root root 4.0K Aug 20 11:50 . drwxr-xr-x. 3 root root 4.0K Aug 20 12:16 .. drwx------. 2 root root 16K Aug 20 11:50 lost+found [maltfield@osedev1 ~]$
- the disk RAID1'd disk on prod is 197G with 75G used
[maltfield@opensourceecology ~]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/md2 197G 75G 113G 40% / devtmpfs 32G 0 32G 0% /dev tmpfs 32G 8.0K 32G 1% /dev/shm tmpfs 32G 2.6G 29G 9% /run tmpfs 32G 0 32G 0% /sys/fs/cgroup /dev/md1 488M 289M 174M 63% /boot tmpfs 6.3G 0 6.3G 0% /run/user/0 tmpfs 6.3G 0 6.3G 0% /run/user/1005 [maltfield@opensourceecology ~]$
- a quick duckduck pulled up this guide for using luks to create an encrypted volume out of hetzner block volumes; this is a good idea https://angristan.xyz/how-to-use-encrypted-block-storage-volumes-hetzner-cloud/
- the guide shows a method for resizing the encrypted volume. I didn't think that would be trivial, but it appears that resize2fs can increase the size of a luks-encrypted volume without issue. this is good to know. if we run out of space (or maybe we create a second staging node or ad-hoc dev nodes), we should be able to shutdown all our lxc containers, unmount the block drive, resize it, and remount it. That said, I don't think we'll be making backups of these (dev/staging) containers, so if we fuck up it would be bad.
- our 10G hetzner cloud block volume has been costing 0.48 EUR/mo = 5.76 EUR/yr
- the min needed for our current prod server is 75G. The slider on the product page has weird increments, but the actual "resize volume" option in the cloud console wui permits resizing in 1G increments. A 75G volume would cost 3.00 EUR/mo = 35 EUR/yr
- A much more sane choice would be equal to the disk on prod = 197G = 7.88 EUR/mo = 94.56 EUR/yr
- fuck, I asked Marcin for $100/yr. Currently we're spending 2.49/mo on the osedev1 instance alone. That's 29.88 EUR/yr = 32.81 USD/yr. For a 100 USD/yr budget, that leaves 67.19 USD for disk space = 61.19 EUR/yr. That's 5.09 EUR/mo, which will buy us a 127G volume at 5.08 EUR/mo.
- 127/197 = 0.64. Therefore, a 127G block volume will allow for an lxc staging node to replicate our prod node until our prod node grows beyond 64% capacity. 70% is a good general high-water-mark at which we'd need to look at migrating prod anyway. This (127G) seems like a resonable low-budget solution that meets the 100 USD/yr line.
- I resized our 10G 'ose-dev-volume-1' volume to 127G in the hetzner WUI.
- I clicked the 'enable protection' option, which prevents it from being deleted until the protection is manually removed
- the 'show configuration' window in the wui tells us that the volume is '/dev/disk/by-id/scsi-0HC_Volume_3110278' on osedev1
- the box itself looks like it's really /dev/sdb
[maltfield@osedev1 ~]$ mount sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel) proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=893568k,nr_inodes=223392,mode=755) securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,seclabel) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000) tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,mode=755) tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,seclabel,mode=755) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd) pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,net_prio,net_cls) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,perf_event) cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,memory) cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,pids) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,cpuset,clone_children) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,cpuacct,cpu) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,blkio) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,freezer) cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,hugetlb) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,devices) configfs on /sys/kernel/config type configfs (rw,relatime) /dev/sda1 on / type ext4 (rw,relatime,seclabel,data=ordered) selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime) debugfs on /sys/kernel/debug type debugfs (rw,relatime) systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=27,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=11033) hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel) mqueue on /dev/mqueue type mqueue (rw,relatime,seclabel) /dev/sdb on /mnt/HC_Volume_3110278 type ext4 (rw,relatime,seclabel,discard,data=ordered) tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=183308k,mode=700,uid=1000,gid=1000) [maltfield@osedev1 ~]$
- but the other name appears in fstab
[root@osedev1 ~]# cat /etc/fstab # # /etc/fstab # Created by anaconda on Sun Jul 14 04:14:25 2019 # # Accessible filesystems, by reference, are maintained under '/dev/disk' # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info # UUID=1e457b76-5100-4b53-bcdc-667ca122b941 / ext4 defaults 1 1 /dev/disk/by-id/scsi-0HC_Volume_3110278 /mnt/HC_Volume_3110278 ext4 discard,nofail,defaults 0 0 [root@osedev1 ~]#
- ah, indeed, the above disk is just a link back to /dev/sdb
[root@osedev1 ~]# ls -lah /dev/disk/by-id/scsi-0HC_Volume_3110278 lrwxrwxrwx. 1 root root 9 Oct 7 10:31 /dev/disk/by-id/scsi-0HC_Volume_3110278 -> ../../sdb [root@osedev1 ~]#
- before I rebuild this volume, the cryptfs command begs the question: where do I store the key?
- assuming I want the server to be able to restart by itself without user interaction, the key should probably be stored in a file somewhere on '/root' on 'osedev1' but while my OS would lock-down the permissions to that file, the key file itself would likely be stored unencrypted on some hetzner drive somewhere. Is it worth encrypting the contents of the block volume when the encryption key itself might be stored unencrypted somewhere at hetzner's datacenter?
- as a test, I ran `testdisk` to see if I could find any deleted files in the 10G volume that hetzner gave us from previous customers; I couldn't.
- someone asked about this, but there wasn't much great discussion on how hetzner provisions their disks https://serverfault.com/questions/950790/cloud-server-vulnerability-analysis?noredirect=1
- so risk assessment: when working in a cloud, we have to accept the integrity of the cloud provider. If a rogue hetzner employee wants to steal all our data, they can. There's absolutely nothing we can do about that other than building the servers ourselves and physically locking them down. The decision to use hetzner predates me, but I agree with it. It does not make sense for OSE to buy a server rack and host our equipment at FeF. So, I accept the risk and trust that hetzner not do something malicious that will put our data at risk
- the real concern here is that we resize our volume (or hetzner in the background shuffles some abstracted blocks around physical devices that's black-boxed to us), and a different customer suddently gets, for example, our user's PII in their new volume. Or a malicious hetzner cloud user triggers some shuffling and is successfully able to exfiltrate our data from their cloud without breaking into our server. This is the risk that we're trying to prevent. In this case, I think it *is* worthwhile to encrypt our block volume. The chances that someone is able to get chunks of our data from an old 127G block volume that lacked encryption is significantly higher than them able to get those *and* the key from our server *and* be able to use the key to extract meaninful data from the likely non-contiguious bits that may be extracted from our recycled block volume data.
- hetzner does not have a clean record, but hardly anybody does. This is only customer data, though. Not the their customer's server contents data https://mybroadband.co.za/news/cloud-hosting/279181-hetzner-client-data-exposed-after-attack.html
- so, while recognizing that it has limitations, I also recognize that there are sufficient benefits to justfy encrypting this block volume with a key stored unencrypted on our cloud instance
- meanwhile, I found a guide for how to migrate the contents of /var to a block volume. It suggested doing so from a resuce disk, then editing fstab for the next reboot https://serverfault.com/questions/947732/how-to-add-hetzner-cloud-disk-volume-to-extend-var-partition
- I created a new key file on my laptop, stored it in our shared keepass, and uploaded it to the server at /root/keys/ose-dev-volume-1.201910.key
- let's shutdown osedev1 and migrate its /var/ to a block volume. First I'll shutdown the osestagng1 staging lxc container then the host osedev1
[root@osedev1 ~]# lxc-stop -n osestaging1 [root@osedev1 ~]# shutdown -h now Connection to 195.201.233.113 closed by remote host. Connection to 195.201.233.113 closed. user@ose:~$
- I confirmed that the server was off in the hetzner cloud console wui
- I clicked on the server. I'm not clear if I should mount a rescue disk or click the "rescue" option. No idea what the latter is, so I navigated to "ISO IMAGES", found SystemRescueCD, and clicked the "MOUNT" button next to it. I went back to the "servers" page, opened a console for 'osedev1', and clicked "Power on"
- the console showed the boot options for the rescue cd. I choose the first menu item = "SystemRescueCd: default boot options"
- I can't copy & paste from the console, but I basically found 5x items in /dev/disk/by-id/
- the DVD for systemrescue
- my 127G block volume with the same name shown above (scsi-0HC_Volume_3110278 )
- scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0
- scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0-part1
- another DVD?
- so 3 & 4 must be our osedev1 disk. Both are 19.1G
- attempting to mount the one without '-part1' failed, but the one with '-part1' succeeded, and all my data was there. It was mounted to '/mnt/osedev1-part/'
- I formatted the new 127G ebs volume using cryptsetup
cryptsetup luksFormat /dev/disk/by-id/scsi-0HC_Volume_211278 /mnt/osedev1/root/keys/ose-dev-volume-1.201910.key
- I opened the new encrypted luks volume and created its ext4 partition
cryptsetup luksOpen --key-file /mnt/osedev1/root/keys/ose-dev-volume-1.201910.key /dev/disk/by-id/scsi-0HC_Volume_211278 ebs mkfs.ext4 -j /dev/mapper/ebs
- I mounted the new FS & began a sync the osedev1's 'var' dir (now only 2.3G) to it
mkdir /mnt/ebs mount /dev/mapper/ebs /mnt/ebs rsync -av --progress /mnt/osedev1/var /mnt/ebs/
- I added entries for fstab & crypttab to auto-mount the volume to /mnt/ose_dev_volume_1/
- I moved the existing /var/ dir to /var.old and made a symlink from /var/ to /mnt/ose_dev_volume_1/var
- I safely umounted & closed all the disks and shutdown
- I removed the systemrescue iso from the server and started it up again
- I was able to ssh-in, and the new '/var/' dir *appeared* to be setup properly
[maltfield@osedev1 /]$ ls -lah /var lrwxrwxrwx. 1 root root 25 Oct 7 13:47 /var -> /mnt/ose_dev_volume_1/var [maltfield@osedev1 /]$ ls -lah /var/ total 80K drwxr-xr-x. 19 root root 4.0K Jul 14 06:18 . drwxr-xr-x. 4 root root 4.0K Oct 7 13:22 .. drwxr-xr-x. 2 root root 4.0K Apr 11 2018 adm drwxr-xr-x. 7 root root 4.0K Oct 2 14:24 cache drwxr-xr-x. 2 root root 4.0K Apr 24 16:03 crash drwxr-xr-x. 3 root root 4.0K Jul 14 06:15 db drwxr-xr-x. 3 root root 4.0K Jul 14 06:15 empty drwxr-xr-x. 2 root root 4.0K Apr 11 2018 games drwxr-xr-x. 2 root root 4.0K Apr 11 2018 gopher drwxr-xr-x. 3 root root 4.0K Jul 14 06:14 kerberos drwxr-xr-x. 34 root root 4.0K Oct 2 15:34 lib drwxr-xr-x. 2 root root 4.0K Apr 11 2018 local lrwxrwxrwx. 1 root root 11 Jul 14 06:14 lock -> ../run/lock drwxr-xr-x. 11 root root 4.0K Oct 7 13:49 log lrwxrwxrwx. 1 root root 10 Jul 14 06:14 mail -> spool/mail drwxr-xr-x. 2 root root 4.0K Apr 11 2018 nis drwxr-xr-x. 2 root root 4.0K Apr 11 2018 opt drwxr-xr-x. 2 root root 4.0K Apr 11 2018 preserve lrwxrwxrwx. 1 root root 6 Jul 14 06:14 run -> ../run drwxr-xr-x. 8 root root 4.0K Oct 3 08:06 spool drwxrwxrwt. 4 root root 4.0K Oct 7 13:49 tmp -rw-r--r--. 1 root root 163 Jul 14 06:14 .updated drwxr-xr-x. 2 root root 4.0K Apr 11 2018 yp [maltfield@osedev1 /]$
- but I immediately noticed that, for exaple, screen wasn't working
[maltfield@osedev1 /]$ screen -S ebs Cannot make directory '/var/run/screen': No such file or directory [maltfield@osedev1 /]$
- oh, damn, '/var/run' is a relative symlink to '../run' which won't work
[maltfield@osedev1 /]$ ls -lah /var/run lrwxrwxrwx. 1 root root 6 Jul 14 06:14 /var/run -> ../run [maltfield@osedev1 /]$
- I made it an absolute symlink instead
[root@osedev1 var]# rm -rf lock [root@osedev1 var]# rm -rf run [root@osedev1 var]# ln -s /run [root@osedev1 var]# ln -s /run/lock [root@osedev1 var]# ls -lah run lrwxrwxrwx. 1 root root 4 Oct 7 13:54 run -> /run [root@osedev1 var]# ls -lah lock lrwxrwxrwx. 1 root root 9 Oct 7 13:54 lock -> /run/lock [root@osedev1 var]#
- it still fails, but everything looks ok; I gave the system a reboot
[root@osedev1 var]# rm -rf lock [root@osedev1 var]# rm -rf run [root@osedev1 var]# ln -s /run [root@osedev1 var]# ln -s /run/lock [root@osedev1 var]# ls -lah run lrwxrwxrwx. 1 root root 4 Oct 7 13:54 run -> /run [root@osedev1 var]# ls -lah lock lrwxrwxrwx. 1 root root 9 Oct 7 13:54 lock -> /run/lock [root@osedev1 var]#
- when the system came back up, `screen` had no issues, and everything looked good.
[maltfield@osedev1 ~]$ screen -ls There is a screen on: 4362.ebs (Attached) 1 Socket in /var/run/screen/S-maltfield. [maltfield@osedev1 ~]$ sudo su - Last login: Mon Oct 7 13:54:28 CEST 2019 on pts/0 [root@osedev1 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 19G 3.4G 15G 19% / devtmpfs 873M 0 873M 0% /dev tmpfs 896M 0 896M 0% /dev/shm tmpfs 896M 17M 879M 2% /run tmpfs 896M 0 896M 0% /sys/fs/cgroup /dev/mapper/ose_dev_volume_1 125G 2.5G 116G 3% /mnt/ose_dev_volume_1 tmpfs 180M 0 180M 0% /run/user/1000 [root@osedev1 ~]# ls -lah /var lrwxrwxrwx. 1 root root 25 Oct 7 13:47 /var -> /mnt/ose_dev_volume_1/var [root@osedev1 ~]# ls -lah /mnt/ose_dev_volume_1/ total 28K drwxr-xr-x. 4 root root 4.0K Oct 7 13:22 . drwxr-xr-x. 4 root root 4.0K Oct 7 13:46 .. drwx------. 2 root root 16K Oct 7 13:18 lost+found drwxr-xr-x. 19 root root 4.0K Oct 7 13:54 var [root@osedev1 ~]#
- I started the staging server, connected to the vpn from my laptop, and was successfully able to ssh into it (though it took a long delay)
- I ssh'd into prod and kicked-off the rsync!
time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --rsync-path="sudo rsync" --exclude=/root --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/
- that also copied the old backups, which is probably unnecessary. I should also exclude
- home/b2user/sync
- this sync is going at a rate of about 1G every 5 minutes. I expect it'll be done in 5-10 hours. I'll check on it tomorrow. page, opened a console for 'osedev1', and clicked "Power on"
- the console showed the boot options for the rescue cd. I choose the first menu item = "SystemRescueCd: default boot options"
- I can't copy & paste from the console, but I basically found 5x items in /dev/disk/by-id/
- the DVD for systemrescue
- my 127G block volume with the same name shown above (scsi-0HC_Volume_3110278 )
- scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0
- scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0-part1
- another DVD?
- so 3 & 4 must be our osedev1 disk. Both are 19.1G
- attempting to mount the one without '-part1' failed, but the one with '-part1' succeeded, and all my data was there. It was mounted to '/mnt/osedev1-part/'
- I formatted the new 127G ebs volume using cryptsetup
cryptsetup luksFormat /dev/disk/by-id/scsi-0HC_Volume_211278 /mnt/osedev1/root/keys/ose-dev-volume-1.201910.key
- I opened the new encrypted luks volume and created its ext4 partition
cryptsetup luksOpen --key-file /mnt/osedev1/root/keys/ose-dev-volume-1.201910.key /dev/disk/by-id/scsi-0HC_Volume_211278 ebs mkfs.ext4 -j /dev/mapper/ebs
- I mounted the new FS & began a sync the osedev1's 'var' dir (now only 2.3G) to it
mkdir /mnt/ebs mount /dev/mapper/ebs /mnt/ebs rsync -av --progress /mnt/osedev1/var /mnt/ebs/
- I added entries for fstab & crypttab to auto-mount the volume to /mnt/ose_dev_volume_1/
- I moved the existing /var/ dir to /var.old and made a symlink from /var/ to /mnt/ose_dev_volume_1/var
- I safely umounted & closed all the disks and shutdown
- I removed the systemrescue iso from the server and started it up again
- I was able to ssh-in, and the new '/var/' dir *appeared* to be setup properly
[maltfield@osedev1 /]$ ls -lah /var lrwxrwxrwx. 1 root root 25 Oct 7 13:47 /var -> /mnt/ose_dev_volume_1/var [maltfield@osedev1 /]$ ls -lah /var/ total 80K drwxr-xr-x. 19 root root 4.0K Jul 14 06:18 . drwxr-xr-x. 4 root root 4.0K Oct 7 13:22 .. drwxr-xr-x. 2 root root 4.0K Apr 11 2018 adm drwxr-xr-x. 7 root root 4.0K Oct 2 14:24 cache drwxr-xr-x. 2 root root 4.0K Apr 24 16:03 crash drwxr-xr-x. 3 root root 4.0K Jul 14 06:15 db drwxr-xr-x. 3 root root 4.0K Jul 14 06:15 empty drwxr-xr-x. 2 root root 4.0K Apr 11 2018 games drwxr-xr-x. 2 root root 4.0K Apr 11 2018 gopher drwxr-xr-x. 3 root root 4.0K Jul 14 06:14 kerberos drwxr-xr-x. 34 root root 4.0K Oct 2 15:34 lib drwxr-xr-x. 2 root root 4.0K Apr 11 2018 local lrwxrwxrwx. 1 root root 11 Jul 14 06:14 lock -> ../run/lock drwxr-xr-x. 11 root root 4.0K Oct 7 13:49 log lrwxrwxrwx. 1 root root 10 Jul 14 06:14 mail -> spool/mail drwxr-xr-x. 2 root root 4.0K Apr 11 2018 nis drwxr-xr-x. 2 root root 4.0K Apr 11 2018 opt drwxr-xr-x. 2 root root 4.0K Apr 11 2018 preserve lrwxrwxrwx. 1 root root 6 Jul 14 06:14 run -> ../run drwxr-xr-x. 8 root root 4.0K Oct 3 08:06 spool drwxrwxrwt. 4 root root 4.0K Oct 7 13:49 tmp -rw-r--r--. 1 root root 163 Jul 14 06:14 .updated drwxr-xr-x. 2 root root 4.0K Apr 11 2018 yp [maltfield@osedev1 /]$
- but I immediately noticed that, for exaple, screen wasn't working
[maltfield@osedev1 /]$ screen -S ebs Cannot make directory '/var/run/screen': No such file or directory [maltfield@osedev1 /]$
- oh, damn, '/var/run' is a relative symlink to '../run' which won't work
[maltfield@osedev1 /]$ ls -lah /var/run lrwxrwxrwx. 1 root root 6 Jul 14 06:14 /var/run -> ../run [maltfield@osedev1 /]$
- I made it an absolute symlink instead
[root@osedev1 var]# rm -rf lock [root@osedev1 var]# rm -rf run [root@osedev1 var]# ln -s /run [root@osedev1 var]# ln -s /run/lock [root@osedev1 var]# ls -lah run lrwxrwxrwx. 1 root root 4 Oct 7 13:54 run -> /run [root@osedev1 var]# ls -lah lock lrwxrwxrwx. 1 root root 9 Oct 7 13:54 lock -> /run/lock [root@osedev1 var]#
- it still fails, but everything looks ok; I gave the system a reboot
[root@osedev1 var]# rm -rf lock [root@osedev1 var]# rm -rf run [root@osedev1 var]# ln -s /run [root@osedev1 var]# ln -s /run/lock [root@osedev1 var]# ls -lah run lrwxrwxrwx. 1 root root 4 Oct 7 13:54 run -> /run [root@osedev1 var]# ls -lah lock lrwxrwxrwx. 1 root root 9 Oct 7 13:54 lock -> /run/lock [root@osedev1 var]#
- when the system came back up, `screen` had no issues, and everything looked good.
[maltfield@osedev1 ~]$ screen -ls There is a screen on: 4362.ebs (Attached) 1 Socket in /var/run/screen/S-maltfield. [maltfield@osedev1 ~]$ sudo su - Last login: Mon Oct 7 13:54:28 CEST 2019 on pts/0 [root@osedev1 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 19G 3.4G 15G 19% / devtmpfs 873M 0 873M 0% /dev tmpfs 896M 0 896M 0% /dev/shm tmpfs 896M 17M 879M 2% /run tmpfs 896M 0 896M 0% /sys/fs/cgroup /dev/mapper/ose_dev_volume_1 125G 2.5G 116G 3% /mnt/ose_dev_volume_1 tmpfs 180M 0 180M 0% /run/user/1000 [root@osedev1 ~]# ls -lah /var lrwxrwxrwx. 1 root root 25 Oct 7 13:47 /var -> /mnt/ose_dev_volume_1/var [root@osedev1 ~]# ls -lah /mnt/ose_dev_volume_1/ total 28K drwxr-xr-x. 4 root root 4.0K Oct 7 13:22 . drwxr-xr-x. 4 root root 4.0K Oct 7 13:46 .. drwx------. 2 root root 16K Oct 7 13:18 lost+found drwxr-xr-x. 19 root root 4.0K Oct 7 13:54 var [root@osedev1 ~]#
- I started the staging server, connected to the vpn from my laptop, and was successfully able to ssh into it (though it took a long delay)
- I ssh'd into prod and kicked-off the rsync!
time sudo -E nice rsync -e 'ssh -p 32415' --bwlimit=3000 --rsync-path="sudo rsync" --exclude=/root --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/
- that also copied the old backups, which is probably unnecessary. I should also exclude
- home/b2user/sync
- this sync is going at a rate of about 1G every 5 minutes. I expect it'll be done in 5-10 hours. I'll check on it tomorrow.
Sat Oct 05, 2019
Fri Oct 04, 2019
Thr Oct 03, 2019
- continuing from yesterday, I copied the dev-specific encryption key from our shared keepass for the backups to the dev node
[root@osedev1 backups]# mv /home/maltfield/ose-dev-backups-cron.201910.key /root/backups/ [root@osedev1 backups]# chown root:root ose-dev-backups-cron.201910.key [root@osedev1 backups]# chmod 0400 ose-dev-backups-cron.201910.key [root@osedev1 backups]# ls -lah total 32K drwxr-xr-x. 4 root root 4.0K Oct 3 07:09 . dr-xr-x---. 7 root root 4.0K Oct 3 07:03 .. -rw-r--r--. 1 root root 747 Oct 2 15:57 backup.settings -rwxr-xr-x. 1 root root 5.7K Oct 3 07:03 backup.sh drwxr-xr-x. 3 root root 4.0K Sep 9 09:02 iptables -r--------. 1 root root 4.0K Oct 3 07:05 ose-dev-backups-cron.201910.key drwxr-xr-x. 2 root root 4.0K Oct 3 07:04 sync [root@osedev1 backups]#
- note that I also had to install `trickle` on the dev node
[root@osedev1 backups]# ./backup.sh ================================================================================ INFO: Beginning Backup Run on 20191003_051037 INFO: Cleaning up old backup files ... INFO: moving encrypted backup file to b2user's sync dir INFO: Beginning upload to backblaze b2 sudo: /bin/trickle: command not found real 0m0.030s user 0m0.009s sys 0m0.021s [root@osedev1 backups]# yum install trickle ... Installed: trickle.x86_64 0:1.07-19.el7 Complete! [root@osedev1 backups]#
- note that something changed in the install process of the b2cli that required me to use the '--user' flag, which changed the path to the b2 binary. To keep the mods to the backup.sh script minimal, I just created a symlink
[root@osedev1 backups]# ./backup.sh ... + echo 'INFO: Beginning upload to backblaze b2' INFO: Beginning upload to backblaze b2 + /bin/sudo -u b2user /bin/trickle -s -u 3000 /home/b2user/virtualenv/bin/b2 upload-file --noProgress --threads 1 ose-dev-server-backups /home/b2user/sync/daily_osedev120191003_051511.tar.gpg daily_osedev120191003_051511.tar.gpg trickle: exec(): No such file or directory real 0m0.040s user 0m0.012s sys 0m0.020s + exit 0 [root@osedev1 backups]# /bin/sudo -u b2user /bin/trickle -s -u 3000 /home/b2user/virtualenv/bin/b2 upload-file --noProgress --threads 1 ose-dev-server-backups /home/b2user/sync/daily_osedev120191003_051511.tar.gpg daily_osedev120191003_051511.tar.gpg trickle: exec(): No such file or directory [root@osedev1 b2user]# ln -s /home/b2user/.local/bin/b2 /home/b2user/virtualenv/bin/b2 [root@osedev1 b2user]#
- the backup script still failed at the upload to b2
[root@osedev1 backups]# ./backup.sh ... INFO: Beginning upload to backblaze b2 + /bin/sudo -u b2user /bin/trickle -s -u 3000 /home/b2user/virtualenv/bin/b2 upload-file --noProgress --threads 1 ose-dev-server-backups /home/b2user/sync/daily_osedev1_20191003_052059.tar.gpg daily_osedev1_20191003_052059.tar.gpg ERROR: Missing account data: 'NoneType' object has no attribute 'getitem' Use: b2 authorize-account real 0m0.363s user 0m0.281s sys 0m0.076s + exit 0 [root@osedev1 b2user]# [root@osedev1 b2user]# /bin/sudo -u b2user /bin/trickle -s -u 3000 /home/b2user/virtualenv/bin/b2 upload-file --noProgress --threads 1 ose-dev-server-backups /home/b2user/sync/daily_osedev1_20191003_052059.tar.gpg daily_osedev1_20191003_052059.tar.gpg ERROR: Missing account data: 'NoneType' object has no attribute 'getitem' Use: b2 authorize-account [root@osedev1 b2user]#
- per the error, I used `b2 authorize-account` and added my creds for the user 'b2user'
[root@osedev1 b2user]# su - b2user Last login: Wed Oct 2 16:15:28 CEST 2019 on pts/8 [b2user@osedev1 ~]$ .local/bin/b2 authorize-account Using https://api.backblazeb2.com Backblaze application key ID: XXXXXXXXXXXXXXXXXXXXXXXXX Backblaze application key: [b2user@osedev1 ~]$
- this time the backup succeeded!
[root@osedev1 b2user]# /root/backups/backup.sh ... INFO: moving encrypted backup file to b2user's sync dir + /bin/mv /root/backups/sync/daily_osedev1_20191003_052448.tar.gpg /home/b2user/sync/daily_osedev1_20191003_052448.tar.gpg + /bin/chown b2user /home/b2user/sync/daily_osedev1_20191003_052448.tar.gpg + echo 'INFO: Beginning upload to backblaze b2' INFO: Beginning upload to backblaze b2 + /bin/sudo -u b2user /bin/trickle -s -u 3000 /home/b2user/virtualenv/bin/b2 upload-file --noProgress --threads 1 ose-dev-server-backups /home/b2user/sync/daily_osedev1_20191003_052448.tar.gpg daily_osedev1_20191003_052448.tar.gpg URL by file name: https://f001.backblazeb2.com/file/ose-dev-server-backups/daily_osedev1_20191003_052448.tar.gpg URL by fileId: https://f001.backblazeb2.com/b2api/v2/b2_download_file_by_id?fileId=4_z2675c17c55dd1d696edd0118_f1082387e9ca2c0d4_d20191003_m052459_c001_v0001109_t0038 { "action": "upload", "fileId": "4_z2675c17c55dd1d696edd0118_f1082387e9ca2c0d4_d20191003_m052459_c001_v0001109_t0038", "fileName": "daily_osedev1_20191003_052448.tar.gpg", "size": 17233113, "uploadTimestamp": 1570080299000 } real 0m26.435s user 0m0.706s sys 0m0.251s + exit 0 [root@osedev1 b2user]#
- as an out-of-band restore validation, I downloaded the 17.2M backup file from the backblaze b2 wui onto my laptop
- again, I downloaded the encryption key from our shared keepass
user@disp5653:~/Downloads$ gpg --batch --passphrase-file ose-dev-backups-cron.201910.key --output daily_osedev1_20191003_052448.tar ose-dev-backups-cron.201910.key gpg: WARNING: no command supplied. Trying to guess what you mean ... gpg: no valid OpenPGP data found. gpg: processing message failed: Unknown system error user@disp5653:~/Downloads$ gpg --batch --passphrase-file ose-dev-backups-cron.201910.key --output daily_osedev1_20191003_052448.tar daily_osedev1_20191003_052448.tar.gpg gpg: WARNING: no command supplied. Trying to guess what you mean ... gpg: AES256 encrypted data gpg: encrypted with 1 passphrase user@disp5653:~/Downloads$ tar -xf daily_osedev1_20191003_052448.tar user@disp5653:~/Downloads$ ls daily_osedev1_20191003_052448.tar ose-dev-backups-cron.201910.key daily_osedev1_20191003_052448.tar.gpg root user@disp5653:~/Downloads$ find root/backups/sync/daily_osedev1_20191003_052448/ -type f root/backups/sync/daily_osedev1_20191003_052448/www/www.20191003_052448.tar.gz root/backups/sync/daily_osedev1_20191003_052448/root/root.20191003_052448.tar.gz root/backups/sync/daily_osedev1_20191003_052448/log/log.20191003_052448.tar.gz root/backups/sync/daily_osedev1_20191003_052448/etc/etc.20191003_052448.tar.gz root/backups/sync/daily_osedev1_20191003_052448/home/home.20191003_052448.tar.gz user@disp5653:~/Downloads$
- it looks like it's working; here's the contents of the backup file (note there's some varnish config files on here from when I did my test rsync back in on Sep 9th Maltfield_Log/2019_Q3#Mon_Sep_09.2C_2019
user@disp5653:~/Downloads$ find root/backups/sync/daily_osedev1_20191003_052448/ -type f -exec tar -tvf '{}' \; | awk '{print $6}' | cut -d/ -f 1-2 | sort -u etc/adjtime etc/aliases etc/alternatives etc/anacrontab etc/audisp etc/audit etc/bash_completion.d etc/bashrc etc/binfmt.d etc/centos-release etc/centos-release-upstream etc/chkconfig.d etc/chrony.conf etc/chrony.keys etc/cloud etc/cron.d etc/cron.daily etc/cron.deny etc/cron.hourly etc/cron.monthly etc/crontab etc/cron.weekly etc/crypttab etc/csh.cshrc etc/csh.login etc/dbus-1 etc/default etc/depmod.d etc/dhcp etc/DIR_COLORS etc/DIR_COLORS.256color etc/DIR_COLORS.lightbgcolor etc/dnsmasq.conf etc/dnsmasq.d etc/dracut.conf etc/dracut.conf.d etc/e2fsck.conf etc/environment etc/ethertypes etc/exports etc/exports.d etc/filesystems etc/firewalld etc/fstab etc/gcrypt etc/GeoIP.conf etc/GeoIP.conf.default etc/gnupg etc/GREP_COLORS etc/groff etc/group etc/group- etc/grub2.cfg etc/grub.d etc/gshadow etc/gshadow- etc/gss etc/gssproxy etc/host.conf etc/hostname etc/hosts etc/hosts.allow etc/hosts.deny etc/idmapd.conf etc/init.d etc/inittab etc/inputrc etc/iproute2 etc/iscsi etc/issue etc/issue.net etc/kdump.conf etc/kernel etc/krb5.conf etc/krb5.conf.d etc/ld.so.cache etc/ld.so.conf etc/ld.so.conf.d etc/libaudit.conf etc/libnl etc/libuser.conf etc/libvirt etc/locale.conf etc/localtime etc/login.defs etc/logrotate.conf etc/logrotate.d etc/lvm etc/lxc etc/machine-id etc/magic etc/makedumpfile.conf.sample etc/man_db.conf etc/mke2fs.conf etc/modprobe.d etc/modules-load.d etc/motd etc/mtab etc/netconfig etc/NetworkManager etc/networks etc/nfs.conf etc/nfsmount.conf etc/nsswitch.conf etc/nsswitch.conf.bak etc/numad.conf etc/openldap etc/openvpn etc/opt etc/os-release etc/pam.d etc/passwd etc/passwd- etc/pkcs11 etc/pki etc/pm etc/polkit-1 etc/popt.d etc/ppp etc/prelink.conf.d etc/printcap etc/profile etc/profile.d etc/protocols etc/python etc/qemu-ga etc/radvd.conf etc/rc0.d etc/rc1.d etc/rc2.d etc/rc3.d etc/rc4.d etc/rc5.d etc/rc6.d etc/rc.d etc/rc.local etc/redhat-release etc/request-key.conf etc/request-key.d etc/resolv.conf etc/rpc etc/rpm etc/rsyncd.conf etc/rsyslog.conf etc/rsyslog.d etc/rwtab etc/rwtab.d etc/sasl2 etc/screenrc etc/securetty etc/security etc/selinux etc/services etc/sestatus.conf etc/shadow etc/shadow- etc/shells etc/skel etc/ssh etc/ssl etc/statetab etc/statetab.d etc/subgid etc/subuid etc/sudo.conf etc/sudoers etc/sudoers.d etc/sudo-ldap.conf etc/sysconfig etc/sysctl.conf etc/sysctl.d etc/systemd etc/system-release etc/system-release-cpe etc/tcsd.conf etc/terminfo etc/timezone etc/tmpfiles.d etc/trickled.conf etc/tuned etc/udev etc/unbound etc/varnish etc/vconsole.conf etc/vimrc etc/virc etc/wpa_supplicant etc/X11 etc/xdg etc/xinetd.d etc/yum etc/yum.conf etc/yum.repos.d home/b2user home/maltfield root/anaconda-ks.cfg root/backups root/Finished root/original-ks.cfg root/Package root/pki root/Running var/log user@disp5653:~/Downloads$
- and a true end-to-end test, I restored the sshd_config file
user@disp5653:~/Downloads/root/backups/sync/daily_osedev1_20191003_052448/etc$ pwd /home/user/Downloads/root/backups/sync/daily_osedev1_20191003_052448/etc user@disp5653:~/Downloads/root/backups/sync/daily_osedev1_20191003_052448/etc$ date Thu Oct 3 11:37:49 +0545 2019 user@disp5653:~/Downloads/root/backups/sync/daily_osedev1_20191003_052448/etc$ ls etc.20191003_052448.tar.gz user@disp5653:~/Downloads/root/backups/sync/daily_osedev1_20191003_052448/etc$ tar -xzf etc.20191003_052448.tar.gz user@disp5653:~/Downloads/root/backups/sync/daily_osedev1_20191003_052448/etc$ tail etc/ssh/sshd_config # override default of no subsystems Subsystem sftp /usr/libexec/openssh/sftp-server # Example of overriding settings on a per-user basis #Match User anoncvs # X11Forwarding no # AllowTcpForwarding no # PermitTTY no # ForceCommand cvs server user@disp5653:~/Downloads/root/backups/sync/daily_osedev1_20191003_052448/etc$
- I also copied the cron job and the backup report script to the dev node
[root@opensourceecology ~]# cat /etc/cron.d/backup_to_backblaze 20 07 * * * root time /bin/nice /root/backups/backup.sh &>> /var/log/backups/backup.log 20 04 03 * * root time /bin/nice /root/backups/backupReport.sh [root@opensourceecology ~]#
- I tried testing the backup report script, but it complained that the `mail` command was absent. otherwise it appears to be working without modifications
[root@osedev1 backups]# ./backupReport.sh ./backupReport.sh: line 90: /usr/bin/mail: No such file or directory INFO: email body below ATTENTION: BACKUPS MISSING! WARNING: First of this month's backup (20191001) is missing! WARNING: First of last month's backup (20190901) is missing! WARNING: Yesterday's backup (20191002) is missing! WARNING: The day before yesterday's backup (20191001) is missing! See below for the contents of the backblaze b2 bucket = ose-dev-server-backups daily_osedev1_20191003_052448.tar.gpg --- Note: This report was generated on 20191003_060036 UTC by script '/root/backups/backupReport.sh' This script was triggered by '/etc/cron.d/backup_to_backblaze' For more information about OSE backups, please see the relevant documentation pages on the wiki: * https://wiki.opensourceecology.org/wiki/Backblaze * https://wiki.opensourceecology.org/wiki/OSE_Server#Backups [root@osedev1 backups]#
- I installed mailx and re-ran the script
[root@osedev1 backups]# yum install mailx ... Installed: mailx.x86_64 0:12.5-19.el7 Complete! [root@osedev1 backups]#
- this time it failed because sendmail is not installed; I *could* install postfix, but I decided just to install sendmail
[root@osedev1 backups]# ./backupReport.sh ... /usr/sbin/sendmail: No such file or directory "/root/dead.letter" 30/1215 . . . message not sent. [root@osedev1 backups]# rpm -qa | grep postfix [root@osedev1 backups]# rpm -qa | grep exim [root@osedev1 backups]# yum install sendmail ... Installed: sendmail.x86_64 0:8.14.7-5.el7 Dependency Installed: hesiod.x86_64 0:3.2.1-3.el7 procmail.x86_64 0:3.22-36.el7_4.1 Complete! [root@osedev1 backups]#
- this time it ran without error, but I never got an email. this is probably because gmail is rejecting it; we don't have DNS setup properly for this server to send mail. Anyway, this is good enough for our dev node's backups for now.
- I also added the same lifecycle rules that we have for the 'ose-server-backups' bucket to the 'ose-dev-server-backups' bucket in the backblaze b2 wui
- let's proceed with getting openvpn clients configured for the prod node (and its clone the staging node, which will use the same client cert)
- as I did on Sep 9 to create my client cert for 'maltfield', I created a new cert for 'hetzner2' Maltfield_Log/2019_Q3#Mon_Sep_09.2C_2019
- again, the ca and cert files are located in /usr/share/easy-rsa/3/pki/
- I documented this dir on the wiki OpenVPN
- interestingly, I could only execute these command from the dir above the pki dir
[root@osedev1 pki]# /usr/share/easy-rsa/3.0.6/easyrsa build-client-full hetzner2 Easy-RSA error: EASYRSA_PKI does not exist (perhaps you need to run init-pki)? Expected to find the EASYRSA_PKI at: /usr/share/easy-rsa/3/pki/pki Run easyrsa without commands for usage and command help. [root@osedev1 pki]# [root@osedev1 pki]# cd .. [root@osedev1 3]# /usr/share/easy-rsa/3.0.6/easyrsa build-client-full hetzner2 Using SSL: openssl OpenSSL 1.0.2k-fips 26 Jan 2017 Generating a 2048 bit RSA private key .......................................................................+++ ............................................+++ writing new private key to '/usr/share/easy-rsa/3/pki/private/hetzner2.key.7F3A32KzES' Enter PEM pass phrase:
- note I appended the option 'nopass' so that the hetzner2 prod server could connect to the vpn using a private certificate file only & automatically, without requiring a password (it may be a good idea to look into if we can whitelist a specific IP for this user, since this hetzner2 client will only connect from the prod or staging server's static ip addresses)
[root@osedev1 3]# /usr/share/easy-rsa/3.0.6/easyrsa help build-client-full build-client-full <filename_base> [ cmd-opts ] build-server-full <filename_base> [ cmd-opts ] build-serverClient-full <filename_base> [ cmd-opts ] Generate a keypair and sign locally for a client and/or server This mode uses the <filename_base> as the X509 CN. cmd-opts is an optional set of command options from this list: nopass - do not encrypt the private key (default is encrypted) [root@osedev1 3]# /usr/share/easy-rsa/3.0.6/easyrsa build-client-full hetzner2 nopass Using SSL: openssl OpenSSL 1.0.2k-fips 26 Jan 2017 Generating a 2048 bit RSA private key ..................................................................................................+++ .....+++ writing new private key to '/usr/share/easy-rsa/3/pki/private/hetzner2.key.qQ1HGf7ovg' ----- Using configuration from /usr/share/easy-rsa/3/pki/safessl-easyrsa.cnf Enter pass phrase for /usr/share/easy-rsa/3/pki/private/ca.key: Check that the request matches the signature Signature ok The Subject's Distinguished Name is as follows commonName :ASN.1 12:'hetzner2' Certificate is to be certified until Sep 17 06:42:28 2022 GMT (1080 days) Write out database with 1 new entries Data Base Updated [root@osedev1 3]#
- I copied the necessary files to the prod server
[root@osedev1 3]# cp pki/private/hetzner2.key /home/maltfield/ [root@osedev1 3]# cp pki/issued/hetzner2.crt /home/maltfield/ [root@osedev1 3]# cp pki/private/ta.key /home/maltfield/ [root@osedev1 3]# cp pki/ca.crt /home/maltfield/ [root@osedev1 3]# chown maltfield /home/maltfield/*.cert [root@osedev1 3]# chown maltfield /home/maltfield/*.key [root@osedev1 3]# logout [maltfield@osedev1 ~]$ scp -P32415 /home/maltfield/hetzner2* opensourceecology.org: hetzner2.crt 100% 5675 2.8MB/s 00:00 hetzner2.key 100% 1708 1.0MB/s 00:00 [maltfield@osedev1 ~]$ scp -P32415 /home/maltfield/*.key opensourceecology.org: hetzner2.key 100% 1708 1.0MB/s 00:00 ta.key 100% 636 368.9KB/s 00:00 [maltfield@osedev1 ~]$ shred -u /home/maltfield/*.key [maltfield@osedev1 ~]$ shred -u /home/maltfield/hetzner2.* [maltfield@osedev1 ~]$
- and I moved them to '/root/openvpn' and locked-down the files on the prod hetzner2 server
[root@opensourceecology maltfield]# cd /root [root@opensourceecology ~]# ls backups bin iptables output.json rsyncTest sandbox staging.opensourceecology.org tmp [root@opensourceecology ~]# mkdir openvpn [root@opensourceecology ~]# cd openvpn [root@opensourceecology openvpn]# mv /home/maltfield/hetzner2* . [root@opensourceecology openvpn]# mv /home/maltfield/*.key . [root@opensourceecology openvpn]# mv /home/maltfield/ca.crt . [root@opensourceecology openvpn]# ls -lah total 28K drwxr-xr-x 2 root root 4.0K Oct 3 06:53 . dr-xr-x---. 20 root root 4.0K Oct 3 06:53 .. -rw------- 1 maltfield maltfield 3.3K Oct 3 06:51 ca.crt -rw------- 1 maltfield maltfield 5.6K Oct 3 06:51 hetzner2.crt -rw------- 1 maltfield maltfield 1.7K Oct 3 06:51 hetzner2.key -rw------- 1 maltfield maltfield 636 Oct 3 06:51 ta.key [root@opensourceecology openvpn]# chown root:root * [root@opensourceecology openvpn]# ls -lah total 28K drwxr-xr-x 2 root root 4.0K Oct 3 06:53 . dr-xr-x---. 20 root root 4.0K Oct 3 06:53 .. -rw------- 1 root root 3.3K Oct 3 06:51 ca.crt -rw------- 1 root root 5.6K Oct 3 06:51 hetzner2.crt -rw------- 1 root root 1.7K Oct 3 06:51 hetzner2.key -rw------- 1 root root 636 Oct 3 06:51 ta.key [root@opensourceecology openvpn]# chmod 0700 . [root@opensourceecology openvpn]# ls -lah total 28K drwx------ 2 root root 4.0K Oct 3 06:53 . dr-xr-x---. 20 root root 4.0K Oct 3 06:53 .. -rw------- 1 root root 3.3K Oct 3 06:51 ca.crt -rw------- 1 root root 5.6K Oct 3 06:51 hetzner2.crt -rw------- 1 root root 1.7K Oct 3 06:51 hetzner2.key -rw------- 1 root root 636 Oct 3 06:51 ta.key [root@opensourceecology openvpn]#
- then I created a client.conf file from my personal client.conf file & modified it to use the new cert & key files
[root@opensourceecology openvpn]# vim client.conf [root@opensourceecology openvpn]# ls -lah client.conf -rw-r--r-- 1 root root 3.6K Oct 3 06:56 client.conf [root@opensourceecology openvpn]# chmod 0600 client.conf [root@opensourceecology openvpn]# cat client.conf ############################################## # Sample client-side OpenVPN 2.0 config file # # for connecting to multi-client server. # # # # This configuration can be used by multiple # # clients, however each client should have # # its own cert and key files. # # # # On Windows, you might want to rename this # # file so it has a .ovpn extension # ############################################## # Specify that we are a client and that we # will be pulling certain config file directives # from the server. client # Use the same setting as you are using on # the server. # On most systems, the VPN will not function # unless you partially or fully disable # the firewall for the TUN/TAP interface. ;dev tap dev tun # Windows needs the TAP-Win32 adapter name # from the Network Connections panel # if you have more than one. On XP SP2, # you may need to disable the firewall # for the TAP adapter. ;dev-node MyTap # Are we connecting to a TCP or # UDP server? Use the same setting as # on the server. ;proto tcp proto udp # The hostname/IP and port of the server. # You can have multiple remote entries # to load balance between the servers. remote 195.201.233.113 1194 ;remote my-server-2 1194 # Choose a random host from the remote # list for load-balancing. Otherwise # try hosts in the order specified. ;remote-random # Keep trying indefinitely to resolve the # host name of the OpenVPN server. Very useful # on machines which are not permanently connected # to the internet such as laptops. resolv-retry infinite # Most clients don't need to bind to # a specific local port number. nobind # Downgrade privileges after initialization (non-Windows only) ;user nobody ;group nobody # Try to preserve some state across restarts. persist-key persist-tun # If you are connecting through an # HTTP proxy to reach the actual OpenVPN # server, put the proxy server/IP and # port number here. See the man page # if your proxy server requires # authentication. ;http-proxy-retry # retry on connection failures ;http-proxy [proxy server] [proxy port #] # Wireless networks often produce a lot # of duplicate packets. Set this flag # to silence duplicate packet warnings. ;mute-replay-warnings # SSL/TLS parms. # See the server config file for more # description. It's best to use # a separate .crt/.key file pair # for each client. A single ca # file can be used for all clients. ca ca.crt cert hetzner2.crt key hetzner2.key # Verify server certificate by checking that the # certicate has the correct key usage set. # This is an important precaution to protect against # a potential attack discussed here: # http://openvpn.net/howto.html#mitm # # To use this feature, you will need to generate # your server certificates with the keyUsage set to # digitalSignature, keyEncipherment # and the extendedKeyUsage to # serverAuth # EasyRSA can do this for you. remote-cert-tls server # If a tls-auth key is used on the server # then every client must also have the key. tls-auth ta.key 1 # Select a cryptographic cipher. # If the cipher option is used on the server # then you must also specify it here. # Note that v2.4 client/server will automatically # negotiate AES-256-GCM in TLS mode. # See also the ncp-cipher option in the manpage cipher AES-256-GCM # Enable compression on the VPN link. # Don't enable this unless it is also # enabled in the server config file. #comp-lzo # Set log file verbosity. verb 3 # Silence repeating messages ;mute 20 # hardening tls-cipher TLS-DHE-RSA-WITH-AES-256-GCM-SHA384 [root@opensourceecology openvpn]#
- I installed the 'openvpn' package on the production hetzner2 server
[root@opensourceecology openvpn]# yum install openvpn ... Installed: openvpn.x86_64 0:2.4.7-1.el7 Dependency Installed: lz4.x86_64 0:1.7.5-3.el7 pkcs11-helper.x86_64 0:1.11-3.el7 Complete! [root@opensourceecology openvpn]#
- I was successfully able to connect to the vpn on the dev node from the prod node
[root@opensourceecology openvpn]# openvpn client.conf Thu Oct 3 07:06:45 2019 OpenVPN 2.4.7 x86_64-redhat-linux-gnu [Fedora EPEL patched] [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Feb 20 2019 Thu Oct 3 07:06:45 2019 library versions: OpenSSL 1.0.2k-fips 26 Jan 2017, LZO 2.06 Thu Oct 3 07:06:45 2019 Outgoing Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Thu Oct 3 07:06:45 2019 Incoming Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Thu Oct 3 07:06:45 2019 TCP/UDP: Preserving recently used remote address: [AF_INET]195.201.233.113:1194 Thu Oct 3 07:06:45 2019 Socket Buffers: R=[212992->212992] S=[212992->212992] Thu Oct 3 07:06:45 2019 UDP link local: (not bound) Thu Oct 3 07:06:45 2019 UDP link remote: [AF_INET]195.201.233.113:1194 Thu Oct 3 07:06:45 2019 TLS: Initial packet from [AF_INET]195.201.233.113:1194, sid=865b6fa1 7dcf4731 Thu Oct 3 07:06:45 2019 VERIFY OK: depth=1, CN=osedev1 Thu Oct 3 07:06:45 2019 VERIFY KU OK Thu Oct 3 07:06:45 2019 Validating certificate extended key usage Thu Oct 3 07:06:45 2019 ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication Thu Oct 3 07:06:45 2019 VERIFY EKU OK Thu Oct 3 07:06:45 2019 VERIFY OK: depth=0, CN=server Thu Oct 3 07:06:45 2019 Control Channel: TLSv1.2, cipher TLSv1/SSLv3 DHE-RSA-AES256-GCM-SHA384, 4096 bit RSA Thu Oct 3 07:06:45 2019 [server] Peer Connection Initiated with [AF_INET]195.201.233.113:1194 Thu Oct 3 07:06:46 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 07:06:46 2019 PUSH: Received control message: 'PUSH_REPLY,route 10.241.189.1,topology net30,ping 10,ping-restart 120,ifconfig 10.241.189.10 10.241.189.9,peer-id 0,cipher AES-256-GCM' Thu Oct 3 07:06:46 2019 OPTIONS IMPORT: timers and/or timeouts modified Thu Oct 3 07:06:46 2019 OPTIONS IMPORT: --ifconfig/up options modified Thu Oct 3 07:06:46 2019 OPTIONS IMPORT: route options modified Thu Oct 3 07:06:46 2019 OPTIONS IMPORT: peer-id set Thu Oct 3 07:06:46 2019 OPTIONS IMPORT: adjusting link_mtu to 1624 Thu Oct 3 07:06:46 2019 OPTIONS IMPORT: data channel crypto options modified Thu Oct 3 07:06:46 2019 Outgoing Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key Thu Oct 3 07:06:46 2019 Incoming Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key Thu Oct 3 07:06:46 2019 ROUTE_GATEWAY 138.201.84.193 Thu Oct 3 07:06:46 2019 TUN/TAP device tun0 opened Thu Oct 3 07:06:46 2019 TUN/TAP TX queue length set to 100 Thu Oct 3 07:06:46 2019 /sbin/ip link set dev tun0 up mtu 1500 Thu Oct 3 07:06:46 2019 /sbin/ip addr add dev tun0 local 10.241.189.10 peer 10.241.189.9 Thu Oct 3 07:06:46 2019 /sbin/ip route add 10.241.189.1/32 via 10.241.189.9 Thu Oct 3 07:06:46 2019 WARNING: this configuration may cache passwords in memory -- use the auth-nocache option to prevent this Thu Oct 3 07:06:46 2019 Initialization Sequence Completed
- the prod server now has a tun0 interface with an ip address of 10.241.189.10 on the VPN private network subnet
[root@opensourceecology ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 90:1b:0e:94:07:c4 brd ff:ff:ff:ff:ff:ff inet 138.201.84.223 peer 138.201.84.193/32 brd 138.201.84.223 scope global eth0 valid_lft forever preferred_lft forever inet 138.201.84.223/32 scope global eth0 valid_lft forever preferred_lft forever inet 138.201.84.243/16 scope global eth0 valid_lft forever preferred_lft forever inet 138.201.84.243 peer 138.201.84.193/32 brd 138.201.255.255 scope global secondary eth0 valid_lft forever preferred_lft forever inet6 2a01:4f8:172:209e::2/64 scope global valid_lft forever preferred_lft forever inet6 fe80::921b:eff:fe94:7c4/64 scope link valid_lft forever preferred_lft forever 4: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 100 link/none inet 10.241.189.10 peer 10.241.189.9/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::a834:c77a:f65f:76fc/64 scope link flags 800 valid_lft forever preferred_lft forever [root@opensourceecology ~]#
- I confirmed that the website didn't break ☺
- now I created the same dir on the staging node (note this weird systemd journal corruption error that slowed things down quite a bit)
[root@osedev1 ~]# lxc-start -n osestaging1 systemd 219 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN) Detected virtualization lxc. Detected architecture x86-64. Welcome to CentOS Linux 7 (Core)! ... Kernel 3.10.0-957.21.3.el7.x86_64 on an x86_64 osestaging1 login: maltfield Password: Last login: Wed Oct 2 13:01:56 on lxc/console [maltfield@osestaging1 ~]$ sudo su - [sudo] password for maltfield: <44>systemd-journald[297]: File /run/log/journal/dd9978e8797e4112832634fa4d174c7b/system.journal corrupted or uncleanly shut down, renaming and replacing. Last login: Wed Oct 2 13:15:46 UTC 2019 on lxc/console Last failed login: Thu Oct 3 07:11:57 UTC 2019 on lxc/console There was 1 failed login attempt since the last successful login. [root@osestaging1 ~]#
- on the dev node again
[root@osedev1 pki]# cp private/hetzner2.key /home/maltfield/ [root@osedev1 pki]# cp issued/hetzner2.crt /home/maltfield/ [root@osedev1 pki]# cp private/ta.key /home/maltfield/ [root@osedev1 pki]# chown maltfield /home/maltfield/*.key [root@osedev1 pki]# chown maltfield /home/maltfield/*.crt [root@osedev1 pki]# logout [maltfield@osedev1 ~]$ scp -P 32415 /home/maltfield/*.key 192.168.122.201: hetzner2.key 100% 1708 2.4MB/s 00:00 ta.key 100% 636 1.2MB/s 00:00 [maltfield@osedev1 ~]$ scp -P 32415 /home/maltfield/*.crt 192.168.122.201: ca.crt 100% 1850 2.6MB/s 00:00 hetzner2.crt 100% 5675 9.0MB/s 00:00 [maltfield@osedev1 ~]$ shred -u /home/maltfield/*.key [maltfield@osedev1 ~]$ shred -u /home/maltfield/*.crt [maltfield@osedev1 ~]$
- and back on the staging container node
[root@osestaging1 ~]# cd /root/openvpn [root@osestaging1 openvpn]# ls [root@osestaging1 openvpn]# mv /home/maltfield/*.crt . [root@osestaging1 openvpn]# mv /home/maltfield/*.key . [root@osestaging1 openvpn]# ls -lah total 28K drwxr-xr-x. 2 root root 4.0K Oct 3 07:23 . dr-xr-x---. 3 root root 4.0K Oct 3 07:18 .. -rw-------. 1 maltfield maltfield 1.9K Oct 3 07:21 ca.crt -rw-------. 1 maltfield maltfield 5.6K Oct 3 07:21 hetzner2.crt -rw-------. 1 maltfield maltfield 1.7K Oct 3 07:21 hetzner2.key -rw-------. 1 maltfield maltfield 636 Oct 3 07:21 ta.key [root@osestaging1 openvpn]# chown root:root * [root@osestaging1 openvpn]# chmod 0700 . [root@osestaging1 openvpn]# ls -lah total 28K drwx------. 2 root root 4.0K Oct 3 07:23 . dr-xr-x---. 3 root root 4.0K Oct 3 07:18 .. -rw-------. 1 root root 1.9K Oct 3 07:21 ca.crt -rw-------. 1 root root 5.6K Oct 3 07:21 hetzner2.crt -rw-------. 1 root root 1.7K Oct 3 07:21 hetzner2.key -rw-------. 1 root root 636 Oct 3 07:21 ta.key [root@osestaging1 openvpn]#
- I also installed vim, epel-release, and openvpn on the staging node
- I had an issue connecting to to the vpn from within the staging node; this appears to be an issue for trying to connect to a vpn from within a docker or lxc container https://serverfault.com/questions/429461/no-tun-device-in-lxc-guest-for-openvpn
[root@osestaging1 openvpn]# openvpn client.conf Thu Oct 3 07:29:17 2019 OpenVPN 2.4.7 x86_64-redhat-linux-gnu [Fedora EPEL patched] [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Feb 20 2019 Thu Oct 3 07:29:17 2019 library versions: OpenSSL 1.0.2k-fips 26 Jan 2017, LZO 2.06 Thu Oct 3 07:29:17 2019 Outgoing Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Thu Oct 3 07:29:17 2019 Incoming Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Thu Oct 3 07:29:17 2019 TCP/UDP: Preserving recently used remote address: [AF_INET]195.201.233.113:1194 Thu Oct 3 07:29:17 2019 Socket Buffers: R=[212992->212992] S=[212992->212992] Thu Oct 3 07:29:17 2019 UDP link local: (not bound) Thu Oct 3 07:29:17 2019 UDP link remote: [AF_INET]195.201.233.113:1194 Thu Oct 3 07:29:17 2019 TLS: Initial packet from [AF_INET]195.201.233.113:1194, sid=f2e8fcad efdb9311 Thu Oct 3 07:29:17 2019 VERIFY OK: depth=1, CN=osedev1 Thu Oct 3 07:29:17 2019 VERIFY KU OK Thu Oct 3 07:29:17 2019 Validating certificate extended key usage Thu Oct 3 07:29:17 2019 ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication Thu Oct 3 07:29:17 2019 VERIFY EKU OK Thu Oct 3 07:29:17 2019 VERIFY OK: depth=0, CN=server Thu Oct 3 07:29:17 2019 Control Channel: TLSv1.2, cipher TLSv1/SSLv3 DHE-RSA-AES256-GCM-SHA384, 4096 bit RSA Thu Oct 3 07:29:17 2019 [server] Peer Connection Initiated with [AF_INET]195.201.233.113:1194 Thu Oct 3 07:29:18 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 07:29:18 2019 PUSH: Received control message: 'PUSH_REPLY,route 10.241.189.1,topology net30,ping 10,ping-restart 120,ifconfig 10.241.189.10 10.241.189.9,peer-id 0,cipher AES-256-GCM' Thu Oct 3 07:29:18 2019 OPTIONS IMPORT: timers and/or timeouts modified Thu Oct 3 07:29:18 2019 OPTIONS IMPORT: --ifconfig/up options modified Thu Oct 3 07:29:18 2019 OPTIONS IMPORT: route options modified Thu Oct 3 07:29:18 2019 OPTIONS IMPORT: peer-id set Thu Oct 3 07:29:18 2019 OPTIONS IMPORT: adjusting link_mtu to 1624 Thu Oct 3 07:29:18 2019 OPTIONS IMPORT: data channel crypto options modified Thu Oct 3 07:29:18 2019 Outgoing Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key Thu Oct 3 07:29:18 2019 Incoming Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key Thu Oct 3 07:29:18 2019 ROUTE_GATEWAY 192.168.122.1/255.255.255.0 IFACE=eth0 HWADDR=fe:07:06:a6:5f:1d Thu Oct 3 07:29:18 2019 ERROR: Cannot open TUN/TAP dev /dev/net/tun: No such file or directory (errno=2) Thu Oct 3 07:29:18 2019 Exiting due to fatal error [root@osestaging1 openvpn]#
- the above link suggests following the arch linux guide to create an openvpn client systemd module within the container
[root@osestaging1 openvpn]# ls /usr/lib/systemd/system/openvpn-client\@.service /usr/lib/systemd/system/openvpn-client@.service [root@osestaging1 openvpn]# ls /etc/systemd/system/ basic.target.wants default.target.wants local-fs.target.wants sysinit.target.wants default.target getty.target.wants multi-user.target.wants system-update.target.wants [root@osestaging1 openvpn]# cp /usr/lib/systemd/system/openvpn-client\@.service /etc/systemd/system/ [root@osestaging1 openvpn]# grep /etc/systemd/system/openvpn-client\@.service LimitNPROC grep: LimitNPROC: No such file or directory [root@osestaging1 openvpn]# grep LimitNPROC /etc/systemd/system/openvpn-client\@.service LimitNPROC=10 [root@osestaging1 openvpn]# vim /etc/systemd/system/openvpn-client\@.service [root@osestaging1 openvpn]# grep LimitNPROC /etc/systemd/system/openvpn-client\@.service #LimitNPROC=10 [root@osestaging1 openvpn]#
- that didn't work; it wants something after the '@' I did that, and realized that I'll need to further modify it with the correct config file
[root@osestaging1 openvpn]# cd /etc/systemd/system [root@osestaging1 system]# ls basic.target.wants getty.target.wants openvpn-client@.service default.target local-fs.target.wants sysinit.target.wants default.target.wants multi-user.target.wants system-update.target.wants [root@osestaging1 system]# mv openvpn-client\@.service openvpn-client\@dev.service [root@osestaging1 system]# systemctl status openvpn-client\@dev.service ● openvpn-client@dev.service - OpenVPN tunnel for dev Loaded: loaded (/etc/systemd/system/openvpn-client@dev.service; disabled; vendor preset: disabled) Active: inactive (dead) Docs: man:openvpn(8) https://community.openvpn.net/openvpn/wiki/Openvpn24ManPage https://community.openvpn.net/openvpn/wiki/HOWTO [root@osestaging1 system]# systemctl start openvpn-client\@dev.service Job for openvpn-client@dev.service failed because the control process exited with error code. See "systemctl status openvpn-client@dev.service" and "journalctl -xe" for details. [root@osestaging1 system]# systemctl status openvpn-client\@dev.service ● openvpn-client@dev.service - OpenVPN tunnel for dev Loaded: loaded (/etc/systemd/system/openvpn-client@dev.service; disabled; vendor preset: disabled) Active: failed (Result: exit-code) since Thu 2019-10-03 07:44:09 UTC; 16s ago Docs: man:openvpn(8) https://community.openvpn.net/openvpn/wiki/Openvpn24ManPage https://community.openvpn.net/openvpn/wiki/HOWTO Process: 557 ExecStart=/usr/sbin/openvpn --suppress-timestamps --nobind --config %i.conf (code=exited, status=1/FAILURE) Main PID: 557 (code=exited, status=1/FAILURE) Oct 03 07:44:08 osestaging1 systemd[1]: Starting OpenVPN tunnel for dev... Oct 03 07:44:09 osestaging1 openvpn[557]: Options error: In [CMD-LINE]:1: Error opening configuration file: dev.conf Oct 03 07:44:09 osestaging1 openvpn[557]: Use --help for more information. Oct 03 07:44:09 osestaging1 systemd[1]: openvpn-client@dev.service: main process exited, code=exited, status=...ILURE Oct 03 07:44:09 osestaging1 systemd[1]: Failed to start OpenVPN tunnel for dev. Oct 03 07:44:09 osestaging1 systemd[1]: Unit openvpn-client@dev.service entered failed state. Oct 03 07:44:09 osestaging1 systemd[1]: openvpn-client@dev.service failed. Hint: Some lines were ellipsized, use -l to show in full. [root@osestaging1 system]# vim openvpn-client\@dev.service
- I updated the working dir and changed the service name to match the name of the config file in there
[root@osestaging1 system]# cat openvpn-client\@dev.service [Unit] Description=OpenVPN tunnel for %I After=syslog.target network-online.target Wants=network-online.target Documentation=man:openvpn(8) Documentation=https://community.openvpn.net/openvpn/wiki/Openvpn24ManPage Documentation=https://community.openvpn.net/openvpn/wiki/HOWTO [Service] Type=notify PrivateTmp=true WorkingDirectory=/etc/openvpn/client ExecStart=/usr/sbin/openvpn --suppress-timestamps --nobind --config %i.conf CapabilityBoundingSet=CAP_IPC_LOCK CAP_NET_ADMIN CAP_NET_RAW CAP_SETGID CAP_SETUID CAP_SYS_CHROOT CAP_DAC_OVERRIDE #LimitNPROC=10 DeviceAllow=/dev/null rw DeviceAllow=/dev/net/tun rw ProtectSystem=true ProtectHome=true KillMode=process [Install] WantedBy=multi-user.target [root@osestaging1 system]# vim openvpn-client\@dev.service [root@osestaging1 system]# mv openvpn-client\@dev.service openvpn-client\@client.service [root@osestaging1 system]# cat openvpn-client\@client.service [Unit] Description=OpenVPN tunnel for %I After=syslog.target network-online.target Wants=network-online.target Documentation=man:openvpn(8) Documentation=https://community.openvpn.net/openvpn/wiki/Openvpn24ManPage Documentation=https://community.openvpn.net/openvpn/wiki/HOWTO [Service] Type=notify PrivateTmp=true #WorkingDirectory=/etc/openvpn/client WorkingDirectory=/root/openvpn ExecStart=/usr/sbin/openvpn --suppress-timestamps --nobind --config %i.conf CapabilityBoundingSet=CAP_IPC_LOCK CAP_NET_ADMIN CAP_NET_RAW CAP_SETGID CAP_SETUID CAP_SYS_CHROOT CAP_DAC_OVERRIDE #LimitNPROC=10 DeviceAllow=/dev/null rw DeviceAllow=/dev/net/tun rw ProtectSystem=true ProtectHome=true KillMode=process [Install] WantedBy=multi-user.target [root@osestaging1 system]#
- this failed; I gave up and went with manually creating the tun interface per the guide, even though someone else commented taht this would no longer work; it worked!
[root@osestaging1 openvpn]# openvpn client.conf Thu Oct 3 08:02:50 2019 OpenVPN 2.4.7 x86_64-redhat-linux-gnu [Fedora EPEL patched] [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Feb 20 2019 Thu Oct 3 08:02:50 2019 library versions: OpenSSL 1.0.2k-fips 26 Jan 2017, LZO 2.06 Thu Oct 3 08:02:50 2019 Outgoing Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Thu Oct 3 08:02:50 2019 Incoming Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Thu Oct 3 08:02:50 2019 TCP/UDP: Preserving recently used remote address: [AF_INET]195.201.233.113:1194 Thu Oct 3 08:02:50 2019 Socket Buffers: R=[212992->212992] S=[212992->212992] Thu Oct 3 08:02:50 2019 UDP link local: (not bound) Thu Oct 3 08:02:50 2019 UDP link remote: [AF_INET]195.201.233.113:1194 Thu Oct 3 08:02:50 2019 TLS: Initial packet from [AF_INET]195.201.233.113:1194, sid=10846fe0 74bf0345 Thu Oct 3 08:02:50 2019 VERIFY OK: depth=1, CN=osedev1 Thu Oct 3 08:02:50 2019 VERIFY KU OK Thu Oct 3 08:02:50 2019 Validating certificate extended key usage Thu Oct 3 08:02:50 2019 ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication Thu Oct 3 08:02:50 2019 VERIFY EKU OK Thu Oct 3 08:02:50 2019 VERIFY OK: depth=0, CN=server Thu Oct 3 08:02:50 2019 Control Channel: TLSv1.2, cipher TLSv1/SSLv3 DHE-RSA-AES256-GCM-SHA384, 4096 bit RSA Thu Oct 3 08:02:50 2019 [server] Peer Connection Initiated with [AF_INET]195.201.233.113:1194 Thu Oct 3 08:02:51 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:02:51 2019 PUSH: Received control message: 'PUSH_REPLY,route 10.241.189.1,topology net30,ping 10,ping-restart 120,ifconfig 10.241.189.10 10.241.189.9,peer-id 0,cipher AES-256-GCM' Thu Oct 3 08:02:51 2019 OPTIONS IMPORT: timers and/or timeouts modified Thu Oct 3 08:02:51 2019 OPTIONS IMPORT: --ifconfig/up options modified Thu Oct 3 08:02:51 2019 OPTIONS IMPORT: route options modified Thu Oct 3 08:02:51 2019 OPTIONS IMPORT: peer-id set Thu Oct 3 08:02:51 2019 OPTIONS IMPORT: adjusting link_mtu to 1624 Thu Oct 3 08:02:51 2019 OPTIONS IMPORT: data channel crypto options modified Thu Oct 3 08:02:51 2019 Outgoing Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key Thu Oct 3 08:02:51 2019 Incoming Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key Thu Oct 3 08:02:51 2019 ROUTE_GATEWAY 192.168.122.1/255.255.255.0 IFACE=eth0 HWADDR=fe:07:06:a6:5f:1d Thu Oct 3 08:02:51 2019 ERROR: Cannot open TUN/TAP dev /dev/net/tun: No such file or directory (errno=2) Thu Oct 3 08:02:51 2019 Exiting due to fatal error [root@osestaging1 openvpn]# mkdir /dev/net [root@osestaging1 openvpn]# mknod /dev/net/tun c 10 200 [root@osestaging1 openvpn]# chmod 666 /dev/net/tun [root@osestaging1 openvpn]# openvpn client.conf Thu Oct 3 08:03:42 2019 OpenVPN 2.4.7 x86_64-redhat-linux-gnu [Fedora EPEL patched] [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Feb 20 2019 Thu Oct 3 08:03:42 2019 library versions: OpenSSL 1.0.2k-fips 26 Jan 2017, LZO 2.06 Thu Oct 3 08:03:42 2019 Outgoing Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Thu Oct 3 08:03:42 2019 Incoming Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication Thu Oct 3 08:03:42 2019 TCP/UDP: Preserving recently used remote address: [AF_INET]195.201.233.113:1194 Thu Oct 3 08:03:42 2019 Socket Buffers: R=[212992->212992] S=[212992->212992] Thu Oct 3 08:03:42 2019 UDP link local: (not bound) Thu Oct 3 08:03:42 2019 UDP link remote: [AF_INET]195.201.233.113:1194 Thu Oct 3 08:03:42 2019 TLS: Initial packet from [AF_INET]195.201.233.113:1194, sid=dcadaef9 7ebea8f1 Thu Oct 3 08:03:42 2019 VERIFY OK: depth=1, CN=osedev1 Thu Oct 3 08:03:42 2019 VERIFY KU OK Thu Oct 3 08:03:42 2019 Validating certificate extended key usage Thu Oct 3 08:03:42 2019 ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication Thu Oct 3 08:03:42 2019 VERIFY EKU OK Thu Oct 3 08:03:42 2019 VERIFY OK: depth=0, CN=server Thu Oct 3 08:03:42 2019 Control Channel: TLSv1.2, cipher TLSv1/SSLv3 DHE-RSA-AES256-GCM-SHA384, 4096 bit RSA Thu Oct 3 08:03:42 2019 [server] Peer Connection Initiated with [AF_INET]195.201.233.113:1194 Thu Oct 3 08:03:43 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:03:48 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:03:53 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:03:59 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:04 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:09 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:15 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:20 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:25 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:30 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:35 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:41 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:46 2019 No reply from server after sending 12 push requests Thu Oct 3 08:04:46 2019 SIGUSR1[soft,no-push-reply] received, process restarting Thu Oct 3 08:04:46 2019 Restart pause, 5 second(s) Thu Oct 3 08:04:51 2019 TCP/UDP: Preserving recently used remote address: [AF_INET]195.201.233.113:1194 Thu Oct 3 08:04:51 2019 Socket Buffers: R=[212992->212992] S=[212992->212992] Thu Oct 3 08:04:51 2019 UDP link local: (not bound) Thu Oct 3 08:04:51 2019 UDP link remote: [AF_INET]195.201.233.113:1194 Thu Oct 3 08:04:51 2019 TLS: Initial packet from [AF_INET]195.201.233.113:1194, sid=c3f6bcfa 04f701bb Thu Oct 3 08:04:51 2019 VERIFY OK: depth=1, CN=osedev1 Thu Oct 3 08:04:51 2019 VERIFY KU OK Thu Oct 3 08:04:51 2019 Validating certificate extended key usage Thu Oct 3 08:04:51 2019 ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication Thu Oct 3 08:04:51 2019 VERIFY EKU OK Thu Oct 3 08:04:51 2019 VERIFY OK: depth=0, CN=server Thu Oct 3 08:04:51 2019 Control Channel: TLSv1.2, cipher TLSv1/SSLv3 DHE-RSA-AES256-GCM-SHA384, 4096 bit RSA Thu Oct 3 08:04:51 2019 [server] Peer Connection Initiated with [AF_INET]195.201.233.113:1194 Thu Oct 3 08:04:53 2019 SENT CONTROL [server]: 'PUSH_REQUEST' (status=1) Thu Oct 3 08:04:53 2019 PUSH: Received control message: 'PUSH_REPLY,route 10.241.189.1,topology net30,ping 10,ping-restart 120,ifconfig 10.241.189.10 10.241.189.9,peer-id 1,cipher AES-256-GCM' Thu Oct 3 08:04:53 2019 OPTIONS IMPORT: timers and/or timeouts modified Thu Oct 3 08:04:53 2019 OPTIONS IMPORT: --ifconfig/up options modified Thu Oct 3 08:04:53 2019 OPTIONS IMPORT: route options modified Thu Oct 3 08:04:53 2019 OPTIONS IMPORT: peer-id set Thu Oct 3 08:04:53 2019 OPTIONS IMPORT: adjusting link_mtu to 1624 Thu Oct 3 08:04:53 2019 OPTIONS IMPORT: data channel crypto options modified Thu Oct 3 08:04:53 2019 Outgoing Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key Thu Oct 3 08:04:53 2019 Incoming Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key Thu Oct 3 08:04:53 2019 ROUTE_GATEWAY 192.168.122.1/255.255.255.0 IFACE=eth0 HWADDR=fe:07:06:a6:5f:1d Thu Oct 3 08:04:53 2019 TUN/TAP device tun0 opened Thu Oct 3 08:04:53 2019 TUN/TAP TX queue length set to 100 Thu Oct 3 08:04:53 2019 /sbin/ip link set dev tun0 up mtu 1500 Thu Oct 3 08:04:53 2019 /sbin/ip addr add dev tun0 local 10.241.189.10 peer 10.241.189.9 Thu Oct 3 08:04:53 2019 /sbin/ip route add 10.241.189.1/32 via 10.241.189.9 Thu Oct 3 08:04:53 2019 WARNING: this configuration may cache passwords in memory -- use the auth-nocache option to prevent this Thu Oct 3 08:04:53 2019 Initialization Sequence Completed
- I found that I've become stuck in a lxc console since the escape keyboard sequence uses the same keystroke as screen (ctrl-a). the solution is to define an alternate escape sequence (ie: ctrl-e) using `-e'^e'` https://serverfault.com/questions/567696/byobu-how-to-disconnect-from-lxc-console
[root@osedev1 ~]# lxc-console -e '^e' -n osestaging1 Connected to tty 1 Type <Ctrl+e q> to exit the console, <Ctrl+e Ctrl+e> to enter Ctrl+e itself [root@osedev1 ~]#
- I also had to change the tty to 0 to actually get access
[root@osedev1 ~]# lxc-console -e '^e' -n osestaging1 -t 0 lxc_container: commands.c: lxc_cmd_console: 724 Console 0 invalid, busy or all consoles busy. [root@osedev1 ~]# [root@osedev1 ~]#
- I went ahead and connected to the vpn from 3x clients: my laptop, the staging container, and the prod server
- oddly, I noticed that the ip address given to the staging server and the prod server were the same (they do use the same client cert, but I expected them to have a distinct ip address
user@ose:~/openvpn$ ip address show dev tun0 4: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100 link/none inet 10.241.189.6 peer 10.241.189.5/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::2ab6:3617:63cc:c654/64 scope link flags 800 valid_lft forever preferred_lft forever user@ose:~/openvpn$
[root@opensourceecology openvpn]# ip address show dev tun0 4: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 100 link/none inet 10.241.189.10 peer 10.241.189.9/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::a834:c77a:f65f:76fc/64 scope link flags 800 valid_lft forever preferred_lft forever [root@opensourceecology openvpn]#
[root@osestaging1 ~]# ip address show dev tun0 2: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100 link/none inet 10.241.189.10 peer 10.241.189.9/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::5e8c:3af2:2e6:4aea/64 scope link flags 800 valid_lft forever preferred_lft forever [root@osestaging1 ~]#
- I noticed a few relevant options to our openvpn server config
- by default, I have 'ifconfig-pool-persist ipp.txt' defined, which makes clients have the same ip address persistently across the server's reboots; we appear to be using '/etc/openvpn/ipp.txt' here. The one in the 'server' dir appears to be from earlier, probably when I started the server manually rather than through systemd. Interestingly, this isn't even right! From above, we see that my 'maltfield' user has '.6' while the 'hetzner2' users have '.10'. Hmm.
[root@osedev1 server]# grep -iB5 ipp server.conf # Maintain a record of client <-> virtual IP address # associations in this file. If OpenVPN goes down or # is restarted, reconnecting clients can be assigned # the same virtual IP address from the pool that was # previously assigned. ifconfig-pool-persist ipp.txt [root@osedev1 server]# find /etc/openvpn | grep -i ipp.txt /etc/openvpn/server/ipp.txt /etc/openvpn/ipp.txt [root@osedev1 server]# cat /etc/openvpn/server/ipp.txt maltfield,10.241.189.4 [root@osedev1 server]# cat /etc/openvpn/ipp.txt maltfield,10.241.189.4 hetzner2,10.241.189.8
- there's also an option that I have commented-out whoose comments say it should be uncommented if multiple clients will share the same cert
[root@osedev1 server]# grep -iB5 duplicate server.conf # # IF YOU HAVE NOT GENERATED INDIVIDUAL # CERTIFICATE/KEY PAIRS FOR EACH CLIENT, # EACH HAVING ITS OWN UNIQUE "COMMON NAME", # UNCOMMENT THIS LINE OUT. ;duplicate-cn [root@osedev1 server]#
- I uncommented the above 'duplicate-cn' line and restarted openvpn on the dev node
[root@osedev1 server]# vim server.conf [root@osedev1 server]# systemctl restart openvpn@server.service
- I reconnected to the vpn from the staging & prod servers; they got new IP addresses
[root@opensourceecology openvpn]# ip address show dev tun0 5: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 100 link/none inet 10.241.189.14 peer 10.241.189.13/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::e5fb:f261:801b:1c3d/64 scope link flags 800 valid_lft forever preferred_lft forever [root@opensourceecology openvpn]#
[root@osestaging1 openvpn]# ip address show dev tun0 4: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100 link/none inet 10.241.189.18 peer 10.241.189.17/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::27f3:9643:5530:bd0e/64 scope link flags 800 valid_lft forever preferred_lft forever [root@osestaging1 openvpn]#
- I confirmed that each client could ping themselves, but not each-other, so I uncommented the line 'client-to-client' and restarted the openvpn server again
- after that, I confirmed that staging could ping prod, prod could ping staging, and my laptop could ping both staging & prod. Cool!
- for some reason the servers could still not ping my laptop; maybe that's some complication in my like quad-NAT'd QubesOS networking stack flowing through two nested VPN connections. Anyway, that shouldn't be required *shrug*
- and, holy shit, I was successfully able to ssh into the staging node from the production node through the private VPN IP
[maltfield@opensourceecology ~]$ ssh -p 32415 10.241.189.18 The authenticity of host '[10.241.189.18]:32415 ([10.241.189.18]:32415)' can't be established. ECDSA key fingerprint is SHA256:a6NpVsq/qdOCV8o7u3TXeVfZIxp7hpgMqXFOifTuNrI. ECDSA key fingerprint is MD5:ab:eb:7f:f2:bb:83:a1:e5:21:49:1e:22:93:17:70:d6. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '[10.241.189.18]:32415' (ECDSA) to the list of known hosts. Last login: Thu Oct 3 08:56:23 2019 from gateway [maltfield@osestaging1 ~]$
- but I was unable to ssh into our staging node from my laptop. oddly, it *is* able to establish a connection, but it gets stuck at some handshake step
user@ose:~/openvpn$ ssh -vvvvvvp 32415 maltfield@10.241.189.18 OpenSSH_7.4p1 Debian-10+deb9u7, OpenSSL 1.0.2t 10 Sep 2019 debug1: Reading configuration data /home/user/.ssh/config debug1: Reading configuration data /etc/ssh/ssh_config debug1: /etc/ssh/ssh_config line 19: Applying options for * debug2: resolving "10.241.189.18" port 32415 debug2: ssh_connect_direct: needpriv 0 debug1: Connecting to 10.241.189.18 [10.241.189.18] port 32415. debug1: Connection established. ... debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none debug3: send packet: type 30 debug1: expecting SSH2_MSG_KEX_ECDH_REPLY Connection closed by 10.241.189.18 port 32415 user@ose:~/openvpn$
- ok, I fixed this issue by removing the second VPN (qubes was configured to use a vpn qube as its NetVM; changing this to 'sys-firewall' fixed this issue)
user@ose:~/openvpn$ ssh -p 32415 maltfield@10.241.189.18 Last login: Thu Oct 3 09:20:50 2019 from 10.241.189.6 [maltfield@osestaging1 ~]$
- on second thought, I really should have static ip addresses unique for both the prod & staging nodes. to achieve this, I can't share the same cert; I'll just make '/root/openvpn' one of those dirs (like networking config dirs) that is not changed by the rsync
- I commented-out the 'duplicate-cn' line again in the openvpn server config & restarted the openvpn server
[root@osedev1 openvpn]# systemctl restart openvpn@server.service (reverse-i-search)`grep': ss -plan | ^Cep -i 8080 [root@osedev1 openvpn]# grep -B5 duplicate-cn server.conf # # IF YOU HAVE NOT GENERATED INDIVIDUAL # CERTIFICATE/KEY PAIRS FOR EACH CLIENT, # EACH HAVING ITS OWN UNIQUE "COMMON NAME", # UNCOMMENT THIS LINE OUT. ;duplicate-cn [root@osedev1 openvpn]# systemctl restart openvpn@server.service
- and I created a distinct cert for 'osestaging1'
[root@osedev1 3]# /usr/share/easy-rsa/3.0.6/easyrsa build-client-full osestaging1 nopass Using SSL: openssl OpenSSL 1.0.2k-fips 26 Jan 2017 Generating a 2048 bit RSA private key ....+++ ...........................+++ writing new private key to '/usr/share/easy-rsa/3/pki/private/osestaging1.key.WsJhUsDCny' ----- Using configuration from /usr/share/easy-rsa/3/pki/safessl-easyrsa.cnf Enter pass phrase for /usr/share/easy-rsa/3/pki/private/ca.key: Check that the request matches the signature Signature ok The Subject's Distinguished Name is as follows commonName :ASN.1 12:'osestaging1' Certificate is to be certified until Sep 17 10:34:03 2022 GMT (1080 days) Write out database with 1 new entries Data Base Updated [root@osedev1 3]# cp pki/private/osestaging1.key /home/maltfield/ [root@osedev1 3]# cp pki/private/ta.key /home/maltfield/ [root@osedev1 3]# cp pki/issued/osestaging1.crt /home/maltfield/ [root@osedev1 3]# cp pki/ca.crt /home/maltfield/ [root@osedev1 3]# chown maltfield /home/maltfield/*.key [root@osedev1 3]# chown maltfield /home/maltfield/*.crt [root@osedev1 3]# logout
- and on the staging server
[root@osestaging1 ~]# cd /root/openvpn/ [root@osestaging1 openvpn]# mv /home/maltfield/*.key . mv: overwrite './ta.key'? y [root@osestaging1 openvpn]# mv /home/maltfield/*.crt . mv: overwrite './ca.crt'? y [root@osestaging1 openvpn]# ls ca.crt hetzner2.crt osestaging1.crt ta.key client.conf hetzner2.key osestaging1.key [root@osestaging1 openvpn]# shred -u hetzner2.* [root@osestaging1 openvpn]# ls -lah total 32K drwx------. 2 root root 4.0K Oct 3 10:40 . dr-xr-x---. 4 root root 4.0K Oct 3 07:59 .. -rw-------. 1 maltfield maltfield 1.9K Oct 3 10:36 ca.crt -rw-r--r--. 1 root root 3.6K Oct 3 07:27 client.conf -rw-------. 1 maltfield maltfield 5.6K Oct 3 10:36 osestaging1.crt -rw-------. 1 maltfield maltfield 1.7K Oct 3 10:36 osestaging1.key -rw-------. 1 maltfield maltfield 636 Oct 3 10:36 ta.key [root@osestaging1 openvpn]# chown root:root *.crt [root@osestaging1 openvpn]# chown root:root *.key [root@osestaging1 openvpn]# chmod 0600 client.conf [root@osestaging1 openvpn]# ls -lah total 32K drwx------. 2 root root 4.0K Oct 3 10:40 . dr-xr-x---. 4 root root 4.0K Oct 3 07:59 .. -rw-------. 1 root root 1.9K Oct 3 10:36 ca.crt -rw-------. 1 root root 3.6K Oct 3 07:27 client.conf -rw-------. 1 root root 5.6K Oct 3 10:36 osestaging1.crt -rw-------. 1 root root 1.7K Oct 3 10:36 osestaging1.key -rw-------. 1 root root 636 Oct 3 10:36 ta.key [root@osestaging1 openvpn]# vim client.conf
- I decided to make the following static IPs
- 10.241.189.10 hetzner2 (prod)
- 10.241.189.11 osestaging1
- I did this by uncommenting the line 'client-config-dir ccd', creating a client-specifc config file in the '/etc/openvpn/ccd/' dir whoose name matches the CN (Common Name) on the client cert, and restarting the openvpn server service
[root@osedev1 openvpn]# vim server.conf [root@osedev1 openvpn]# grep -Ei '^client-config-dir ccd' server.conf client-config-dir ccd [root@osedev1 openvpn]# echo "ifconfig-push 10.241.189.11 255.255.255.255" > ccd/osestaging1 [root@osedev1 openvpn]# systemctl restart openvpn@server.service [root@osedev1 openvpn]#
- I did the same for prod
[root@osedev1 openvpn]# echo "ifconfig-push 10.241.189.10 255.255.255.255" > ccd/hetzner2 [root@osedev1 openvpn]# systemctl restart openvpn@server.service [root@osedev1 openvpn]#
- now that it's static, I can update my ssh config to make connecting to the staging node easy after connecting to the vpn from my laptop
user@ose:~/openvpn$ vim ~/.ssh/config user@ose:~/openvpn$ head -n21 ~/.ssh/config # OSE Host openbuildinginstitute.org *.openbuildinginstitute.org opensourceecology.org *.opensourceecology.org Port 32415 ForwardAgent yes IdentityFile /home/user/.ssh/id_rsa.ose User maltfield Host osedev1 HostName 195.201.233.113 Port 32415 ForwardAgent yes IdentityFile /home/user/.ssh/id_rsa.ose User maltfield Host osestaging1 HostName 10.241.189.11 Port 32415 ForwardAgent yes IdentityFile /home/user/.ssh/id_rsa.ose User maltfield user@ose:~/openvpn$ ssh osestaging1 The authenticity of host '[10.241.189.11]:32415 ([10.241.189.11]:32415)' can't be established. ECDSA key fingerprint is SHA256:a6NpVsq/qdOCV8o7u3TXeVfZIxp7hpgMqXFOifTuNrI. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '[10.241.189.11]:32415' (ECDSA) to the list of known hosts. Last login: Thu Oct 3 10:42:40 2019 from 10.241.189.10 [maltfield@osestaging1 ~]$
- another issue remains: we need the staging node to connect to the vpn on startup, but I can't get the fucking systemd module to work
[root@osestaging1 system]# systemctl start openvpn-client\@client.service Job for openvpn-client@client.service failed because the control process exited with error code. See "systemctl status openvpn-client@client.service" and "journalctl -xe" for details. [root@osestaging1 system]# systemctl status openvpn-client\@client.service ● openvpn-client@client.service - OpenVPN tunnel for client Loaded: loaded (/etc/systemd/system/openvpn-client@client.service; disabled; vendor preset: disabled) Active: failed (Result: exit-code) since Thu 2019-10-03 12:34:56 UTC; 8s ago Docs: man:openvpn(8) https://community.openvpn.net/openvpn/wiki/Openvpn24ManPage https://community.openvpn.net/openvpn/wiki/HOWTO Process: 1295 ExecStart=/usr/sbin/openvpn --suppress-timestamps --nobind --config client.conf (code=exited, status=200/CHDIR) Main PID: 1295 (code=exited, status=200/CHDIR) Oct 03 12:34:56 osestaging1 systemd[1]: Starting OpenVPN tunnel for client... Oct 03 12:34:56 osestaging1 systemd[1]: openvpn-client@client.service: main process exited, code=exited, status=200/CHDIR Oct 03 12:34:56 osestaging1 systemd[1]: Failed to start OpenVPN tunnel for client. Oct 03 12:34:56 osestaging1 systemd[1]: Unit openvpn-client@client.service entered failed state. Oct 03 12:34:56 osestaging1 systemd[1]: openvpn-client@client.service failed. [root@osestaging1 system]# tail -n 7 /var/log/messages Oct 3 12:29:29 localhost systemd: openvpn-client@client.service failed. Oct 3 12:34:56 localhost systemd: Starting OpenVPN tunnel for client... Oct 3 12:34:56 localhost systemd: Failed at step CHDIR spawning /usr/sbin/openvpn: No such file or directory Oct 3 12:34:56 localhost systemd: openvpn-client@client.service: main process exited, code=exited, status=200/CHDIR Oct 3 12:34:56 localhost systemd: Failed to start OpenVPN tunnel for client. Oct 3 12:34:56 localhost systemd: Unit openvpn-client@client.service entered failed state. Oct 3 12:34:56 localhost systemd: openvpn-client@client.service failed. [root@osestaging1 system]#
- the /usr/sbin/openvpn file definitely exists; I think the issue is with the tun0 not existing or something
- I gave the osestaging1 container a reboot
- after a reboot, osestaging1 now says that the openvpn-client@client.service doesn't exist!
[maltfield@osestaging1 ~]$ systemctl start openvpn-client\@client.service Failed to start openvpn-client@client.service: The name org.freedesktop.PolicyKit1 was not provided by any .service files See system logs and 'systemctl status openvpn-client@client.service' for details. [maltfield@osestaging1 ~]$ systemctl list-unit-files | grep -i vpn openvpn-client@.service disabled openvpn-client@client.service disabled openvpn-server@.service disabled openvpn@.service disabled [maltfield@osestaging1 ~]$
- attempting to enable it failes
[maltfield@osestaging1 ~]$ systemctl enable /etc/systemd/system/openvpn-client\@client.service Failed to execute operation: The name org.freedesktop.PolicyKit1 was not provided by any .service files [maltfield@osestaging1 ~]$
- oh, duh, I wasn't root
[root@osestaging1 ~]# systemctl status openvpn-client\@client.service ● openvpn-client@client.service - OpenVPN tunnel for client Loaded: loaded (/etc/systemd/system/openvpn-client@client.service; disabled; vendor preset: disabled) Active: inactive (dead) Docs: man:openvpn(8) https://community.openvpn.net/openvpn/wiki/Openvpn24ManPage https://community.openvpn.net/openvpn/wiki/HOWTO [root@osestaging1 ~]# systemctl start openvpn-client\@client.service Job for openvpn-client@client.service failed because the control process exited with error code. See "systemctl status openvpn-client@client.service" and "journalctl -xe" for details. [root@osestaging1 ~]# systemctl status openvpn-client\@client.service ● openvpn-client@client.service - OpenVPN tunnel for client Loaded: loaded (/etc/systemd/system/openvpn-client@client.service; disabled; vendor preset: disabled) Active: failed (Result: exit-code) since Thu 2019-10-03 12:52:39 UTC; 7s ago Docs: man:openvpn(8) https://community.openvpn.net/openvpn/wiki/Openvpn24ManPage https://community.openvpn.net/openvpn/wiki/HOWTO Process: 379 ExecStart=/usr/sbin/openvpn --suppress-timestamps --nobind --config client.conf (code=exited, status=200/CHDIR) Main PID: 379 (code=exited, status=200/CHDIR) Oct 03 12:52:38 osestaging1 systemd[1]: Starting OpenVPN tunnel for client... Oct 03 12:52:39 osestaging1 systemd[1]: openvpn-client@client.service: main process exited, code=exited, status=200/CHDIR Oct 03 12:52:39 osestaging1 systemd[1]: Failed to start OpenVPN tunnel for client. Oct 03 12:52:39 osestaging1 systemd[1]: Unit openvpn-client@client.service entered failed state. Oct 03 12:52:39 osestaging1 systemd[1]: openvpn-client@client.service failed. [root@osestaging1 ~]# tail -n 7 /var/log/messages Oct 3 12:52:38 localhost systemd: Created slice system-openvpn\x2dclient.slice. Oct 3 12:52:38 localhost systemd: Starting OpenVPN tunnel for client... Oct 3 12:52:39 localhost systemd: Failed at step CHDIR spawning /usr/sbin/openvpn: No such file or directory Oct 3 12:52:39 localhost systemd: openvpn-client@client.service: main process exited, code=exited, status=200/CHDIR Oct 3 12:52:39 localhost systemd: Failed to start OpenVPN tunnel for client. Oct 3 12:52:39 localhost systemd: Unit openvpn-client@client.service entered failed state. Oct 3 12:52:39 localhost systemd: openvpn-client@client.service failed. [root@osestaging1 ~]#
- after fighting with this shit for hours, I finally just copied all my files from /root/openvpn into /etc/openvpn/client/ and it worked!
[root@osestaging1 system]# cp /root/openvpn/* /etc/openvpn/client [root@osestaging1 system]# vim openvpn-client\@client.service ... [root@osestaging1 system]# systemctl daemon-reload <30>systemd-fstab-generator[425]: Running in a container, ignoring fstab device entry for /dev/root. [root@osestaging1 system]# systemctl restart openvpn-client\@client.service [root@osestaging1 system]# systemctl status openvpn-client\@client.service ● openvpn-client@client.service - OpenVPN tunnel for client Loaded: loaded (/etc/systemd/system/openvpn-client@client.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2019-10-03 13:33:32 UTC; 1s ago Docs: man:openvpn(8) https://community.openvpn.net/openvpn/wiki/Openvpn24ManPage https://community.openvpn.net/openvpn/wiki/HOWTO Main PID: 432 (openvpn) Status: "Initialization Sequence Completed" CGroup: /user.slice/user-1000.slice/session-582.scope/system.slice/system-openvpn\x2dclient.slice/openvpn-client@client.service └─432 /usr/sbin/openvpn --suppress-timestamps --nobind --config client.conf Oct 03 13:33:33 osestaging1 openvpn[432]: Incoming Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key Oct 03 13:33:33 osestaging1 openvpn[432]: WARNING: Since you are using --dev tun with a point-to-point topology, the second arg...nowarn) Oct 03 13:33:33 osestaging1 openvpn[432]: ROUTE_GATEWAY 192.168.122.1/255.255.255.0 IFACE=eth0 HWADDR=fe:07:06:a6:5f:1d Oct 03 13:33:33 osestaging1 openvpn[432]: TUN/TAP device tun0 opened Oct 03 13:33:33 osestaging1 openvpn[432]: TUN/TAP TX queue length set to 100 Oct 03 13:33:33 osestaging1 openvpn[432]: /sbin/ip link set dev tun0 up mtu 1500 Oct 03 13:33:33 osestaging1 openvpn[432]: /sbin/ip addr add dev tun0 local 10.241.189.11 peer 255.255.255.255 Oct 03 13:33:33 osestaging1 openvpn[432]: /sbin/ip route add 10.241.189.0/24 via 255.255.255.255 Oct 03 13:33:33 osestaging1 openvpn[432]: WARNING: this configuration may cache passwords in memory -- use the auth-nocache opt...nt this Oct 03 13:33:33 osestaging1 openvpn[432]: Initialization Sequence Completed Hint: Some lines were ellipsized, use -l to show in full. [root@osestaging1 system]# ip address show dev tun0 2: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100 link/none inet 10.241.189.11 peer 255.255.255.255/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::927:fae4:1356:9b90/64 scope link flags 800 valid_lft forever preferred_lft forever [root@osestaging1 system]#
- I confirmed that I could ssh into the staging node from my laptop
- I rebooted the staging node
- I confirmed that I could ssh into the staging node again after the reboot!
- I'm not going to bother with trying to setup this with the prod node for now; I'm not in a place where I want to make & test that prod change by rebooting the server..
- this is a good stopping point; I created another snapshot of the staging node
[root@osedev1 ~]# lxc-stop -n osestaging1 [root@osedev1 ~]# lxc-snapshot --name osestaging1 --list snap0 (/var/lib/lxcsnaps/osestaging1) 2019:10:02 15:37:58 [root@osedev1 ~]# lxc-snapshot --name osestaging1 afterVPN lxc_container: lxccontainer.c: lxcapi_snapshot: 2891 Snapshot of directory-backed container requested. lxc_container: lxccontainer.c: lxcapi_snapshot: 2892 Making a copy-clone. If you do want snapshots, then lxc_container: lxccontainer.c: lxcapi_snapshot: 2893 please create an aufs or overlayfs clone first, snapshot that lxc_container: lxccontainer.c: lxcapi_snapshot: 2894 and keep the original container pristine. [root@osedev1 ~]# lxc-snapshot --name osestaging1 --list snap1 (/var/lib/lxcsnaps/osestaging1) 2019:10:03 15:40:16 snap0 (/var/lib/lxcsnaps/osestaging1) 2019:10:02 15:37:58 [root@osedev1 ~]#
- I started the staging container again, and I tested an rsync from prod to staging; first let's see the contents of /etc/varnish on staging
[root@osestaging1 ~]# ls -lah /etc | grep -i varnish [root@osestaging1 ~]#
- and the rsync; it failed. right, I need passwordless sudo on the staging node setup
[maltfield@opensourceecology ~]$ sudo -E rsync -e 'ssh -p 32415' --rsync-path="sudo rsync" -av --progress /etc/varnish maltfield@10.241.189.10:/etc/ [sudo] password for maltfield: The authenticity of host '[10.241.189.10]:32415 ([10.241.189.10]:32415)' can't be established. ECDSA key fingerprint is SHA256:HclF8ZQOjGqx+9TmwL111kZ7QxgKkoEw8g3l2YxV0gk. ECDSA key fingerprint is MD5:cd:87:b1:bb:c1:3e:d1:d1:d4:5d:16:c9:e8:30:6a:71. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '[10.241.189.10]:32415' (ECDSA) to the list of known hosts. sudo: no tty present and no askpass program specified rsync: connection unexpectedly closed (0 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(605) [sender=3.0.9] [maltfield@opensourceecology ~]$
- I added this line to the end of the staging node with 'visudo'
maltfield ALL=(ALL) NOPASSWD: ALL
- doh, I gotta install rsync on the staging node. so many prereqs...
[maltfield@opensourceecology ~]$ sudo -E rsync -e 'ssh -p 32415' --rsync-path="sudo rsync" -av --progress /etc/varnish maltfield@10.241.189.11:/etc/ The authenticity of host '[10.241.189.11]:32415 ([10.241.189.11]:32415)' can't be established. ECDSA key fingerprint is SHA256:a6NpVsq/qdOCV8o7u3TXeVfZIxp7hpgMqXFOifTuNrI. ECDSA key fingerprint is MD5:ab:eb:7f:f2:bb:83:a1:e5:21:49:1e:22:93:17:70:d6. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '[10.241.189.11]:32415' (ECDSA) to the list of known hosts. sudo: rsync: command not found rsync: connection unexpectedly closed (0 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(605) [sender=3.0.9] [maltfield@opensourceecology ~]$
- this time the rsync worked!
[maltfield@opensourceecology ~]$ sudo -E rsync -e 'ssh -p 32415' --rsync-path="sudo rsync" -av --progress /etc/varnish maltfield@10.241.189.11:/etc/ ... sent 192211 bytes received 503 bytes 128476.00 bytes/sec total size is 190106 speedup is 0.99 [maltfield@opensourceecology ~]$
- here's the dir on staging node's side
[root@osestaging1 ~]# ls -lah /etc/varnish total 44K drwxr-xr-x. 5 root root 4.0K Aug 27 06:19 . drwxr-xr-x. 63 root root 4.0K Oct 3 13:52 .. -rw-r--r--. 1 root root 1.4K Apr 9 19:10 all-vhosts.vcl -rw-r--r--. 1 root root 697 Nov 19 2017 catch-all.vcl drwxr-xr-x. 2 root root 4.0K Aug 27 06:17 conf -rw-rw-r--. 1 1011 1011 737 Nov 23 2017 default.vcl drwxr-xr-x. 2 root root 4.0K Apr 12 2018 lib -rw-------. 1 root root 129 Apr 12 2018 secret -rw-------. 1 root root 129 Apr 12 2018 secret.20180412.bak drwxr-xr-x. 2 root root 4.0K Aug 27 06:18 sites-enabled -rw-r--r--. 1 root root 1.1K Oct 21 2017 varnish.params [root@osestaging1 ~]#
- again, here's the dirs we want to exclude; the openvpn configs are already preserved
/root /etc/sudo* /etc/openvpn /usr/share/easy-rsa /dev /sys /proc /boot/ /etc/sysconfig/network* /tmp /var/tmp /etc/fstab /etc/mtab /etc/mdadm.conf
- aaaand *fingers crossed* I kicked-off the rsync
[maltfield@opensourceecology ~]$ time sudo -E rsync -e 'ssh -p 32415' --rsync-path="sudo rsync" --exclude=/root --exclude=/etc/sudo* --exclude=/etc/openvpn --exclude=/usr/share/easy-rsa --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/boot/ --exclude=/etc/sysconfig/network* --exclude=/tmp --exclude=/var/tmp --exclude=/etc/fstab --exclude=/etc/mtab --exclude=/etc/mdadm.conf -av --progress / maltfield@10.241.189.11:/ ...
- whoops, I got ahead of myself! I killed it & left the staging server in a broken state, so I restored from snapshot & re-did the visudo & install rsync steps. But before we actually kick-off this whole-system rsync, I need to attach a hetzner cloud volume and mount it to /var. Else, the dev node's little disk will fill-up!
[root@osedev1 ~]# lxc-snapshot --name osestaging1 -r snap1 [root@osedev1 ~]# lxc-start -n osestaging1
Wed Oct 02, 2019
- continuing on the dev node, I want to create a container for lxc. First I installed 'lxc'
[root@osedev1 ~]# yum install lxc Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile epel/x86_64/metalink | 27 kB 00:00:00 * base: mirror.alpix.eu * epel: mirror.wiuwiu.de * extras: centosmirror.netcup.net * updates: mirror.alpix.eu base | 3.6 kB 00:00:00 epel | 5.3 kB 00:00:00 extras | 2.9 kB 00:00:00 updates | 2.9 kB 00:00:00 (1/6): base/7/x86_64/group_gz | 165 kB 00:00:00 (2/6): base/7/x86_64/primary_db | 6.0 MB 00:00:00 (3/6): epel/x86_64/updateinfo | 1.0 MB 00:00:00 (4/6): updates/7/x86_64/primary_db | 1.1 MB 00:00:00 (5/6): epel/x86_64/primary_db | 6.8 MB 00:00:00 (6/6): extras/7/x86_64/primary_db | 152 kB 00:00:00 Resolving Dependencies --> Running transaction check ---> Package lxc.x86_64 0:1.0.11-2.el7 will be installed --> Processing Dependency: lua-lxc(x86-64) = 1.0.11-2.el7 for package: lxc-1.0.11-2.el7.x86_64 --> Processing Dependency: lua-alt-getopt for package: lxc-1.0.11-2.el7.x86_64 --> Processing Dependency: liblxc.so.1()(64bit) for package: lxc-1.0.11-2.el7.x86_64 --> Running transaction check ---> Package lua-alt-getopt.noarch 0:0.7.0-4.el7 will be installed ---> Package lua-lxc.x86_64 0:1.0.11-2.el7 will be installed --> Processing Dependency: lua-filesystem for package: lua-lxc-1.0.11-2.el7.x86_64 ---> Package lxc-libs.x86_64 0:1.0.11-2.el7 will be installed --> Running transaction check ---> Package lua-filesystem.x86_64 0:1.6.2-2.el7 will be installed --> Finished Dependency Resolution Dependencies Resolved ========================================================================================================================================= Package Arch Version Repository Size ========================================================================================================================================= Installing: lxc x86_64 1.0.11-2.el7 epel 140 k Installing for dependencies: lua-alt-getopt noarch 0.7.0-4.el7 epel 7.4 k lua-filesystem x86_64 1.6.2-2.el7 epel 28 k lua-lxc x86_64 1.0.11-2.el7 epel 17 k lxc-libs x86_64 1.0.11-2.el7 epel 276 k Transaction Summary ========================================================================================================================================= Install 1 Package (+4 Dependent packages) Total download size: 468 k Installed size: 1.0 M Is this ok [y/d/N]: y Downloading packages: (1/5): lua-alt-getopt-0.7.0-4.el7.noarch.rpm | 7.4 kB 00:00:00 (2/5): lua-filesystem-1.6.2-2.el7.x86_64.rpm | 28 kB 00:00:00 (3/5): lua-lxc-1.0.11-2.el7.x86_64.rpm | 17 kB 00:00:00 (4/5): lxc-1.0.11-2.el7.x86_64.rpm | 140 kB 00:00:00 (5/5): lxc-libs-1.0.11-2.el7.x86_64.rpm | 276 kB 00:00:00 ----------------------------------------------------------------------------------------------------------------------------------------- Total 717 kB/s | 468 kB 00:00:00 Running transaction check Running transaction test Transaction test succeeded Running transaction Installing : lxc-libs-1.0.11-2.el7.x86_64 1/5 Installing : lua-filesystem-1.6.2-2.el7.x86_64 2/5 Installing : lua-lxc-1.0.11-2.el7.x86_64 3/5 Installing : lua-alt-getopt-0.7.0-4.el7.noarch 4/5 Installing : lxc-1.0.11-2.el7.x86_64 5/5 Verifying : lua-lxc-1.0.11-2.el7.x86_64 1/5 Verifying : lua-alt-getopt-0.7.0-4.el7.noarch 2/5 Verifying : lxc-1.0.11-2.el7.x86_64 3/5 Verifying : lua-filesystem-1.6.2-2.el7.x86_64 4/5 Verifying : lxc-libs-1.0.11-2.el7.x86_64 5/5 Installed: lxc.x86_64 0:1.0.11-2.el7 Dependency Installed: lua-alt-getopt.noarch 0:0.7.0-4.el7 lua-filesystem.x86_64 0:1.6.2-2.el7 lua-lxc.x86_64 0:1.0.11-2.el7 lxc-libs.x86_64 0:1.0.11-2.el7 Complete! [root@osedev1 ~]#
- by default, it appears that we have no lxc containers
[root@osedev1 ~]# ls -lah /usr/share/lxc/templates/ total 8.0K drwxr-xr-x. 2 root root 4.0K Mar 7 2019 . drwxr-xr-x. 6 root root 4.0K Oct 2 12:16 .. [root@osedev1 ~]#
- I installed the 'lxc-templates' package (also from epel), and it gave me templates for many distros, including centos
[root@osedev1 ~]# yum -y install lxc-templates Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: mirror.alpix.eu * epel: mirror.wiuwiu.de * extras: centosmirror.netcup.net * updates: mirror.alpix.eu Resolving Dependencies --> Running transaction check ---> Package lxc-templates.x86_64 0:1.0.11-2.el7 will be installed --> Finished Dependency Resolution Dependencies Resolved ========================================================================================================================================= Package Arch Version Repository Size ========================================================================================================================================= Installing: lxc-templates x86_64 1.0.11-2.el7 epel 81 k Transaction Summary ========================================================================================================================================= Install 1 Package Total download size: 81 k Installed size: 333 k Downloading packages: lxc-templates-1.0.11-2.el7.x86_64.rpm | 81 kB 00:00:00 Running transaction check Running transaction test Transaction test succeeded Running transaction Installing : lxc-templates-1.0.11-2.el7.x86_64 1/1 Verifying : lxc-templates-1.0.11-2.el7.x86_64 1/1 Installed: lxc-templates.x86_64 0:1.0.11-2.el7 Complete! [root@osedev1 ~]# ls -lah /usr/share/lxc/templates/ total 348K drwxr-xr-x. 2 root root 4.0K Oct 2 12:29 . drwxr-xr-x. 6 root root 4.0K Oct 2 12:16 .. -rwxr-xr-x. 1 root root 11K Mar 7 2019 lxc-alpine -rwxr-xr-x. 1 root root 14K Mar 7 2019 lxc-altlinux -rwxr-xr-x. 1 root root 11K Mar 7 2019 lxc-archlinux -rwxr-xr-x. 1 root root 9.5K Mar 7 2019 lxc-busybox -rwxr-xr-x. 1 root root 30K Mar 7 2019 lxc-centos -rwxr-xr-x. 1 root root 11K Mar 7 2019 lxc-cirros -rwxr-xr-x. 1 root root 18K Mar 7 2019 lxc-debian -rwxr-xr-x. 1 root root 18K Mar 7 2019 lxc-download -rwxr-xr-x. 1 root root 49K Mar 7 2019 lxc-fedora -rwxr-xr-x. 1 root root 28K Mar 7 2019 lxc-gentoo -rwxr-xr-x. 1 root root 14K Mar 7 2019 lxc-openmandriva -rwxr-xr-x. 1 root root 14K Mar 7 2019 lxc-opensuse -rwxr-xr-x. 1 root root 35K Mar 7 2019 lxc-oracle -rwxr-xr-x. 1 root root 12K Mar 7 2019 lxc-plamo -rwxr-xr-x. 1 root root 6.7K Mar 7 2019 lxc-sshd -rwxr-xr-x. 1 root root 24K Mar 7 2019 lxc-ubuntu -rwxr-xr-x. 1 root root 12K Mar 7 2019 lxc-ubuntu-cloud [root@osedev1 ~]#
- now I was successfully able to create an lxc container for our staging node named 'osestaging1' from the template 'centos'. I didn't specify the version, but it does appear to be centos7
[root@osedev1 ~]# lxc-create -n osestaging1 -t centos Host CPE ID from /etc/os-release: cpe:/o:centos:centos:7 Checking cache download in /var/cache/lxc/centos/x86_64/7/rootfs ... Downloading CentOS minimal ... ... Download complete. Copy /var/cache/lxc/centos/x86_64/7/rootfs to /var/lib/lxc/osestaging1/rootfs ... Copying rootfs to /var/lib/lxc/osestaging1/rootfs ... sed: can't read /var/lib/lxc/osestaging1/rootfs/etc/init/tty.conf: No such file or directory Storing root password in '/var/lib/lxc/osestaging1/tmp_root_pass' Expiring password for user root. passwd: Success sed: can't read /var/lib/lxc/osestaging1/rootfs/etc/rc.sysinit: No such file or directory sed: can't read /var/lib/lxc/osestaging1/rootfs/etc/rc.d/rc.sysinit: No such file or directory Container rootfs and config have been created. Edit the config file to check/enable networking setup. The temporary root password is stored in: '/var/lib/lxc/osestaging1/tmp_root_pass' The root password is set up as expired and will require it to be changed at first login, which you should do as soon as possible. If you lose the root password or wish to change it without starting the container, you can change it from the host by running the following command (which will also reset the expired flag): chroot /var/lib/lxc/osestaging1/rootfs passwd [root@osedev1 ~]#
- the sync from prod to sync is going to override the staging root password, so I won't bother creating & setting a distinct root password for this staging container
- `lxc-top` shows that we have 0 containers running
[root@osedev1 ~]# lxc-top Container CPU CPU CPU BlkIO Mem Name Used Sys User Total Used TOTAL (0 ) 0.00 0.00 0.00 0.00 0.00
- I tried to start the staging container, but I got a networking error
[root@osedev1 ~]# lxc-start -n osestaging1 lxc-start: conf.c: instantiate_veth: 3115 failed to attach 'vethWX1L1G' to the bridge 'virbr0': No such device lxc-start: conf.c: lxc_create_network: 3407 failed to create netdev lxc-start: start.c: lxc_spawn: 875 failed to create the network lxc-start: start.c: __lxc_start: 1149 failed to spawn 'osestaging1' lxc-start: lxc_start.c: main: 336 The container failed to start. lxc-start: lxc_start.c: main: 340 Additional information can be obtained by setting the --logfile and --logpriority options. [root@osedev1 ~]#
- it looks like there is no 'vibr0' device; we only have the loopback, ethernet, and tun device for openvpn
[root@osedev1 ~]# ip -all address show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 96:00:00:2e:48:9d brd ff:ff:ff:ff:ff:ff inet 195.201.233.113/32 brd 195.201.233.113 scope global dynamic eth0 valid_lft 56775sec preferred_lft 56775sec inet6 2a01:4f8:c010:3ca0::1/64 scope global valid_lft forever preferred_lft forever inet6 fe80::9400:ff:fe2e:489d/64 scope link valid_lft forever preferred_lft forever 3: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100 link/none inet 10.241.189.1 peer 10.241.189.2/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::4ca6:2d27:e97f:1a66/64 scope link flags 800 valid_lft forever preferred_lft forever [root@osedev1 ~]#
- Ideally, the container would not be given an internet-facing ip address, anyway. It would be better to give it a bridge on the tun0 openvpn network
- it looks like the relevant files for containers is in /var/lib/lxc/<containerName>/
[root@osedev1 osestaging1]# date Wed Oct 2 12:47:07 CEST 2019 [root@osedev1 osestaging1]# pwd /var/lib/lxc/osestaging1 [root@osedev1 osestaging1]# ls config rootfs tmp_root_pass [root@osedev1 osestaging1]#
- here is the default config
[root@osedev1 osestaging1]# cat config # Template used to create this container: /usr/share/lxc/templates/lxc-centos # Parameters passed to the template: # For additional config options, please look at lxc.container.conf(5) lxc.network.type = veth lxc.network.flags = up lxc.network.link = virbr0 lxc.network.hwaddr = fe:07:06:a6:5f:1d lxc.rootfs = /var/lib/lxc/osestaging1/rootfs # Include common configuration lxc.include = /usr/share/lxc/config/centos.common.conf lxc.arch = x86_64 lxc.utsname = osestaging1 lxc.autodev = 1 # When using LXC with apparmor, uncomment the next line to run unconfined: #lxc.aa_profile = unconfined # example simple networking setup, uncomment to enable #lxc.network.type = veth #lxc.network.flags = up #lxc.network.link = lxcbr0 #lxc.network.name = eth0 # Additional example for veth network type # static MAC address, #lxc.network.hwaddr = 00:16:3e:77:52:20 # persistent veth device name on host side # Note: This may potentially collide with other containers of same name! #lxc.network.veth.pair = v-osestaging1-e0 [root@osedev1 osestaging1]#
- to my horror, I discovered that iptables was disabled on the dev server! why!?!
[root@osedev1 osestaging1]# iptables-save [root@osedev1 osestaging1]# ip6tables-save [root@osedev1 osestaging1]# service iptables status Redirecting to /bin/systemctl status iptables.service ● iptables.service - IPv4 firewall with iptables Loaded: loaded (/usr/lib/systemd/system/iptables.service; disabled; vendor preset: disabled) Active: inactive (dead) [root@osedev1 osestaging1]# service iptables start Redirecting to /bin/systemctl start iptables.service [root@osedev1 osestaging1]# iptables-save # Generated by iptables-save v1.4.21 on Wed Oct 2 12:58:21 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [17:1396] -A INPUT -i lo -j ACCEPT -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -p icmp -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 32415 -j ACCEPT -A INPUT -p udp -m state --state NEW -m udp --dport 1194 -j ACCEPT -A INPUT -j DROP COMMIT # Completed on Wed Oct 2 12:58:21 2019 [root@osedev1 osestaging1]# ip6tables-save root@osedev1 osestaging1]# service ip6tables start Redirecting to /bin/systemctl start ip6tables.service [root@osedev1 osestaging1]# ip6tables-save # Generated by ip6tables-save v1.4.21 on Wed Oct 2 12:59:51 2019 *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -p ipv6-icmp -j ACCEPT -A INPUT -i lo -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT -A INPUT -d fe80::/64 -p udp -m udp --dport 546 -m state --state NEW -j ACCEPT -A INPUT -j REJECT --reject-with icmp6-adm-prohibited -A FORWARD -j REJECT --reject-with icmp6-adm-prohibited COMMIT # Completed on Wed Oct 2 12:59:51 2019 [root@osedev1 osestaging1]#
- systemd says that both iptables.service & ip6tables.service are 'loaded active exited'
[root@osedev1 osestaging1]# systemctl list-units | grep -Ei 'iptables|ip6tables' ip6tables.service loaded active exited IPv6 firewall with ip6tables iptables.service loaded active exited IPv4 firewall with iptables [root@osedev1 osestaging1]#
- systemd status shows both services are 'disabled'
[root@osedev1 osestaging1]# systemctl status iptables.service ● iptables.service - IPv4 firewall with iptables Loaded: loaded (/usr/lib/systemd/system/iptables.service; disabled; vendor preset: disabled) Active: active (exited) since Wed 2019-10-02 12:58:17 CEST; 7min ago Process: 29121 ExecStart=/usr/libexec/iptables/iptables.init start (code=exited, status=0/SUCCESS) Main PID: 29121 (code=exited, status=0/SUCCESS) CGroup: /system.slice/iptables.service Oct 02 12:58:17 osedev1 systemd[1]: Starting IPv4 firewall with iptables... Oct 02 12:58:17 osedev1 iptables.init[29121]: iptables: Applying firewall rules: [ OK ] Oct 02 12:58:17 osedev1 systemd[1]: Started IPv4 firewall with iptables. [root@osedev1 osestaging1]# systemctl status ip6tables.service ● ip6tables.service - IPv6 firewall with ip6tables Loaded: loaded (/usr/lib/systemd/system/ip6tables.service; disabled; vendor preset: disabled) Active: active (exited) since Wed 2019-10-02 12:59:46 CEST; 6min ago Process: 29233 ExecStart=/usr/libexec/iptables/ip6tables.init start (code=exited, status=0/SUCCESS) Main PID: 29233 (code=exited, status=0/SUCCESS) Oct 02 12:59:46 osedev1 systemd[1]: Starting IPv6 firewall with ip6tables... Oct 02 12:59:46 osedev1 ip6tables.init[29233]: ip6tables: Applying firewall rules: [ OK ] Oct 02 12:59:46 osedev1 systemd[1]: Started IPv6 firewall with ip6tables. [root@osedev1 osestaging1]#
- I enabled both, and I confirmed that they're now set to 'enabled' (see second line)
[root@osedev1 osestaging1]# systemctl enable iptables.service Created symlink from /etc/systemd/system/basic.target.wants/iptables.service to /usr/lib/systemd/system/iptables.service. [root@osedev1 osestaging1]# systemctl enable ip6tables.service Created symlink from /etc/systemd/system/basic.target.wants/ip6tables.service to /usr/lib/systemd/system/ip6tables.service. [root@osedev1 osestaging1]# systemctl status iptables.service ● iptables.service - IPv4 firewall with iptables Loaded: loaded (/usr/lib/systemd/system/iptables.service; enabled; vendor preset: disabled) Active: active (exited) since Wed 2019-10-02 12:58:17 CEST; 8min ago Main PID: 29121 (code=exited, status=0/SUCCESS) CGroup: /system.slice/iptables.service Oct 02 12:58:17 osedev1 systemd[1]: Starting IPv4 firewall with iptables... Oct 02 12:58:17 osedev1 iptables.init[29121]: iptables: Applying firewall rules: [ OK ] Oct 02 12:58:17 osedev1 systemd[1]: Started IPv4 firewall with iptables. [root@osedev1 osestaging1]# systemctl status ip6tables.service ● ip6tables.service - IPv6 firewall with ip6tables Loaded: loaded (/usr/lib/systemd/system/ip6tables.service; enabled; vendor preset: disabled) Active: active (exited) since Wed 2019-10-02 12:59:46 CEST; 7min ago Main PID: 29233 (code=exited, status=0/SUCCESS) Oct 02 12:59:46 osedev1 systemd[1]: Starting IPv6 firewall with ip6tables... Oct 02 12:59:46 osedev1 ip6tables.init[29233]: ip6tables: Applying firewall rules: [ OK ] Oct 02 12:59:46 osedev1 systemd[1]: Started IPv6 firewall with ip6tables. [root@osedev1 osestaging1]#
- actually, it doesn't make sense to have the staging server only have an ip address on the openvpn subnet; if that were the case, then it couldn't access the internet...which would make developing a POC nearly impossible. We want to prevent forwarding ports from the internet to the machine, but we do want to let it reach OUT to the internet. Perhaps we should setup the bridge per normal and then just have the openvpn client running on he staging server. Indeed, we'll need the prod server to be running an openvpn client, so we should be able to just duplicate this config (they'll be the same anyway!)
- I looked into what options are available for 'lxc.network.type', which is listed in section 5 of the man page for 'lxc.container.conf' = `man 5 lxc.container.conf`
lxc.network.type specify what kind of network virtualization to be used for the container. Each time a lxc.network.type field is found a new round of network configuration begins. In this way, several network virtualization types can be specified for the same container, as well as assigning several network interfaces for one container. The different virtualization types can be: none: will cause the container to share the host's network namespace. This means the host network devices are usable in the container. It also means that if both the container and host have upstart as init, 'halt' in a container (for instance) will shut down the host. empty: will create only the loopback interface. veth: a virtual ethernet pair device is created with one side assigned to the container and the other side attached to a bridge specified by the lxc.network.link option. If the bridge is not specified, then the veth pair device will be created but not attached to any bridge. Otherwise, the bridge has to be created on the system before starting the con‐ tainer. lxc won't handle any configuration outside of the container. By default, lxc chooses a name for the network device belonging to the outside of the container, but if you wish to handle this name yourselves, you can tell lxc to set a specific name with the lxc.network.veth.pair option (except for unprivileged containers where this option is ignored for security reasons). vlan: a vlan interface is linked with the interface specified by the lxc.network.link and assigned to the container. The vlan identifier is specified with the option lxc.network.vlan.id. macvlan: a macvlan interface is linked with the interface specified by the lxc.network.link and assigned to the con‐ tainer. lxc.network.macvlan.mode specifies the mode the macvlan will use to communicate between different macvlan on the same upper device. The accepted modes are private, the device never communicates with any other device on the same upper_dev (default), vepa, the new Virtual Ethernet Port Aggregator (VEPA) mode, it assumes that the adjacent bridge returns all frames where both source and destination are local to the macvlan port, i.e. the bridge is set up as a reflective relay. Broadcast frames coming in from the upper_dev get flooded to all macvlan interfaces in VEPA mode, local frames are not delivered locally, or bridge, it provides the behavior of a simple bridge between different macvlan interfaces on the same port. Frames from one interface to another one get delivered directly and are not sent out externally. Broadcast frames get flooded to all other bridge ports and to the external interface, but when they come back from a reflective relay, we don't deliver them again. Since we know all the MAC addresses, the macvlan bridge mode does not require learning or STP like the bridge module does. phys: an already existing interface specified by the lxc.network.link is assigned to the container.
- we want the container to be able to touch the internet, so hat rules out 'empty'
- we don't have a spare physical interface on the server for each container, so that rules out 'phys'
- I'm unclear on the distinction between macvlan, vlan, veth, and none. Probably we want veth and we need to get the 'virbr0' interface actually working
- google says our error may be caused by libvert not being installed
- I didn't have libvirt installed, so I did so
[root@osedev1 osestaging1]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 96:00:00:2e:48:9d brd ff:ff:ff:ff:ff:ff inet 195.201.233.113/32 brd 195.201.233.113 scope global dynamic eth0 valid_lft 50735sec preferred_lft 50735sec inet6 2a01:4f8:c010:3ca0::1/64 scope global valid_lft forever preferred_lft forever inet6 fe80::9400:ff:fe2e:489d/64 scope link valid_lft forever preferred_lft forever 3: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100 link/none inet 10.241.189.1 peer 10.241.189.2/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::4ca6:2d27:e97f:1a66/64 scope link flags 800 valid_lft forever preferred_lft forever [root@osedev1 osestaging1]# rpm -qa | grep -i libvirt [root@osedev1 osestaging1]# yum -y install libvirt Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: mirror.alpix.eu * epel: mirror.wiuwiu.de * extras: centosmirror.netcup.net * updates: mirror.alpix.eu Resolving Dependencies ... Complete! [root@osedev1 osestaging1]#
- but there didn't appear to be any changes; I had to manually start the libvirtd service to get the changes; now it shows two new interfaces: 'virbr0' & 'virbr0-nic'
[root@osedev1 osestaging1]# systemctl status libvirtd ● libvirtd.service - Virtualization daemon Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled) Active: inactive (dead) Docs: man:libvirtd(8) https://libvirt.org [root@osedev1 osestaging1]# systemctl start libvirtd [root@osedev1 osestaging1]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 96:00:00:2e:48:9d brd ff:ff:ff:ff:ff:ff inet 195.201.233.113/32 brd 195.201.233.113 scope global dynamic eth0 valid_lft 50619sec preferred_lft 50619sec inet6 2a01:4f8:c010:3ca0::1/64 scope global valid_lft forever preferred_lft forever inet6 fe80::9400:ff:fe2e:489d/64 scope link valid_lft forever preferred_lft forever 3: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100 link/none inet 10.241.189.1 peer 10.241.189.2/32 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::4ca6:2d27:e97f:1a66/64 scope link flags 800 valid_lft forever preferred_lft forever 6: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000 link/ether 52:54:00:7d:01:71 brd ff:ff:ff:ff:ff:ff inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0 valid_lft forever preferred_lft forever 7: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN group default qlen 1000 link/ether 52:54:00:7d:01:71 brd ff:ff:ff:ff:ff:ff [root@osedev1 osestaging1]#
- and there's some changes to the routing table too
[root@osedev1 osestaging1]# ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 link/ether 96:00:00:2e:48:9d brd ff:ff:ff:ff:ff:ff 3: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 100 link/none 6: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 link/ether 52:54:00:7d:01:71 brd ff:ff:ff:ff:ff:ff 7: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN mode DEFAULT group default qlen 1000 link/ether 52:54:00:7d:01:71 brd ff:ff:ff:ff:ff:ff [root@osedev1 osestaging1]# ip r default via 172.31.1.1 dev eth0 10.241.189.0/24 via 10.241.189.2 dev tun0 10.241.189.2 dev tun0 proto kernel scope link src 10.241.189.1 169.254.0.0/16 dev eth0 scope link metric 1002 172.31.1.1 dev eth0 scope link 192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 [root@osedev1 osestaging1]#
- now I was successfully able to start the 'osestaging1' container
[root@osedev1 osestaging1]# lxc-start -n osestaging1 systemd 219 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN) Detected virtualization lxc. Detected architecture x86-64. Welcome to CentOS Linux 7 (Core)! Running in a container, ignoring fstab device entry for /dev/root. Cannot add dependency job for unit display-manager.service, ignoring: Unit not found. [ OK ] Reached target Remote File Systems. [ OK ] Reached target Swap. [ OK ] Started Forward Password Requests to Wall Directory Watch. [ OK ] Created slice Root Slice. [ OK ] Created slice User and Session Slice. [ OK ] Listening on /dev/initctl Compatibility Named Pipe. [ OK ] Listening on Journal Socket. [ OK ] Started Dispatch Password Requests to Console Directory Watch. [ OK ] Reached target Local Encrypted Volumes. [ OK ] Reached target Paths. [ OK ] Listening on Delayed Shutdown Socket. [ OK ] Created slice System Slice. [ OK ] Created slice system-getty.slice. Starting Journal Service... Mounting POSIX Message Queue File System... [ OK ] Reached target Slices. Starting Read and set NIS domainname from /etc/sysconfig/network... Mounting Huge Pages File System... Starting Remount Root and Kernel File Systems... [ OK ] Mounted Huge Pages File System. [ OK ] Mounted POSIX Message Queue File System. [ OK ] Started Journal Service. [ OK ] Started Read and set NIS domainname from /etc/sysconfig/network. [ OK ] Started Remount Root and Kernel File Systems. [ OK ] Reached target Local File Systems (Pre). Starting Configure read-only root support... Starting Rebuild Hardware Database... Starting Flush Journal to Persistent Storage... <46>systemd-journald[14]: Received request to flush runtime journal from PID 1 [ OK ] Started Flush Journal to Persistent Storage. [ OK ] Started Configure read-only root support. Starting Load/Save Random Seed... [ OK ] Reached target Local File Systems. Starting Rebuild Journal Catalog... Starting Mark the need to relabel after reboot... Starting Create Volatile Files and Directories... [ OK ] Started Load/Save Random Seed. [ OK ] Reached target Local File Systems. Starting Rebuild Journal Catalog... Starting Mark the need to relabel after reboot... Starting Create Volatile Files and Directories... [ OK ] Started Load/Save Random Seed. [ OK ] Started Rebuild Journal Catalog. [ OK ] Started Mark the need to relabel after reboot. [ OK ] Started Create Volatile Files and Directories. Starting Update UTMP about System Boot/Shutdown... [ OK ] Started Update UTMP about System Boot/Shutdown. [ OK ] Started Rebuild Hardware Database. Starting Update is Completed... [ OK ] Started Update is Completed. [ OK ] Reached target System Initialization. [ OK ] Listening on D-Bus System Message Bus Socket. [ OK ] Reached target Sockets. [ OK ] Reached target Basic System. Starting LSB: Bring up/down networking... Starting Permit User Sessions... Starting Login Service... Starting OpenSSH Server Key Generation... [ OK ] Started D-Bus System Message Bus. [ OK ] Started Daily Cleanup of Temporary Directories. [ OK ] Reached target Timers. [ OK ] Started Permit User Sessions. Starting Cleanup of Temporary Directories... [ OK ] Started Command Scheduler. [ OK ] Started Console Getty. [ OK ] Reached target Login Prompts. [ OK ] Started Cleanup of Temporary Directories. [ OK ] Started Login Service. [ OK ] Started OpenSSH Server Key Generation. CentOS Linux 7 (Core) Kernel 3.10.0-957.21.3.el7.x86_64 on an x86_64 osestaging1 login:
- I was successfully able to login as root, but it made me change the password immedately. I just set it to the same root password as our prod server
osestaging1 login: root Password: You are required to change your password immediately (root enforced) Changing password for root. (current) UNIX password: New password: Retype new password: [root@osestaging1 ~]#
- this new container has an ip address of '192.168.122.201', and it does have access to the internet
[root@osestaging1 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 8: eth0@if9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether fe:07:06:a6:5f:1d brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 192.168.122.201/24 brd 192.168.122.255 scope global dynamic eth0 valid_lft 3310sec preferred_lft 3310sec inet6 fe80::fc07:6ff:fea6:5f1d/64 scope link valid_lft forever preferred_lft forever [root@osestaging1 ~]# ping 1.1.1.1 PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data. 64 bytes from 1.1.1.1: icmp_seq=1 ttl=55 time=5.46 ms 64 bytes from 1.1.1.1: icmp_seq=2 ttl=55 time=5.48 ms --- 1.1.1.1 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 5.468/5.474/5.480/0.006 ms [root@osestaging1 ~]#
- on the dev node host, we can also see the bridge with `brctl`
[root@osedev1 osestaging1]# brctl show bridge name bridge id STP enabled interfaces virbr0 8000.5254007d0171 yes vethYMJVGD virbr0-nic [root@osedev1 osestaging1]#
- now I think we're about ready to initiate this sync. Interesting decision: we could either rsync (via ssh) to the dev node or to the staging container. I think it would be safer to go to the container, as you can't fuck up the host dev node in that case.
- I confirmed that ssh is listening on the default install of the staging container
[root@osestaging1 ~]# ss -plan | grep -i ssh u_str ESTAB 0 0 * 162265 * 0 users:(("sshd",pid=298,fd=2),("sshd",pid=298,fd=1)) tcp LISTEN 0 128 *:22 *:* users:(("sshd",pid=298,fd=3)) tcp LISTEN 0 128 [::]:22 [::]:* users:(("sshd",pid=298,fd=4)) [root@osestaging1 ~]#
- I did some basic bootstrap config of the staging container, following my documentation for doing the same to its host dev server Maltfield_Log/2019_Q3#Tue_Aug_20.2C_2019
[root@osestaging1 ~]# useradd maltfield [root@osestaging1 ~]# su - maltfield [maltfield@osestaging1 ~]$ mkdir .ssh [maltfield@osestaging1 ~]$ echo "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDGNYjR7UKiJSAG/AbP+vlCBqNfQZ2yuSXfsEDuM7cEU8PQNJyuJnS7m0VcA48JRnpUpPYYCCB0fqtIEhpP+szpMg2LByfTtbU0vDBjzQD9mEfwZ0mzJsfzh1Nxe86l/d6h6FhxAqK+eG7ljYBElDhF4l2lgcMAl9TiSba0pcqqYBRsvJgQoAjlZOIeVEvM1lyfWfrmDaFK37jdUCBWq8QeJ98qpNDX4A76f9T5Y3q5EuSFkY0fcU+zwFxM71bGGlgmo5YsMMdSsW+89fSG0652/U4sjf4NTHCpuD0UaSPB876NJ7QzeDWtOgyBC4nhPpS8pgjsnl48QZuVm6FNDqbXr9bVk5BdntpBgps+gXdSL2j0/yRRayLXzps1LCdasMCBxCzK+lJYWGalw5dNaIDHBsEZiK55iwPp0W3lU9vXFO4oKNJGFgbhNmn+KAaW82NBwlTHo/tOlj2/VQD9uaK5YLhQqAJzIq0JuWZWFLUC2FJIIG0pJBIonNabANcN+vq+YJqjd+JXNZyTZ0mzuj3OAB/Z5zS6lT9azPfnEjpcOngFs46P7S/1hRIrSWCvZ8kfECpa8W+cTMus4rpCd40d1tVKzJA/n0MGJjEs2q4cK6lC08pXxq9zAyt7PMl94PHse2uzDFhrhh7d0ManxNZE+I5/IPWOnG1PJsDlOe4Yqw== michael@opensourceecology.org" > .ssh/authorized_keys [maltfield@osestaging1 ~]$ chmod 0700 .ssh [maltfield@osestaging1 ~]$ chmod 0600 .ssh/authorized_keys [maltfield@osestaging1 ~]$
- I confirmed that I could now successfully ssh in as 'maltfield' using my key into staging from within dev
user@ose:~$ ssh -A osedev1 Last login: Wed Oct 2 12:09:35 2019 from 5.254.96.238 [maltfield@osedev1 ~]$ ssh maltfield@192.168.122.201 hostname The authenticity of host '192.168.122.201 (192.168.122.201)' can't be established. ECDSA key fingerprint is SHA256:a6NpVsq/qdOCV8o7u3TXeVfZIxp7hpgMqXFOifTuNrI. ECDSA key fingerprint is MD5:ab:eb:7f:f2:bb:83:a1:e5:21:49:1e:22:93:17:70:d6. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.168.122.201' (ECDSA) to the list of known hosts. osestaging1 [maltfield@osedev1 ~]$
- and continued with the bootstrap of my user, giving myself sudo rights
[root@osestaging1 ~]# yum -y install sudo ... Installed: sudo.x86_64 0:1.8.23-4.el7 Complete! [root@osestaging1 ~]# passwd maltfield Changing password for user maltfield. New password: Retype new password: passwd: all authentication tokens updated successfully. [root@osestaging1 ~]# gpasswd -a maltfield wheel Adding user maltfield to group wheel [root@osestaging1 ~]# su - maltfield Last login: Wed Oct 2 13:00:29 UTC 2019 on lxc/console [maltfield@osestaging1 ~]$ sudo su - We trust you have received the usual lecture from the local System Administrator. It usually boils down to these three things: #1) Respect the privacy of others. #2) Think before you type. #3) With great power comes great responsibility. [sudo] password for maltfield: Last login: Wed Oct 2 12:33:00 UTC 2019 on lxc/console [root@osestaging1 ~]#
- this time I took the hardened config from dev and gave it to staging; first on dev I ran:
user@ose:~$ ssh osedev1 Last login: Wed Oct 2 14:57:15 2019 from 5.254.96.238 [maltfield@osedev1 ~]$ sudo cp /etc/ssh/sshd_config . [maltfield@osedev1 ~]$ sudo chown maltfield sshd_config [maltfield@osedev1 ~]$ scp sshd_config 192.168.122.201: sshd_config 100% 4455 5.7MB/s 00:00 [maltfield@osedev1 ~]$
- and then in staging
[maltfield@osestaging1 ~]$ ls sshd_config [maltfield@osestaging1 ~]$ sudo su - [sudo] password for maltfield: Last login: Wed Oct 2 13:02:02 UTC 2019 on lxc/console [root@osestaging1 ~]# cd /etc/ssh [root@osestaging1 ssh]# mv sshd_config sshd_config.20191002.orig [root@osestaging1 ssh]# mv /home/maltfield/sshd_config . [root@osestaging1 ssh]# ls -lah total 620K drwxr-xr-x. 2 root root 4.0K Oct 2 13:16 . drwxr-xr-x. 60 root root 4.0K Oct 2 13:01 .. -rw-r--r--. 1 root root 569K Aug 9 01:40 moduli -rw-r--r--. 1 root root 2.3K Aug 9 01:40 ssh_config -rw-r-----. 1 root ssh_keys 227 Oct 2 12:28 ssh_host_ecdsa_key -rw-r--r--. 1 root root 162 Oct 2 12:28 ssh_host_ecdsa_key.pub -rw-r-----. 1 root ssh_keys 387 Oct 2 12:28 ssh_host_ed25519_key -rw-r--r--. 1 root root 82 Oct 2 12:28 ssh_host_ed25519_key.pub -rw-r-----. 1 root ssh_keys 1.7K Oct 2 12:28 ssh_host_rsa_key -rw-r--r--. 1 root root 382 Oct 2 12:28 ssh_host_rsa_key.pub -rw-------. 1 maltfield maltfield 4.4K Oct 2 13:07 sshd_config -rw-------. 1 root root 3.9K Aug 9 01:40 sshd_config.20191002.orig [root@osestaging1 ssh]# chown root:root sshd_config [root@osestaging1 ssh]# ls -lah total 620K drwxr-xr-x. 2 root root 4.0K Oct 2 13:16 . drwxr-xr-x. 60 root root 4.0K Oct 2 13:01 .. -rw-r--r--. 1 root root 569K Aug 9 01:40 moduli -rw-r--r--. 1 root root 2.3K Aug 9 01:40 ssh_config -rw-r-----. 1 root ssh_keys 227 Oct 2 12:28 ssh_host_ecdsa_key -rw-r--r--. 1 root root 162 Oct 2 12:28 ssh_host_ecdsa_key.pub -rw-r-----. 1 root ssh_keys 387 Oct 2 12:28 ssh_host_ed25519_key -rw-r--r--. 1 root root 82 Oct 2 12:28 ssh_host_ed25519_key.pub -rw-r-----. 1 root ssh_keys 1.7K Oct 2 12:28 ssh_host_rsa_key -rw-r--r--. 1 root root 382 Oct 2 12:28 ssh_host_rsa_key.pub -rw-------. 1 root root 4.4K Oct 2 13:07 sshd_config -rw-------. 1 root root 3.9K Aug 9 01:40 sshd_config.20191002.orig [root@osestaging1 ssh]# grep AllowGroups sshd_config AllowGroups sshaccess [root@osestaging1 ssh]# grep sshaccess /etc/group [root@osestaging1 ssh]# groupadd sshaccess [root@osestaging1 ssh]# gpasswd -a maltfield sshaccess Adding user maltfield to group sshaccess [root@osestaging1 ssh]# grep sshaccess /etc/group sshaccess:x:1001:maltfield [root@osestaging1 ssh]# systemctl restart sshd [root@osestaging1 ssh]#
- confirmed that I could still ssh-in on the new non-standard port from dev to staging
user@ose:~$ ssh osedev1 Last login: Wed Oct 2 15:13:21 2019 from 5.254.96.225 [maltfield@osedev1 ~]$ ssh maltfield@192.168.122.201 hostname ssh: connect to host 192.168.122.201 port 22: Connection refused [maltfield@osedev1 ~]$ ssh -p 32415 maltfield@192.168.122.201 hostname osestaging1 [maltfield@osedev1 ~]$
- I could go on further to setup iptables to block things incoming, but the beauty of the fact that this is a container with a NAT'd private ip address on a host with iptables locked-down on its internet-facing ip address is that we really don't need to do that. It's already inaccessible to the internet, and it will only be accessible from the dev node--onto which our developers will vpn into as a necessary prerequisite to reach this staging node
- let's make it so that prod can touch staging; we'll create a cert for openvpn for our prod node, and install it on both our prod & staging nodes. Then we'll update our openvpn config to include the client-to-client option https://openvpn.net/community-resources/how-to/#scope
- before continuing, it would be wise to create a snapshot of the staging container
[root@osedev1 ssh]# lxc-snapshot --name osestaging1 --list No snapshots [root@osedev1 ssh]# lxc-snapshot --name osestaging1 afterBootstrap lxc_container: lxccontainer.c: lxcapi_snapshot: 2891 Snapshot of directory-backed container requested. lxc_container: lxccontainer.c: lxcapi_snapshot: 2892 Making a copy-clone. If you do want snapshots, then lxc_container: lxccontainer.c: lxcapi_snapshot: 2893 please create an aufs or overlayfs clone first, snapshot that lxc_container: lxccontainer.c: lxcapi_snapshot: 2894 and keep the original container pristine. lxc_container: lxccontainer.c: lxcapi_clone: 2643 error: Original container (osestaging1) is running lxc_container: lxccontainer.c: lxcapi_snapshot: 2899 clone of /var/lib/lxc:osestaging1 failed lxc_container: lxc_snapshot.c: do_snapshot: 55 Error creating a snapshot [root@osedev1 ssh]#
- I tried to create a snapshot; it told me that it can't do deltas unless I use overlayfs or aufs (or probably also zfs, butter, etc). It failed probably because the container is not stopped. I stopped it and tried again.
[root@osedev1 ssh]# lxc-snapshot --name osestaging1 afterBootstrap lxc_container: lxccontainer.c: lxcapi_snapshot: 2891 Snapshot of directory-backed container requested. lxc_container: lxccontainer.c: lxcapi_snapshot: 2892 Making a copy-clone. If you do want snapshots, then lxc_container: lxccontainer.c: lxcapi_snapshot: 2893 please create an aufs or overlayfs clone first, snapshot that lxc_container: lxccontainer.c: lxcapi_snapshot: 2894 and keep the original container pristine. [root@osedev1 ssh]# lxc-snapshot --name osestaging1 --list snap0 (/var/lib/lxcsnaps/osestaging1) 2019:10:02 15:37:58 [root@osedev1 ssh]#
- so our container is 0.5G, and so is our 1x snapshot
[root@osedev1 ssh]# du -sh /var/lib/lxcsnaps/* 459M /var/lib/lxcsnaps/osestaging1 [root@osedev1 ssh]# du -sh /var/lib/lxc/* 459M /var/lib/lxc/osestaging1 [root@osedev1 ssh]#
- eventually we'll need to mount the external block volume to /var/, especially before the sync from pod
[root@osedev1 ssh]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 19G 2.4G 16G 14% / devtmpfs 873M 0 873M 0% /dev tmpfs 896M 0 896M 0% /dev/shm tmpfs 896M 17M 879M 2% /run tmpfs 896M 0 896M 0% /sys/fs/cgroup /dev/sdb 9.8G 37M 9.3G 1% /mnt/HC_Volume_3110278 tmpfs 180M 0 180M 0% /run/user/1000 [root@osedev1 ssh]#
- as for backups, I created new API keys that have access to only the 'ose-dev-server-backups' bucket.
- because randomware is a topic of concern (and where the randomware deletes your backups), I also noticed that when we create the api key, we can remove the 'deleteFiles' and 'deleteBuckets' capabilities (the cleanup is actually done by the storage rules on backblaze's sides--not our script's logic) Apparently there's no way to edit the capabilities of exiting keys, so this would be a non-trivial change.
- I wrote the api key creds to osedev1:/root/scripts/backup.settings
- And I created a new 4K encryption key. TO make it clearer, I named it 'ose-dev-backups-cron.201910.key'. I added it to the shared ose keepass db under "backups" (files attached are under the "Advanced" tab)
- I also installed the b2cli depends to the dev node, unfortunately I hit some issues https://wiki.opensourceecology.org/wiki/Backblaze#Install_CLI
[root@osedev1 backups]# yum install python-virtualenv ... Installed: python-virtualenv.noarch 0:15.1.0-2.el7 Dependency Installed: python-devel.x86_64 0:2.7.5-86.el7 python-rpm-macros.noarch 0:3-32.el7 python-srpm-macros.noarch 0:3-32.el7 python2-rpm-macros.noarch 0:3-32.el7 Dependency Updated: python.x86_64 0:2.7.5-86.el7 python-libs.x86_64 0:2.7.5-86.el7 Complete! [root@osedev1 backups]# yum install python-setuptools Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: mirror.alpix.eu * epel: mirror.wiuwiu.de * extras: centosmirror.netcup.net * updates: mirror.alpix.eu Package python-setuptools-0.9.8-7.el7.noarch already installed and latest version Nothing to do [root@osedev1 backups]# yum install git ... Installed: git.x86_64 0:1.8.3.1-20.el7 Dependency Installed: perl-Error.noarch 1:0.17020-2.el7 perl-Git.noarch 0:1.8.3.1-20.el7 perl-TermReadKey.x86_64 0:2.30-20.el7 Complete! [root@osedev1 backups]# adduser b2user [root@osedev1 backups]# sudo su - b2user [b2user@osedev1 ~]$ mkdir virtualenv [b2user@osedev1 ~]$ cd virtualenv/ [b2user@osedev1 virtualenv]$ virtualenv . New python executable in /home/b2user/virtualenv/bin/python Installing setuptools, pip, wheel...done. [b2user@osedev1 virtualenv]$ cd .. [b2user@osedev1 ~]$ mkdir sandbox [b2user@osedev1 ~]$ cd sandbox/ [b2user@osedev1 sandbox]$ git clone https://github.com/Backblaze/B2_Command_Line_Tool.git Cloning into 'B2_Command_Line_Tool'... remote: Enumerating objects: 151, done. remote: Counting objects: 100% (151/151), done. remote: Compressing objects: 100% (93/93), done. remote: Total 7130 (delta 90), reused 102 (delta 55), pack-reused 6979 Receiving objects: 100% (7130/7130), 1.80 MiB | 3.35 MiB/s, done. Resolving deltas: 100% (5127/5127), done. [b2user@osedev1 sandbox]$ cd B2_Command_Line_Tool/ [b2user@osedev1 B2_Command_Line_Tool]$ python setup.py install setuptools 20.2 or later is required. To fix, try running: pip install "setuptools>=20.2" [b2user@osedev1 B2_Command_Line_Tool]$
- I hate using pip; it often breaks the OS and apps installed, but I bit my tounge & proceeded (I wouldn't do this on prod)
[root@osedev1 backups]# yum install python3-setuptools Installed: python3-setuptools.noarch 0:39.2.0-10.el7 Dependency Installed: python3.x86_64 0:3.6.8-10.el7 python3-libs.x86_64 0:3.6.8-10.el7 python3-pip.noarch 0:9.0.3-5.el7 Complete! [root@osedev1 backups]# [root@osedev1 backups]# pip install "setuptools>=20.2" -bash: pip: command not found [root@osedev1 backups]# yum install python-pip ... Installed: python2-pip.noarch 0:8.1.2-10.el7 Complete! [root@osedev1 backups]# pip install "setuptools>=20.2" Collecting setuptools>=20.2 Downloading https://files.pythonhosted.org/packages/b2/86/095d2f7829badc207c893dd4ac767e871f6cd547145df797ea26baea4e2e/setuptools-41.2.0-py2.py3-none-any.whl (576kB) 100% || 583kB 832kB/s Installing collected packages: setuptools Found existing installation: setuptools 0.9.8 Uninstalling setuptools-0.9.8: Successfully uninstalled setuptools-0.9.8 Successfully installed setuptools-41.2.0 You are using pip version 8.1.2, however version 19.2.3 is available. You should consider upgrading via the 'pip install --upgrade pip' command. [root@osedev1 backups]# pip install --upgrade pip Collecting pip Downloading https://files.pythonhosted.org/packages/30/db/9e38760b32e3e7f40cce46dd5fb107b8c73840df38f0046d8e6514e675a1/pip-19.2.3-py2.py3-none-any.whl (1.4MB) 100% || 1.4MB 511kB/s Installing collected packages: pip Found existing installation: pip 8.1.2 Uninstalling pip-8.1.2: Successfully uninstalled pip-8.1.2 Successfully installed pip-19.2.3 [root@osedev1 backups]#
- when it came time to install it, I had to add the '--user' flag
[b2user@osedev1 B2_Command_Line_Tool]$ python setup.py install --user ... Installed /home/b2user/.local/lib/python2.7/site-packages/python_dateutil-2.8.0-py2.7.egg Searching for setuptools==41.2.0 Best match: setuptools 41.2.0 Adding setuptools 41.2.0 to easy-install.pth file Installing easy_install script to /home/b2user/.local/bin Installing easy_install-3.6 script to /home/b2user/.local/bin Using /usr/lib/python2.7/site-packages Finished processing dependencies for b2==1.4.1 [b2user@osedev1 B2_Command_Line_Tool]$ [b2user@osedev1 B2_Command_Line_Tool]$ ^C [b2user@osedev1 B2_Command_Line_Tool]$ ~/.local/bin/b2 version b2 command line tool, version 1.4.1 [b2user@osedev1 B2_Command_Line_Tool]$