Web server configuration
This document will describe how our web server is configured.
We should refrain from actually posting configuration files here, lest we create a documentation maintenance nightmare (which would invariable result in stale, useless content). The source-of-truth for our ever-changing server's configuration files' contents should be the server itself.
Rather, in this document, we will describe the overall architecture. For specific directories and configuration files that are relevant, we will simply name their location on the server.
The files on the server are backed-up on a daily basis. This wiki does-not and should-not serve as a backup of our configuration files.
For more information on how to access our server and information on our backups, see OSE Server
Architecture
Our http(s) content is served using the following web servers:
- Nginx
- Varnish
- Apache
The traffic flows in-order, ie: Internet -> nginx -> varnish -> apache
And then back out to the client in the reverse order: apache -> varnish -> nginx -> Internet
Additionally, the following software assists in the configuration of the above web servers:
- logrotate
- awstats
Why??
OSE's principles aims for simplicity--so you ask, "Why aren't we simply using only Apache? Why Nginx as well? And why Varnish?" Great question!
Before simplicity, OSE is radically committed to using FLOSS. We're also an ecologically-aware, low-budget, non-profit with limited financial & computational resources. Keeping this in mind, below are the reasons for the complexity described in this documentation:
- Varnish is a cache. It's an essential component that allows us to serve a very high volume of requests on many websites from a single server. Unfortunately, the free version of Varnish does not speak https.
- Nginx is our tls-terminator. It listens to our encrypted traffic over https & passes unencrypted http traffic internally onto varnish.
- Nginx has great DOS protection and rate-limiting built-in.
- Nginx being distinct from Apache gives us the ability to serve a SITE_DOWN vhost to users for a specific domain, while devs are still able to iterate & test changes to the backend Apache server. Note that we have only 1 dedicated host, and we don't have a load balancer.
- Most people who use https + varnish specifically use Nginx to terminate their https. Therefore, there is better documentation & a user-support-base for this architecture.
- History. I (Michael Altfield) came on-board in 2017 with only Apache running. I added https to protect our user's passwords that were being sent in cleartext, and--in doing so--I had to abandon CloudFront so a third-party didn't have our private keys. At the time, CF was our CDN cache, so I had to implement a self-hosted cache. I chose varnish, and then had to add nginx before it for https termination.
Ports
What is the network and port architecture of our web servers?
Well, rather than documenting the ports that we're currently using for these web servers (which will probably be stale/outdated by the time you read this), here's the commands you can run on production to understand onto which port our web servers' daemons are listening:
ss -plan | grep LISTEN | grep -i nginx ss -plan | grep LISTEN | grep -i varnish ss -plan | grep LISTEN | grep -i httpd
Nginx
This section will describe our use of the Nginx server in our web server configuration, which serves as our https terminator, basic DOS protection, and SITE_DOWN tool.
Why Nginx?
Nginx is necessary to terminate https prior to our varnish cache, as the free version of varnish does not speak https. Hitch was an option as well, but it lacked many features:
- Nginx has DOS protection. Hitch does not.
- Nginx has powerful rewrite/redirect rules, such as http->https or for subdir-to-subdomain redirects. Hitch can't do redirects.
- Nginx is generally very popular and very well documented. Hitch is generally very poorly documented.
- Specifically for the case of terminating https for varnish, more varnish users use Nginx for this than Hitch.
- Nginx allows you to define a dhparams file. Hitch requires a silly process of concatinating the file into a hitch-specific pem file, which convolutes our every-90-day Let's Encrypt cert renewal process.
- Nginx permits us to do a meta "return 444" to drop requests entirely. Apache nor varnish nor hitch has this awesome feature.
Important Files & Directories
For more information about our nginx web server's configuration, please see the following files & directories on the server:
/etc/nginx/nginx.conf /etc/nginx/nginx/conf.d/<vhost_fqdn>.conf /var/log/nginx/<vhost_fqdn>/access.log /var/log/nginx/<vhost_fqdn>/error.log /var/log/nginx/access.log /var/log/nginx/error.log
https
In 2017 & 2018, Michael Altfield migrated OSE sites to use https with Let's Encrypt certificates.
Because the free version of varnish does not speak https, we terminate https using nginx.
Nginx's https config was hardened using Mozilla's ssl-config-generator and the Qualys ssllabs.com SSL Server Test.
Let's Encrypt
We use Let's Encrypt, a FLOSS CA, to generate our free https certificate.
Let's Encrypt certificates are valid for 90-days, and the are automatically extended using the `certbot` tool via a cron job. For more information, see the following files & directories:
/etc/letsencrypt/ /var/log/letsencrypt/ /etc/cron.d/letsencrypt /root/bin/letsencrypt/renew.sh /var/log/letsEncryptRenew.log /etc/nginx/conf.d/ssl.<domain>.<tld>.include /etc/letsencrypt/live/<domain>.<tld>/ /etc/pki/tls/hpkpBackupKeys/
To add a new subdomain (actually a SAN = subject alternate name) to a certificate, you must renew the certificate, listing all the existing domains. First get a list of all the existing domains:
[root@hetzner2 htdocs]# certbot certificates Saving debug log to /var/log/letsencrypt/letsencrypt.log ------------------------------------------------------------------------------- Found the following certs: Certificate Name: opensourceecology.org Domains: fef.opensourceecology.org osemain.opensourceecology.org oswh.opensourceecology.org Expiry Date: 2018-04-03 22:37:19+00:00 (VALID: 74 days) Certificate Path: /etc/letsencrypt/live/opensourceecology.org/fullchain.pem Private Key Path: /etc/letsencrypt/live/opensourceecology.org/privkey.pem Certificate Name: openbuildinginstitute.org Domains: openbuildinginstitute.org awstats.openbuildinginstitute.org seedhome.openbuildinginstitute.org www.openbuildinginstitute.org Expiry Date: 2018-03-18 03:46:32+00:00 (VALID: 57 days) Certificate Path: /etc/letsencrypt/live/openbuildinginstitute.org/fullchain.pem Private Key Path: /etc/letsencrypt/live/openbuildinginstitute.org/privkey.pem -------------------------------------------------------------------------------
The webroots for existing domains can be determined by checking the letsencrypt domain's config files:
[root@hetzner2 htdocs]# cat /etc/letsencrypt/renewal/*.conf # renew_before_expiry = 30 days version = 0.19.0 archive_dir = /etc/letsencrypt/archive/openbuildinginstitute.org cert = /etc/letsencrypt/live/openbuildinginstitute.org/cert.pem privkey = /etc/letsencrypt/live/openbuildinginstitute.org/privkey.pem chain = /etc/letsencrypt/live/openbuildinginstitute.org/chain.pem fullchain = /etc/letsencrypt/live/openbuildinginstitute.org/fullchain.pem # Options used in the renewal process [renewalparams] authenticator = webroot ... webroot_path = /var/www/html/www.openbuildinginstitute.org/htdocs, /var/www/html/seedhome.openbuildinginstitute.org [[webroot_map]] openbuildinginstitute.org = /var/www/html/www.openbuildinginstitute.org/htdocs awstats.openbuildinginstitute.org = /var/www/html/www.openbuildinginstitute.org/htdocs seedhome.openbuildinginstitute.org = /var/www/html/seedhome.openbuildinginstitute.org www.openbuildinginstitute.org = /var/www/html/www.openbuildinginstitute.org/htdocs # renew_before_expiry = 30 days version = 0.19.0 archive_dir = /etc/letsencrypt/archive/opensourceecology.org cert = /etc/letsencrypt/live/opensourceecology.org/cert.pem privkey = /etc/letsencrypt/live/opensourceecology.org/privkey.pem chain = /etc/letsencrypt/live/opensourceecology.org/chain.pem fullchain = /etc/letsencrypt/live/opensourceecology.org/fullchain.pem # Options used in the renewal process [renewalparams] authenticator = webroot ... webroot_path = /var/www/html/fef.opensourceecology.org/htdocs, /var/www/html/oswh.opensourceecology.org/htdocs, /var/www/html/www.opensourceecology.org/htdocs [[webroot_map]] fef.opensourceecology.org = /var/www/html/fef.opensourceecology.org/htdocs www.opensourceecology.org = /var/www/html/www.opensourceecology.org/htdocs oswh.opensourceecology.org = /var/www/html/oswh.opensourceecology.org/htdocs
Then extend the domain:
certbot -nv --expand --cert-name opensourceecology.org certonly -v --webroot -w /var/www/html/fef.opensourceecology.org/htdocs/ -d fef.opensourceecology.org -w /var/www/html/www.opensourceecology.org/htdocs -d www.opensourceecology.org -d opensourceecology.org -w /var/www/html/oswh.opensourceecology.org/htdocs/ -d oswh.opensourceecology.org -w /var/www/html/forum.opensourceecology.org/htdocs -d forum.opensourceecology.org -w /var/www/html/store.opensourceecology.org/htdocs -d store.opensourceecology.org -w /var/www/html/phplist.opensourceecology.org/public_html -d phplist.opensourceecology.org -w /var/www/html/certbot/htdocs -d munin.opensourceecology.org -d awstats.opensourceecology.org -w /var/www/html/microfactory.opensourceecology.org/htdocs -d microfactory.opensourceecology.org -w /var/www/html/wiki.opensourceecology.org/htdocs -d wiki.opensourceecology.org -w /var/www/html/staging.opensourceecology.org/htdocs -d staging.opensourceecology.org /bin/chmod 0400 /etc/letsencrypt/archive/*/pri* nginx -t && service nginx reload
HPKP
HTTP Public Key Pinning (HPKP) can brick your domain if not done properly. For safety, 14 keys were pinned following the Let's Encrypt HPKP Best Practices Guide, including:
- Two distinct, pre-generated backup keys' CSRs @ /etc/pki/tls/hpkpBackupKeys/
- Our leaf certificate issued by Let's Encrypt using certbot @ /etc/letsencrypt/live/opensourceecology.org/cert.pem
- The intermediate Let's Encrypt certificate that signed our certificate @ /etc/letsencrypt/live/opensourceecology.org/chain.pem
- The Internet Security Research Group (ISRG) Root Certificate for Let's Encrypt
- The IdenTrust Root Certificate, which cross-signed the Let's Encrypt Root Certificate
- In case Let's Encrypt is no longer usable in the future, all root certificates & the root certificates of their cross-signers for CloudFlare, since they offer free certificates. This includes digicert, addtrust, globalsign, and gtecybertrust (now digicert)
- In case Let's Encrypt is no longer usable in the future, all root certificates & the root certificates of their cross-signers for SSL.com, since they offer free certificates for 90 days.
Moreover, apache was configured with a report-uri, which can be checked on the server to debug potential client-side hpkp issues
report-uri="http://opensourceecology.org/hpkp-report"
For more information on our hpkp config, see the following file:
/etc/nginx/conf.d/ssl.<domain>.<tld>.include /etc/pki/tls/hpkpBackupKeys/
Varnish
This section will describe our use of the Varnish server in our web server configuration, which serves as our in-memory cache.
Why Varnish?
Our biggest site is this wiki (running on Mediawiki). As of 2017, Wikipedia (the largest site running on Mediawiki) has chosen Varnish as their cache-of-choice, after experimenting with Squid & Nginx caching. If the biggest user of our biggest site's application backend is using Varnish, we should use it too. And I found good wordpress plugins that play nicely with Varnish as well.
Useful Commands
Below are some useful commands for working with varnish on our server
# check for valid configuration varnishd -Cf /etc/varnish/default.vcl # reload configuration service varnish reload # check for valid config + reload if OK varnishd -Cf /etc/varnish/default.vcl &> /dev/null && service varnish reload # see current varnish requests varnishlog # see varnish requests for a specific client ip address varnishlog -q "ReqHeader eq 'X-Forwarded-For: 209.208.216.133'" # see recent varnish statistics varnishstat # purge the entire varnish cache for all vhosts varnishadm 'ban req.url ~ "."' # purge the varnish cache for urls containing a specific string varnishadm 'ban req.http.host ~ "www.opensourceecology.org"' varnishadm 'ban req.url ~ "css"'
Important Files & Directories
For more information about our varnish web server's configuration, please see the following files & directories on the server:
/etc/varnish/
Apache
This section will describe our use of the Apache server in our web server configuration, which serves as our backend application web server.
Why Apache?
While apache is not without its issues, it is extraordinarily popular. At any time, if we were to ask all of the active OSE Devs who has web server experience working with Apache, probably more hands would raise for Apache than any other web server. This maintains a low barrier-of-entry that's extremely important when choosing the software to run a long-lived nonprofit with short-lived volunteers.
Debugging Apache Directly
Sometimes when debugging a site, it may be useful to isolate tests to just apache, in order to eliminate potential issues with nginx/https or the varnish cache. This section will describe how to use ssh tcp port forwarding to test a vhost on apache directly over 127.0.0.1:8000.
Step 1: /etc/hosts
Edit the hosts file on your workstation to point the domain you're testing to 127.0.0.1.
user@workstation:~$ cat /etc/hosts 127.0.0.1 opensourceecology.org www.opensourceecology.org ...
Step 2: SSH Port Forward
Forward your workstation's 127.0.0.1:80 to the server's 127.0.0.1:8000. Run this on your workstation.
user@workstation:~$ sudo sh -c 'ssh -F /home/${SUDO_USER}/.ssh/config -p 32415 -L 80:127.0.0.1:8000 openbuildinginstitute.org'
Note that, because we're using a port <1024 on the workstation, it requires administrator privileges on the workstation, so we use sudo. A side-effect is that we have to specify to user the normal user's ssh config.
Step 3: Visit site in Browser
You should now be able to access the website from your workstation's browser. SSH is listening for traffic on your workstation's port 80 & seemlessly forwarding it to port 127.0.0.1:8000 on the server. Therefore, you're hitting Apache directly without going through nginx or varnish.
For example, open "http://www.opensourceecology.org" in your browser.
Tip: Use private/incognito browsing to avoid cached DNS addresses.
Note: In order for this to work, the protocol (http:// or https://) must *not* be specified in the wp-config.php file's WP_HOME & WP_SITEURL variables.
Useful Commands
This section will describe useful commands when working with Apache
# see all running vhosts httpd -S
mod_security
Our OSE Server uses mod_security & the CRS for additional web application security in Apache. This can trigger many issues with some applications' normal & expected behaviour. If mod_security is blocking requests, your browser's debugger will show "403 Forbidden" responses to your requests. These will correspond to log entries to the mod_security log file at /var/log/httpd/modsec_audit.log. Below is an example entry to modsec_audit.log:
--df82886e A-- [11/Aug/2017:22:56:32 +0000] WY42IEb1WWRl5vtNXLPk4QAAAA4 216.244.66.245 41996 138.201.84.223 80 --df82886e-B-- GET /?s=%E5%B0%8F%E6%81%92%E6%8C%8720%E5%85%83%E6%89%8B%E7%BB%AD%E8%B4%B9%E5%A4%9A%E5%B0%91cpyx18.com HTTP/1.1 Host: openbuildinginstitute.org Accept: */* User-agent: Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, help@moz.com) Accept-Charset: utf-8;q=0.7,iso-8859-1;q=0.2,*;q=0.1 --df82886e-F-- HTTP/1.1 403 Forbidden X-Frame-Options: SAMEORIGIN Last-Modified: Thu, 16 Oct 2014 13:20:58 GMT Accept-Ranges: bytes Content-Length: 4897 X-XSS-Protection: 1; mode=block Content-Type: text/html; charset=UTF-8 --df82886e-H-- Message: Access denied with code 403 (phase 2). Pattern match "\\W{4,}" at ARGS:s. [file "/etc/httpd/modsecurity.d/activated_rules/modsecurity_crs_40_generic_attacks.conf"] [line "37"] [id "960024"] [rev "2"] [msg "Meta-Character Anomaly Detection Alert - Repetative Non-Word Characters"] [data "Matched Data: \xe5\xb0\x8f\xe6\x81\x92\xe6\x8c\x87 found within ARGS:s: \xe5\xb0\x8f\xe6\x81\x92\xe6\x8c\x8720\xe5\x85\x83\xe6\x89\x8b\xe7\xbb\xad\xe8\xb4\xb9\xe5\xa4\x9a\xe5\xb0\x91cpyx18.com"] [ver "OWASP_CRS/2.2.9"] [maturity "9"] [accuracy "8"] Action: Intercepted (phase 2) Stopwatch: 1502492192808118 605 (- - -) Stopwatch2: 1502492192808118 605; combined=235, p1=145, p2=76, p3=0, p4=0, p5=14, sr=42, sw=0, l=0, gc=0 Producer: ModSecurity for Apache/2.7.3 (http://www.modsecurity.org/); OWASP_CRS/2.2.9. Server: Apache Engine-Mode: "ENABLED" --df82886e-Z--
The above request shows that mod_security rule id = 960024 blocked a request to openbuildinginstitute.org because the request contained an anomaly of "Repetative Non-Word Characters" In this case, the block appears valid. If the block is invalid, you can blacklist false-positive rules by id in the apache vhost file, like /etc/httpd/conf.d/00-openbuildinginstitute.org.conf
<Location "/wp-admin/"> <IfModule security2_module> SecRuleRemoveById 960015 981173 960024 960904 960015 960017 </IfModule> </Location>
Or, if needed, disable mod_security for the entire vhost:
<Location "/"> <IfModule security2_module> SecRuleEngine Off </IfModule> </Location>
But try not to disable mod_security entirely.
Web Applications
Our apache server runs the following Web Applications. Please see their corresponding articles for more info:
Important Files & Directories
For more information about our apache web server's configuration, please see the following files & directories on the server:
/etc/httpd/conf/httpd.conf /etc/httpd/conf.d/ /var/www/html/<vhost>/ /var/log/httpd/ /var/log/httpd/modsec_audit.log /var/log/httpd/<vhost>/
Logrotate
Logrotate is an essential daemon for any production server. If log files aren't rotated, sooner or later your disks will fill and the server will malfunction.
We have configured logrotate to manage our logfiles on the server, including all the software described in this document.
Important Files & Directories
For more information about our logrotate configuration, please see the following files & directories on the server:
/etc/logrotate.conf /etc/logrotate.d/
awstats
See Awstats