Web server configuration
Architecture
Our http(s) content is served using the following web servers:
- Nginx
- Varnish
- Apache
The traffic flows in-order, ie: Internet -> nginx -> varnish -> apache
And then back out to the client in the reverse order: apache -> varnish -> nginx -> Internet
Why??
OSE's principles aims for simplicity--so you ask, "Why aren't we simply using only Apache? Why Nginx as well? And why Varnish?" Great question!
Before simplicity, OSE is radically committed to using FLOSS. We're also an ecologically-aware, low-budget, non-profit with limited financial & computational resources. Keeping this in mind, below are the reasons for the complexity described in this documentation:
- Varnish is a cache. It's an essential component that allows us to serve a very high volume of requests on many websites from a single server. Unfortunately, the free version of Varnish does not speak https.
- Nginx is our tls-terminator. It listens to our encrypted traffic over https & passes unencrypted http traffic internally onto varnish.
- Nginx has great DOS protection and rate-limiting built-in.
- Nginx being distinct from Apache gives us the ability to serve a SITE_DOWN vhost to users for a specific domain, while devs are still able to iterate & test changes to the backend Apache server. Note that we have only 1 dedicated host, and we don't have a load balancer.
- Most people who use https + varnish specifically use Nginx to terminate their https. Therefore, there is better documentation & a user-support-base for this architecture.
- History. I (Michael Altfield) came on-board in 2017 with only Apache running. I added https to protect our user's passwords that were being sent in cleartext, and--in doing so--I had to abandon CloudFront so a third-party didn't have our private keys. At the time, CF was our CDN cache, so I had to implement a self-hosted cache. I chose varnish, and then had to add nginx before it for https termination.
Nginx
This section will describe our use of the Nginx server in our web server configuration, which serves as our https terminator, basic DOS protection, and SITE_DOWN tool.
Why Nginx?
Important Files & Directories
https
Let's Encrypt
HPKP
HTTP Public Key Pinning (HPKP) can brick your domain if not done properly. For safety, 14 keys were pinned following the Let's Encrypt HPKP Best Practices Guide, including:
- Two distinct, pre-generated backup keys' CSRs @ /etc/pki/tls/hpkpBackupKeys/
- Our leaf certificate issued by Let's Encrypt using certbot @ /etc/letsencrypt/live/opensourceecology.org/cert.pem
- The intermediate Let's Encrypt certificate that signed our certificate @ /etc/letsencrypt/live/opensourceecology.org/chain.pem
- The Internet Security Research Group (ISRG) Root Certificate for Let's Encrypt
- The IdenTrust Root Certificate, which cross-signed the Let's Encrypt Root Certificate
- In case Let's Encrypt is no longer usable in the future, all root certificates & the root certificates of their cross-signers for CloudFlare, since they offer free certificates. This includes digicert, addtrust, globalsign, and gtecybertrust (now digicert)
- In case Let's Encrypt is no longer usable in the future, all root certificates & the root certificates of their cross-signers for SSL.com, since they offer free certificates for 90 days.
Moreover, apache was configured with a report-uri, which can be checked on the server to debug potential client-side hpkp issues
report-uri="http://opensourceecology.org/hpkp-report"
Varnish
This section will describe our use of the Varnish server in our web server configuration, which serves as our in-memory cache.
Why Varnish?
Our biggest site is this wiki (running on Mediawiki). As of 2017, Wikipedia (the largest site running on Mediawiki) has chosen Varnish as their cache-of-choice, after experimenting with Squid & Nginx caching. If the biggest user of our biggest site's application backend is using Varnish, we should use it too. And I found good wordpress plugins that play nicely with Varnish as well.
Important Files & Directories
Apache
This section will describe our use of the Apache server in our web server configuration, which serves as our backend application web server.
Why Apache?
mod_security
Our OSE Server uses mod_security & the CRS for additional web application security. This can trigger many issues with some applications' normal & expected behaviour. If mod_security is blocking requests, your browser's debugger will show "403 Forbidden" responses to your requests. These will correspond to log entries to the mod_security log file at /var/log/httpd/modsec_audit.log. Below is an example entry to modsec_audit.log:
--df82886e A-- [11/Aug/2017:22:56:32 +0000] WY42IEb1WWRl5vtNXLPk4QAAAA4 216.244.66.245 41996 138.201.84.223 80 --df82886e-B-- GET /?s=%E5%B0%8F%E6%81%92%E6%8C%8720%E5%85%83%E6%89%8B%E7%BB%AD%E8%B4%B9%E5%A4%9A%E5%B0%91cpyx18.com HTTP/1.1 Host: openbuildinginstitute.org Accept: */* User-agent: Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, help@moz.com) Accept-Charset: utf-8;q=0.7,iso-8859-1;q=0.2,*;q=0.1 --df82886e-F-- HTTP/1.1 403 Forbidden X-Frame-Options: SAMEORIGIN Last-Modified: Thu, 16 Oct 2014 13:20:58 GMT Accept-Ranges: bytes Content-Length: 4897 X-XSS-Protection: 1; mode=block Content-Type: text/html; charset=UTF-8 --df82886e-H-- Message: Access denied with code 403 (phase 2). Pattern match "\\W{4,}" at ARGS:s. [file "/etc/httpd/modsecurity.d/activated_rules/modsecurity_crs_40_generic_attacks.conf"] [line "37"] [id "960024"] [rev "2"] [msg "Meta-Character Anomaly Detection Alert - Repetative Non-Word Characters"] [data "Matched Data: \xe5\xb0\x8f\xe6\x81\x92\xe6\x8c\x87 found within ARGS:s: \xe5\xb0\x8f\xe6\x81\x92\xe6\x8c\x8720\xe5\x85\x83\xe6\x89\x8b\xe7\xbb\xad\xe8\xb4\xb9\xe5\xa4\x9a\xe5\xb0\x91cpyx18.com"] [ver "OWASP_CRS/2.2.9"] [maturity "9"] [accuracy "8"] Action: Intercepted (phase 2) Stopwatch: 1502492192808118 605 (- - -) Stopwatch2: 1502492192808118 605; combined=235, p1=145, p2=76, p3=0, p4=0, p5=14, sr=42, sw=0, l=0, gc=0 Producer: ModSecurity for Apache/2.7.3 (http://www.modsecurity.org/); OWASP_CRS/2.2.9. Server: Apache Engine-Mode: "ENABLED" --df82886e-Z--
The above request shows that mod_security rule id = 960024 blocked a request to openbuildinginstitute.org because the request contained an anomaly of "Repetative Non-Word Characters" In this case, the block appears valid. If the block is invalid, you can blacklist false-positive rules by id in the apache vhost file, like /etc/httpd/conf.d/obi.conf
<Location "/wp-admin/"> <IfModule security2_module> SecRuleRemoveById 960015 981173 960024 960904 960015 960017 </IfModule> </Location>
Or, if needed, disable mod_security for the entire vhost:
<Location "/"> <IfModule security2_module> SecRuleEngine Off </IfModule> </Location>
But try not to disable mod_security entirely.