Objectives
I have a Digital Ocean droplet, and wanted to
- Install Nginx;
- Put Nginx it in front of two distinct web services I have running on this server;
- Enable Basic Http Auth on some URIs;
- One of these services dynamically creates HTTP ports, like service2.example.com:20001. I wanted to map this port to a name, as 20001.service2.example.com
OS
Ubuntu 20.04 on a Digital Ocean droplet.
DNS Config
I created the following A records pointing to the droplet static IP:
service1.example.com
service2.example.com
*.service1.example.com
Code language: Bash (bash)
Nginx install
Basically:
sudo apt update
sudo apt install nginx
Code language: Bash (bash)
More details here
Nginx configuration
The basic reverse proxy configuration was made with the help of this online nginx config generator by Digital Ocean
Some nice command snippets to highlight:
# Create a backup of your current NGINX configuration:
tar -czvf nginx_$(date +'%F_%H-%M-%S').tar.gz nginx.conf sites-available/ sites-enabled/ nginxconfig.io/
# Extract the new compressed configuration archive using tar:
tar -xzvf nginxconfig.io-wiki.uniprojetec.com.br.tar.gz | xargs chmod 0644
# Reload Nginx configuration
sudo systemctl reload nginx
Code language: Bash (bash)
Basic Auth
First, we need to install htpasswd
, that is included in apache2-utils
sudo apt update
sudo apt install apache2-utils
Code language: Bash (bash)
To enable Http Basic Auth, just add the following lines inside a nginx location:
location / {
proxy_pass http://127.0.0.1:6875;
include nginxconfig.io/proxy.conf;
auth_basic "Restricted";
auth_basic_user_file /etc/nginx/.htpasswd;
}
Code language: Nginx (nginx)
Then, add users credentials to the specified file, /etc/nginx/.htpasswd:
cd /etc/nginx
htpasswd -c .htpasswd user-name
Code language: Bash (bash)
Robots.txt and X-Robots-Tag
Now that the website has a public DNS pointing to it, Google and other search engines crawlers will try to index it. However, this web page should be kept private and not appear on Google search results. For this purpose, robots.txt
and noindex
exists. Robots.txt is a text file in the requested location and noindex may be a html tag or a http header.
When the crawler its gathering page information, it will try to open robots.txt and, based on its contents, will respectfully follows its directives, for instance to ignore pages on that location.
However, we do not even need to create the robots.txt file, because Nginx can serve it directly, with the following rule:
location = /robots.txt {
log_not_found off;
access_log off;
add_header Content-Type text/plain;
return 200 "User-agent: *\nDisallow: /\n";
}
Code language: Nginx (nginx)
We must comment the following lines in /etc/nginxconfig.io/general.conf
, to prevent conflict:
# robots.txt
# location = /robots.txt {
# log_not_found off;
# access_log off;
# }
Code language: Nginx (nginx)
Regarding to noindex, the following rule will add the X-Robots-Tag
header to all resources (pages and files):
# reverse proxy
location / {
add_header X-Robots-Tag "noindex, nofollow, nosnippet, noarchive";
proxy_pass http://127.0.0.1:6875;
include nginxconfig.io/proxy.conf;
auth_basic "Restricted";
auth_basic_user_file /etc/nginx/.htpasswd;
}
Code language: Nginx (nginx)
Dynamic Port to URL mapping
The following config maps ports and make a reverse proxy. For instance, 1234.service2.example.com is mapped to 127.0.0.0:1234 and reverse proxied.
server {
listen 80;
listen [::]:80;
server_name "~(\d*).service2.example.com";
# security
include nginxconfig.io/security.conf;
# reverse proxy
location / {
proxy_pass http://127.0.0.1:$1;
include nginxconfig.io/proxy.conf;
}
# additional config
include nginxconfig.io/general.conf;
include nginxconfig.io/robots-disallow.conf;
Code language: Nginx (nginx)
This approach is preferable to the alternative of listen a port range with the listen directive, because the latter will open sockets regardless of incoming connections and may exhaust the nginx or the OS socket limits.
Deixe um comentário