Easy Tutorial
❮ Android Tutorial System Architecture Analysis Programmer Joke 24 ❯

HTTP X-Forwarded-For Introduction

Category Programming Technology

X-Forwarded-For is an HTTP extension header. The HTTP/1.1 (RFC 2616) protocol does not define it; it was initially introduced by Squid, a caching proxy software, to represent the real IP of the HTTP request origin. Now it has become a de facto standard, widely used by various HTTP proxies, load balancers, and forwarding services, and has been included in the RFC 7239 (Forwarded HTTP Extension) standard.

The format of the X-Forwarded-For request header is very simple, like this:

X-Forwarded-For: client, proxy1, proxy2

As you can see, the content of XFF consists of multiple parts separated by "comma + space," starting with the IP of the device farthest from the server, followed by the IP of each proxy device.

If an HTTP request passes through three proxies (Proxy1, Proxy2, Proxy3) with IPs IP1, IP2, IP3 respectively before reaching the server, and the real user IP is IP0, then according to the XFF standard, the server will receive the following information:

X-Forwarded-For: IP0, IP1, IP2

Proxy3 connects directly to the server and will append IP2 to XFF to indicate that it is forwarding the request for Proxy2. IP3 is not listed in the XFF, but it can be obtained on the server through the Remote Address field. We know that HTTP connections are based on TCP connections, and there is no concept of IP in the HTTP protocol. Remote Address comes from the TCP connection, indicating the IP of the device that establishes the TCP connection with the server. In this example, it is IP3.

Remote Address cannot be forged because establishing a TCP connection requires a three-way handshake. If the source IP is forged, no TCP connection can be established, let alone the subsequent HTTP request. Different languages have different ways to obtain Remote Address. For example, PHP uses $_SERVER["REMOTE_ADDR"], and Node.js uses req.connection.remoteAddress, but the principle is the same.


Issue

With the background knowledge above, let's talk about the issue. I wrote a simple Web Server using Node.js for testing. The HTTP protocol is language-independent; using Node.js here is just for demonstration purposes, and the same conclusions can be drawn with any other language. Also, using Nginx in this article works the same way; if interested, you can replace it with Apache or other web servers.

The following code listens on port 9009 and outputs some information upon receiving an HTTP request:

var http = require('http');

http.createServer(function (req, res) {
    res.writeHead(200, {'Content-Type': 'text/plain'});
    res.write('remoteAddress: ' + req.connection.remoteAddress + '\n');
    res.write('x-forwarded-for: ' + req.headers['x-forwarded-for'] + '\n');
    res.write('x-real-ip: ' + req.headers['x-real-ip'] + '\n');
    res.end();
}).listen(9009, '0.0.0.0');

In addition to the previously introduced Remote Address and X-Forwarded-For, there is also an X-Real-IP, which is another custom header field. X-Real-IP is typically used by HTTP proxies to indicate the IP of the device with which it establishes a TCP connection, which could be another proxy or the actual request origin. It's important to note that X-Real-IP is not currently part of any standard, and proxies and web applications can agree on any custom header to pass this information.

Now you can directly access this Node.js service using a domain name + port number, and set up an Nginx reverse proxy:

location / {
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header Host $http_host;
    proxy_set_header X-NginX-Proxy true;

    proxy_pass http://127.0.0.1:9009/;
    proxy_redirect off;
}

My Nginx listens on port 80, so I can access the service forwarded by Nginx without specifying a port.

Testing direct access to the Node service:

curl http://t1.imququ.com:9009/

remoteAddress: 114.248.238.236
x-forwarded-for: undefined
x-real-ip: undefined

Since my computer is directly connected to the Node.js service, the Remote Address is my IP. Also, I did not specify any additional custom headers, so the last two fields are undefined.

Now let's access the service forwarded by Nginx:

curl http://t1.imququ.com/

remoteAddress: 127.0.0.1
x-forwarded-for: 114.248.238.236
x-real-ip: 114.248.238.236

This time, my computer accesses the Node.js service through Nginx, and the obtained Remote Address is actually the local IP of Nginx. The following two lines in the Nginx configuration take effect, adding two custom headers to the request:

proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

In fact, in a production environment, web applications are generally deployed using the second method, which has many advantages. However, this introduces a隐患: many web applications retrieve the user's real IP from the HTTP request headers.

HTTP request headers can be arbitrarily constructed. We use the -H parameter of curl to construct X-Forwarded-For and X-Real-IP and test again.

Direct access to the Node.js service:

curl http://t1.imququ.com:9009/ -H 'X-Forwarded-For: 1.1.1.1' -H 'X-Real-IP: 2.2.2.2'

remoteAddress: 114.248.238.236
x-forwarded-for: 1.1.1.1
x-real-ip: 2.2.2.2

For the web application, X-Forwarded-For and X-Real-IP are just two ordinary request headers, so they are output as is without any processing. This indicates that for direct deployment, except for the Remote Address obtained from the TCP connection, any IP information carried in the request headers cannot be trusted.

Accessing the service forwarded by Nginx:

curl http://t1.imququ.com/ -H 'X-Forwarded-For: 1.1.1.1' -H 'X-Real-IP: 2.2.2.2'

remoteAddress: 127.0.0.1
x-forwarded-for: 1.1.1.1, 114.248.238.236
x-real-ip: 114.248.238.236

This time, Nginx appends my IP to X-Forwarded-For and overwrites the X-Real-IP request header with my IP. This indicates that with Nginx's processing, the last part of X-Forwarded-For and the entire content of X-Real-IP cannot be constructed and can be used to obtain the user's IP.

User IP is often used in web security-related scenarios, such as checking user login locations, controlling access frequency based on IP, etc. In such scenarios, ensuring that the IP cannot be constructed is more important. After the previous tests and analysis, for web applications directly serving users, the Remote Address obtained from the TCP connection must be used; for web applications with an Nginx reverse proxy, after correctly configuring the Set Header behavior, X-Real-IP or the last part of X-Forwarded-For (they are actually equivalent) can be used.

So, how does the web application itself determine whether the request is direct or forwarded by a controllable proxy? Adding extra request headers during forwarding is one way, but it's not very secure because request headers are too easy to construct. If you must use this method, the custom header should be long and rare, and must be kept confidential.

Another method is to determine if the Remote Address is the local IP, but this is also not perfect, as accessing through Nginx on the server, whether direct or via Nginx proxy, the Remote Address is 127.0.0.1. This issue is usually negligible, but the problem is that the reverse proxy server and the actual web application may not be deployed on the same server. Therefore, a more reasonable approach is to collect all proxy server IP lists, and the web application compares the Remote Address with each one to determine the access method.

Typically, to simplify logic, production environments block direct access to web applications with port numbers and only allow access through Nginx. Is this problem solved? Not necessarily.

First, if the user really accesses Nginx through a proxy, the last part of X-Forwarded-For and X-Real-IP will get the proxy's IP. Security-related scenarios can only use this, but some scenarios, such as displaying the weather based on the user's IP, require the user's real IP as much as possible, where the first IP in X-Forwarded-For can be useful. In this case, there is a problem to be aware of, using the previous example for testing:

curl http://t1.imququ.com/ -H 'X-Forwarded-For: unknown, <>"1.1.1.1'

remoteAddress: 127.0.0.1
x-forwarded-for: unknown, <>"1.1.1.1, 114.248.238.236
x-real-ip: 114.248.238.236

The last part of X-Forwarded-For is appended by Nginx, but the previous parts come from the request headers received by Nginx, which are completely untrusted user inputs. Special care must be taken when using them; only IPs that conform to the format should be used, otherwise, it can lead to security vulnerabilities such as SQL injection or XSS.

Conclusion

PS: Some online articles suggest configuring Nginx like this, which is not reasonable:

proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $remote_addr;

After configuring this, security is indeed enhanced, but it also results in all proxy information before the request reaches Nginx being erased, making it impossible to provide better service for users who genuinely use proxies. It is still necessary to understand the principles in between and analyze specific scenarios accordingly.

Original article: https://imququ.com/post/x-forwarded-for-header-in-http.html

** Click to Share Notes

Cancel

-

-

-

❮ Android Tutorial System Architecture Analysis Programmer Joke 24 ❯