Override the Host: header when using PHP’s readfile
I ran into an odd issue today using readfile() in PHP. One server needed to contact another server via a URL; but the hostname of the second server wasn’t available via the DNS servers that the first server was attached to. This is by design—it’s necessary for the load balancer we use. The second server itself could be contacted, either by IP or by a different hostname. But because of the way virtual hosts work, either of those options meant a response from the wrong virtual host.
Name-based virtual hosts allow one computer with one IP address to respond with different sites to multiple hostnames1. For example, I have https://www.hypocritae.com/ and https://itisntmurder.com/; two hostnames, but if you look them up using “dig” you’ll see that they have the same IP address2.
The client converts the hostname into an IP address and then requests a page. The server doesn’t get to see which hostname was used to contact it; all it sees is the IP address. It knows which site the client wants because the client tells it which hostname in the “Host:” header.
This is the basis for the “Host:” exercises I used in HTTP headers. But readfile() is meant for reading files; it just happens to also support reading via URL. Is it possible to manipulate the http headers it sends when it makes a request?
Yes, although it’s a bit obscure. Most of PHP’s file functions, such as readfile() and file_get_contents(), accept a context parameter. You create contexts using stream_context_create, and they can include custom headers.
Here’s a simple PHP script that gets the title of a web page:
[toggle code]
-
<?
- $page = 'http://www.hypocritae.com/';
- $html = file_get_contents($page);
- $html = str_replace('&', '&', $html);
- $document = simplexml_load_string($html);
- echo $document->head->title, "\n";
- ?>
Save this as “showTitle.php” and you can run this on the command line, if you have PHP installed, using “php showTitle.php”.
- solis$ php showTitle.php
- The Walkerville Weekly Reader
- solis$
If, however, we want to get the site of a different host on that server, but we don’t have the ability to use that hostname for whatever reason, we can specify it in a context:
[toggle code]
-
<?
- $page = 'http://www.hypocritae.com/';
-
$headers = array(
-
'http'=>array(
- 'header'=>"Host: itisntmurder.com\r\n"
- )
-
'http'=>array(
- );
- $context = stream_context_create($headers);
- $html = file_get_contents($page, false, $context);
- $html = str_replace('&', '&', $html);
- $document = simplexml_load_string($html);
- echo $document->head->title, "\n";
- ?>
Run that, and see the results:
- solis$ php showTitle.php
- February 27, 1993: It Isn’t Murder If They’re Yankees
- solis$
Even though it’s using the hostname “www.hypocritae.com” it’s getting the site for “itisntmurder.com”. Specifying the right host when you have to use a different hostname or a raw IP address really is as simple as that: the server looks at the “Host:” http header.
Note that name-based virtual hosting doesn’t work with secure servers: secure servers need to have their own dedicated IP address (or, more specifically, there can only be one secure web site on an IP address).
↑At least at the time I’m writing this. It may change in the future, of course.
↑
HTTP
- HTTP headers
- It’s hard to understand how cookies work and how much information from web visitors can be trusted without understanding how browsers and servers communicate.
- Virtual hosting at Wikipedia
- “Virtual hosting is a method for hosting multiple domain names on a computer using a single IP address. This allows one machine to share its resources, such as memory and processor cycles, to use its resources more efficiently.”
PHP
- readfile
- “Reads a file and writes it to the output buffer.”
- stream_context_create
- “Creates and returns a stream context with any options supplied in options preset.”
satire
- It Isn’t Murder If They’re Yankees
- “The true story of rural Virginia schoolteacher Carolyn Purcell, the small town of Walkerville, and the Washington, DC foolkiller known as the Quiet Man, as told by one of the Quiet Man’s famous victims.”
- The Walkerville Weekly Reader
- In the end times, one newspaper dared to call God to task for His hypocrisy. That newspaper was not us, we swear it. Not the eternal flames!
More HTTP
- Using ETag and If-Modified-Since
- In the article on grabbing an RSS feed, I mentioned that if you’re grabbing a feed more than once a day, you should pay attention to the ETag and the If-Modified-Since headers. Here’s how to do that.
More PHP
- Auto-closing HTML tags in comments
- One of the biggest problems on blogs is that comments often get stuck with unclosed italics, bold, or links. You can automatically close them by transforming the HTML snippet into an XML document.
- Stable sorting of numerically indexed arrays in PHP
- From PHP 4.1, sorted arrays are no longer “stable”. That is, if they are resorted and two items are equal values, they no longer can be expected to maintain their order vis-a-vis each other.
- Add nodes to SimpleXMLElement
- If you want to add child nodes in PHP’s SimpleXML, the correct way to do it is to add the node first, then create it.
- Web display of Taskpaper file
- It is easy to use PHP to convert a Taskpaper task file into simple HTML conducive to styling via CSS.
- New PHP Tutorial
- I’ve just uploaded a new version of my PHP tutorial, with a better MySQL section.
- Two more pages with the topic PHP, and other related pages