mod_cdn is an apache2 module developed at Voxel that makes VoxCAST and other CDNs easier to use by automatically "CDNifying" some HTML content and "originifying" the origin apache server to send proper headers, validate authentication tokens, etc. mod_cdn is meant to be installed and configured on a CDN customer's origin server. With mod_cdn, turning up a CDN for your website is as easy as setting up a simple apache module.
The source code for mod_cdn is available under the GPL v2 license.
Please direct technical questions about mod_cdn to support@voxel.net. Also, check out the FAQ to see if your questions have already been addressed.
Features
mod_cdn currently has the following features:
- Find links to content (e.g. img src tags) in HTML and rewrite the URLs to point to a different server (e.g., a VoxCAST host)
- Automatically add voxtoken authentication token query string arguments to links in HTML that are being CDNified
- Automatically add query string ignore tokens in the query strings of URLs being CDNified, relative to some other query string argument
- Add proper
Expiresheaders to static content for which the server is the CDN origin - Add
Vox-Authorization: requiredheaders to static content for which the server is the CDN origin, and which requires authentication - Verify the authentication tokens in query string arguments of static content requests for which the server is the CDN origin
Most sites probably don't need all of these features; they can each be turned on and off and configured at a fairly high granularity.
Another major feature in the works is dynamic threshold based switching between CDN and non-CDN that takes into account server load. Let us know if you think of other features we should add to mod_cdn -- or send us a patch!
While some of mod_cdn's features are specific to VoxCAST, mod_cdn is almost as useful for customers of other CDNs: server remapping, Expires headers, and other mod_cdn features address CDNification tasks common to most CDNs.
Downloads
mod_cdn 1.1.0 (source) [27 KB] -- 2009-11-29
Please note: mod_cdn has mainly been tested on Voxel's apache deployments and may need some tweaking to work on other setups. Please let us know if you encounter problems (or find solutions)!
Installing mod_cdn
There's no package yet for mod_cdn, but we'll make some soon. The module consists of an apache module (dynamic library) that must be put in /var/lib/apache2/modules/mod_cdn.so (the path might be slightly different on your system). To load the module, place cdn.load and cdn.conf in /etc/apache2/mods-available and link to them from mods-enabled. (Again, the procedure might be slightly different for you -- our instructions are based on a Debian installation.)
cdn.conf
The global (apache-wide) configuration for mod_cdn in cdn.conf includes a few directives that should work as-is for pretty much everyone. These are:
CDNHTMLContentType
Defines the set of content types that will be parsed as (X)HTML and CDNified. The defaults are text/html and application/xhtml+xml. If other content types should be CDNified (and are really just HTML) then you can add them to the list. You can also just include the directive elsewhere (e.g. within a VirtualHost or Directory/Location section) and any new values will be added to the list.
CDNHTMLLinks
Defines the set of HTML tag/attribute pairs that we expect to contain links to content. mod_cdn will look for these tags/attributes and, when found, will CDNify the link if necessary. There are some tag/attribute pairs that contain URLs, but which we probably don't want to CDNify, like form/action. These are left out of the list in cdn.conf. New tag/attribute pairs can be added either in cdn.conf or elsewhere in the apache config tree.
Configuration
To configure mod_cdn, there are a variety of apache config directives that control HTML parsing/replacing and other originification options. Almost always, these directives should be applied within a VirtualHost section, and probably even within a Directory/Location section, especially in the case of CDNActAsOrigin, where often the content you want offloaded to the CDN is within a particular directory (e.g., /images).
See the example.conf shipped with mod_cdn for some sample uses of these directives.
CDNHTMLFromServers
CDNHTMLFromServers static.example.com images.example.com ...
By default, any relative links (e.g. "../blah.png") and locally absolute links ("/path/to/blah.png") that match a regex defined with CDNHTMLRemapURLServer will be CDNified. In addition, globally absolute links ("http://example.com/path/to/blah.gif") where the server name matches the current VHost's ServerName or one of its ServerAliases will be CDNified if they match. It may be the case that some of the content linked to in the HTML has URLs on a different server (e.g., "static.example.com"). CDNHTMLFromServers specifies a list of servernames for which we'll CDNify globally absolute links that match one of the regexes.
CDNHTMLToServer
CDNHTMLToServer http://1234.voxcdn.com
Set the CDN hostname to which traffic will be redirected. This should be in the form of http[s]://xxxx.voxcdn.com[:port] .
CDNHTMLRemapURLServer
CDNHTMLRemapURLServer regex flags
The main workhorse for CDNification via server remapping. Multiple instances of this directive can be used. If the regex matches a URL in the HTML, the URL is remapped to point to the CDNHTMLToServer. The flags affect the regex matching and the URL rewriting. Flags are individual letters, unseparated by any spaces, and include:
- 'x': use POSIX extended regular expressions
- 'i': ignore case in matching the regex
- 'a': add
voxtokenauthentication tokens to the query strings or matching URLS, based onCDNAuthKey - 'q': add query string ignore tokens to the query strings of matching URLs, based on
CDNIgnoreTokenNameandCDNHTMLIgnoreTokenLocation
For example, to match all files with a .png extension and no query string, and add authentication tokens to them:
CDNHTMLRemapURLServer \.png$ iaTo match
.pngs with a query string and also add a query string ignore token:
CDNHTMLRemapURLServer \.png\? iaq
CDNHTMLDocType
CDNHTMLDocType [HTML|XHTML] [legacy]
Set the DOCTYPE tag that will be inserted at the top of the HTML:
- HTML:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"> - HTML+legacy:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> - XHTML:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> - XHTML+legacy:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> - other: anything other than "HTML" or "XHTML" (with optional "legacy") will be inserted as the DOCTYPE tag directly, at the top of the document.
If this directive is omitted, no DOCTYPE is inserted but any existing one will not be passed through.
CDNHTMLDefaultCharset
CDNHTMLDefaultCharset utf-8
Specify the default input character set to be used if one cannot be automatically detected. Currently, mod_cdn always outputs UTF-8.
CDNHTMLStartParse
CDNHTMLStartParse tagname
Specify an HTML tag such that anything in front of the first occurrence of the tag in the document is ignored. For example:
CDNHTMLStartParse html
ignores anything that comes before the <html> tag.
CDNAuthKey
CDNAuthKey string
Specify the authentication key (assigned by Voxel or set via the CDN portal) for generating and verifying authentication tokens. Note that in using mod_cdn's authentication capabilities, caution needs to be taken in changing the authentication key for the CDN host.
CDNAuthAlt
CDNAuthAlt http://example.com/failed-auth.html
Specify a URL to which a user will be redirected if a request for an object fails authentication (i.e., the voxtoken supplied with the request is incorrect).
CDNAuthExpire
CDNAuthExpire 900
Set the default expiration time in seconds for generated authentication tokens.
CDNHTMLAddAuthTokens
CDNHTMLAddAuthTokens on
This is just a shortcut that is the same as setting the 'a' flag for every CDNHTMLRemapURLServer directive.
CDNIgnoreTokenName
CDNIgnoreTokenName string
Set the name of a query string argument that will be inserted into the query string of remapped URLs when the 'q' flag is set (or for every remapped URL if CDNHTMLAddIgnoreTokens is set). (What will actually be inserted is "string=ignore" since VoxCAST currently expects the ignore token argument to include a value.)
CDNHTMLIgnoreTokenLocation
CDNHTMLIgnoreTokenLocation [before|after] arg-name
Set the relative insertion point of the ignore token into the query string when the ignore token is being added to a remapped URL. The token is inserted either before or after some other query string argument (arg-name). For example, to insert the ignore token before timestamp in a query string like "?id=52&type=png×tamp=1220435841":
CDNHTMLIgnoreTokenLocation before timestamp
CDNHTMLAddIgnoreTokens
CDNHTMLAddIgnoreTokens on
This is just a shortcut that is the same as setting the 'q' flag for every CDNHTMLRemapURLServer directive.
CDNActAsOrigin
CDNActAsOrigin regex flags [expiration-time]
The main workhorse for "originification", i.e., delivery of requests from the CDN for which the server running mod_cdn is the origin. The response for requests where the requested URL matches the regex are modified according to the flags. Flags include:
- 'x': use POSIX extended regular expressions
- 'i': ignore case in matching the regex
- 'e': add an
Expiresheader to the response. If the expiration-time value is set, this will be used to compute the time of expiration; otherwise, the value set withCDNDefaultExpireis used. - 'a': add a
Vox-Authorization: requiredheader to the response (with an alternative URL if specified byCDNAuthAlt), and verify thatvoxtokenis set and correct in the request's query string; if not, return a 403 response code.
CDNDefaultExpire
CDNDefaultExpire seconds
Set the default expiration time in seconds for content that matches one of the CDNActAsOrigin regexes.
FAQ
Here are the answers to some tricky questions you might have while implementing mod_cdn.
How do I compile mod_cdn?
Currently there are no packages for mod_cdn, although we're in the process of making some for, at least, Debian and CentOS. That means you need to compile mod_cdn from the source. To do this, you need to install APR, the Apache runtime library. On a Debian system, the package is libapr1-dev; you may also need libaprutil1-dev. You should compile mod_cdn against Apache 2.2.7 or higher. (Our testing has mainly been with 2.2.8.) Then, just run make.
How do I set up mod_cdn on CentOS/Fedora/Redhat?
The installation instructions above are for Debian systems. Everything is mostly the same on a CentOS system, except for the layout of the Apache configuration. Take the lines from cdn.load and put them in /etc/httpd/conf/httpd.conf near the other LoadModule lines. (Don't forget to make sure that libxml2 is installed and available at /usr/lib/libxml2.so.2; if it's available elsewhere, change the LoadFile line.) Put cdn.conf in /etc/httpd/conf.d. Put mod_cdn.so in /usr/lib/httpd/modules. That should do it; put the mod_cdn configuration directives wherever your normal VirtualHost configuration resides.
Some assets are still being fetched from my origin and not the CDN. Why?
Assuming your regexes are correct, the assets in question are probably referenced in either CSS or Javascript, rather than in your site's HTML. mod_cdn speaks (X)HTML but does not currently parse CSS or Javascript, either embedded or in separate files. This means, for example, that if you use background-image or other URL-referencing attributes in your CSS, mod_cdn won't rewrite those URLs. Remember: mod_cdn is a quick-and-dirty way to take a lot of the load off your origin. More complicated setups will need custom work to truly offload all the intended content delivery to the CDN.
mod_cdn is doing really weird things to my Javascript.
Because mod_cdn is based on libxml2, it's subject to some of the limitations of that library with respect to XML parsing. One problem that's been reported occurs with HTML looking something like this:
<div>
<script type="text/javascript">
...
document.write('<div>blah</div>');
...
</script>
</div>
Here, the </div> within the Javascript is erroniously treated as non-CDATA by libxml2, which tries to fix it by moving the </script> earlier. Other XML libraries seem to have the same problem. A simple workaround is to use the old Javascript/HTML comment trick so mod_cdn ignores Javascript with HTML embedded in it:
<div>
<script type="text/javascript">
// <!--
...
document.write('<div>blah</div>');
...
// -->
</script>
</div>
I'm seeing gibberish or weird encoding errors in HTML parsed by mod_cdn.
This might be due to interactions between mod_cdn and another Apache module. Some users have reported problems with mod_deflate in some versions of Apache, although we've used mod_cdn and mod_deflate together successfully with Apache 2.2.8. Try temporarily disabling unnecessary modules. If that fixes your problem, please let us know the details of your situation and we'll try to address the problem permanently.


+1 212 812 4190 (US)
