mod_replace: Documentation

What is mod_replace?

mod_replace is a simple Apache 2.0.x filter module which has originally been developed based on mod_ext_filter. The initial purpose has been to support an Apache-based reverse proxy with mod_rewrite. Absolute URLs contained in the HTTP body could not be handled with mod_rewrite. Thus there was a slow mod_perl solution to rewrite the body content.

The C-based mod_replace in its original version provided a much faster approach to this problem. Since then HTTP header replacement (eg. for Cookie adjustments) have been added.

Until now, mod_replace is only used in addition to mod_proxy to provide an improved reverse proxy experience. It greatly helps to sanitize ill behaving web servers / applications. Examples: absolute links within web pages, absolute links in HTTP headers which aren't controlled by mod_proxy (eg. Set-Cookie).

How does it work?

There are up to date three destinct mechanisms to do pattern replacement within mod_replace. Those are:

When a HTTP response is routed through an HTTP body filter, there are a couple of things you should be aware of:

This means, that you can define multiple patterns for a single filter definition. You simply create a new ReplacePattern with the same name as the previous one (see examples below).

Using an HTTP response header filter, the process is almost the same as above. The patterns are sequentially matched against the data and the necessary replacements take place before the next pattern is processed.

One special feature to note with this filter is, that is doesn't stop looking for matching headers once it found one (which would make sense, according to the HTTP standard, there is only one occurrence of an HTTP header per response). There is one commonly used situation where there are multiple occurrences of the same HTTP header, each with different content: Set-Cookie.

The HTTP request header filter is quite different from the other filters, because it doesn't use the same mechanism within Apache. If it would use the filter mechanism (eg. as an input filter), any request that is also routed through mod_proxy (using Apache as a reverse proxy) will first be processed by mod_proxy and then by mod_replace. Any modifications applied to the HTTP header then are completely ignored by mod_proxy, since it already has created the request to the origin server and the modifications by mod_replace are simply discarded.

The mechanism used by mod_replace for modifications of the request header allow you to alter the HTTP header before mod_proxy processes the request. The same rules apply for the patterns: Multiple patterns are linked together in a linked list and are processed sequentially. Note: There is only one "filter" for all patterns. You don't need to create a named definition and you don't have to set the output filter. But you won't be able to specify additional parameters.

Configuration

Configuring an HTTP body filter

Syntax

ReplaceFilterDefine <name> [<options> ...]

Option Description
<name> The name of the filter definition. Used to destinguish multiple filters (not patterns) and to selectively actived filters.
<options> Configuration options for this filter definition.

CaseIgnore Pattern matching is case insensitive. Don't set this option if you want your patterns to be matched case sensitive!
intype=<mime> Narrows the pattern matching to HTTP responses with the specified MIME type (eg. text/html). Be careful if you use this option with HTTP header patterns.

ReplacePattern <name> <pattern> <string>

Option Description
<name> The name of the filter definition which this pattern is added to. Be sure to define a filter by using the ReplaceFilterDefine command.
<pattern> A PCRE (perl compatible regular expression) pattern. This pattern is matched against any the HTTP body coming from the server. You may use subpatterns and reference them (up to 9) in the replacement string. See the examples for more information.
<string> The string that is inserted as an replacement if a pattern matches. You may specify up to 9 subpatterns from the original pattern (\0 - \9). See the examples.

SetOutputFilter <name>[;<name>]

Option Description
<name> The name of a filter definition that needs to be activated. If there are multiple definitions, you have to put semicolons between the names.

Examples

  ReplaceFilterDefine revproxy CaseIgnore intype=text/html
  ReplacePattern revproxy "(http|https)://origin.server/" "\1://revproxy/"
  SetOutputFilter revproxy
  ReplaceFilterDefine multiple CaseIgnore intype=text/html
  ReplacePattern multiple "(http|https)://origin.server/" "\1://revproxy/"
  ReplacePattern multiple "ftp://origin.server" "ftp://public.server/pub"
  SetOutputFilter multiple

Configuring an HTTP header filter

Syntax

ReplaceFilterDefine <name> [<options> ...]
Option Description
<name> The name of the filter definition. Used to destinguish multiple filters (not patterns) and to selectively actived filters.
<options> Configuration options for this filter definition.
CaseIgnore Pattern matching is case insensitive. Don't set this option if you want your patterns to be matched case sensitive!
intype=<mime> Narrows the pattern matching to HTTP responses with the specified MIME type (eg. text/html). Be careful if you use this option with HTTP header pattern.
HeaderReplacePattern <name> <header> <pattern> <string>
Option Description
<name> The name of the filter definition. Used to destinguish multiple filters (not patterns) and to selectively actived filters.
<header> This is the HTTP header that is to be altered. Note: you cannot alter the header field, only its content. Eg. you can alter the domain name of a Set-Cookie header, but not change an - obviously wrong - "SetKookie" to "Set-Cookie".
<pattern> A PCRE (perl compatible regular expression) pattern. This pattern is matched against the HTTP body coming from the server. You can use subpatterns here, but you are not able to reference them in the replacement string (not implemented).
<string> The string that is inserted as an replacement if a pattern matches.
SetOutputFilter <name>[;<name>]
Option Description
<name> The name of a filter definition that needs to be activated. If there are multiple definitions, you have to put semicolons between the names.

Examples

  ReplaceFilterDefine revproxy CaseIgnore
  HeaderReplacePattern revproxy Set-Cookie \
    " domain=[.]?server.com" \
    " domain=revproxy.com"
  SetOutputFilter revproxy
HTTP header before OutputFilter HTTP header after OutputFilter
    Date: Wed, 07 Apr 2004 13:08:01 GMT
    Server: Apache/1.3.29
    Vary: Accept-Encoding,User-agent
    Set-Cookie: UID=0815; domain=server.com; path=/
    Connection: close
    Content-Type: text/html; charset=iso-8859-1
   
    Date: Wed, 07 Apr 2004 13:08:01 GMT
    Server: Apache/1.3.29
    Vary: Accept-Encoding,User-agent
    Set-Cookie: UID=0815; domain=revproxy.com; path=/
    Connection: close
    Content-Type: text/html; charset=iso-8859-1
   

Configuring an HTTP request header filter

Syntax

RequestHeaderPattern <header> <pattern> <string>
Option Description
<header> This is the HTTP header that is to be altered. Note: you cannot alter the header field, only its content. Eg. you can alter the domain name of a Set-Cookie header, but not change an - obviously wrong - "SetKookie" to "Set-Cookie".
<pattern> A PCRE (perl compatible regular expression) pattern. This pattern is matched against the HTTP body coming from the server. You can use subpatterns here, but you are not able to reference them in the replacement string (not implemented).
<string> The string that is inserted as an replacement if a pattern matches.

Examples

  RequestHeaderPattern Cookie " UID=0815" " UID=007"