back to main page

A list of open-source HTTP proxies written in python

This page last updated: Sun Jan 14 21:00:00 2007

What: A HTTP proxy is a piece of software that acts as an intermediary between HTTP client software (i.e. browser) and HTTP server software. The proxy receives all requests from the browser, and relays them (possibly modified) onto the server. Likewise, it receives all responses from the server, and relays them (possibly modified) to the client. HTTP Proxies can be used for a wide variety of tasks, including filtering, logging, caching, etc, etc, etc. At some stage, most web programmers will make use of some form of proxy.

Why: The difficulty sometimes arises that it's not easy to find a proxy which implements exactly the function which is required. There may be one or more proxies that implement something close, but not exactly. In these cases, it is often necessary to take an existing proxy that is as close as possible, and modify its source code to meet requirements.

The function of this page is to make this searching and comparison process easier for python programmers. I have listed here all of the open source python HTTP proxies I could find. Listing them together should make the job of comparing features a little easier, at least on a high level basis.

Please feel free to email me if

  1. You know of or find a proxy that is not listed here
  2. If any of the details presented here are wrong
  3. If you think there is information missing from the page which would be useful
  4. You have other suggestions to improve the usefulness of this page

The proxies

Name (click for details) Primary features Minimum python version Design License Last updated YYYY-MM-DD HTTP HTTPS Windows NTLM
NTLM Authorization Proxy Server Windows NTLM authentication, HTTPS 1.5.2 Threaded GPL 2006-06-16 1.1 yes yes
WebCleaner Range of filtering/blocking techniques, html parsing, javascript engine (SpiderMonkey), recognizes known browser attacks, XML DTD for describing filters and rewrite rules 2.4 Asyncore GPL 2006-12-16 1.1 yes yes
SPIKE Proxy SPIKE Proxy is a tool for looking for application-level vulnerabilities in web applications. 2.3 (requires pyOpenSSL) Threaded GPL (soon to be LGPL) 2004-11-05 1.1 yes yes
Willow Bayesian content filtering, browser-based interface, caching, Windows NTLM authentication 2.2.2 or greater Asyncore LGPL 2003-06-06 1.1 yes yes
Amit's Web Proxy Project Filtering/blocking, compression, experimental architectural approaches, range of loadable modules all Asyncore MIT 2003-11-16 1.1 no no
HTTP Replicator Caching. Designed for caching Debian packages. Good asyncore design pattern. 2.3 Asyncore GPL 2004-11-29 1.1 no no
MindRetrieve Indexing proxy 2.4 Threaded BSD 2005-03-28 1.0 no no
Personal Proxy Server Indexing proxy 2.3 Threaded LGPL 2005-05-24 1.0 no no
Twisted Proxy Twisted Proxy 2.3 Twisted LGPL 2005-05-24 1.0 no no
WebDebug Request and Resource logging, Browser interface, Statistics 2.0 (older versions support 1.5.x) Threaded GPL 2002 1.1 no no
httpMonitor Logging, header transformation 1.5 Threaded GPL 2002-03-15 1.1 no no
AdZapper Elimination of Banner Advertisements. Programmable through XML "zapplets": a wizard is provided. Controllable through a web/browser interface. 1.5.2 Medusa Python 2001-09-02 1.1 no no
Archiver Proxy "An http proxy server which archives all your HTTP traffic." 2.0 Asyncore "This program is free software." 2001-08-29 1.0 yes no
Cut the Crap Content filtering, filters extensible through python "plug-ins" and "zapplets language", ACL (Access Control Lists) 2.0 Asyncore GPL 2001-08-20 1.1 no no
Alfajor Cookie filtering, basic content filtering, optional GUI 1.5.2 Threaded GPL 2000-02-28 1.0 no no
Tiny HTTP Proxy Solid basic design pattern 2.1 and greater Threaded Unknown 2003-06-17 1.1 yes no
TCPWatch TCP and HTTP protocol debugging, optional GUI Interface 2.1 Unknown Zope ZPL 2003-06-17 1.1 yes no
Mediator Mediator support, content filtering ?? Threaded GPL Unknown 1.1 no no
Munchy Design pattern for basic content filtering and blocking ?? Threaded Unknown 2000-04-25 1.0 no no
HTTP Debugging Proxy Basic debugging proxy all Threaded Unknown Unknown 1.1 yes no

NTLM Authorization Proxy Server

[back to top]

Name NTLM Authorization Proxy Server
Last updated YYYY-MM-DD 2006-06-16
HTTP version 1.1
HTTPS Connect yes
Windows NTLM support yes
Minimum python version 1.5.2
Platform Win32
Author Darryl Dixon, who succeeded Dmitry Rozmanov
Home page http://ntlmaps.sourceforge.net/
Primary features Windows NTLM authentication, HTTPS
Features

From the product page

'NTLM Authorization Proxy Server' (APS) is a proxy software that allows you to authenticate via an MS Proxy Server using the proprietary NTLM protocol. Since version 0.9.5 APS has an ability to behave as a standalone proxy server and authenticate http clients at web servers using NTLM method. It can change arbitrary values in your client's request header so that those requests will look like they were created by MS IE. Main features:

  1. supports NTLM authentication via parent proxy server (Error 407 Proxy Authentication Required);
  2. supports NTLM authentication at web servers (Error 401 Access Denied/Unauthorized);
  3. supports translation of NTLM scheme to standard "Basic" authentication scheme;
  4. supports the HTTPS 'CONNECT' method for transparent tunnelling through parent proxy server;
  5. has ability to change arbitrary values in client's request headers;
  6. supports unlimited number of client connections;
  7. supports connections from external hosts;
  8. supports HTTP 1.1 persistent connections;
  9. stores user's credentials in config file or requests password from a console during the start time;
License GPL
Design Architecture Threaded
Notes Of note is that Dmitry has built his own HTTP server/client from the ground up, rather than using the HTTP support in the python standard library.

WebCleaner

[back to top]

Name WebCleaner
Last updated YYYY-MM-DD 2006-12-16
HTTP version 1.1
HTTPS Connect yes
Windows NTLM support yes
Minimum python version 2.4
Platform all
Author Calvin (aka Bastian Kleineidam)
Home page http://webcleaner.sourceforge.net/
Primary features Range of filtering/blocking techniques, html parsing, javascript engine (SpiderMonkey), recognizes known browser attacks, XML DTD for describing filters and rewrite rules
Features

From the project page

  1. remove unwanted HTML (adverts, flash, etc.)
  2. popup blocker
  3. disable animated GIFs
  4. filter images by size, remove banner adverts
  5. compress documents on-the-fly (with gzip)
  6. reduce images to low-bandwidth JPEGs
  7. remove/add/modify arbitrary HTTP headers
  8. configurable over web interface
  9. usage of SquidGuard blacklists
  10. antivirus filter module
  11. detection and correction of known HTML security flaws
  12. Basic, Digest and (untested) NTLM proxy authentication support
  13. per-host access control
  14. HTTP/1.1 support (persistent connections, pipelining)
  15. HTTPS proxy CONNECT and optional SSL gateway support
License GPL
Design Architecture Asyncore
Notes Derived from Amit's Proxy 4.

SPIKE Proxy

[back to top]

Name SPIKE Proxy
Last updated YYYY-MM-DD 2004-11-05
HTTP version 1.1
HTTPS Connect yes
Windows NTLM support yes
Minimum python version 2.3 (requires pyOpenSSL)
Platform Linux/Windows
Author Dave Aitel
Home page http://www.immunitysec.com/resources-freesoftware.shtml
Primary features SPIKE Proxy is a tool for looking for application-level vulnerabilities in web applications.
Features

From the product web-site:

SPIKE Proxy is a professional-grade tool for looking for application-level vulnerabilities in web applications. SPIKE Proxy covers the basics, such as SQL Injection and cross-site-scripting, but it's completely open Python infrastructure allows advanced users to customize it for web applications that other tools fall apart on.

Features include

  1. HTTPS Man in the Middle support.
  2. Archives all requests so you can edit and resend them.
  3. Powerful fuzzer, crawler, and other web application auditing features
  4. Can apply a regular expression to either side of the connection (useful for removing client side javascript authentication or browser checks
  5. NTLM support
  6. Robust and tested with many browsers and applications
  7. Fully open source
  8. "Recent Requests" window
  9. Supports chunked encoding
  10. And much more.
License GPL (soon to be LGPL)
Design Architecture Threaded
Notes HTTPS is not tunneled - it's MITMed.

Willow

[back to top]

Name Willow
Last updated YYYY-MM-DD 2003-06-06
HTTP version 1.1
HTTPS Connect yes
Windows NTLM support yes
Minimum python version 2.2.2 or greater
Platform Linux only. From the product page: "There is no windows suport at this time. It is being worked on and an should be release shortly."
Author Digital Lumber
Home page http://www.digitallumber.com/willow
Primary features Bayesian content filtering, browser-based interface, caching, Windows NTLM authentication
Features

From the product page

Willow is a content-filtering proxy server. It bears one similarity to the many other pieces of software available for web filtering in that it is designed to filter web content. That, however, is where the similarities end. The differences between Willow and other solutions are significant, and these differences make Willow the first really usable internet filter.

In addition to being the first web filter to really work, Willow was also designed to make life easy on network administrators. To this end Willow supports the following:

  • HTTPS tunneling
  • response caching
  • filtering based on any part of the request or response (domain, url, headers, etc.)
  • through-the-web management
  • authentication to a Windows NT/2000 domain
  • authentication through unix password files
License LGPL
Design Architecture Asyncore
Notes Willow is interesting because it uses Bayesian filtering to recognise the bad content that you don't want to see. Which means that it has a Bayesian network that has to be trained on good and bad content in order to be able to tell the difference between them. Which means that you need some bad content in order to train it: a problem which the authors have solved by including a corpus of pornography in the download!

Amit's Web Proxy Project

[back to top]

Name Amit's Web Proxy Project
Last updated YYYY-MM-DD 2003-11-16
HTTP version 1.1
HTTPS Connect no
Windows NTLM support no
Minimum python version all
Platform Mostly *nix
Author Amit Patel
Home page http://theory.stanford.edu/~amitp/proxy.html
Primary features Filtering/blocking, compression, experimental architectural approaches, range of loadable modules
Features

Amit has written a variety of proxies, with differing features. Among the features of the varying proxies are

  1. Content filtering
  2. Content filtering on streamed content
  3. External configuration files
  4. Gzip encoding
  5. Chunked encoding
  6. Persistent Connections
  7. DNS Lookups
  8. Loadable modules
  9. A java applet which acts as a user interface

Among the loadable modules that come with Proxy 3 are

  1. mod_proxy: provide magic URLs that display the proxy's internal state.
  2. mod_stdio: listen for proxy events (HTTP connections, errors, timeouts, ad removal, etc.) and display them on stdout.
  3. mod_curses: listen for proxy events and display them in a curses UI.
  4. mod_gtk: listen for proxy events and display them in a GTK UI; also allow changing settings.
  5. mod_ui: listen for proxy events and display them in a Java applet (not included); also allow changing settings.
  6. mod_stats: listen for proxy events and display statistics when the proxy exits.
  7. mod_timing: display slow DNS lookups and slow proxy filters.
  8. mod_cookies: listen for cookie events (sent by server, sent by browser) and display them on stdout.
  9. mod_headers: display HTTP headers (sent by server, sent by browser) on stdout.
  10. mod_html: modify HTML -- change Slashdot color scheme from green to blue; rearrange My Excite portal layout; change Microsoft quotes to standard ASCII quotes; remove popup ads; remove banner ads.
  11. mod_geocities: modify HTML -- remove Geocities popups.
  12. mod_java: modify Java bytecode (wrap audio, thread, frame, socket objects).
  13. mod_slashdot: modify images -- change Slashdot color scheme from green to blue by altering the GIF files.
  14. mod_block: block clear GIFs.
  15. mod_cache: cache documents forever.
  16. mod_dnsprefetch: parse HTML documents, find hostnames, prefetch the DNS lookups for them so when you click on a link, the (often slow) DNS lookup is already performed.
  17. mod_formdata: display form upload data.
  18. mod_ignorecache: remove headers which tell the browser not to cache certain sites.
  19. mod_nocookie: block servers from setting cookies.
License MIT
Design Architecture Asyncore
Notes Amit has some good notes on the various implementations he has created, including discussion of techniques. Amit put a lot of work in transforming content as it passed through the proxy: e.g. eliminating javascript that popped-up windows, eliminating requests to ad-servers, etc.

HTTP Replicator

[back to top]

Name HTTP Replicator
Last updated YYYY-MM-DD 2004-11-29
HTTP version 1.1
HTTPS Connect no
Windows NTLM support no
Minimum python version 2.3
Platform Linux
Author Gertjan
Home page http://freshmeat.net/projects/http-replicator
Primary features Caching. Designed for caching Debian packages. Good asyncore design pattern.
Features

Replicator is a replicating HTTP proxy server. Files that are downloaded through the proxy are transparently stored in a private cache, so an exact copy of accessed remote files is created on the local machine. It is in essence a general purpose proxy server, but especially suited for maintaining a cache of Debian packages.

License GPL
Design Architecture Asyncore
Notes A debian package is available. The python code is discussed on the project homepage. The code is clear and concise enough to make HTTP Replicator a good design pattern for writing an asynchronous (asyncore) proxy.

MindRetrieve

[back to top]

Name MindRetrieve
Last updated YYYY-MM-DD 2005-03-28
HTTP version 1.0
HTTPS Connect no
Windows NTLM support no
Minimum python version 2.4
Platform Windows and Linux
Author Wai Yip Tung
Home page http://www.mindretrieve.net/
Primary features Indexing proxy
Features From the product page: "MindRetrieve is a personal desktop search engine. Unlike general purpose search engines that intent to index and search every web page in the world, MindRetrieve focus on a small but very special area. That is the web that you have seen. "
License BSD
Design Architecture Threaded
Notes This proxy depends on PyLucene, a python port of Jakarta Lucene

Personal Proxy Server

[back to top]

Name Personal Proxy Server
Last updated YYYY-MM-DD 2005-05-24
HTTP version 1.0
HTTPS Connect no
Windows NTLM support no
Minimum python version 2.3
Platform all
Author Duncan Gough
Home page http://www.suttree.com/code/pps/
Primary features Indexing proxy
Features PPS is a local web proxy that indexes the URL, page title, date and text of every page you visit. It feeds that data into a database that can be queried later, providing you with a searchable browser history.
License LGPL
Design Architecture Threaded
Notes This proxy is based on Tiny HTTP Proxy. This proxy depends on Lupy, a python port of Jakarta Lucene 1.2

Twisted Proxy

[back to top]

Name Twisted Proxy
Last updated YYYY-MM-DD 2005-05-24
HTTP version 1.0
HTTPS Connect no
Windows NTLM support no
Minimum python version 2.3
Platform all
Author Duncan Gough
Home page http://www.suttree.com/code/proxy/
Primary features Twisted Proxy
Features "This is an extremely simple proxy server that I created since there were no examples of how to use the Twisted proxy code that I could find."
License LGPL
Design Architecture Twisted
Notes This proxy depends on Twisted

WebDebug

[back to top]

Name WebDebug
Last updated YYYY-MM-DD 2002
HTTP version 1.1
HTTPS Connect no
Windows NTLM support no
Minimum python version 2.0 (older versions support 1.5.x)
Platform all
Author Paul Clip
Home page http://www.cyberclip.com/webdebug/index.html
Primary features Request and Resource logging, Browser interface, Statistics
Features

From the product page: Here is a quick summary of WebDebug's features:

  1. Written in Python, tested on Windows NT, 2000 and Solaris. Should run on most (all?) environments that Python supports. Runs best with thread support but you can turn that off
  2. Records all HTTP requests and responses by acting as a browser's proxy server. Supports Keep-Alive
  3. Displays these HTTP messages (including headers, content, etc.) through a simple browser interface
  4. Generates statistics such as total message size, total size of each mime-type, throughput, etc.
  5. Allows the saving and loading of captured HTTP requests/responses and also statistics in tab-delimited file format, suitable for importing into spreadsheets
License GPL
Design Architecture Threaded
Notes  

httpMonitor

[back to top]

Name httpMonitor
Last updated YYYY-MM-DD 2002-03-15
HTTP version 1.1
HTTPS Connect no
Windows NTLM support no
Minimum python version 1.5
Platform all
Author Volker Stampa
Home page http://home.wtal.de/stampa/httpMonitor/
Primary features Logging, header transformation
Features

From the product page

In a XML-configuration file you can specify python-functions as processors for the HTTP-messages you are interested in (requests or responses, which match certain criteria). You may have as many processors as you want, without modifying the actual program. The only thing you need to do, is configure the httpMonitor and write your processors.

The package comes with three sample processors. One for logging header-information of the HTTP messages (httpMonitor.filters.logger), one for modifying header fields (httpMonitor.filters.headermodifier) and a third one for accessing (and modifiying) parsed html-text. The processors can be configured to tell them exactly what to log or what to modify.

License GPL
Design Architecture Threaded
Notes  

AdZapper

[back to top]

Name AdZapper
Last updated YYYY-MM-DD 2001-09-02
HTTP version 1.1
HTTPS Connect no
Windows NTLM support no
Minimum python version 1.5.2
Platform all
Author Adam Feuer
Home page http://www.zaplet.org/adzapper/
Primary features Elimination of Banner Advertisements. Programmable through XML "zapplets": a wizard is provided. Controllable through a web/browser interface.
Features

From the product page:

adzapper is a filtering proxy that can block ads from being displayed on your web browser. instead of ad banners, you see blank spaces: adzapper transforms the ads into transparent gifs.

The rules that describe what are ads and what are not ads are called "zaplets", and are configurable on a per-website basis. This way, when websites change their ads or their graphic design, it is easy to build and share new zaplets that block the ads.

adzapper is based on Sam Rushing's Medusa, a very fast asychronous-sockets web-server written in Python. Medusa is single threaded, but this doesn't mean it is slow! in my experience it is one of the fastest, lightest webservers out there.

License Python
Design Architecture Medusa
Notes From the product's About Page: "adzapper was inspired by the Muffin filtering proxy written by Mark Boyns, and additional inspiration and the impetus to start coding was provided by Constantinos Kotsokalis and his CTC filtering proxy".

Archiver Proxy

[back to top]

Name Archiver Proxy
Last updated YYYY-MM-DD 2001-08-29
HTTP version 1.0
HTTPS Connect yes
Windows NTLM support yes
Minimum python version 2.0
Platform all
Author Aaron Swartz
Home page http://logicerror.com/archiverProxy
Primary features "An http proxy server which archives all your HTTP traffic."
Features

 

License "This program is free software."
Design Architecture Asyncore
Notes Based on code from The Medusa Tutorial and Neil Schemenauer's Munchy.

Cut the Crap

[back to top]

Name Cut the Crap
Last updated YYYY-MM-DD 2001-08-20
HTTP version 1.1
HTTPS Connect no
Windows NTLM support no
Minimum python version 2.0
Platform all
Author Constantinos A. Kotsokalis
Home page http://www.softlab.ece.ntua.gr/~ckotso/CTC/
Primary features Content filtering, filters extensible through python "plug-ins" and "zapplets language", ACL (Access Control Lists)
Features From the product page: "Cut The Crap" is a proxy-like server, that will keep advertisement banners out of your sight
License GPL
Design Architecture Asyncore
Notes  

Alfajor

[back to top]

Name Alfajor
Last updated YYYY-MM-DD 2000-02-28
HTTP version 1.0
HTTPS Connect no
Windows NTLM support no
Minimum python version 1.5.2
Platform all
Author Andrew Cooke
Home page http://www.acooke.org/jara/alfajor/index.html
Primary features Cookie filtering, basic content filtering, optional GUI
Features

From the product page

Alfajor is an http cookie filter, written in Python with an optional GUI. It acts as an http proxy (you must configure your browser to use it) and can either contact sites directly or work with a second proxy (eg. a cache).

The program allows cookies to be sent from your browser to any address which matches a list of regular expressions. The GUI allows the filter to be turned on and off, sites to be added or removed as they are accessed, and the list of regular expressions to be edited.

From version 1.3, Alfajor can block all data from named sites. This allows filtering of adverts. By running two instances of the program in series, cookies can be controlled and adverts blocked. For more information, see here.

Please note that Alfajor does not fully conform to any HTTP version. However, in practice, it works with the vast majority of sites.

License GPL
Design Architecture Threaded
Notes  

Tiny HTTP Proxy

[back to top]

Name Tiny HTTP Proxy
Last updated YYYY-MM-DD 2003-06-17
HTTP version 1.1
HTTPS Connect yes
Windows NTLM support yes
Minimum python version 2.1 and greater
Platform Beos/Windows
Author Suzuki Hisao
Home page http://mail.python.org/pipermail/python-list/2003-June/210343.html
Primary features Solid basic design pattern
Features  
License Unknown
Design Architecture Threaded
Notes This proxy is so small, there is no home page for it. The link given above points to the archive of Suzuki's post of the code to comp.lang.python.

TCPWatch

[back to top]

Name TCPWatch
Last updated YYYY-MM-DD 2003-06-17
HTTP version 1.1
HTTPS Connect yes
Windows NTLM support yes
Minimum python version 2.1
Platform all
Author Shane Hathaway
Home page http://hathaway.freezope.org/Software/TCPWatch
Primary features TCP and HTTP protocol debugging, optional GUI Interface
Features

From the product page: TCPWatch is a utility written in Python that lets you monitor forwarded TCP connections or HTTP proxy connections. It displays the sessions in a window with a history of past connections. It is useful for developing and debugging protocol implementations and web services. The latest version, 1.2, adds support for recording sessions to a directory.

License Zope ZPL
Design Architecture Unknown
Notes  

Mediator

[back to top]

Name Mediator
Last updated YYYY-MM-DD Unknown
HTTP version 1.1
HTTPS Connect no
Windows NTLM support no
Minimum python version ??
Platform all
Author Itamar Shtull-Trauring
Home page http://www.itamarst.org/software/
Primary features Mediator support, content filtering
Features Apart from its ability to filter content, stripping ads, javascript, etc, this proxy has one unique feature: It can forward all requests through a given anonymiser site or other "mediator" site. Best to visit the page and see it in action.
License GPL
Design Architecture Threaded
Notes This is derived from Neil Schemenauer's Munchy proxy. Here is a direct link to the single python source file.

Munchy

[back to top]

Name Munchy
Last updated YYYY-MM-DD 2000-04-25
HTTP version 1.0
HTTPS Connect no
Windows NTLM support no
Minimum python version ??
Platform all
Author Neil Schemenauer
Home page http://arctrix.com/nas/python/munchy.py
Primary features Design pattern for basic content filtering and blocking
Features Munchy is designed to strip ads and other time/bandwidth consuming and privacy invading stuff.
License Unknown
Design Architecture Threaded
Notes The content filtering database is well out-of-date.

HTTP Debugging Proxy

[back to top]

Name HTTP Debugging Proxy
Last updated YYYY-MM-DD Unknown
HTTP version 1.1
HTTPS Connect yes
Windows NTLM support yes
Minimum python version all
Platform all
Author Xavier Defrang
Home page http://defrang.com/python/
Primary features Basic debugging proxy
Features This small HTTP proxy prints the headers of all HTTP requests and responses
License Unknown
Design Architecture Threaded
Notes This proxy is based on Tiny HTTP Proxy. Here is a direct link to the single python source file.


source http://www.xhaus.com/alan/python/proxies.html        page by Alan Kennedy