用Google App Engine架设免费代理服务器

标签:Python, Google App Engine

相信大多数网民都被GFW过,教育网民众更是苦于无法访问国外的网站,于是某段时期,寻找代理服务器便成为我们这种不法分子的必修课。
不过GFW的末日终于要来临了。本人顺应潮流,用Google App Engine架设了一台免费代理服务器,从此不再惧怕GFW了。

也许还有一小撮良民不知GFW为何物,我也不解释术语了,点下面3个网站,看看你能不能访问就知道什么是GFW了。
http://www.baidu.jp/
http://www.antigfw.com/
http://chinagfw.blogspot.com/

先介绍下这个代理服务器。
这个程序叫做GAppProxy,可以在Google Code下载。
旁边的链接有说明,不想听我啰嗦的就看他的说明吧。

再说使用方式。
1.如果是Python用户,先确认你安装了Python 2.5或以上版本的SDK。之后,请把文末的代码保存为proxy.py文件,双击运行它,客户端就搭建好了。
接着去浏览器的代理设置选项,把代理服务器设为127.0.0.1:8000,然后试试访问上面3个网站吧。教育网的用户应该也能访问国外的网站了(只要你能访问http://keakon.appspot.com)。
注意:
(1)不要关闭那个命令行窗口,不然就连不上代理服务器了。
(2)如果你想换端口的话,编辑proxy.py中的DEF_LISTEN_PORT = 8000,把8000改成不超过65535的正整数即可。

2.如果不是Python用户,请在上面那个下载链接那下载Windows客户端版本(文件名是GAppProxy.r55.exe),双击后解压,运行gui.exe,然后把Use FetchServer打勾,右边填http://keakon.appspot.com/fetch.py,然后按save按钮保存即可。(其实也可以不填右边的网址,这样会使用别人创建的代理服务器。)
之后和上面一样设置,注意不要关闭gui.exe。
更详细的说明可以参考《GAppProxy Windows客户端的使用说明》

个人强烈推荐第一种方式,因为启动速度很快,所占资源少。后者要加载很多东西,需要耐心等待。
不过最新版的好像更新了启动速度,我运行已经很快了,详见文末的更新。

然后说下架设方式。
先在上面那个下载链接那下载源码,解压后把fetchserver文件夹复制到Google App Engine的文件夹下。
编辑app.yaml,把application: your_application_name改成自己的应用程序名。例如我是application: keakon
然后在Google App Engine的文件夹下运行这个命令:
appcfg.py update fetchserver/
按照提示输入用户名和密码后,文件就会上传,之后服务器就架设好了。

但如果想用这个代理服务器的话,还得更改客户端配置,方法如下:
进入localproxy文件夹,把proxy.py和common.py以外的文件全部删掉。
编辑common.py,把DEF_FETCH_SERVER = ''改为你自己的服务器,例如我是DEF_FETCH_SERVER = 'http://keakon.appspot.com/fetch.py'
之后双击proxy.py就能运行了。
如果想像我一样只用一个文件的话,把common.py的内容复制到proxy.py,并把proxy.py里import语句中的common模块删掉即可。

最后说一下,Python 2.5是不支持HTTPS的(GAppProxy Windows客户端所用的是Python 2.5),如果你的Python是2.6以上版本,应该是可以用HTTPS的,不过安全性并不能保证。
另外,由于Google App Engine的限制,太大的文件是无法用这个代理浏览的,所以想用来下载的也可以放弃了。
再就是这个服务器的IP一般是在美国,不过有的网站限制只有本国IP才能访问,那就不关Google、GFW和教育网的事了…

2010/01/09更新:
可以用于Windows平台,支持HTTPS,无需安装Python的版本:
http://gs.keakon.net/proxy.7z
注意会自动生成proxy.exe.txt文件,觉得占用空间的话,定期删除就行了。

2010/01/19更新:
如何使用代理服务器

proxy.py源码:
#! /usr/bin/env python
#############################################################################
#                                                                           #
#   File: proxy.py                                                          #
#                                                                           #
#   Copyright (C) 2008 Du XiaoGang <dugang@188.com>                         #
#                                                                           #
#   Home: http://gappproxy.googlecode.com                                   #
#                                                                           #
#   This file is part of GAppProxy.                                         #
#                                                                           #
#   GAppProxy is free software: you can redistribute it and/or modify       #
#   it under the terms of the GNU General Public License as                 #
#   published by the Free Software Foundation, either version 3 of the      #
#   License, or (at your option) any later version.                         #
#                                                                           #
#   GAppProxy is distributed in the hope that it will be useful,            #
#   but WITHOUT ANY WARRANTY; without even the implied warranty of          #
#   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the           #
#   GNU General Public License for more details.                            #
#                                                                           #
#   You should have received a copy of the GNU General Public License       #
#   along with GAppProxy.  If not, see <http://www.gnu.org/licenses/>.      #
#                                                                           #
#############################################################################

import BaseHTTPServer, SocketServer, urllib, urllib2, urlparse, zlib, \
       socket, os, sys
try:
    import ssl
    SSLEnable = True
except:
    SSLEnable = False

# global varibles
LOAD_BALANCE = 'http://gappproxy-center.appspot.com/available_fetchserver.py'
GOOGLE_PROXY = 'google.cn:80'
DEF_FETCH_SERVER = 'http://keakon.appspot.com/fetch.py'
DEF_LISTEN_PORT = 8000
localProxy = ''
fetchServer = DEF_FETCH_SERVER

class GAppProxyError(Exception):
    def __init__(self, reason):
        self.reason = reason

    def __str__(self):
        return '<GAppProxy Error: %s>' % self.reason

class LocalProxyHandler(BaseHTTPServer.BaseHTTPRequestHandler):
    PostDataLimit = 0x100000

    def do_CONNECT(self):
        if not SSLEnable:
            # Not Implemented
            print 'HTTPS is not enabled: HTTPS needs Python 2.6 or later.'
            self.wfile.write('HTTP/1.1 501 Not Implemented\r\n')
            self.wfile.write('\r\n')
            self.connection.close()
            return

        # for ssl proxy
        (httpsHost, _, httpsPort) = self.path.partition(':')
        if httpsPort != '' and httpsPort != '443':
            # unsupport
            self.wfile.write('HTTP/1.1 403 Forbidden\r\n')
            self.wfile.write('\r\n')
            self.connection.close()
            return

        # continue
        self.wfile.write('HTTP/1.1 200 OK\r\n')
        self.wfile.write('\r\n')
        sslSock = ssl.SSLSocket(self.connection,
                                server_side=True,
                                certfile='./LocalProxyServer.cert',
                                keyfile='./LocalProxyServer.key')

        # rewrite request line, url to abs
        firstLine = ''
        while True:
            chr = sslSock.read(1)
            # EOF?
            if chr == '':
                # bad request
                sslSock.close()
                self.connection.close()
                return
            # newline(\r\n)?
            if chr == '\r':
                chr = sslSock.read(1)
                if chr == '\n':
                    # got
                    break
                else:
                    # bad request
                    sslSock.close()
                    self.connection.close()
                    return
            # newline(\n)?
            if chr == '\n':
                # got
                break
            firstLine += chr

        # get path
        (method, path, ver) = firstLine.split()
        if path.startswith('/'):
            path = 'https://%s' % httpsHost + path

        # connect to local proxy server
        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        sock.connect(('127.0.0.1', DEF_LISTEN_PORT))
        sock.send('%s %s %s\r\n' % (method, path, ver))

        # forward https request
        sslSock.settimeout(1)
        while True:
            try:
                data = sslSock.read(8192)
            except ssl.SSLError, e:
                if str(e).lower().find('timed out') == -1:
                    # error
                    sslSock.close()
                    self.connection.close()
                    sock.close()
                    return
                # timeout
                break
            if data != '':
                sock.send(data)
            else:
                # EOF
                break
        sslSock.setblocking(True)

        # simply forward response
        while True:
            data = sock.recv(8192)
            if data != '':
                sslSock.write(data)
            else:
                # EOF
                break

        # clean
        sock.close()
        sslSock.shutdown(socket.SHUT_WR)
        sslSock.close()
        self.connection.close()

    def do_METHOD(self):
        # check http method and post data
        method = self.command
        if method == 'GET' or method == 'HEAD':
            # no post data
            postDataLen = 0
        elif method == 'POST':
            # get length of post data
            postDataLen = 0
            if self.headers.has_key('Content-Length'):
                postDataLen = int(self.headers['Content-Length'])
            # exceed limit?
            if postDataLen > self.PostDataLimit:
                self.send_error(403)
                self.connection.close()
                return
        else:
            # unsupported method
            self.send_error(501)
            self.connection.close()
            return

        # get post data
        postData = ''
        if postDataLen > 0:
            postData = self.rfile.read(postDataLen)
            if len(postData) != postDataLen:
                # bad request
                self.send_error(400)
                self.connection.close()
                return

        # do path check
        (scm, netloc, path, params, query, _) = urlparse.urlparse(self.path)
        if (scm.lower() != 'http' and scm.lower() != 'https') or not netloc:
            self.send_error(400)
            self.connection.close()
            return
        # create new path
        path = urlparse.urlunparse((scm, netloc, path, params, query, ''))

        # create request for GAppProxy
        params = urllib.urlencode({'method': method,
                                   'path': path,
                                   'headers': self.headers,
                                   'encodeResponse': 'compress',
                                   'postdata': postData,
                                   'version': 'r55'})
        # accept-encoding: identity, *;q=0
        # connection: close
        request = urllib2.Request(fetchServer)
        request.add_header('Accept-Encoding', 'identity, *;q=0')
        request.add_header('Connection', 'close')
        # create new opener
        if localProxy != '':
            proxy_handler = urllib2.ProxyHandler({'http': localProxy, \
                                                  'https': localProxy})
        else:
            proxy_handler = urllib2.ProxyHandler({'http': GOOGLE_PROXY, \
                                                  'https': GOOGLE_PROXY})
        opener = urllib2.build_opener(proxy_handler)
        # set the opener as the default opener
        urllib2.install_opener(opener)
        resp = urllib2.urlopen(request, params)

        # parse resp
        textContent = True
        # for status line
        line = resp.readline()
        status = int(line.split()[1])
        self.send_response(status)
        # for headers
        while True:
            line = resp.readline()
            line = line.strip()
            # end header?
            if line == '':
                break
            # header
            (name, _, value) = line.partition(':')
            name = name.strip()
            value = value.strip()
            self.send_header(name, value)
            # check Content-Type
            if name.lower() == 'content-type':
                if value.lower().find('text') == -1:
                    # not text
                    textContent = False
        self.end_headers()
        # for page
        if textContent:
            dat = resp.read()
            if len(dat) > 0:
                self.wfile.write(zlib.decompress(dat))
        else:
            self.wfile.write(resp.read())
        self.connection.close()

    do_GET = do_METHOD
    do_HEAD = do_METHOD
    do_POST = do_METHOD

class ThreadingHTTPServer(SocketServer.ThreadingMixIn,
                          BaseHTTPServer.HTTPServer):
    pass

def getAvailableFetchServer():
    request = urllib2.Request(LOAD_BALANCE)
    if localProxy != '':
        proxy_handler = urllib2.ProxyHandler({'http': localProxy})
    else:
        proxy_handler = urllib2.ProxyHandler({'http': GOOGLE_PROXY})
    opener = urllib2.build_opener(proxy_handler)
    # set the opener as the default opener
    urllib2.install_opener(opener)
    resp = urllib2.urlopen(request)
    return resp.read().strip()

def parseConf(confFile):
    global localProxy, fetchServer

    # read config file
    try:
        fp = open(confFile, 'r')
    except IOError:
        # use default parameters
        return
    # parse user defined parameters
    while True:
        line = fp.readline()
        if line == '':
            # end
            break
        # parse line
        line = line.strip()
        if line == '':
            # empty line
            continue
        if line.startswith('#'):
            # comments
            continue
        (name, sep, value) = line.partition('=')
        if sep == '=':
            name = name.strip().lower()
            value = value.strip()
            if name == 'local_proxy':
                localProxy = value
            elif name == 'fetch_server':
                fetchServer = value

if __name__ == '__main__':
    print '--------------------------------------------'
    if SSLEnable:
        print 'HTTP Enabled : YES'
        print 'HTTPS Enabled: YES'
    else:
        print 'HTTP Enabled : YES'
        print 'HTTPS Enabled: NO'

    if fetchServer == '':
        fetchServer = getAvailableFetchServer()
    if fetchServer == '':
        raise GAppProxyError('Invalid response from load balance server.')
    print 'Local Proxy  : %s' % localProxy
    print 'Fetch Server : %s' % fetchServer
    print '--------------------------------------------'
    httpd = ThreadingHTTPServer(('', DEF_LISTEN_PORT),
                                LocalProxyHandler)
    httpd.serve_forever()

2011年10月20日更新:
建议使用goagent

15条评论 你不来一发么↓ 顺序排列 倒序排列

    向下滚动可载入更多评论,或者点这里禁止自动加载

    想说点什么呢?