偶尔读读GAE的源码也不错

标签:Google App Engine, Python

闲着没事看了看GAE SDK 1.2.2的源码,果然和我一样是2个空格打天下的啊~

先看webapp(google\appengine\ext\webapp\):

首先是__init__.py:
这个是找编码的
_CHARSET_RE = re.compile(r';\s*charset=([^;\s]*)', re.I)
match = _CHARSET_RE.search(environ.get('CONTENT_TYPE', ''))
嗯,这个environ是传给WSGI-compliant的环境字典,Request对象就是这么拿编码的。

Response对象默认是关掉缓存的,看来如果要用开发服务器来做最终系统,肯定得把这改了:
self.headers['Content-Type'] = 'text/html; charset=utf-8'
self.headers['Cache-Control'] = 'no-cache'
301和302转向是设置header的Location字段:
self.response.headers['Location'] = str(absolute_url)
请求的方法是在环境变量里:
method = environ['REQUEST_METHOD']
if method == 'GET':
  handler.get(*groups)
elif method == 'POST':
  handler.post(*groups)
# ...
再看util.py:

一开始的run_wsgi_app里就有个很醒目的字典:
env = dict(os.environ)
env["wsgi.input"] = sys.stdin
env["wsgi.errors"] = sys.stderr
env["wsgi.version"] = (1, 0)
env["wsgi.run_once"] = True
env["wsgi.url_scheme"] = wsgiref.util.guess_scheme(env)
env["wsgi.multithread"] = False
env["wsgi.multiprocess"] = False
难不成把False改成True,就支持多线程和多进程了吗?

至于_start_response,果然就是直接print header,然后将sys.stdout.write返回用于输出body。


另一个是超大的google\appengine\tools\dev_appserver.py:
这个是被无视的header,可以拿来干坏事:
_IGNORE_REQUEST_HEADERS = frozenset(['content-type', 'content-length', 'accept-encoding', 'transfer-encoding'])
_IGNORE_RESPONSE_HEADERS = frozenset([
    'content-encoding', 'accept-encoding', 'transfer-encoding',
    'server', 'date',
    ])
然后是非常有用的几个环境变量:
env = DEFAULT_ENV.copy()

script_name, query_string = split_url(relative_url)

env['SCRIPT_NAME'] = ''
env['QUERY_STRING'] = query_string
env['PATH_INFO'] = urllib.unquote(script_name)
env['PATH_TRANSLATED'] = cgi_path
env['CONTENT_TYPE'] = headers.getheader('content-type',
                                        'application/x-www-form-urlencoded')
env['CONTENT_LENGTH'] = headers.getheader('content-length', '')

cookies = ', '.join(headers.getheaders('cookie'))
email, admin, user_id = get_user_info(cookies)
env['USER_EMAIL'] = email
env['USER_ID'] = user_id
if admin:
  env['USER_IS_ADMIN'] = '1'

for key in headers:
  if key in _IGNORE_REQUEST_HEADERS:
    continue
  adjusted_name = key.replace('-', '_').upper()
  env['HTTP_' + adjusted_name] = ', '.join(headers.getheaders(key))
话说我刚从GAE Cookbook里看到篇取GET参数的方法:
import os
params = dict([part.split('=') for part in str(os.environ['QUERY_STRING']).split('&')])
也就是说,参数是放在os.environ['QUERY_STRING']里的,实际上设置的地方就是这。

模块的缓存是放在字典里:
output_dict = {}
for module_name, module in module_dict.iteritems():
  if module is None:
    continue

  if IsEncodingsModule(module_name):
    output_dict[module_name] = module
    continue

  shared_prefix = ModuleNameHasPrefix(module_name, SHARED_MODULE_PREFIXES)
  banned_prefix = ModuleNameHasPrefix(module_name, NOT_SHARED_MODULE_PREFIXES)

  if shared_prefix and not banned_prefix:
    output_dict[module_name] = module

return output_dict
文件读取也是有缓存的:
logical_filename = normcase(os.path.abspath(filename))

result = FakeFile._availability_cache.get(logical_filename)
if result is None:
  result = FakeFile._IsFileAccessibleNoCache(logical_filename,
                                             normcase=normcase)
  FakeFile._availability_cache[logical_filename] = result
return result
看来这里也要去掉:
if not 'Cache-Control' in headers:
  headers['Cache-Control'] = 'no-cache'

0条评论 你不来一发么↓

    想说点什么呢?