之前有过朋友问我Flask、Express这些框架是如何在函数中运行,他是怎么样的一个机制?还有人问我如何做一个Component?看了一下腾讯云Serverless架构现在支持的框架:

我发现虽然支持了很多,但是我比较钟爱的Django貌似没有,正好想到了部分人的疑惑,所以在这里,我就简单的和大家说一下,我如何做一个Django的Component。

分析已有Component(Flask为例)

首先第一步,我们要知道其他的框架是怎么运行的,例如Flask等,我们先通过腾讯云的Flask-Component,按照他的说明部署一下:

非常简单轻松愉快的部署上线,然后在函数的控制台,我们把部署好的下载下来,研究一下:

下载解压之后,我们可以看这样一个目录结构:

蓝色框起来的,是依赖包,黄色的app.py是我们的自己写的代码,那么红色圈起来的是什么?这两个文件从哪里出来的?
api_server.py文件内容:

import app  # Replace with your actual application
import severless_wsgi

# If you need to send additional content types as text, add then directly
# to the whitelist:
#
# serverless_wsgi.TEXT_MIME_TYPES.append("application/custom+json")

def handler(event, context):
    return severless_wsgi.handle_request(app.app, event, context)

可以看到,这里面是将我们创建的app.py文件引入,并且拿到了app这个对象,并且将event和context同时传递给severless_wsgi.py中的handle_reques方法中,那么问题来了,这个方法是什么?

这个方法内容好多......看着有点眼晕,但是,我们可以直接发现这一段代码:

这一段是什么呢?这一段实际上就是将我们拿到的参数(event和context)进行转换,转换之后统一environ中,然后接下来通过werkzeug这个依赖,将这个内容变成request对象,并且与我们刚才说的app对象一起调用from_app方法。获得到反馈:

并且按照API网关的响应集成的格式,将结果返回。
此时此刻,各位看官可能有点想法了,貌似有一丢丢灵感出现了,那么我们不妨看一下Flask/Django这些框架的实现原理:

通过这个简版的原理图,和我刚才说的内容,我们可以想到,实际上正常用的时候要通过web_server,进入到下一个环节,而我们云函数更多是一个函数,本不需要启动web server,所以我们就可以直接调用wsgi_app这个方法,其中这里的environ就是我们刚才的通过对event/context等进行处理后的对象,start_response可以认为是我们的一种特殊的数据结构,例如我们的response结构形态等。所以,如果我们自己想要实现这个过程,不使用腾讯云flask-component,可以这样做:

import sys

try:
    from urllib import urlencode
except ImportError:
    from urllib.parse import urlencode

from flask import Flask

try:
    from cStringIO import StringIO
except ImportError:
    try:
        from StringIO import StringIO
    except ImportError:
        from io import StringIO

from werkzeug.wrappers import BaseRequest

__version__ = '0.0.4'


def make_environ(event):
    environ = {}
    for hdr_name, hdr_value in event['headers'].items():
        hdr_name = hdr_name.replace('-', '_').upper()
        if hdr_name in ['CONTENT_TYPE', 'CONTENT_LENGTH']:
            environ[hdr_name] = hdr_value
            continue

        http_hdr_name = 'HTTP_%s' % hdr_name
        environ[http_hdr_name] = hdr_value

    apigateway_qs = event['queryStringParameters']
    request_qs = event['queryString']
    qs = apigateway_qs.copy()
    qs.update(request_qs)

    body = ''
    if 'body' in event:
        body = event['body']

    environ['REQUEST_METHOD'] = event['httpMethod']
    environ['PATH_INFO'] = event['path']
    environ['QUERY_STRING'] = urlencode(qs) if qs else ''
    environ['REMOTE_ADDR'] = 80
    environ['HOST'] = event['headers']['host']
    environ['SCRIPT_NAME'] = ''
    environ['SERVER_PORT'] = 80
    environ['SERVER_PROTOCOL'] = 'HTTP/1.1'
    environ['CONTENT_LENGTH'] = str(len(body))
    environ['wsgi.url_scheme'] = ''
    environ['wsgi.input'] = StringIO(body)
    environ['wsgi.version'] = (1, 0)
    environ['wsgi.errors'] = sys.stderr
    environ['wsgi.multithread'] = False
    environ['wsgi.run_once'] = True
    environ['wsgi.multiprocess'] = False

    BaseRequest(environ)

    return environ


class LambdaResponse(object):
    def __init__(self):
        self.status = None
        self.response_headers = None

    def start_response(self, status, response_headers, exc_info=None):
        self.status = int(status[:3])
        self.response_headers = dict(response_headers)


class FlaskLambda(Flask):
    def __call__(self, event, context):
        if 'httpMethod' not in event:
            print('httpMethod not in event')
            return super(FlaskLambda, self).__call__(event, context)

        response = LambdaResponse()

        body = next(self.wsgi_app(
            make_environ(event),
            response.start_response
        ))

        return {
            'statusCode': response.status,
            'headers': response.response_headers,
            'body': body
        }

这样一个流程,就会变得更加简单,清楚。整个实现过程,可以认为是对web server部分进行了一种“截断”或者是“替换”:

这就是对Flask-Component的基本分析思路,那么按照这个思路,我们是否可以将Django框架部署上Serverless架构呢?那么Flask和Django有什么区别呢?我这里的区别特指的是在运行启动过程中。

拓展思路:实现Django-component

仔细想一下,貌似并没有区别,那么我们是不是可以直接用Flask这个转换逻辑,将flask的app替换成django的app呢?
把:

from flask import Flask
app = Flask(__name__)

替换成:

import os
from django.core.wsgi import get_wsgi_application
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'mydjango.settings')
application = get_wsgi_application()

是否就能解决问题呢?
我们不妨试一下:

建立好Django项目,直接增加index.py:

# -*- coding: utf-8 -*-

import os
import sys
import base64
from werkzeug.datastructures import Headers, MultiDict
from werkzeug.wrappers import Response
from werkzeug.urls import url_encode, url_unquote
from werkzeug.http import HTTP_STATUS_CODES
from werkzeug._compat import BytesIO, string_types, to_bytes, wsgi_encoding_dance
import mydjango.wsgi

TEXT_MIME_TYPES = [
    "application/json",
    "application/javascript",
    "application/xml",
    "application/vnd.api+json",
    "image/svg+xml",
]


def all_casings(input_string):
    if not input_string:
        yield ""
    else:
        first = input_string[:1]
        if first.lower() == first.upper():
            for sub_casing in all_casings(input_string[1:]):
                yield first + sub_casing
        else:
            for sub_casing in all_casings(input_string[1:]):
                yield first.lower() + sub_casing
                yield first.upper() + sub_casing


def split_headers(headers):
    """
    If there are multiple occurrences of headers, create case-mutated variations
    in order to pass them through APIGW. This is a hack that's currently
    needed. See: https://github.com/logandk/serverless-wsgi/issues/11
    Source: https://github.com/Miserlou/Zappa/blob/master/zappa/middleware.py
    """
    new_headers = {}

    for key in headers.keys():
        values = headers.get_all(key)
        if len(values) > 1:
            for value, casing in zip(values, all_casings(key)):
                new_headers[casing] = value
        elif len(values) == 1:
            new_headers[key] = values[0]

    return new_headers


def group_headers(headers):
    new_headers = {}

    for key in headers.keys():
        new_headers[key] = headers.get_all(key)

    return new_headers


def encode_query_string(event):
    multi = event.get(u"multiValueQueryStringParameters")
    if multi:
        return url_encode(MultiDict((i, j) for i in multi for j in multi[i]))
    else:
        return url_encode(event.get(u"queryString") or {})


def handle_request(application, event, context):

    if u"multiValueHeaders" in event:
        headers = Headers(event["multiValueHeaders"])
    else:
        headers = Headers(event["headers"])

    strip_stage_path = os.environ.get("STRIP_STAGE_PATH", "").lower().strip() in [
        "yes",
        "y",
        "true",
        "t",
        "1",
    ]
    if u"apigw.tencentcs.com" in headers.get(u"Host", u"") and not strip_stage_path:
        script_name = "/{}".format(event["requestContext"].get(u"stage", ""))
    else:
        script_name = ""

    path_info = event["path"]
    base_path = os.environ.get("API_GATEWAY_BASE_PATH")
    if base_path:
        script_name = "/" + base_path

        if path_info.startswith(script_name):
            path_info = path_info[len(script_name) :] or "/"

    if u"body" in event:
        body = event[u"body"] or ""
    else:
        body = ""

    if event.get("isBase64Encoded", False):
        body = base64.b64decode(body)
    if isinstance(body, string_types):
        body = to_bytes(body, charset="utf-8")

    environ = {
        "CONTENT_LENGTH": str(len(body)),
        "CONTENT_TYPE": headers.get(u"Content-Type", ""),
        "PATH_INFO": url_unquote(path_info),
        "QUERY_STRING": encode_query_string(event),
        "REMOTE_ADDR": event["requestContext"]
        .get(u"identity", {})
        .get(u"sourceIp", ""),
        "REMOTE_USER": event["requestContext"]
        .get(u"authorizer", {})
        .get(u"principalId", ""),
        "REQUEST_METHOD": event["httpMethod"],
        "SCRIPT_NAME": script_name,
        "SERVER_NAME": headers.get(u"Host", "lambda"),
        "SERVER_PORT": headers.get(u"X-Forwarded-Port", "80"),
        "SERVER_PROTOCOL": "HTTP/1.1",
        "wsgi.errors": sys.stderr,
        "wsgi.input": BytesIO(body),
        "wsgi.multiprocess": False,
        "wsgi.multithread": False,
        "wsgi.run_once": False,
        "wsgi.url_scheme": headers.get(u"X-Forwarded-Proto", "http"),
        "wsgi.version": (1, 0),
        "serverless.authorizer": event["requestContext"].get(u"authorizer"),
        "serverless.event": event,
        "serverless.context": context,
        # TODO: Deprecate the following entries, as they do not comply with the WSGI
        # spec. For custom variables, the spec says:
        #
        #   Finally, the environ dictionary may also contain server-defined variables.
        #   These variables should be named using only lower-case letters, numbers, dots,
        #   and underscores, and should be prefixed with a name that is unique to the
        #   defining server or gateway.
        "API_GATEWAY_AUTHORIZER": event["requestContext"].get(u"authorizer"),
        "event": event,
        "context": context,
    }

    for key, value in environ.items():
        if isinstance(value, string_types):
            environ[key] = wsgi_encoding_dance(value)

    for key, value in headers.items():
        key = "HTTP_" + key.upper().replace("-", "_")
        if key not in ("HTTP_CONTENT_TYPE", "HTTP_CONTENT_LENGTH"):
            environ[key] = value

    response = Response.from_app(application, environ)

    returndict = {u"statusCode": response.status_code}

    if u"multiValueHeaders" in event:
        returndict["multiValueHeaders"] = group_headers(response.headers)
    else:
        returndict["headers"] = split_headers(response.headers)

    if event.get("requestContext").get("elb"):
        # If the request comes from ALB we need to add a status description
        returndict["statusDescription"] = u"%d %s" % (
            response.status_code,
            HTTP_STATUS_CODES[response.status_code],
        )

    if response.data:
        mimetype = response.mimetype or "text/plain"
        if (
            mimetype.startswith("text/") or mimetype in TEXT_MIME_TYPES
        ) and not response.headers.get("Content-Encoding", ""):
            returndict["body"] = response.get_data(as_text=True)
            returndict["isBase64Encoded"] = False
        else:
            returndict["body"] = base64.b64encode(response.data).decode("utf-8")
            returndict["isBase64Encoded"] = True

    return returndict



def main_handler(event, context):
    return handle_request(mydjango.wsgi.application, event, context)

然后我们部署到函数上,看一下效果:
函数信息:

from django.shortcuts import render
from django.http import HttpResponse
from django.views.decorators.csrf import csrf_exempt

# Create your views here.
@csrf_exempt
def hello(request):
    if request.method == "POST":
        return HttpResponse("Hello world ! " + request.POST.get("name"))
    if request.method == "GET":
        return HttpResponse("Hello world ! " + request.GET.get("name"))

通过部署完成,并绑定apigw触发器,然后在postman中进行测试:
get:

post:

可以看到,通过我们对运行原理的基本剖析和对django的改造,我们已经通过增加一个文件和相关依赖的方法,实现了Django上Serverless的过程。

接下来,我们看一下,如何将这个代码写成一个Component:
首先Clone下来Flask-Component的代码:

然后,我们按照Django的部分模式进行修改:

第一部分,是我们可能会依赖的一个依赖包,以及我们刚才放入的index.py文件。在用户调用这个Component的时候,我们会把这两个文件,放入用户的代码中,一并上传。
第二部分是Serverless.js部分,这里的一个基本格式:

const { Component } = require('@serverless/core')
class TencentDjango extends Component {
  async default(inputs = {}) {
  }
  async remove(inputs = {}) {
  }
}
module.exports = TencentDjango

用户在执行sls的时候,会默认调用default的方法,在执行sls remove的时候会调用remove的方法,所以可以认default的内容是部署,而remove的内容是移除。

部署这里主要流程也蛮简单的,首先将文件进行复制和处理,然后直接调用云函数的组件,通过函数中的include参数将这些文件额外加入,再通过调用apigw的组件来进网关的管理,而用户写的yaml中inpust的内容,会在inputs中获取,我们要做的就是对应的传给不同的组件:

当然除了这两部分对应放过去,上面的region等一些信息也要对应的进行处理。而调用底层组件方法也很简单:

const tencentCloudFunction = await this.load('@serverless/tencent-scf'
const tencentCloudFunctionOutputs = await tencentCloudFunction(inputs)

处理好这里之后,只需要修改一下package.json和readme就可以了。

目前,我已经完成了开源:https://github.com/gosls/tenc...

也在NPM上进行了发布:https://www.npmjs.com/package...

在使用的时候,只需要引入这个Component就好:

DjangoTest:
  component: '@serverless/tencent-django'
  inputs:
    region: ap-guangzhou
    functionName: DjangoFunctionTest
    djangoProjectName: mydjango
    code: ./
    functionConf:
      timeout: 10
      memorySize: 256
      environment:
        variables:
          TEST: vale
      vpcConfig:
        subnetId: ''
        vpcId: ''
    apigatewayConf:
      protocols:
        - http
      environment: release

至此,完成了Django Component的开发和测试。



anycodes
7 声望8 粉丝

浙江大学软件工程硕士毕业,现腾讯科技Serverless架构后台研发,Serverless Framework中国开发者之一,先后开发维护Plugin和多个Components,著有图书《Serverless架构》,是Serverless忠实粉丝和“拓荒者”