全域性 Lua 外掛

我們支援使用者編寫 Lua 外掛來執行一些自定義功能,可以選擇定時觸發或者事件觸發。

例如,我們可以每分鐘從資料庫中查詢 CPU 使用率較高的節點,並透過 HTTP 請求將其報告給使用者自己的監控系統。

下面是一個關於如何建立 Lua 外掛的實際例子。

Screenshot

我們用下面的程式碼建立一個定時觸發的外掛:


local sql = [[
select node_id, avg("system_CPU_percent") from monitor
where created > now() - INTERVAL '1 hour' group by node_id limit 1
]]
local res = sql_query(sql, 120, 2000, "log_server")

local params = {
    body = res
}
res = http_query('POST', "http://receive-metrics.openresty.com", params)

output(res)

我們點選這個執行按鈕可以立即看到執行結果。

Screenshot

可以在執行歷史頁面中看到每次外掛執行的結果。

Screenshot

Screenshot

此外,Lua 外掛也可以由事件觸發。當指定的事件發生時,它將觸發外掛執行,並透過 Lua 變數 trigger_event 傳遞給外掛。

下面是一個列印觸發事件的例子。

output(trigger_event)

Screenshot

點選執行按鈕會觸發一個 node 下線的測試事件。

Screenshot

事件

我們現在支援以下這些事件:

Nodes Heartbeat Offline

節點沒有傳送心跳到 Admin,被設定成離線時觸發 nodes_heartbeat_offline 事件。

{
    "type": "nodes_heartbeat_offline",
    "from": "log-server",
    "message": "Gateway nodes [59] offline",
    "level": "ERROR"
}

Nodes Heartbeat Online

節點傳送心跳到 Admin,被設定成線上時觸發 nodes_heartbeat_online 事件。

{
    "type": "nodes_heartbeat_online",
    "from": "log-server",
    "message": "Gateway nodes [59] online",
    "level": "WARNING"
}

Nodes Offline

節點健康檢查失敗,被設定成下線時觸發 nodes_offline 事件。

{
    "from": "admin",
    "level": "ERROR",
    "message": "gateway node [78] is offline since failed to connect to 120.24.93.4:81: connection refused, time: 1634629224;;failed to connect to 120.24.93.4:81: connection refused, time: 1634629254;;failed to connect to 120.24.93.4:81: connection refused, time: 1634629284",
    "type": "node_offline"
}

Nodes Online

節點健康檢查成功,被設定成上線時觸發 nodes_online 事件。

{
    "from": "admin",
    "level": "WARNING",
    "message": "gateway node [78] is online since health check success",
    "type": "node_online"
}

Release

釋出應用時觸發 release 事件。

{
    "type": "release",
    "uid": 2,
    "http_app_id": 786
}

WAF Event

當請求命中 WAF 規則時,並且分數達到閾值,會觸發 waf_event 事件。

相同客戶端 IP 的請求一秒內只會觸發一次事件

{
    "app_id": "1033",
    "type": "waf_event",
    "score": "3",
    "threshold": "3",
    "action": "block",
    "matches": [
        {
            "matches": [
                "0",
                "/hit"
            ],
            "begin_line": 1,
            "version": "d04fb751526fc85a172475a71f19cf53",
            "rule_id": "0",
            "rule_set_id": 10025,
            "msg": "test",
            "group": "test",
            "end_line": 2
        }
    ],
    "header": "User-Agent: curl/7.29.0\r\nHost: waf-filter.com\r\nAccept: */*\r\nProxy-Connection: Keep-Alive\r\n\r\n",
    "timestamp": "1634629481",
    "request_id": "0000270010243927eb48000d",
    "client_country": "",
    "client_province": "unknown",
    "client_city": "unknown",
    "from": "log_server",
    "client_isp": "",
    "body": "",
    "remote_addr": "172.17.0.1",
    "host": "waf-filter.com",
    "request": "GET HTTP://waf-filter.com/hit HTTP/1.1"
}

Builtin API

大部分 Lua 程式碼是可以在外掛中執行的,我們也提供了一些內建的 API。

output

syntax: output(msg)

將訊息輸出到執行歷史中。

http_query

syntax: res = http_query(method, url, params, retries)

傳送 HTTP 請求。

  • method HTTP 請求方法,比如 GET, POST, PUT
  • url HTTP 請求 url 字串
  • retries 請求失敗重試次數,預設是 1

params 支援這些欄位:

  • timeout 設定請求超時時間,單位是毫秒,預設超時時間為 300 秒
  • headers 設定請求頭
  • body 設定請求體
  • ssl_verify 啟用 SSL 證書驗證,預設關閉

當請求成功,會返回一個 Lua Table,會包含以下這些欄位:

  • status 響應狀態碼
  • headers 響應頭
  • body 響應體

sql_query

查詢 Admin 或者 Log Server 的資料庫。

syntax: res = sql_query(sql, timeout, limit, destination, retries)

  • sql 查詢 SQL 語句,只支援 select 型別的語句
  • timeout 設定查詢超時時間,單位是秒,預設超時時間為 120 秒
  • limit 設定結果限制數,預設返回 20000 條查詢結果
  • destination 查詢目標,選擇 admin 或者 log-server
  • retries 查詢失敗重試次數,預設是 1

send_alarm_event

傳送自定義報警事件

syntax: res = send_alarm_event(alarm_type, alarm_level, alarm_message)

  • alarm_type 自定義報警型別
  • alarm_level 報警等級,有三個等級:CRITICAL, ERROR,WARNING
  • alarm_message 報警文字內容

更多示例

新增被 WAF 規則攔截的客戶端 IP 到應用 IP 列表中

這裡我們需要先在應用內使用 WAFIP List.

-- In this example we assume that the application id and IP list id are both 1.
local app_id = "1"
local ip_list_id = "1"

local str_fmt = string.format
local api_put = require "Lua.SchemaDB" .update

-- Get the IP addresses blocked by WAF in the last 24 hours
local sql = [[
SELECT DISTINCT remote_addr as ip
FROM waf_request_tsdb
WHERE action='block'
        AND score >= threshold
        AND created >= now() - interval '24 hours'
        AND app_id='%s'
]]

sql = str_fmt(sql, app_id)
local res = sql_query(sql, 120, 2000, "log_server")
local ip_list = { items = res }

local uri = { "applications", app_id, "ip_list", ip_list_id }
res, err = api_put(uri, ip_list)

if res then
    output("updated ip blacklist successfully!")
else
    output("failed to update ip blacklist: " .. tostring(err))
end

校驗指定 APP_ID 應用的 HTTPS 證書

-- 請手動更新這一行
local app_id = nil

-- 移除下面一行註釋,如果你想在觸發事件中使用
-- local app_id = trigger_event.http_app_id

if not app_id then
    return output("WARN: app_id is required")
end

local ngx = ngx
local substr = string.sub
local str_fmt = string.format
local re_find = ngx.re.find
local httpc = require "resty.http".new()

local function ssl_handshake(ip, port, domain)
    local c, err = httpc:connect(ip, port)
    if not c then
        return nil, err
    end

    return httpc:ssl_handshake(nil, domain, true)
end

local function is_wildcard_domain(domain)
    if string.sub(domain, 1, 2) == '*.' then
        return true
    end

    return false
end

local app_domains_sql = [[
select applications_domains.domain "domain",
    applications_domains.is_wildcard is_wildcard,
    https_ports,
    offline_enabled
    from applications
left join applications_domains on applications.id = applications_domains._applications_id
where applications.id = %d
]]

local cert_domains_sql = [[
select applications_phases_ssl_cert_certs_acme_host.item acme_host
    from applications_phases_ssl_cert_certs
join applications_phases_ssl_cert_certs_acme_host on
    applications_phases_ssl_cert_certs_acme_host._applications_phases_ssl_cert_certs_id
    = applications_phases_ssl_cert_certs.id
where global_cert is null and applications_phases_ssl_cert_certs._applications_id = %d
union
select global_certs.acme_host acme_host
    from applications_phases_ssl_cert_certs
join global_certs on applications_phases_ssl_cert_certs.global_cert = global_certs.id
where global_cert is not null and applications_phases_ssl_cert_certs._applications_id = %d
]]

local gateway_nodes_sql = [[
select gateway_nodes.external_ip external_ip,
    gateway_nodes.external_ipv6 external_ipv6
    from applications
left join applications_partitions on applications.id = applications_partitions._applications_id
left join gateway on applications_partitions.item = gateway.partition
left join gateway_nodes on gateway.id = gateway_nodes._gateway_id
where offline_enabled is not true
    and (gateway_nodes.external_ip is not null or gateway_nodes.external_ipv6 is not null)
    and applications.id = %d
]]

local ok_tbl = {}
local err_tbl = {}
local check_list = {}
local app_domains_hash = {}

local app_domains, err = sql_query(str_fmt(app_domains_sql, tonumber(app_id)))
if not app_domains then
    output(str_fmt("failed to verify TLS certificate for app, app_id: %d"), tostring(app_id))
end

local domain_list = {}
for _, app_domain in ipairs(app_domains) do
    domain_list[#domain_list + 1] = app_domain.domain

    app_domains_hash[app_domain.domain] = app_domain
end

local domain_str = table.concat(domain_list, ', ')

local cert_domains, err = sql_query(str_fmt(cert_domains_sql, tonumber(app_id), tonumber(app_id)))
if not cert_domains then
    output("failed to verify TLS certificate for app, domain: %s, err: no certificate found", domain_str)
end

local gateway_nodes, err = sql_query(str_fmt(gateway_nodes_sql, tonumber(app_id)))
if not gateway_nodes then
    output("failed to verify TLS certificate for app, domain: %s, err: no gateway nodes found", domain_str)
end

for _, cert_domain in ipairs(cert_domains) do
    local acme_host = cert_domain.acme_host

    if is_wildcard_domain(acme_host) and app_domains_hash[acme_host] then
        err_tbl[#err_tbl + 1] ="wildcard app with wildcard cert is not supported yet:" .. tostring(acme_host)
        goto _next_
    end

    if is_wildcard_domain(acme_host) then
        for _, app_domain in ipairs(app_domains) do
            local domain = app_domain.domain
            local base_acme_host = substr(acme_host, 2)

            if re_find(domain, [[\A(?:\Q\E.*?\Q]] .. base_acme_host .. [[\E)]], 'josm') then
                check_list[#check_list + 1] = app_domains_hash[app_domain]
            end
        end

        goto _next_
    end

    local hit = false

    for _, app_domain in ipairs(app_domains) do
        local domain = app_domain.domain

        if domain == acme_host then
            check_list[#check_list + 1] = app_domains_hash[acme_host]

            goto _next_
        end

        if is_wildcard_domain(domain) then
            local base_domain = substr(domain, 2)

            if re_find(acme_host, [[\A(?:\Q\E.*?\Q]] .. base_domain .. [[\E)]], 'josm') then
                local app_obj = app_domains_hash[domain]

                check_list[#check_list + 1] = {
                    domain = acme_host,
                    is_wildcard = app_obj.is_wildcard,
                    https_ports = app_obj.https_ports
                }
                hit = true
            end
        end
    end

    if hit then
        goto _next_
    end

    err_tbl[#err_tbl + 1] = str_fmt(
        "certificate with Common Name '%s' not found matched host in current application '%s'",
        acme_host, domain_str)

    ::_next_::
end

local check_domain_list = {}

for _, check_obj in pairs(check_list) do
    local check_domain = check_obj.domain
    local https_ports = check_obj.https_ports

    check_domain_list[#check_domain_list + 1] = check_domain

    for _, gateway_node in ipairs(gateway_nodes) do
        local ip = gateway_node.external_ip or gateway_node.external_ipv6
        if not ip then
            goto _next_node_
        end

        for _, https_port in ipairs(https_ports) do
            local ok, err = ssl_handshake(ip, https_port, check_domain)
            if not ok then
                err_tbl[#err_tbl + 1] = str_fmt("domain '%s' on ip '%s': '%s'",
                    check_domain, ip, err)
                goto _next_node_
            end

            ok_tbl[#ok_tbl + 1] = str_fmt("domain '%s' on ip '%s'", check_domain, ip)
        end

        ::_next_node_::
    end
end

if #err_tbl == 0 then
    output("OK: all TLS certificates are verified:" .. table.concat(check_domain_list, ','))

else
    output("ERR: following TLS certificates are failed:" .. table.concat(err_tbl, "\n === \n"))

    if #ok_tbl > 0 then
        output("INFO: following TLS certificates are verified:" .. table.concat(ok_tbl, "\n === \n"))

    else
        output("ERR: no TLS certificate is verified successfully")
    end
end