feat: kubernetes discovery readiness check#12852
Conversation
…hcheck # Conflicts: # apisix/discovery/kubernetes/init.lua
| if not endpoint_dict then | ||
| core.log.error("failed to get lua_shared_dict:", get_endpoint_dict_name(id), | ||
| ", please check your APISIX version") | ||
| return false, "failed to get lua_shared_dict: ", get_endpoint_dict_name(id), |
There was a problem hiding this comment.
Please use .. for string concatenation.
|
|
||
|
|
||
| local function post_list(handle) | ||
| handle.endpoint_dict:safe_set("discovery_ready",true) |
There was a problem hiding this comment.
safe_set lacks error handling
| end | ||
| end | ||
| end | ||
| return true; |
There was a problem hiding this comment.
Please do not use semicolons.
| ngx.exit(200) | ||
| local http = require("resty.http") | ||
| local healthcheck_uri = "http://127.0.0.1:7085" .. "/status/ready" | ||
| for i = 1, 4 do |
There was a problem hiding this comment.
Add a comment explaining the rationale for selecting the number of retries.
There was a problem hiding this comment.
APISIX takes time to load data from the Kubernetes API server. In scenarios with small amounts of data, it may complete in less than 1 second. The maximum wait time I set, 5 seconds, is solely to account for network fluctuations.
|
|
||
|
|
||
| local function post_list(handle) | ||
| local _, err = handle.endpoint_dict:safe_set("discovery_ready",true) |
There was a problem hiding this comment.
| local _, err = handle.endpoint_dict:safe_set("discovery_ready",true) | |
| local _, err = handle.endpoint_dict:safe_set("discovery_ready", true) |
|
550e240 Please help review the code again. |
| for _, key in ipairs(handle.existing_keys) do | ||
| if not handle.current_keys_hash[key] then | ||
| core.log.info("kubernetes discovery module found dirty data in shared dict, key: ", | ||
| key) |
There was a problem hiding this comment.
Several indentation issues have been fixed.
| local endpoint_dict = get_endpoint_dict(id) | ||
| if not endpoint_dict then | ||
| core.log.error("failed to get lua_shared_dict:", get_endpoint_dict_name(id), | ||
| ", please check your APISIX version") |
| return | ||
| local function config_ready_check() | ||
| local role = core.table.try_read_attr(local_conf, "deployment", "role") | ||
| local provider = core.table.try_read_attr(local_conf, "deployment", "role_" .. |
There was a problem hiding this comment.
local provider = core.table.try_read_attr(local_conf, "deployment",
"role_" .. role, "config_provider")
| return false, "unknown config provider: " .. tostring(provider) | ||
| end | ||
|
|
||
| local status_shdict = ngx.shared["status-report"] |
There was a problem hiding this comment.
we may get nil, need to check it
| local worker_count = ngx.worker.count() | ||
| if #ids ~= worker_count then | ||
| local error = "worker count: " .. worker_count .. " but status report count: " .. #ids | ||
| core.log.warn(error) |
There was a problem hiding this comment.
should we use error log level?
| local ready = status_shdict:get(id) | ||
| if not ready then | ||
| local error = "worker id: " .. id .. " has not received configuration" | ||
| core.log.warn(error) |

Description
This pull request adds a Kubernetes service discovery readiness status query to the /status/ready interface. Specifically, the readiness check calls the
check_discovery_readymethod of the service discovery module. Other types of service discovery, such as Consul, can also adapt this method to support the service discovery module.relate discuss:
#12635
Which issue(s) this PR fixes:
Fixes #
Checklist