Skip to content

Commit de88a73

Browse files
kaxilRatasa143
authored andcommitted
Add gunicorn support for API server with rolling worker restarts (apache#60940)
Add optional gunicorn server type for the API server that provides: - Memory sharing via preload + fork copy-on-write - Rolling worker restarts through GunicornMonitor - Correct FIFO signal handling (SIGTTOU kills oldest worker) New configuration options in [api] section: - server_type: uvicorn (default) or gunicorn - worker_refresh_interval: seconds between refresh cycles (0=disabled) - worker_refresh_batch_size: workers to refresh per cycle - master_timeout: gunicorn master timeout - reload_on_plugin_change: reload on plugin file changes Requires apache-airflow-core[gunicorn] extra for gunicorn mode. * Run GunicornMonitor in main thread instead of daemon thread Matches Airflow 2's webserver pattern: monitor runs in main thread, so if it crashes, the whole process exits (fail-fast). No silent degradation where gunicorn keeps running without worker recycling. Also triggers monitor when reload_on_plugin_change is enabled, even if worker_refresh_interval is 0. * Refactor gunicorn support to use custom Arbiter instead of external monitor This refactor changes the gunicorn worker monitoring architecture from an external thread-based approach to using a custom Arbiter subclass, which is gunicorn's recommended extension pattern. Changes: - New gunicorn_app.py with AirflowArbiter and AirflowGunicornApp - AirflowArbiter integrates worker refresh into manage_workers() loop - Removed gunicorn_monitor.py (no longer needed) - Simplified api_server_command.py (no subprocess, direct gunicorn API) - Updated tests for new architecture Benefits: - Simpler architecture (no separate thread or subprocess) - Direct access to worker state via self.WORKERS - Uses gunicorn's internal spawn_worker/kill_worker methods - Follows gunicorn's documented extension pattern
1 parent e38e73e commit de88a73

13 files changed

Lines changed: 1055 additions & 11 deletions

File tree

.pre-commit-config.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -524,7 +524,11 @@ repos:
524524
^airflow-core/newsfragments/41368\.significant\.rst$|
525525
^airflow-core/newsfragments/41761.significant\.rst$|
526526
^airflow-core/newsfragments/43349\.significant\.rst$|
527+
^airflow-core/newsfragments/60921\.significant\.rst$|
527528
^airflow-core/src/airflow/api_fastapi/auth/managers/simple/ui/pnpm-lock\.yaml$|
529+
^airflow-core/src/airflow/api_fastapi/gunicorn_config\.py$|
530+
^airflow-core/src/airflow/cli/commands/api_server_command\.py$|
531+
^airflow-core/src/airflow/api_fastapi/gunicorn_monitor\.py$|
528532
^airflow-core/src/airflow/cli/commands/local_commands/fastapi_api_command\.py$|
529533
^airflow-core/src/airflow/config_templates/|
530534
^airflow-core/src/airflow/models/baseoperator\.py$|

airflow-core/docs/administration-and-deployment/web-stack.rst

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,3 +57,84 @@ separately. This might be useful for scaling them independently or for deploying
5757
airflow api-server --apps core
5858
# serve only the Execution API Server
5959
airflow api-server --apps execution
60+
61+
Server Types
62+
------------
63+
64+
The API server supports two server types: ``uvicorn`` (default) and ``gunicorn``.
65+
66+
Uvicorn (Default)
67+
~~~~~~~~~~~~~~~~~
68+
69+
Uvicorn is the default server type. It's simple to set up and works on all platforms including Windows.
70+
71+
.. code-block:: bash
72+
73+
airflow api-server
74+
75+
Gunicorn
76+
~~~~~~~~
77+
78+
Gunicorn provides additional features for production deployments:
79+
80+
- **Memory sharing**: Workers share memory via copy-on-write after fork, reducing total memory usage
81+
- **Rolling worker restarts**: Zero-downtime worker recycling to prevent memory accumulation
82+
- **Proper signal handling**: SIGTTOU kills the oldest worker (FIFO), enabling true rolling restarts
83+
84+
.. note::
85+
86+
Gunicorn requires the ``gunicorn`` extra: ``pip install 'apache-airflow-core[gunicorn]'``
87+
88+
Gunicorn is Unix-only and does not work on Windows.
89+
90+
To enable gunicorn mode:
91+
92+
.. code-block:: bash
93+
94+
export AIRFLOW__API__SERVER_TYPE=gunicorn
95+
airflow api-server
96+
97+
Rolling Worker Restarts
98+
^^^^^^^^^^^^^^^^^^^^^^^
99+
100+
To enable periodic worker recycling (useful for long-running processes to prevent memory accumulation):
101+
102+
.. code-block:: bash
103+
104+
export AIRFLOW__API__SERVER_TYPE=gunicorn
105+
export AIRFLOW__API__WORKER_REFRESH_INTERVAL=43200 # Restart workers every 12 hours
106+
export AIRFLOW__API__WORKER_REFRESH_BATCH_SIZE=1 # Restart one worker at a time
107+
airflow api-server
108+
109+
The rolling restart process:
110+
111+
1. Spawns new workers before killing old ones (zero downtime)
112+
2. Waits for new workers to be ready (process title check)
113+
3. Performs HTTP health check to verify workers can serve requests
114+
4. Kills old workers (oldest first)
115+
5. Repeats until all original workers are replaced
116+
117+
Configuration Options
118+
^^^^^^^^^^^^^^^^^^^^^
119+
120+
The following configuration options are available in the ``[api]`` section:
121+
122+
- ``server_type``: ``uvicorn`` (default) or ``gunicorn``
123+
- ``worker_refresh_interval``: Seconds between worker refresh cycles (0 = disabled, default)
124+
- ``worker_refresh_batch_size``: Number of workers to refresh per cycle (default: 1)
125+
- ``reload_on_plugin_change``: Reload when plugin files change (default: False)
126+
127+
When to Use Gunicorn
128+
^^^^^^^^^^^^^^^^^^^^
129+
130+
Use gunicorn when you need:
131+
132+
- Long-running API server processes where memory accumulation is a concern
133+
- Multi-worker deployments where memory sharing matters
134+
- Production environments requiring zero-downtime worker recycling
135+
136+
Use the default uvicorn when:
137+
138+
- Running on Windows
139+
- Running in development or testing environments
140+
- Running short-lived containers (e.g., Kubernetes pods that get recycled)

airflow-core/docs/extra-packages-ref.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,8 @@ other packages that can be used by airflow or some of its providers.
126126
+---------------------+-----------------------------------------------------+----------------------------------------------------------------------------+
127127
| graphviz | ``pip install 'apache-airflow[graphviz]'`` | Graphviz renderer for converting Dag to graphical output |
128128
+---------------------+-----------------------------------------------------+----------------------------------------------------------------------------+
129+
| gunicorn | ``pip install 'apache-airflow[gunicorn]'`` | Gunicorn server with rolling worker restarts for the API server |
130+
+---------------------+-----------------------------------------------------+----------------------------------------------------------------------------+
129131
| ldap | ``pip install 'apache-airflow[ldap]'`` | LDAP authentication for users |
130132
+---------------------+-----------------------------------------------------+----------------------------------------------------------------------------+
131133
| leveldb | ``pip install 'apache-airflow[leveldb]'`` | Required for use leveldb extra in google provider |
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
Add gunicorn support for API server with zero-downtime worker recycling
2+
3+
The API server now supports gunicorn as an alternative server with rolling worker restarts
4+
to prevent memory accumulation in long-running processes.
5+
6+
**Key Benefits:**
7+
8+
* **Rolling worker restarts**: New workers spawn and pass health checks before old workers
9+
are killed, ensuring zero downtime during worker recycling.
10+
11+
* **Memory sharing**: Gunicorn uses preload + fork, so workers share memory via
12+
copy-on-write. This significantly reduces total memory usage compared to uvicorn's
13+
multiprocess mode where each worker loads everything independently.
14+
15+
* **Correct FIFO signal handling**: Gunicorn's SIGTTOU kills the oldest worker (FIFO),
16+
not the newest (LIFO), which is correct for rolling restarts.
17+
18+
**Configuration:**
19+
20+
.. code-block:: ini
21+
22+
[api]
23+
# Use gunicorn instead of uvicorn
24+
server_type = gunicorn
25+
26+
# Enable rolling worker restarts every 12 hours
27+
worker_refresh_interval = 43200
28+
29+
# Restart workers one at a time
30+
worker_refresh_batch_size = 1
31+
32+
Or via environment variables:
33+
34+
.. code-block:: bash
35+
36+
export AIRFLOW__API__SERVER_TYPE=gunicorn
37+
export AIRFLOW__API__WORKER_REFRESH_INTERVAL=43200
38+
39+
**Requirements:**
40+
41+
Install the gunicorn extra: ``pip install 'apache-airflow-core[gunicorn]'``
42+
43+
**Note on uvicorn (default):**
44+
45+
The default uvicorn mode does not support rolling worker restarts because:
46+
47+
1. With workers=1, there is no master process to send signals to
48+
2. uvicorn's SIGTTOU kills the newest worker (LIFO), defeating rolling restart purposes
49+
3. Each uvicorn worker loads everything independently with no memory sharing
50+
51+
If you need worker recycling or memory-efficient multi-worker deployment, use gunicorn.

airflow-core/pyproject.toml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -188,14 +188,17 @@ dependencies = [
188188
"memray" = [
189189
"memray>=1.19.0",
190190
]
191+
"gunicorn" = [
192+
"gunicorn>=23.0.0",
193+
]
191194
"otel" = [
192195
"opentelemetry-exporter-prometheus>=0.47b0",
193196
]
194197
"statsd" = [
195198
"statsd>=3.3.0",
196199
]
197200
"all" = [
198-
"apache-airflow-core[graphviz,kerberos,otel,statsd]"
201+
"apache-airflow-core[graphviz,gunicorn,kerberos,otel,statsd]"
199202
]
200203

201204
[project.scripts]

0 commit comments

Comments
 (0)