Skip to content

fix(RuntimeConfigurationPlugin): avoid Jetty/webconsole starts on slave when using RuntimeConfigurationBroker plugin#1921

Closed
jbonofre wants to merge 2 commits intoapache:mainfrom
jbonofre:gh-1920
Closed

fix(RuntimeConfigurationPlugin): avoid Jetty/webconsole starts on slave when using RuntimeConfigurationBroker plugin#1921
jbonofre wants to merge 2 commits intoapache:mainfrom
jbonofre:gh-1920

Conversation

@jbonofre
Copy link
Copy Markdown
Member

@jbonofre jbonofre commented Apr 14, 2026

Move the RuntimeConfigurationBroker plugin init logic (config loading, file monitoring, MBean registration) from start() to nowMasterBroker(). This callback is invoked by BrokerService.doStartBroker(), after startAllConnectors() has set slave=true. It's the proper lifecycle hook for "this broker is now a master".

This avoid a potential race condition where the slave can start Jetty/webconsole.

…plugin init logic (config loading, file monitoring, MBean registration) from start() to nowMasterBroker(). This callback is invoked by BrokerService.doStartBroker(), after startAllConnectors() has set slave=true. It's the proper lifecycle hook for "this broker is now a master".
…ecycle

Verify that RuntimeConfigurationBroker defers init (config loading, file
monitoring, MBean registration) to nowMasterBroker() instead of start(),
preventing race conditions in master/slave topologies.
@cshannon cshannon requested a review from mattrpav April 14, 2026 12:29
Copy link
Copy Markdown
Contributor

@cshannon cshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it's ok but I don't have a lot of experience with a failover setup and the Runtime config plugin. @mattrpav - any concerns here?

@mattrpav
Copy link
Copy Markdown
Contributor

mattrpav commented Apr 14, 2026

I'm not sure there is really a problem with having the RuntimeConfigPlugin running while the broker is active or inactive.

The Broker does not have a lifecycle hook to indicate when the persistent store is not available (aka 'slave'). So this change only solves for the 'first time' becoming primary/master.

If the broker flips back to failover/slave, the RuntimeConfigPlugin will still be active and processing configuration changes.

Since the plugin can process transportConnectors, I can see a scenario where an original set of transport connectors come online, but the user has already removed (or changed) via config, but the plugin hasn't processed yet so ports come online that were not intended.

edit: this config race condition could also lead to incorrect security, startup destinations, policies and really anything else that is updatable via the plugin

@jbonofre
Copy link
Copy Markdown
Member Author

@mattrpav yes, the purpose here is specifically to avoid the race condition.

@mattrpav
Copy link
Copy Markdown
Contributor

@jbonofre this adds a race condition for the configuration update and then the plugin remains active, even if the broker goes back to being a failover/slave broker which leads to inconsistent behavior. The Broker interface used for lifecycle has a gap, where there is no hook/method called when a broker goes back to being a slave/failover node.

Described:

  1. BrokerA starts up as primary
  2. BrokerB starts up as failover
  3. User applies config change
  4. BrokerA applies configuration (new policy entries and authz entries)
  5. BrokerB does not apply change (plugin not active)
  6. BrokerA goes offline
  7. BrokerB becomes active with the old configuration
  8. RuntimConfig polling period completes.. new configuration applied <-- config race condition
  9. BrokerB goes offline
  10. BrokerA comes online
  11. BrokerB RuntimeConfig continues to run <-- inconsistent behavior

Questions:
Q-1: What problem exists with having a failover/slave broker process config changes?

@jbonofre
Copy link
Copy Markdown
Member Author

@mattrpav that's a good call. Let me revisit the fix focusing on the race condition on slave: the problem I'm observing is that a slave can start Jetty/WebConsole even if the broker is waiting for the lock due to the RuntimeConfigurationPlugin.

@jbonofre
Copy link
Copy Markdown
Member Author

Closing this one to implement a more global fix.

@jbonofre jbonofre closed this Apr 14, 2026
@jbonofre jbonofre deleted the gh-1920 branch April 15, 2026 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WebConsole should redirect to "slave page" instead of error when started on slave

3 participants