Skip to content

Comments

Add support for the simple "sigs-based auth" VSS scheme#755

Open
TheBlueMatt wants to merge 6 commits intolightningdevkit:mainfrom
TheBlueMatt:2026-01-vss-sigs-auth
Open

Add support for the simple "sigs-based auth" VSS scheme#755
TheBlueMatt wants to merge 6 commits intolightningdevkit:mainfrom
TheBlueMatt:2026-01-vss-sigs-auth

Conversation

@TheBlueMatt
Copy link
Contributor

At https://github.com/lightningdevkit/vss-server/pull/79 we added
a new, trivial, VSS authentication scheme that ensures client
isolation without much else. This is great for testing, and we
expect some to do new-account-rate-limiting via other means, so
might well become a common default.

Here we add support to it in ldk-node.

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Jan 15, 2026

👋 I see @tnull was un-assigned.
If you'd like another reviewer assignment, please click here.

@TheBlueMatt
Copy link
Contributor Author

Depends on lightningdevkit/vss-client#54

@ldk-reviews-bot ldk-reviews-bot requested a review from tnull January 15, 2026 02:14
Copy link
Collaborator

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good, some comments. Mind adding an intermittent commit that bumps the dependency to your branch of vss-client so we can see if CI passes?

.build_with_vss_store(
config_a.node_entropy,
vss_base_url.clone(),
"node_1_store".to_string(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we changing the store_id here and below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To demonstrate/test client isolation in VSS by default.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not sure it's worth the changes, but if you think it somehow preferable, sure..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Entirely up to you. I did it cause coverage is coverage, even if the coverage is in totally the wrong repo. If you don't want coverage in the wrong repo, that's also totally fine :).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, feel free to leave as-is.

@TheBlueMatt
Copy link
Contributor Author

Oops, fixed an issue but also pushed a new commit to drop the store_id parameter from build_with_vss_store entirely I don't see any reason to care about it now that data is keyed on the node's entropy for builds via this method (of course its still retained for other vss builders).

Copy link
Collaborator

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good, would still be good to see a vss-client dependency bump (preferably before the next vss-client release so we get a chance to amend bugs).

I don't mind the test changes, but leaning NACK on the fixating the store_id suddenly, see below.

.build_with_vss_store(
config_a.node_entropy,
vss_base_url.clone(),
"node_1_store".to_string(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not sure it's worth the changes, but if you think it somehow preferable, sure..

src/builder.rs Outdated
&self, node_entropy: NodeEntropy, vss_url: String, fixed_headers: HashMap<String, String>,
) -> Result<Node, BuildError> {
let logger = setup_logger(&self.log_writer_config, &self.config)?;
let store_id = "ldk-node".to_owned();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should do this. Adding pubkey-auth is just another authentication mechanism, which is otherwise unrelated to the VSS API contract. Note that we had users pick store_ids freely already and they are running in production. Suddenly fixating the store_id breaks the VSS API contract but also disallows them to switch to pubkey-auth from whatever they are running right now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure I understand the argument around the API contract - there's an API contract defined in vss-client and between the VSS client and server, but I'm not really sure that applies to new instances of ldk-node using VSS. Do we really think they're going to re-derive the VSS auth key from their entropy in order to instantiate another instance of the VSS client with the same auth to store other stuff there and need to put it in the same store_id? If not, making them pass another argument just seems like yet more cognitive burden on devs to figure out what a store_id is.

disallows them to switch to pubkey-auth from whatever they are running right now.

Fwiw they can manually instantiate the vss-client VssHeaderProvider instance and call build_with_vss_store_and_header_provider, though we could also add a separate method if we anticipate users switching authentication mechanisms but keeping their existing VSS instance. It wasn't clear that we anticipate that kind of migration.

Copy link
Collaborator

@tnull tnull Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure I understand the argument around the API contract - there's an API contract defined in vss-client and between the VSS client and server, but I'm not really sure that applies to new instances of ldk-node using VSS.

Well, but we're re-exposing that API in LDK Node. In particular, we recently introduced VssStoreBuilder explicitly so that other projects (such as Orange) can build a VssStore instance outside of/prior to LDK Node initialization to be able to store some extra data. We also considered that a pre-factor to then finally upstream VssStore and VssStoreBuilder to lightning-persister soon, which has been the goal since the beginning, but we never go around to do it so far. That's all to say that I'm not quite seeing the need to fixate the store_id at all, and in fact we'll need to revert that (and then re-apply for backwards compat if we'd go that way) once we upstream the code.

Fwiw they can manually instantiate the vss-client VssHeaderProvider instance and call build_with_vss_store_and_header_provider, though we could also add a separate method if we anticipate users switching authentication mechanisms but keeping their existing VSS instance. It wasn't clear that we anticipate that kind of migration.

I mean, we should anticipate such a migration, as we have a bunch of users on JWT right now, that might opt to choose the new auth mechanism which seems to be the new default over time? At least I don't think we should make it impossible or have them some custom route just to do that? FWIW, we also have MigratableKVStore, why not allow users to build two independent VssStores with different store_id (but both using same auth) and then migrate between them, e.g., for backup purposes.

Copy link
Collaborator

@tnull tnull Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now opened a draft for this lightningdevkit/rust-lightning#4323 (i.e., draft until this lands so we upstream the latest version).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, but we're re-exposing that API in LDK Node. In particular, we recently introduced VssStoreBuilder explicitly so that other projects (such as Orange) can build a VssStore instance outside of/prior to LDK Node initialization to be able to store some extra data. We also considered that a pre-factor to then finally upstream VssStore and VssStoreBuilder to lightning-persister soon, which has been the goal since the beginning, but we never go around to do it so far. That's all to say that I'm not quite seeing the need to fixate the store_id at all, and in fact we'll need to revert that (and then re-apply for backwards compat if we'd go that way) once we upstream the code.

Right, the change here did not remove the store-id parameter from the VssStoreBuilder, but only from the NodeBuilder::build_with_vss_store method that uses an underlying VssStoreBuilder. My assumption here was that we wanted as simple an API as possible under the new sigs-based auth scheme, and for those using NodeBuilder::build_with_vss_store we assume that they are just using LDK Node, as they would otherwise use a VssStoreBuilder and pass a reference to the VSS Store to NodeBuilder::build_with_store so that they can also reuse the VSS Store elsewhere in their code.

I guess a related question here is whether something like orange can/should/will use a separate store-id from ldk node. I assume that it should, given we don't want it storing stuff somewhere LDK Node might want to in the future? In that case it would need multiple VSS Store objects. Sadly I believe the store-ids aren't obfuscated so the service would trivially see which objects are for which submodule which does kinda suck.

I mean, we should anticipate such a migration, as we have a bunch of users on JWT right now, that might opt to choose the new auth mechanism which seems to be the new default over time? At least I don't think we should make it impossible or have them some custom route just to do that? FWIW, we also have MigratableKVStore, why not allow users to build two independent VssStores with different store_id (but both using same auth) and then migrate between them, e.g., for backup purposes.

Right, my question was only how prevalent it would be. Of course the current API supports it just fine (either with a manual header provider or via the VssStoreBuilder), just a question of if we expect it to be universal so much that we should make the API more complicated for new uses.

Copy link
Contributor

@tankyleo tankyleo Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tnull As matt mentions, what do you think of the VssStoreBuilder -> VssStore -> Node::build_with_store route that people can take today to build a sig-auth based VSS backend, with a custom store_id ? Seems reasonable enough to ask people to take this route if they want to customize the store_id IMHO.

Tangentially: as to the goal of minimizing confusion / cognitive load, I would delete all the existing Builder::build_with_vss_store* methods, and make all VSS users go through VssStoreBuilder. We currently have two ways of achieving the same thing, which is confusing to me (see for example Builder::build_with_vss_store_and_lnurl_auth and VssStoreBuilder::build_with_lnurl).

Then with store_id, I think it would be more helpful to add one or two documentation sentences explaining to people how they should think about store_id at a high-level, rather than the current choice of either 1) fixing the store_id to ldk-node under the hood, or 2) letting people customize it, with no guidance on how they should think about selecting the store_id string, and no explanation on the purpose of store_id in the VSS design.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the change here did not remove the store-id parameter from the VssStoreBuilder, but only from the NodeBuilder::build_with_vss_store method that uses an underlying VssStoreBuilder. My assumption here was that we wanted as simple an API as possible under the new sigs-based auth scheme, and for those using NodeBuilder::build_with_vss_store we assume that they are just using LDK Node, as they would otherwise use a VssStoreBuilder and pass a reference to the VSS Store to NodeBuilder::build_with_store so that they can also reuse the VSS Store elsewhere in their code.

No, note that VssStoreBuilder is not (and given we're going to upstream it, never will) be exposed in bindings, so we still need a way for bindings users to set their store id.

I guess a related question here is whether something like orange can/should/will use a separate store-id from ldk node. I assume that it should, given we don't want it storing stuff somewhere LDK Node might want to in the future? In that case it would need multiple VSS Store objects. Sadly I believe the store-ids aren't obfuscated so the service would trivially see which objects are for which submodule which does kinda suck.

Yeah, maybe they should build two separate stores, though not sure what's @benthecarman's thoughts here. Any case, choosing store ids is left to the user, if they obfuscate it the service won't learn anything.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tnull As matt mentions, what do you think of the VssStoreBuilder -> VssStore -> Node::build_with_store route that people can take today to build a sig-auth based VSS backend, with a custom store_id ? Seems reasonable enough to ask people to take this route if they want to customize the store_id IMHO.

Unfortunately, I'm still not the biggest fan of throwing in that semantic API change with just adding another auth mechanism. It's really an entirely orthogonal choice, that now unnecessarily complicates the discussion on this PR, IMO. As noted above, bindings users won't have access to the VssStoreBuilder, so we need to provide a way to set the (potentially legacy) store id via Builder.

Tangentially: as to the goal of minimizing confusion / cognitive load, I would delete all the existing Builder::build_with_vss_store* methods, and make all VSS users go through VssStoreBuilder. We currently have two ways of achieving the same thing, which is confusing to me (see for example Builder::build_with_vss_store_and_lnurl_auth and VssStoreBuilder::build_with_lnurl).

See above, we can't do that as we don't expose VssStoreBuilder in bindings, and likely won't ever given we're going to upstream it to lightning-persister, where we don't have the notion of uniffi bindings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm, that's a good point about bindings. I'll re-add the argument but try to add more context in the documentation.

@ldk-reviews-bot
Copy link

🔔 1st Reminder

Hey @tankyleo! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 2nd Reminder

Hey @tankyleo! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 3rd Reminder

Hey @tankyleo! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

Copy link
Collaborator

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, we now shipped vss-client-ng v0.5.0 and bumped the dependency in #796. Please rebase!

@TheBlueMatt TheBlueMatt force-pushed the 2026-01-vss-sigs-auth branch 4 times, most recently from 205e2eb to 96434f8 Compare February 20, 2026 12:36
@TheBlueMatt
Copy link
Contributor Author

Done!

Copy link
Collaborator

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rustfmt is unhappy

Copy link
Collaborator

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, seems bindings builds are still failing unfortunately.

@TheBlueMatt
Copy link
Contributor Author

No, rustfmt I had fixed, but now bindings aren't building. Its very annoying that there isn't an easy way to run CI tests locally, both because whenever I try to run cargo test locally I get flaky test failures and because there isn't a simple script I can run which tests things like bindings builds so I have to wait for CI to run.

@TheBlueMatt TheBlueMatt force-pushed the 2026-01-vss-sigs-auth branch 2 times, most recently from cb16271 to 1756e6a Compare February 20, 2026 12:51
At lightningdevkit/vss-server#79 we added
a new, trivial, VSS authentication scheme that ensures client
isolation without much else. This is great for testing, and we
expect some to do new-account-rate-limiting via other means, so
might well become a common default.

Here we add support to it in ldk-node.
@TheBlueMatt TheBlueMatt force-pushed the 2026-01-vss-sigs-auth branch from 1756e6a to c4a6307 Compare February 20, 2026 12:54
@tnull
Copy link
Collaborator

tnull commented Feb 20, 2026

both because whenever I try to run cargo test locally I get flaky test failures

Ah, anything in particular that stands out?

because there isn't a simple script I can run which tests things like bindings builds so I have to wait for CI to run.

Hmm, well, for the Rust part the cargo test --features uniffi basically does the trick as it covers bindings builds which are just a more restrictive layer on top of all rust tests. Everything beyond that (i.e., the other integration tests) requires docker containers to run, so I don't think we would want to include them in a 'catchall' script anyways.

That said, given you changed the API, you'll also need to make changes in integration_tests_vss.rs now. See RUSTFLAGS="--cfg vss_test" cargo check --tests.

@tnull
Copy link
Collaborator

tnull commented Feb 20, 2026

The other CI failure is #798

@TheBlueMatt
Copy link
Contributor Author

TheBlueMatt commented Feb 20, 2026

Ah, anything in particular that stands out?

I've never debugged it honestly. I often see port-bind errors as well.

 Hmm, well, for the Rust part the cargo test --features uniffi basically does the trick as it covers bindings builds which are just a more restrictive layer on top of all rust tests.

Okay, I'll update my local scripts to just always run that instead.

That said, given you changed the API, you'll also need to make changes in integration_tests_vss.rs now. See RUSTFLAGS="--cfg vss_test" cargo check --tests.

No, the compilation failure there was introduced in 76194c7

@tnull
Copy link
Collaborator

tnull commented Feb 20, 2026

No, the compilation failure there was introduced in 76194c7

Hmm, very strange, as this pre-dates lightningdevkit/vss-server#79 considerably, and before that CI was passing, in particular in #683?

When we added the trivial sigs-based authentication scheme in VSS,
we made it the default if no other authentication scheme was
configured and default features are enabled.

This broke our integration tests as we were expecting no
authentication to be required in such a case.

Here we fix this by switching to the new sigs-based auth scheme,
removing `store_id`s to demonstrate client isolation while we're at
it.

Sadly, because we don't currently have a test framework for
LNURL-auth-based VSS, and because VSS no longer defaults to
no-auth, we can't practically test the upgrade-from-0.6 logic
anymore, so opt to simply drop the test.
@TheBlueMatt TheBlueMatt force-pushed the 2026-01-vss-sigs-auth branch from 739f16c to 490bfda Compare February 20, 2026 17:44
@TheBlueMatt
Copy link
Contributor Author

Ah, apologies, that was actually a rebase error, the code that was broken by that commit was mine, I just didn't notice.

Copy link
Collaborator

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately CI is still failing.

@TheBlueMatt TheBlueMatt force-pushed the 2026-01-vss-sigs-auth branch from 490bfda to 85aec2d Compare February 20, 2026 18:16
@TheBlueMatt TheBlueMatt force-pushed the 2026-01-vss-sigs-auth branch from 85aec2d to 344eb8d Compare February 20, 2026 18:21
@TheBlueMatt
Copy link
Contributor Author

The tests are now hitting "Cannot drop a runtime in a context where blocking is not allowed. This happens when a runtime is dropped from within an asynchronous context." Any idea why?

Copy link
Collaborator

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are now hitting "Cannot drop a runtime in a context where blocking is not allowed. This happens when a runtime is dropped from within an asynchronous context." Any idea why?

Yes, this happens when some of the tests exit early due to a previous panic/error. In this case, your changes to the store ID generation don't work, resulting in

Err(Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: Some(Error), code: SqlState(E23514), message: "new row for relation \"vss_db\" violates check constraint \"vss_db_store_id_check\"", detail: Some("Failing row contains (02c5703a8c0e840b3ed08e53985b2c4a506ca3dd3c04c86bd86a2e8c899eda6e..., , 1RtRhcZwBa85/GA7LTcrwHXRNZvxHioEFOH6P27+GBMV5lLO/2zfIGp/q03H#jTX..., \\x0a0f466c23eef591505b6d2ad20091681212320a104368614368613230506f..., 1, 2026-02-23 10:40:37.512281+01, 2026-02-23 10:40:37.512281+01)."), hint: None, position: None, where_: None, schema: Some("public"), table: Some("vss_db"), column: None, datatype: None, constraint: Some("vss_db_store_id_check"), file: Some("execMain.c"), line: Some(2074), routine: Some("ExecConstraints") }) })
[2026-02-23T09:40:37.515Z DEBUG vss_server::vss_service:100] PutObjectRequest 5972812125750132542 failed: InternalServerError: Database operation failed. db error

on the server side for any PutObjectRequest.

}

#[tokio::test(flavor = "multi_thread", worker_threads = 1)]
async fn vss_v0_schema_backwards_compatibility() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the commit message but its not really practical to support anymore - we don't have LNURL-auth support in the old version of LDK Node being tested here and VSS wants auth now, so getting the data in the table isn't trivial. We'd have to use a custom VSS server authenticator or disable the auth...given VSS is labeled "alpha" still it didn't seem worth the complexity.

Copy link
Collaborator

@tnull tnull Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't have LNURL-auth support in the old version of LDK Node being tested here

Yes we did? https://docs.rs/ldk-node/0.6.2/ldk_node/struct.Builder.html#method.build_with_vss_store

and VSS wants auth now

No, test setups like this are exactly the reason why we kept NoopAuthorizer around behind the cfg(noop_authorizer) flag: lightningdevkit/vss-server@52fd919

so getting the data in the table isn't trivial

It should be relatively easy, given one of the two paths mentioned above?

Note that we did the schema upgrade just last release and existing users aren't actually migrated to the new schema. So IMO it would be good to keep the backwards compatibility around to ensure we don't break anything for existing users.

let mut rng = rng();
let rand_store_id: String = (0..7).map(|_| rng.sample(Alphanumeric) as char).collect();
let mut vss_seed = [0u8; 32];
let mut vss_seed = [0u8; 64];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the vss_seed is 32 bytes, this is not the node entropy seed. Renaming and mixing the concepts here is confusing. Please revert.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code had to change to use a NodeEntropy rather than the previous version because VSS server no longer accepts fixed empty headers so we have to use the builder. It can't be reverted, but I renamed the field.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I mostly meant the renaming, though we will need to revisit this as part of lightningdevkit/rust-lightning#4323 anyways, as NodeEntropy is ofc an LDK Node concept, so the builder can't take it as an argument then.

store_id.clone(),
HashMap::new(),
)
.build_with_vss_store(node_entropy, vss_base_url.clone(), "".to_owned(), HashMap::new())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The empty store ID here is invalid, we expect it to be character varying(120) NOT NULL (see https://github.com/lightningdevkit/vss-server/blob/main/rust/impls/src/postgres/sql/v0_create_vss_db.sql#L3). Please revert this to use the alphanumeric store ids we were using before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, shouldn't we update vss-server to accept an empty store-id? I imagine the most common use will in fact by an empty store-id (because there's no reason to set a store id) and it doesn't seem worth forcing people to set it.

@TheBlueMatt
Copy link
Contributor Author

In this case, your changes to the store ID generation don't work, resulting in

Ah! Sorry I missed that, thought I looked for earlier errors...

@tnull
Copy link
Collaborator

tnull commented Feb 23, 2026

CI unhappy:

error[E0425]: cannot find value `vss_seed` in this scope
    --> src/io/vss_store.rs:1016:46
     |
1016 |         let entropy = NodeEntropy::from_seed_bytes(vss_seed);
     |                                                    ^^^^^^^^ not found in this scope

error[E0425]: cannot find value `vss_seed` in this scope
    --> src/io/vss_store.rs:1031:46
     |
1031 |         let entropy = NodeEntropy::from_seed_bytes(vss_seed);
     |                                                    ^^^^^^^^ not found in this scope

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants