Add support for the simple "sigs-based auth" VSS scheme#755
Add support for the simple "sigs-based auth" VSS scheme#755TheBlueMatt wants to merge 6 commits intolightningdevkit:mainfrom
Conversation
TheBlueMatt
commented
Jan 15, 2026
|
👋 I see @tnull was un-assigned. |
|
Depends on lightningdevkit/vss-client#54 |
tnull
left a comment
There was a problem hiding this comment.
Generally looks good, some comments. Mind adding an intermittent commit that bumps the dependency to your branch of vss-client so we can see if CI passes?
| .build_with_vss_store( | ||
| config_a.node_entropy, | ||
| vss_base_url.clone(), | ||
| "node_1_store".to_string(), |
There was a problem hiding this comment.
Why are we changing the store_id here and below?
There was a problem hiding this comment.
To demonstrate/test client isolation in VSS by default.
There was a problem hiding this comment.
Hmm, not sure it's worth the changes, but if you think it somehow preferable, sure..
There was a problem hiding this comment.
Entirely up to you. I did it cause coverage is coverage, even if the coverage is in totally the wrong repo. If you don't want coverage in the wrong repo, that's also totally fine :).
There was a problem hiding this comment.
Yeah, feel free to leave as-is.
|
Oops, fixed an issue but also pushed a new commit to drop the |
tnull
left a comment
There was a problem hiding this comment.
Generally looks good, would still be good to see a vss-client dependency bump (preferably before the next vss-client release so we get a chance to amend bugs).
I don't mind the test changes, but leaning NACK on the fixating the store_id suddenly, see below.
| .build_with_vss_store( | ||
| config_a.node_entropy, | ||
| vss_base_url.clone(), | ||
| "node_1_store".to_string(), |
There was a problem hiding this comment.
Hmm, not sure it's worth the changes, but if you think it somehow preferable, sure..
src/builder.rs
Outdated
| &self, node_entropy: NodeEntropy, vss_url: String, fixed_headers: HashMap<String, String>, | ||
| ) -> Result<Node, BuildError> { | ||
| let logger = setup_logger(&self.log_writer_config, &self.config)?; | ||
| let store_id = "ldk-node".to_owned(); |
There was a problem hiding this comment.
I don't think we should do this. Adding pubkey-auth is just another authentication mechanism, which is otherwise unrelated to the VSS API contract. Note that we had users pick store_ids freely already and they are running in production. Suddenly fixating the store_id breaks the VSS API contract but also disallows them to switch to pubkey-auth from whatever they are running right now.
There was a problem hiding this comment.
I'm not quite sure I understand the argument around the API contract - there's an API contract defined in vss-client and between the VSS client and server, but I'm not really sure that applies to new instances of ldk-node using VSS. Do we really think they're going to re-derive the VSS auth key from their entropy in order to instantiate another instance of the VSS client with the same auth to store other stuff there and need to put it in the same store_id? If not, making them pass another argument just seems like yet more cognitive burden on devs to figure out what a store_id is.
disallows them to switch to pubkey-auth from whatever they are running right now.
Fwiw they can manually instantiate the vss-client VssHeaderProvider instance and call build_with_vss_store_and_header_provider, though we could also add a separate method if we anticipate users switching authentication mechanisms but keeping their existing VSS instance. It wasn't clear that we anticipate that kind of migration.
There was a problem hiding this comment.
I'm not quite sure I understand the argument around the API contract - there's an API contract defined in vss-client and between the VSS client and server, but I'm not really sure that applies to new instances of ldk-node using VSS.
Well, but we're re-exposing that API in LDK Node. In particular, we recently introduced VssStoreBuilder explicitly so that other projects (such as Orange) can build a VssStore instance outside of/prior to LDK Node initialization to be able to store some extra data. We also considered that a pre-factor to then finally upstream VssStore and VssStoreBuilder to lightning-persister soon, which has been the goal since the beginning, but we never go around to do it so far. That's all to say that I'm not quite seeing the need to fixate the store_id at all, and in fact we'll need to revert that (and then re-apply for backwards compat if we'd go that way) once we upstream the code.
Fwiw they can manually instantiate the vss-client VssHeaderProvider instance and call build_with_vss_store_and_header_provider, though we could also add a separate method if we anticipate users switching authentication mechanisms but keeping their existing VSS instance. It wasn't clear that we anticipate that kind of migration.
I mean, we should anticipate such a migration, as we have a bunch of users on JWT right now, that might opt to choose the new auth mechanism which seems to be the new default over time? At least I don't think we should make it impossible or have them some custom route just to do that? FWIW, we also have MigratableKVStore, why not allow users to build two independent VssStores with different store_id (but both using same auth) and then migrate between them, e.g., for backup purposes.
There was a problem hiding this comment.
Now opened a draft for this lightningdevkit/rust-lightning#4323 (i.e., draft until this lands so we upstream the latest version).
There was a problem hiding this comment.
Well, but we're re-exposing that API in LDK Node. In particular, we recently introduced VssStoreBuilder explicitly so that other projects (such as Orange) can build a VssStore instance outside of/prior to LDK Node initialization to be able to store some extra data. We also considered that a pre-factor to then finally upstream VssStore and VssStoreBuilder to lightning-persister soon, which has been the goal since the beginning, but we never go around to do it so far. That's all to say that I'm not quite seeing the need to fixate the store_id at all, and in fact we'll need to revert that (and then re-apply for backwards compat if we'd go that way) once we upstream the code.
Right, the change here did not remove the store-id parameter from the VssStoreBuilder, but only from the NodeBuilder::build_with_vss_store method that uses an underlying VssStoreBuilder. My assumption here was that we wanted as simple an API as possible under the new sigs-based auth scheme, and for those using NodeBuilder::build_with_vss_store we assume that they are just using LDK Node, as they would otherwise use a VssStoreBuilder and pass a reference to the VSS Store to NodeBuilder::build_with_store so that they can also reuse the VSS Store elsewhere in their code.
I guess a related question here is whether something like orange can/should/will use a separate store-id from ldk node. I assume that it should, given we don't want it storing stuff somewhere LDK Node might want to in the future? In that case it would need multiple VSS Store objects. Sadly I believe the store-ids aren't obfuscated so the service would trivially see which objects are for which submodule which does kinda suck.
I mean, we should anticipate such a migration, as we have a bunch of users on JWT right now, that might opt to choose the new auth mechanism which seems to be the new default over time? At least I don't think we should make it impossible or have them some custom route just to do that? FWIW, we also have MigratableKVStore, why not allow users to build two independent VssStores with different store_id (but both using same auth) and then migrate between them, e.g., for backup purposes.
Right, my question was only how prevalent it would be. Of course the current API supports it just fine (either with a manual header provider or via the VssStoreBuilder), just a question of if we expect it to be universal so much that we should make the API more complicated for new uses.
There was a problem hiding this comment.
@tnull As matt mentions, what do you think of the VssStoreBuilder -> VssStore -> Node::build_with_store route that people can take today to build a sig-auth based VSS backend, with a custom store_id ? Seems reasonable enough to ask people to take this route if they want to customize the store_id IMHO.
Tangentially: as to the goal of minimizing confusion / cognitive load, I would delete all the existing Builder::build_with_vss_store* methods, and make all VSS users go through VssStoreBuilder. We currently have two ways of achieving the same thing, which is confusing to me (see for example Builder::build_with_vss_store_and_lnurl_auth and VssStoreBuilder::build_with_lnurl).
Then with store_id, I think it would be more helpful to add one or two documentation sentences explaining to people how they should think about store_id at a high-level, rather than the current choice of either 1) fixing the store_id to ldk-node under the hood, or 2) letting people customize it, with no guidance on how they should think about selecting the store_id string, and no explanation on the purpose of store_id in the VSS design.
There was a problem hiding this comment.
Right, the change here did not remove the store-id parameter from the
VssStoreBuilder, but only from theNodeBuilder::build_with_vss_storemethod that uses an underlyingVssStoreBuilder. My assumption here was that we wanted as simple an API as possible under the new sigs-based auth scheme, and for those usingNodeBuilder::build_with_vss_storewe assume that they are just using LDK Node, as they would otherwise use aVssStoreBuilderand pass a reference to the VSS Store toNodeBuilder::build_with_storeso that they can also reuse the VSS Store elsewhere in their code.
No, note that VssStoreBuilder is not (and given we're going to upstream it, never will) be exposed in bindings, so we still need a way for bindings users to set their store id.
I guess a related question here is whether something like orange can/should/will use a separate store-id from ldk node. I assume that it should, given we don't want it storing stuff somewhere LDK Node might want to in the future? In that case it would need multiple VSS Store objects. Sadly I believe the store-ids aren't obfuscated so the service would trivially see which objects are for which submodule which does kinda suck.
Yeah, maybe they should build two separate stores, though not sure what's @benthecarman's thoughts here. Any case, choosing store ids is left to the user, if they obfuscate it the service won't learn anything.
There was a problem hiding this comment.
@tnull As matt mentions, what do you think of the
VssStoreBuilder->VssStore->Node::build_with_storeroute that people can take today to build a sig-auth based VSS backend, with a custom store_id ? Seems reasonable enough to ask people to take this route if they want to customize thestore_idIMHO.
Unfortunately, I'm still not the biggest fan of throwing in that semantic API change with just adding another auth mechanism. It's really an entirely orthogonal choice, that now unnecessarily complicates the discussion on this PR, IMO. As noted above, bindings users won't have access to the VssStoreBuilder, so we need to provide a way to set the (potentially legacy) store id via Builder.
Tangentially: as to the goal of minimizing confusion / cognitive load, I would delete all the existing
Builder::build_with_vss_store*methods, and make all VSS users go throughVssStoreBuilder. We currently have two ways of achieving the same thing, which is confusing to me (see for exampleBuilder::build_with_vss_store_and_lnurl_authandVssStoreBuilder::build_with_lnurl).
See above, we can't do that as we don't expose VssStoreBuilder in bindings, and likely won't ever given we're going to upstream it to lightning-persister, where we don't have the notion of uniffi bindings.
There was a problem hiding this comment.
Mmm, that's a good point about bindings. I'll re-add the argument but try to add more context in the documentation.
|
🔔 1st Reminder Hey @tankyleo! This PR has been waiting for your review. |
|
🔔 2nd Reminder Hey @tankyleo! This PR has been waiting for your review. |
|
🔔 3rd Reminder Hey @tankyleo! This PR has been waiting for your review. |
1c51a36 to
4e3f7b9
Compare
205e2eb to
96434f8
Compare
|
Done! |
tnull
left a comment
There was a problem hiding this comment.
Ah, seems bindings builds are still failing unfortunately.
|
No, rustfmt I had fixed, but now bindings aren't building. Its very annoying that there isn't an easy way to run CI tests locally, both because whenever I try to run |
cb16271 to
1756e6a
Compare
At lightningdevkit/vss-server#79 we added a new, trivial, VSS authentication scheme that ensures client isolation without much else. This is great for testing, and we expect some to do new-account-rate-limiting via other means, so might well become a common default. Here we add support to it in ldk-node.
1756e6a to
c4a6307
Compare
Ah, anything in particular that stands out?
Hmm, well, for the Rust part the That said, given you changed the API, you'll also need to make changes in |
|
The other CI failure is #798 |
I've never debugged it honestly. I often see port-bind errors as well.
Okay, I'll update my local scripts to just always run that instead.
No, the compilation failure there was introduced in 76194c7 |
Hmm, very strange, as this pre-dates lightningdevkit/vss-server#79 considerably, and before that CI was passing, in particular in #683? |
When we added the trivial sigs-based authentication scheme in VSS, we made it the default if no other authentication scheme was configured and default features are enabled. This broke our integration tests as we were expecting no authentication to be required in such a case. Here we fix this by switching to the new sigs-based auth scheme, removing `store_id`s to demonstrate client isolation while we're at it. Sadly, because we don't currently have a test framework for LNURL-auth-based VSS, and because VSS no longer defaults to no-auth, we can't practically test the upgrade-from-0.6 logic anymore, so opt to simply drop the test.
739f16c to
490bfda
Compare
|
Ah, apologies, that was actually a rebase error, the code that was broken by that commit was mine, I just didn't notice. |
tnull
left a comment
There was a problem hiding this comment.
Unfortunately CI is still failing.
490bfda to
85aec2d
Compare
85aec2d to
344eb8d
Compare
|
The tests are now hitting "Cannot drop a runtime in a context where blocking is not allowed. This happens when a runtime is dropped from within an asynchronous context." Any idea why? |
tnull
left a comment
There was a problem hiding this comment.
The tests are now hitting "Cannot drop a runtime in a context where blocking is not allowed. This happens when a runtime is dropped from within an asynchronous context." Any idea why?
Yes, this happens when some of the tests exit early due to a previous panic/error. In this case, your changes to the store ID generation don't work, resulting in
Err(Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: Some(Error), code: SqlState(E23514), message: "new row for relation \"vss_db\" violates check constraint \"vss_db_store_id_check\"", detail: Some("Failing row contains (02c5703a8c0e840b3ed08e53985b2c4a506ca3dd3c04c86bd86a2e8c899eda6e..., , 1RtRhcZwBa85/GA7LTcrwHXRNZvxHioEFOH6P27+GBMV5lLO/2zfIGp/q03H#jTX..., \\x0a0f466c23eef591505b6d2ad20091681212320a104368614368613230506f..., 1, 2026-02-23 10:40:37.512281+01, 2026-02-23 10:40:37.512281+01)."), hint: None, position: None, where_: None, schema: Some("public"), table: Some("vss_db"), column: None, datatype: None, constraint: Some("vss_db_store_id_check"), file: Some("execMain.c"), line: Some(2074), routine: Some("ExecConstraints") }) })
[2026-02-23T09:40:37.515Z DEBUG vss_server::vss_service:100] PutObjectRequest 5972812125750132542 failed: InternalServerError: Database operation failed. db error
on the server side for any PutObjectRequest.
| } | ||
|
|
||
| #[tokio::test(flavor = "multi_thread", worker_threads = 1)] | ||
| async fn vss_v0_schema_backwards_compatibility() { |
There was a problem hiding this comment.
I updated the commit message but its not really practical to support anymore - we don't have LNURL-auth support in the old version of LDK Node being tested here and VSS wants auth now, so getting the data in the table isn't trivial. We'd have to use a custom VSS server authenticator or disable the auth...given VSS is labeled "alpha" still it didn't seem worth the complexity.
There was a problem hiding this comment.
we don't have LNURL-auth support in the old version of LDK Node being tested here
Yes we did? https://docs.rs/ldk-node/0.6.2/ldk_node/struct.Builder.html#method.build_with_vss_store
and VSS wants auth now
No, test setups like this are exactly the reason why we kept NoopAuthorizer around behind the cfg(noop_authorizer) flag: lightningdevkit/vss-server@52fd919
so getting the data in the table isn't trivial
It should be relatively easy, given one of the two paths mentioned above?
Note that we did the schema upgrade just last release and existing users aren't actually migrated to the new schema. So IMO it would be good to keep the backwards compatibility around to ensure we don't break anything for existing users.
src/io/vss_store.rs
Outdated
| let mut rng = rng(); | ||
| let rand_store_id: String = (0..7).map(|_| rng.sample(Alphanumeric) as char).collect(); | ||
| let mut vss_seed = [0u8; 32]; | ||
| let mut vss_seed = [0u8; 64]; |
There was a problem hiding this comment.
No, the vss_seed is 32 bytes, this is not the node entropy seed. Renaming and mixing the concepts here is confusing. Please revert.
There was a problem hiding this comment.
The code had to change to use a NodeEntropy rather than the previous version because VSS server no longer accepts fixed empty headers so we have to use the builder. It can't be reverted, but I renamed the field.
There was a problem hiding this comment.
Right, I mostly meant the renaming, though we will need to revisit this as part of lightningdevkit/rust-lightning#4323 anyways, as NodeEntropy is ofc an LDK Node concept, so the builder can't take it as an argument then.
| store_id.clone(), | ||
| HashMap::new(), | ||
| ) | ||
| .build_with_vss_store(node_entropy, vss_base_url.clone(), "".to_owned(), HashMap::new()) |
There was a problem hiding this comment.
The empty store ID here is invalid, we expect it to be character varying(120) NOT NULL (see https://github.com/lightningdevkit/vss-server/blob/main/rust/impls/src/postgres/sql/v0_create_vss_db.sql#L3). Please revert this to use the alphanumeric store ids we were using before.
There was a problem hiding this comment.
Hmm, shouldn't we update vss-server to accept an empty store-id? I imagine the most common use will in fact by an empty store-id (because there's no reason to set a store id) and it doesn't seem worth forcing people to set it.
Ah! Sorry I missed that, thought I looked for earlier errors... |
|
CI unhappy: |