fix(linux/xdgportal): Properly support multiple screens by exposing pipewire streams as separate displays#4931
Conversation
1f4fa32 to
58efcab
Compare
|
@psyke83 Sorry to ping you here but can you think of a reason why the keyboard shortcut to switch displays would not work with portalgrab? I've been trying to do so after enumerating them properly for display_names (current state of this PR) but the switch_display_event from video.cpp does not seem to be firing or handled when using portalgrab and I can't figure what I'm missing. |
If you add some debug to video.cpp you'll see that your current code is hitting the |
|
Look: Sunshine/src/platform/linux/kmsgrab.cpp Lines 1221 to 1230 in b172a98 It seems that portalgrab is not checking the push_captured_image_cb callback the same way in the capture loop. The same logic may need to be added. I will look at this further later when I have time, if you don't figure it out. |
That is exactly what was missing to get the keyboard shortcut to work. Thanks! |
5fa6e6f to
6d7942e
Compare
|
I've got the basic functionality working now and can switch multiple displays using keyboard shortcuts. The main issue right now is when the same display currently actively streaming is requested to be switched to again (e.g. currently streaming display 0 and trying to switch to it via shortcut) then the stream will disconnect due to https://github.com/Kishi85/Sunshine/blob/6d7942e0c82aeedfca16b77735f0fc6cf716a4ec/src/platform/linux/portalgrab.cpp#L1294-L1309 which is causing the stream to disconnect with Problem here is that this is necessary to detect if the user stopped the stream via the XDG notification (according to @psyke83 notes in #4914 (comment)). Possible solutions that need more research:
|
6d7942e to
3a22dff
Compare
auto switch_display_event = mail::man->event<int>(mail::switch_display);
if (previous_width.load() == width && previous_height.load() == height && previous_pos_x.load() == pos_x && previous_pos_y.load() == pos_y && switch_display_event->peek()) {Next issue is when changing displays via shortcut too quickly there seems to be a possible race condition which can happen that freezes Sunshine hard until it is restarted and then it is exiting with: |
e097695 to
79773f8
Compare
Scratch that solution as I've had a typo (missing inversion) there and didn't realize that the event is already popped when the display gets reset by video.cpp and the portal re-initialzies. Back to square one on that.
Sunshine hanging when switching too quickly is still an issue. UPDATE: This could be related to chosen encoder, having tried vulkan and vaapi yields different results (hang vs. SEGV). UPDATE 2: More or less consistently reproducible when switching screens back and forth without waiting for the stream to update to the new screen. Does not seem to trigger if waiting for the stream to update to the new screen after switching plus a second or two more. |
36c8067 to
073e73d
Compare
|
The crashing (whether hang or segfault) may be due to the pipewire loop still being active during fake reinit. See this block here: Sunshine/src/platform/linux/portalgrab.cpp Lines 1363 to 1370 in b172a98 And here: Sunshine/src/platform/linux/portalgrab.cpp Lines 1393 to 1399 in b172a98 You may also need to set these same variables when you intend to signal an artificial reinit (i.e., when push_captured_image_cb returns false) to ensure the encoder fully stops. Regarding the race condition that causes a hang, one of the changes in #4875 resolves a hang after disconnect that happens when there are no screen updates happening. Since the on_process callback doesn't fire, the capture loop gets stuck. I worked around that by checking the pull_free_image_cb result during timeout, so you may want to see if you need to do a similar check elsewhere (or perhaps find a better fix). |
|
I'll stop force-pushing most things im doing right now and will be squashing the whole thing in the end after we've got all the kinks ironed out. A few things I've done and learned now:
With all these changes I seemingly have managed to get a stable reproducer for the crash (SEGV) by switching to the same display that is currently streaming. This is working on a single monitor setup as well, allowed by the implementation in video.cpp and should just fully reset the display (+pipewire stream) without segfaulting. The crash only occurs when switching to the same display. Switching to another display and then back to the first one does not seem to crash (as easily) unless you're switching very quickly but that might just be resulting in a switch to the same display internally in the end.
UPDATE: Testing this again and again leads to the same result: Segfault is only happening when trying to switch to the same screen that is already streaming even though the display reset/initialization looks the same in the log as when switching from one screen to another. It is properly reusing the session cache now and connecting to the same pipewire_node for each display. Could this be some quirk when repeatedly streaming the same node from pipewire? Switching to a different node and then back to the first one does not seem to trigger this issue. |
ed38479 to
12c20b7
Compare
|
After a lot of trial and error I've managed to cut the hangs when switching display down to only when very quickly, repeatedly switching (to the same? haven't tried others) display. You have to basically mash the shortcut to get a segfault. My theory for that is that when switching really quickly something in the capture logic combined with pipewire just cannot keep up (some thread sync issue?). This is still an issue but I'm wondering if it can be mitigated in video.cpp as it seems unreasonable to allow queuing another switch_display_event while the previous one's reset_display has not finished: Lines 1478 to 1484 in b172a98 All other issues I've had with this are basically ironed out now:
Things still todo before I'm likely to remove the draft status:
|
aa2ff1f to
b346dd1
Compare
|
Squashed, rebased and description+title updated. This is ready for review. There's still one known issue (that was there before this PR just masked, see description for details) but maybe someone reviewing this has an idea on how to fix or work around that. |
b346dd1 to
1165c03
Compare
Bundle ReportBundle size has no change ✅ |
1165c03 to
8edc0ba
Compare
|
Linter issues fixed as reported |
38ae04f to
62dd906
Compare
…ipewire streams as separate displays
Enables display switching on the client-side with Sunshine's existing
shortcuts (CTRL+ALT+Fxx) when selecting multiple screens on the
screencast source selection dialog (or automatically all availabe screens
when using a combined remotedesktop+screencast session, tested with KDE).
Each pipewire stream given by the portal will be available as
a separate display (consistently ordered by position).
62dd906 to
d3eb98a
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #4931 +/- ##
=========================================
Coverage ? 18.16%
=========================================
Files ? 108
Lines ? 23405
Branches ? 10330
=========================================
Hits ? 4252
Misses ? 14146
Partials ? 5007
Flags with carried forward coverage won't be shown. Click here to find out more.
|
|
Apologies, I won't have time to look at the code right now, but I did a quick test on Plasma 6.6.3 and found some issues.
When testing resolution changes, I used a simple script like the following for a single monitor: This doesn't account for multi-monitors, but it can still trigger the issue. |
I can reproduce this and know where it is breaking but without further looking into it this does not make much sense to me but maybe you have an idea. Due to the resolution change the display will be seen as a new screen which in term invalidates the session_cache and thereby restarts the portal (also fetching a new pipewire_fd and pipewire_streams vector): https://github.com/Kishi85/Sunshine/blob/d3eb98a2cc4ae095547d6b86447f3468e295397a/src/platform/linux/portalgrab.cpp#L1308-L1314 I've seen similar greenscreen behaviour when disabling the session_cache and trying to use dbus->connect_to_portal directly but couldn't make any sense of it at the time. Since I needed the session_cache anyway for detecting user ending via XDG notification this seems to have slipped during my testing.
This one could be related to the display_naming scheme being based on both position and resolution. Does the log note falling back to the first stream? |
UPDATE: Quick test shows that the pipewire stream does connect like it should do after session_cache invalidation/new cached portal session (so basically the same state as my previous problem when using dbus->connect_to_portal directly) but only gets to stream state paused and does not start streaming hence the green screen. Any quick ideas to why this would happen?
UPDATE: This one is a bit tricky as changing resolution is currently seen as removing the screen with its old resolution and adding a new one with the new resolution. This is detected by video.cpp and handled by picking the first display from the displaynames list for re-init: Lines 1241 to 1251 in d3eb98a Unless we find a way to generate persistent display_names here the display switch on resolution change might have to be categorised as expected behaviour. Persistently naming displays is hard when using portals as the streams are sorted by pipewire_node which is literally random so you have to do some sorting to get a consistent list. I'm using screen position in this PR but even position might change here on resolution change if you're on a stretched layout, so leaving the resolution out of the display_name won't help as much (unless its the display at 0,0 but that is picked as the first one anyway). Also since we're using the screencast portal we might not even have all available screens (as they user can select them in a screencast-only session) to map them to real displays and do the display naming/matching based on whatever else which always would be a lot more complicated. |
Is the PR reinitializing to an existing stream? If the pipewire state is set to paused and reconnected with an active stream, ensure_stream will be a no-op. Perhaps you need to add an else clause to the end that resumes the stream via setting
The description for |
Unfortunately that won't help as the display gets full reset so we're working with a newly initializing portal_display when the stream state gets stuck in paused. There's also no logging for pipewire parameters negotiated (like Video Format and so on) anymore for any capture re-init after the session_cache gets invalidated for the new display.
I'll have a look at this but the descriptors might be initialised in a random order that would shuffle displays around again. I'm also curious what will happen when a screen gets added or removed. It might be consistent and persistent per portal token but might just be that and also not guaranteed to stay that way on resetting the token. Using position to sort the stream list was the most logical choice so far. |
Could the resolution change be handled without a capture re-init? Because that's what is causing the display switch in the first place. |
KWIN isn't providing this id value at all in it's Session.Start response (as seen from simple capture with bustle): It's just providing I'll continue trying to figure out the pipewire stream paused issue over the weekend when I have time for it and post further updates on my findings. |
pipewire_fd is set by dbus_t::open_pipewire_remote and therefore needs to be closed on the destruction of associated dbus_t because it a sessions pipewire_streams are only available on that fd. Not closing the fd with the Portal.Session can have sideeffects (like pipewire streamstate getting stuck in paused) if dbus_t is called multiple times concurrently.
|
Looks like I've found the root of the greenscreen problem with the session invalidation by sheer luck and it's because the pipewire_fd is closed at the wrong point (in session_cache_t instead of dbus_t where it's essentially created by calling open_pipewire_remote). This was not a problem before but since this PR is using a separate dbus_t Portal for the display names enumeration the pipewire_fd's from that enumeration call never get closed and basically block the pipewire_fd that gets created after session cache invalidation. TODO: Change the Portal call for display enumeration so that it does not use open_pipewire_remote (which is totally unnecessary for it as we just need the streamlist from Session.Start). |
This change by itself does not seem to be catching the full issue. Additionally invalidating the cache on every portal_display init so new streams are directly available does fix it however (f01ed84). Doing it this way does not change behaviour compared to current master as there the session cache gets invalidated on every pipewire::cleanup_stream resulting in recreating the session on every portal_display init as well. I'm still curious as to why the issue is occurring if we just invalidate on demand when a display cannot be matched to a stream. EDIT: Since video.cpp is already ensuring that we only have one display active at the same time (hence never having multiple tray icons and it even destroys the current display for reenumeration of display_names) I'm wondering if we need the session_cache for dbus_t at all. It's currently also tracking if maxFramerate failed and the portal session has been closed by the user and for that it will still be necessary.
I've tried this by adding a parameter to connect_portal but doing it that way breaks mouse input? I wonder why this is happening. No issue leaving it the way it is now though as the pipewire_fd will just get closed unused. |
f01ed84 to
8528881
Compare
|
Checking if the session cache and display names from a new portal match and doing the invalidation should they not during the display_names enumeration leads to a working portal_display init with a working cache that is only invalidated when the screen configuration changes. |
…iffer from dbus during display names retrieval This is a follow up to fix capture issues (black- or greenscreen) when changing resolution due to session cache invalidation during portal display init. Doing this will get the pipewire stream stuck in paused state. Pipewire stream is working properly when session cache check and invalidation are done during portal_display_names while discovering available displays.
8528881 to
087725d
Compare
pipewire_screenstream_t -> pipewire_streaminfo_t as it is only containing the metadata necessary to identify the stream streams_to_display_names -> portal_streams_to_display_names so it can be easily identified as portal related
|
50e5291 to
eeef927
Compare
…hat as display_name If correlation fails, fall back to position/resolution matching for a specific stream. To distinguish display_names between modes prefix with 'n' for xdg_name and 'p' for position/resolution
eeef927 to
edce2dc
Compare




Description
This PR adds proper multi-monitor support to Sunshine's XDG portal grab by exposing all streams of a capture portal as separate screens (utilizing the multiple capture mode of the screencast portal).
Things this PR does:
Known issues:
Can segfault when trying to switch displays really quickly (by mashing the screen switch shortcut multiple times within a second) and even then it's not fully reproducible. This would have been happening even before this PR when just having a single stream but was masked due to the shortcut handling not being properly implemented in the capture logic (missing return on !push_captured_image_cb).This is a separate issue concerning other capturemethods as well (tested with KMS), see Coredump with SIGSEGV when switching displays too quickly #4943Screenshot
Issues Fixed or Closed
Roadmap Issues
Type of Change
Checklist
AI Usage