Jan Grulich

WebRTC: journey to make wayland screen sharing enabled by default

While we have pretty good support for screen sharing on Wayland in WebRTC, which is included in browsers like Chromium or Firefox, it is still not enabled by default in Chromium and it is kept behind a flag. Not only you have to remember to always enable it for new configurations, but for many users it is not even something they are aware of. This has been my main focus recently and I would like to share with you steps that has been done and what are the plans for the future.

What are the changes to expect in Chromium soon?

DMA-BUF improvements/fixes:

Last year I landed proper DMA-BUF support in WebRTC, which made things way faster. It was working, but it was not perfect and there were some corner cases where it might not be working at all. Here are changes I made recently:

  • Advertise DMA-BUF support when it is really supported. Older versions of PipeWire don’t handle the new way of DMA-BUF negotiation and therefore it shouldn’t be used in such cases. Also using DMA-BUF modifiers requires some recent versions of PipeWire on both sides.
  • Implemented stream renegotiation. In situation when we fail to import a DMA-BUF with given modifier, we will drop this modifier and try to renegotiate stream parameters and go with a different modifier or fallback to shared memory buffers in case we fail completely.
  • Make sure to import DMA-BUF with correct render node. In case of multi-gpu setups, we always picked the first render node to import DMA-BUFs, but it can happen that they were actually produced by a different render node and for that reason we might fail to import them. We now try to get default EGLDisplay, which should be the same one used by the wayland compositor and we should be using same render node.

Better mouse cursor support:

Until now we had mouse cursor as part of the screen content. This means that everytime you moved with your mouse cursor, we had to update whole image and that is very inefficient. The API in WebRTC allows you to implement MouseCursorMonitor which can be used to track mouse changes only and each platform can have both MouseCursorMonitor and DesktopCapturer implementations combined in DesktopAndCursorComposer to get complete image and this all works automatically like a magic. Unlike X11 implementation, our only option is to get everything from one PipeWire stream we connect to and there was no way how to make it shared from DesktopCapturer implementation so it can be used by MouseCursorMonitor implementation. I had to split DesktopCapturer to have xdg-desktop-portal and PipeWire separate implementations. Code for PipeWire is now a SharedScreenCastStream class which is being shared through DesktopCaptureOptions. This is set of parameters associated with each capturer instance and luckily this is also passed to MouseCursorMonitor so we can have access to already initialized PipeWire stream and get the cursor data from there. Implement MouseCursorMonitor with SharedScreenCastStream was then piece of cake.

List of merge requests:

This should again significantly improve performance of screen sharing, because moving with a mouse over a static screen content doesn’t need full screen content update.

Misc:

Last but not least, I’m now in touch with Google developers who help me to review all my changes and discuss with me the current state, issues I have, etc. on monthly meetings we have. The plan is to make this finally enabled by default, hopefully in the first half of this year. There are still some things that need to be solved before this is enabled and there is lot of work ahead, but things look promising.

Plans for the future:

  • Implement stream restoration
    • this will allow us to skip the second portal dialog and I already have plan in my head how to do this in WebRTC. This is currently only supported by xdg-desktop-portal-gnome and xdg-desktop-portal-kde lacks this functionality.
  • Improve UX of the Chromium screen sharing dialog
  • Write tests for all PipeWire/portal code in WebRTC

Even though WebRTC is used in Firefox, I mostly talk about Chromium, because Firefox doesn’t use most recent WebRTC and will need to pick all the changes I did or rebase to newer WebRTC in order to have them. Firefox also has PipeWire/Wayland screen sharing enabled by default and doesn’t have UX issues as there is no internal screen sharing dialog like in Chromium.

I hope all these changes will make your experience better and next time when you read a new blog post I will be informing you about end of this journey.

How to use libportal/libportal-qt

There was a blog post from Peter Hutterer about Flatpak portals posted few months back. Peter explained what are portals and how do they work. Portals are used mostly because of security and sandbox/Wayland restrictions. Many times your only way to get access outside (opening a file, sending a notification, sharing a screen, etc.) is to use a portal. For most use-cases applications or developers don’t need to care about them as their support is usually implemented in libraries they use. For example Qt and GTK use portals internally so apps can use still the same APIs as before and they don’t need to worry about their apps not working in sandboxed environments. BUT there are still scenarios where libraries have unsufficient or none portal support, or a different options are desired so what are the options in this case if you still need to use portals?

  1. Do everything yourself, which means you will implement all the DBus calls and handling yourself.
  2. Use a library. Most logic choice would be libportal, but there is also a project called ASHPD for Rust users.

What is libportal and libportal-qt?

The libportal library provides GIO-style async APIs for Flatpak portals. It hides all the DBus complexity users would face in case of using portals directly and provides a user-friendly library instead. You might think that the libportal-qt is the same thing, just with Qt-style APIs, but the idea behind it is that each toolkit (Gtk3, Gtk4, Qt5, Qt6) has a different way to get a window handle which is needed to associate portal dialogs with the app that invoked them. So libportal-qt just provides a way to get a XdpParent object from a QWindow. As a C++/Qt developer I don’t mind using C/Glib APIs and I used it many times, but there is still one speciality I fail to use everytime, my friend GVariant. Some of the portal APIs in libportal expects a GVariant for all the complex structures, for example to specify a filter option for OpenFile() call from the fillechooser portal, you have to build a very complex GVariant based on the DBus specification.

Remember I told you libportal-qt doesn’t offer Qt-style APIs? This is not necessarily true, because I implemented all the complex structures you will have to pass in most of the portals and implemented functions that will return them as GVariants so you don’t need to get in touch with GVariants at all.

How to use libportal-qt?

First of all, all libportal flavours have pkgconfig file installed so it’s easy to use them from any build system and you just need to search for libportal-qt5 (we don’t have -qt6 version yet).

And how does the code look like? For example let’s say you want to open an image:

// Creates a filter rule, this can be a Mimetype or Pattern.
XdpQt::FileChooserFilterRule rule;
rule.type = XdpQt::FileChooserFilterRuleType::Mimetype;
rule.rule = QStringLiteral("image/jpeg");

// Create a filter with our rules, we will then pass it to OpenFile() call as GVariant.
XdpQt::FileChooserFilter filter;
filter.label = QStringLiteral("Images");
filter.rules << rule;

// Create a GVariant from our filter. This will result into variant in form of:
// "[('Images', [(1, 'image/jpeg')])]"
g_autoptr(GVariant) filterVariant = XdpQt::filechooserFiltersToGVariant({filter});

// Get XdpParent to associate this call (portal dialog) with our window.
XdpParent *parent = xdp_parent_new_qt(m_mainWindow->windowHandle());

// Finally open a file. XdpQt::globalPortalObject() is another convenient function 
// that creates a global instance of XdpPortal object so you don't need to take care
// of creating it yourself. For some of the arguments we just pass a nullptr to don't 
// specify them.
xdp_portal_open_file(XdpQt::globalPortalObject() /*XdpPortal object*/,
                                  parent /*XdpParent object*/, "Title", filterVariant /*filters*/,
                                  nullptr /*current_filter*/, nullptr /*choices*/, 
                                  XDP_OPEN_FILE_FLAG_NONE /*flags*/, nullptr /*cancellable*/, 
                                  openedFile /*callback*/, this /*data*/);
xdp_parent_free(parent);

// Then the callback would look like this, eg.
static void openedFile(GObject *object, GAsyncResult *result, gpointer data) {
    g_autoptr(GError) error;
    g_autoptr(GVariant) ret = 
        xdp_portal_open_file_finish(XdpQt::globalPortalObject(), result, &error);

    if (ret) {
        // Another convenient function that will get you uris and choices from 
        // GVariant returned by xdp_portal_open_file() call.
        XdpQt::FileChooserResult result = filechooserResultFromGVariant(ret);
        
        // Do whatever you want to do with the result. Here we just print opened selected files.
        qDebug() << result.uris;
    }
}

As you can see, no GVariant got hurt and you can easily open a file without any GVariant knowledge. Besides FileChooser portal helpers, we also have Notification portal helpers, because serializing icons and buttons is also something that is not trivial. For the rest of the portals you either don’t need to use complex GVariants so you can use them easily without helper functions same way as shown above, or some portals like ScreenCast or RemoteDesktop are not used that often and we don’t have helper functions for those just yet.

I hope you can find this helpful in case you want to join this world. The libportal project is hosted on GitHub in case you want to try it just now, because this is still not part of any stable release (will be in libportal 0.6), or report a bug or just look at my GVariant helpers to see what I spare you of.

DMA-BUF support in WebRTC

It will be almost three years since we landed initial support for screensharing on Wayland with the use of PipeWire in the WebRTC project. This enabled screensharing support in both major Linux browsers. Last year I implemented support for window sharing, added support for PipeWire 0.3 and added support for DMA-BUF and MemFD buffer types. Problem was, as it turned out, the DMA-BUF support was not implemented in a correct way.

The original implementation was using mmap() to get the buffer content. This worked correctly for current Intel GPUs, but was terrifically slow on e.g. AMD GPUs. Proper solution is to use OpenGL context to get the content from buffer. However, there were many implementations using mmap() already, including WebRTC and we needed a way how to properly communicate between the server and the client that when the client advertises DMA-BUF support, it means it doesn’t use mmap() and goes through OpenGL context instead.

Here are some issues if you want to read about the details:

This all resulted into a completely different way how the communication between the consumer and the producer should happen in order to use DMA buffers for way faster and smoother screensharing support. Both sides are now required to query the list of all supported modifiers and add this as a new stream parameter, including flags that the modifiers are mandatory parameter, rest of stream parameters are kept as before so we can keep using other types in case DMA-BUFs are not supported by the producer. Once both sides matches their expectations, we can query whether the stream includes modifiers, based on that we know we can use DMA buffers, which we now properly open using OpenGL context, while we kept mmap() for MemFd buffer types as fallback. This will result into faster screensharing support in your web browsers.

Last but not least, I made screensharing even faster, regardless of buffer type we use. Originally when we received buffer from PipeWire, we copied it to a local variable so we can apply cropping and adjust the position and only after that we copied this adjusted content into a DesktopFrame, which each DesktopCapturer (a class representing screensharing implementation) is supposed to return and let it be displayed by the browser. That means we performed two copy operations for each frame. I improved this implementation and now we copy the PW buffer content directly to a desktop frame which we can return directly so one copy operation less than before. I didn’t do some exact measures, but simply running htop and comparing usage of top 5 processes when sharing a 4k screen I got:

  • Original result: 66%, 64%, 26% 23%, 10%
  • Updated result: 41%, 39%, 19%, 17%, 12%

I also have some other improvements on my TODO list, all of them should bring some additional optimizations and improvements. I will keep you informed once I have news to share with you.

Both changes have been merged into WebRTC, that means it should be in Chrome/Chromium 96 (released during November 2021).

WebRTC/Chromium updates in 2020

In 2019, I started with my first contribution to WebRTC. This was all about screen sharing support on Linux Wayland sessions, using xdg-desktop-portal and PipeWire. Back then, it was quite simple, we only had PipeWire 0.2 and all portal backends supported only screen sharing (no window sharing). While this was relatively easy, it was not ideal as each screen sharing request involved two portal dialogs to get the screen content on the web page itself. For me it was a big success, because I made quite a significant contribution to such a big project, which is used by many people, and a project which is used by all modern web browsers.

At the beginning of 2020, the year everyone would like to erase from their memories, we got PipeWire 0.3 (with slightly different API) and later with xdg-desktop-portal-gtk and xdg-desktop-portal-kde (later this year) people were finally able to share application windows. Support for all of this was lacking in WebRTC, because back then those were not available. I wanted to tackle all issues at once, bring support for window sharing and get rid of the “dialog hell” with portals, which was even worse with the new window sharing capabilities in portal backends.

This is what the situation looks like. With each request to share a screen, you got the preview dialog from Chromium. This dialog consists from three pages. One is for screen sharing making one portal request, second one is for window sharing, which is another portal request, and the last one is just to allow you to share a web page you have opened. You had to confirm both portal dialogs, then confirm the Chromium dialog and finally you got one more portal dialog (ouch) to get the screen content on the web page itself.

I had a solution. I made all portal calls identified with an ID and shared this ID (portal call) in Chromium between both pages in the Chromium preview dialog and with the request made for the web page itself. With this solution we only had ONE portal dialog. This was a perfect solution (at least seemed to be). I started working on this at the beginning of this year, we exchanged many emails with people from Chromium UX team, because I wanted to do also some minor UI changes in the preview dialog. Unfortunately, those were rejected for consistency with all platforms. It was not a big deal and I submitted my changes for review, keeping UI as it was, just adding all necessary bits into Chromium and WebRTC to make it all work.

I wish to say things went smoothly since then, but the opposite is true. It took a while to get everything reviewed, but this is probably no surprise with this year being weird and many people working from home with less than ideal conditions. Anyway, few months passed away, I ended up rewriting my changes many times, not even counting hours I spent on it. This all resulted into me being obsessed with this change, it mattered to me so much to get it merged. I was constantly thinking about how to make it better, I was many times fixing issues in the evening (as reviewers were mostly US based), instead spending time with my family. It would be even better to waste my time with my beloved Playstation. This had really negative impact on my mental health and I realized this has to stop and I simply gave up, because I couldn’t continue this way and needed a break. I abandoned both changes (WebRTC and Chromium) and decided to just pick changes I will be able to successfuly upstream. I probably made my change too ambitious and complicated or maybe it’s just Chromium not being ready for this kind of change, because some tweaks were specific for my use-case. It’s also hard to say I wish upstream devs had helped me more, because there is so much to understand around Wayland, portals and PipeWire and way how it all works together.

Anyway, with a new start, without pressure after gaving up on the change, I picked the most important changes and submitted them separately. I was surprised now how smoothly this went and how fast those changes were upstreamed. Simply those changes were simple, understandable and easy to review. I didn’t gave up on fixing the “dialog hell” completely, I have some other ideas, but next time I will try to submit them step by step and will keep some distance and my free time.

And what are the changes you can expect in upcoming Chromium release in 2021?

Support for PipeWire 0.3

You can now build Chromium/WebRTC with both PipeWire 0.2 and Pipewire 0.3. There is a new “rtc_pipewire_version” option you can pass to your build configs.

Window sharing support

There is probably no description needed. You will be able to share application windows in case you don’t want to share whole screen.

Suppport for DmaBuf and MemFd buffer types

This should allow faster transfer of your screen content from your Wayland compositor, through PipeWire to your browser.

Less portal dialogs involved

If you look back into the screenshot I posted above, you can see there are two portal dialogs opened just for the Chromium preview dialog. I at least tried to reduce this to just one portal dialog. This was done by removing the page for window sharing, because the screen share request will already handle both screen and windows.

I think you can expect above mentioned changes in Chromium 89 and I hope you will at least appreciate some of these improvements even though I didn’t deliver everything I wanted to. Also, thanks to Martin Stránský from our Firefox team, you can expect all these changes to be also part of Firefox.

Happy holidays and see you in a better year.

Tutorial: Screen Sharing and Remote Desktop on Fedora Workstation 30 (Wayland)

I recently got an email from a user asking me how to make all this work on Fedora. Problem is that unlike in old XServer sessions, there are certain things which need to be enabled first. There are also dependencies which need to be installed and services which need to be running. While most of the dependencies are automatically installed and services automatically activated, there still might be situations when this is not true, for example when switching from another desktop so it’s better to cover it all. This tutorial targets Fedora, but it can be probably used by any other distribution.

Dependencies

Both screen sharing and remote desktop work almost identically on Wayland, they both use portals as a communication tool between applications and compositor (in this case Mutter) to start the process of sharing and setup PipeWire stream (see below). While portals were primarily meant to be used by sandboxed applications (e.g. Flatpak) to get access to system (like files or printing) outside the sandbox, their design perfectly fits for Wayland usage too. In Fedora you should have portals automatically installed, they are represented by two separate packages, first is xdg-desktop-portal, which is the portal service communicating with sandboxed applications and with a backend implementation of portals, and the second package is the backend implementation, in our case xdg-desktop-portal-gtk. Both are DBus activatable, which means they are automatically started whenever application calls them. The reason why portals consist from two services is that there can be multiple backends, each one providing native dialogs for your desktop. For example you don’t get a gtk dialog to open a file in KDE Plasma session or you want a backend communicating with specific compositor (like in our case with Mutter).

The second important dependency is PipeWire. PipeWire is the core technology used for screen content delivery from the compositor to applications. This is done throught a PipeWire stream shared between the compositor and application. PipeWire should be automatically installed on your system, the package name is pipewire and it provides socket-based activation so you shouldn’t need to worry if it’s running or not.

Enabling screen sharing and remote desktop in Gnome

You don’t seem to do any additional step in order to make screen sharing work. However, you need to enable remote desktop (if you want to). Go and open gnome-control-center (Settings) and there go to Sharing section. There you should see this window when you click on Screen Sharing:

If you don’t have such option, make sure you have installed gnome-remote-desktop, because I’m not sure whether it’s installed by default. Allowing screen sharing will start a server instance which you can connect to from another computer, using vnc://linux.local or your_computer_ip:5900. You will most likely need to open a hole to your firewall, but the same you need to do for any other VNC server. I tried to connect with Vinagre and Krdc (KDE VNC viewer) and both worked for me, but I was unable to connect with Tigervnc (vncviewer) due to not maching security type.

Screen sharing in Firefox

Firefox in Fedora already comes with PipeWire support enabled and you don’t need to do anything special. You can test it for example with this testing page. The PipeWire support in Firefox unfortunately needs to be enabled during build by handmade changes, which is most likely not happening in other distributions, but from what I now there is an ongoing effort to make this configurable with a build option.

Screen sharing in Chromium/Chrome

Similar situation is with Chromium, where PipeWire needs to be also enabled during build, but it’s already configurable via a build option. In Fedora we have this enabled by default. Official Chrome builds are build with PipeWire support enabled as well. I should maybe mention that PipeWire support is in Chromium starting with version 74.

Unlike with Firefox, this support needs to be also enabled in runtime. You can do that with following chrome flag:

chrome://flags/#enable-webrtc-pipewire-capturer

Then you should be all set to be able to share a screen or a window from your Chromium or Chrome.

Issues

Don’t get scared by higher number of dialogs for screen/windows selection you will get when sharing screen in your browser. We are aware of this annoying usability issue and hopefully we will manage to solve it one day. The reason why this happens is that every browser provide their dialog for screen/window selection and in both browsers these dialogs show previews of your selection. You need to select screen/window first in the portal dialog for the preview dialog and once you accept the preview dialog in your browser, you again need to select screen/window in the portal dialog to get the content into the web page itself.

Support in other applications

There is a KDE application called Krfb, which in the next release (19.08) will have similar support for remote desktop on Wayland as you have in Gnome. Otherwise there are probably no other applications which would allow you to share a screen or control your desktop remotely on Gnome Wayland sessions. You will not be able to use TeamViewer, Tigervnc or any other application you are used to use. If you want to use these applications, you will have to switch to X session for now.

How to enable and use screen sharing on Wayland

Two days ago I wrote about our work on screen sharing in web browsers. While there was a lot of work done recently on this area, it’s not still in the state where everything would just work out of the box. There are few steps necessary to make this work for you and here is a brief summary what you need. This is not a distro specific how to, but given I use Fedora 28 and I know that everything you need is there, it’s most likely you will need to figure out the differences for your distribution or build it yourself.

PipeWire

PipeWire is the core technology used behind all of this. In Fedora you just need to install it, it’s available for Fedora 27 and newer. Once PipeWire is installed, you can just start it using “pipewire” command. If you want to see what’s going on, you can use “PIPEWIRE_DEBUG=4 pipewire”  to start PipeWire with debug information. For Fedora 29, there is a feature planned for PipeWire which should make it to start automatically.

Xdg-desktop-portal and xdg-desktop-portal-[kde,gtk]

We use xdg-desktop-portal (+ backend implementation) for communication between the app requesting to share a screen and between desktop (Plasma or Gnome). You need xdg-desktop-portal, which is the middle man between the app and backend implementation, compiled with screencast portal. This portal will be build automatically when PipeWire is present during the build. In Fedora you should be already covered when you install it. For backend implementation, if you are using Plasma, you need xdg-desktop-portal-kde from Plasma 5.13.x, again compiled with screencast portal, which is build when PipeWire is present. For Fedora 28+, you can use this COPR repository and you are ready to go. I highly recommend using Plasma 5.13.2, where I have some minor fixes and if you have a chance, try to compile upcoming 5.13.3 version from git (Plasma/5.13 branch), as I rewrote how we connect to PipeWire. Previously our portal implementation worked only when PipeWire was started first, now it shouldn’t matter. If you use Gnome, you can just install xdg-desktop-portal-gtk from Fedora repository or build it yourself. You again need to build screencast portal.

Enabling screen sharing in your desktop

Both Plasma and Gnome need some adjustments to enable screen sharing, as in both cases it’s an experimental feature. For Gnome you can follow this guide, just enable screen-cast feature using gsettings. For Plasma, you need to get KWin from Plasma 5.13.x, which is available for Fedora in the COPR repository mentioned above. Then you need to set and export KWIN_REMOTE=1 env variable before KWin starts. There is also one more thing needed for Gnome at this moment, you need to backport this patch to Mutter, otherwise it won’t be able to match PipeWire stream configuration with the app using different framerate, e.g. when using Firefox.

Edit: It seems that exporting KWIN_REMOTE=1 is not necessary, it probably was only during the time when this feature was not merged yet. Now it should work without it. You still need KWin from Plasma 5.13.

Start with screen sharing

Now you should be all set and ready to share a screen on Gnome/Plasma Wayland session. You can now try Firefox for Fedora 28 or Rawhide from this COPR repository. For Firefox there is a WebRTC test page, where you can test this screen share functionality. Another option is to use my  test application for Flatpak portals or use gnome-remote-desktop app.

Edit: I didn’t realize that not everyone knows about xdg-desktop-portal or PipeWire, below are some links where you can get an idea what is everything about. I should also mention that while xdg-desktop-portals is primarily designed for flatpak, its usage has been expanded over time as it perfectly makes sense to use it for e.g. Wayland, where like in sandbox, where apps don’t have access to your system, on Wayland apps don’t know about other apps or windows and communication can by done only through compositor.

 

Screen sharing in Plasma wayland session

One of the important missing features in Plasma wayland session is without a doubt possibility to share your screen or record you screen. To support this you need help of the compositor and somehow deliver all needed information to the client (application), in ideal way something what can be used by all DEs, such as Gnome. Luckily, this has been one of the primary goals of Pipewire, together with support for Flatpak. If you haven’t heard about Pipewire, it’s a new project that wants to improve audio and video handling in Linux, supporting all the usecases handled by PulseAudio and providing same level of handling for video input and output. With Pipewire supporting this, there was recently a new API added to xdg-desktop-portal for screen cast support and also for remote desktop. Using this API, applications can now have access to your screen content on Wayland sessions or in case they are running in sandbox. With various backend implementation, like xdg-desktop-portal-kde or xdg-desktop-portal-gtk, they just need to support one API to target all desktops. Screen cast portal works the way, that the client first needs to create a session between him and xdp (xdg-desktop-portal) backend implementation, user then gets a dialog with a screen he would like to share and starts screen sharing. Once he does that, xdp backend implementation creates a Pipewire stream, sends back response to the client with stream id and then client can connect to that stream and get its content. Once he no longer requests content of the selected stream, xdp backend implementation gets information that nobody is longer connected to the created Pipewire stream and can stop sharing screen information and xdp backend implementation is again ready to accept next requests for screen sharing. This is all happening in the background so there is really no cool picture I can show, at least this dialog which you get when you request to share a screen.

I finished support for screen cast portal in xdg-desktop-portal-kde last week and currently waiting for it to pass review and be merged to master. This is also currently blocked by two not merged reviews, one adding support for sending GBM buffers from KWin and one with new Remote Access Manager interface in KWayland, both authored by Oleg Chernovskiy, for which I’m really greatful. This all will hopefully land soon enough for Plasma 5.13. Testing this is currently a bit complicated as you need everything compiled yourself and besides my testing application there is really no app using this, except maybe Gnome remote desktop, but there should be support in future for this in Krfb, Chrome or in Firefox. Hopefully soon enough.

Last thing I would like to mention is for GSoC students. We also need remote desktop portal support to have full remote desktop experience so I decided to propose this as a GSoC idea so students can choose this interesting stuff as their GSoC work.

Scroll To Top