Jan Grulich

PipeWire camera support in Firefox

New year, new challenges.

We finally reached a major milestone with Chromium 110, which was a release where we finally got screen sharing enabled by default on Wayland, and since then you no longer have to go into the preferences and enable the flag you need. That doesn’t mean my work there is over, but I’ve shifted my focus to something related but slightly different and that is PipeWire camera support.

Work on PipeWire camera support started in 2021 and was done by Michael Olbrich (Pengutronix). He submitted a huge change to Chromium to add this support and had trouble finding a reviewer because there was actually no one who knew anything about PipeWire in the Chromium project. I actually saw his change request by accident, but we got in touch and decided to move this to WebRTC instead, because having it lower in the stack means we would get it automatically in other browsers, like Firefox. Michael attended a meeting we used to have regularly for screen sharing support in WebRTC and we discussed how to implement PipeWire camera support in WebRTC instead and how to reuse some of the code we already had for screen sharing to avoid code duplication. After a few submitted and reverted reviews (usually when things break Chromium parts that are not covered by CI, happened to me many times), we ended up with PipeWire camera support in WebRTC (talking about the beginning of this year).

Journey to PipeWire camera support in Firefox

Up to recently, my work has mostly been 95% WebRTC and 5% Chromium, but I have not been familiar with Firefox at all (not counting WebRTC backports). I actually started fixing screen sharing support in there first before moving to camera, because I noticed a few issues after Firefox (finally) did a WebRTC rebase to some of the newer versions. They’ve actually started doing monthly WebRTC rebases, which is really a good thing and I’m glad to see that happening. Anyway, even though Firefox has more recent WebRTC these days, when I started in February, there was still no PipeWire camera support at all because WebRTC was still a few months behind, so I had to backport all the patches and make them work with Firefox. Only then I could finally start working on the actual PipeWire camera support from WebRTC. Working on the backports, I was still working in the WebRTC space, so everything was somewhat familiar. Implementing the actual PipeWire support was a different story and took me some extra time to understand how everything works. This includes camera API on the WebRTC side, camera support on the Firefox side, and I also had to learn all the APIs specifically used in Firefox, but admittedly, learning about new things is fun too. After some tries and errors it started to work and I was able to share my camera using the PipeWire camera backend from WebRTC. You have to trust me that the picture below is not using the V4L2 backend.

I went ahead and submitted my WebRTC backports and the PipeWire camera backend implementation for review to Firefox. Unfortunately, I was told that the code where I placed my implementation could also be used by the WebRTC Javascript API, which is used by bots to check for camera presence on the client side, which I didn’t know as someone who just recently started working on camera support. This was a problem because we get PipeWire access through xdg-desktop-portal and this involves showing a dialog to the user asking for camera access. Showing a camera request dialog randomly to the user would not be a good experience. Going back to the drawing board, I talked to Andreas Pehrson (Mozilla/WebRTC). Andreas was a great helper and we came up with a solution on how to implement it properly in Firefox and avoid things like I’ve mentioned before. This time it involved some re-org changes in WebRTC, where I split the xdg-desktop-portal and PipeWire implementations for PipeWire video capture, so we can request camera access in Firefox only when appropriate and only do the PipeWire stuff in the backend assuming the access was granted. So I did implement it again, this time according to what we agreed on with Andreas and it worked.

This is now submitted for review again and hopefully this time it will only need some minor fixes and not a complete rewrite like before, and you will be able to try/use it sooner rather than later. The main change is submitted here, but it is accompanied by other changes with WebRTC backports or changes that make the backports buildable with Firefox. With the first version of the change, I had a Fedora COPR repository, but I had to discontinue it, because it was too hard to maintain it in a buildable state on top of a stable Firefox. But you can be sure that Fedora will be the very first consumer of these changes once they are merged.

Why do we need this?

For many reasons. I would recommend you to read a blog post from Christian Schaller, where everything is explained into the details and gives you more information about the camera stack. Main reasons are:

  • Security
    • Access to the camera must be granted by the user, so you can be sure that no one is using your camera behind your back.
  • Flexibility
    • Your camera can be accessed by multiple clients simultaneously.
  • Libcamera support
    • Needed for ARM devices or devices using ChromeOS

Chromium support

While support in WebRTC has been done already a few months ago, Chromium originally didn’t use WebRTC video capture API for camera support and for that reason it had to be added. Michael implemented it and it is still currently pending on review so currently both Chromium and Firefox are both implemented, but waiting for approval.

Future plans

Most importantly, I want to get everything merged and working seamlessly, but I’m already aware of some issues and missing functionality in the PipeWire backend in WebRTC. And we also have the same problem we used to have with screen sharing, which is that it’s not enabled by default, unit tested, and feature complete and these things take time fix.

WebRTC (Chromium): Year end report

Although Wayland screen sharing is still not yet enabled by default in Chromium, which is what I hoped to achieve this year, I think I can say we are almost there and you can expect it sooner than later. Let’s summarize what we have accomplished this year to make this change happen:

Stream restoration support

You probably remember that you had to go through two portal (xdg-desktop-portal) dialogs all the time you wanted to share your screen. We had first portal dialog to have your selected screen visible in Chromium preview dialog and yet another portal dialog to make your screen shared with the web page itself. This was quite annoying as users had to make the same selection twice. Thanks to a new addition into portal API I was able to implement stream restoration support in WebRTC to bypass the second portal dialog and have your selection instantly shared with the web page itself once you confirm it in the Chromium dialog. This was released in Chromium 105.

Tests for PipeWire (streaming) code

This is not a feature that is visible to users, but it makes an important part of the whole process. It was a long effort to make this happen as it’s something that is not trivial to test and needs some dependencies for the tests itself to run. As a first step we had to bring PipeWire and some of its dependencies into the infrastructure. As easy as it sounds, in order to add a new dependency there you have to add it in form of a CIPD (Chrome Infrastructure Package Deployment) package. This means you write recipes with information how to build and get your package. This all on a CentOS 7 based distribution where you have to work with older libraries or missing libraries and tools (e.g. Meson). This makes your packages later available in third_party directory that is available to both Chromium and WebRTC. Next step was to write the tests itself. The only way how I could test our PipeWire code, code that is all about receiving frames over PipeWire stream, was to write another “testing” stream that will be sending us frames with parameters where we will know what to expect and can verify we received what we were supposed to receive. For the tests itself I used GTest framework which is very well documented and quite comprehensive. Sadly, we were still just in the middle of the process and one would hope that it cannot get more complicated, but the opposite is true. There is a whole runtime setup we need for our tests to run. We need PipeWire and PipeWire session manager to run, otherwise we would never activate and connect our streams. To create the setup I had to write a Python wrapper script that sets all environment variables for PipeWire to find everything in the third_party directory (plugins, libspa, etc.) and run both PipeWire and PipeWire session manager in order to run our test. As a last step, because we had a script that runs the test, we had to create a mapping in the infrastructure making the script itself a launcher of our test + limit this only to x86_64 architecture as that’s the only one where we have PipeWire available as CIPD package.

Here are upstream changes that implement all above mentioned:

UX improvements in Chromium preview dialog

All these improvements were made by Alexander Cooper, Alex is Google engineer working on screen sharing stack in Chromium and WebRTC. I have been intensively working with Alex for the past year and he is my go to person when it comes to code reviews or anything related to screen sharing in Chromium. Alex made a great set of UX improvements in their preview dialog. My original implementation always automatically invoked portal dialog when screen sharing was initiated. This led into one dialog overlaping the other. In most recent Chromium version (I think starting with 107) users will be presented with Chromium preview dialog first asking to share a web tab and only once they pick to share a screen they will get presented with the portal dialog. You can also now re-request the portal dialog in case you pick a wrong screen to share.

Web Engines Hackfest in A Coruña

I travelled to A Coruña in May to attend the Web Engines hackfest and meet Alex Cooper from Google there. This was a perfect opportunity for us to meet in person and I’m really grateful for that. Even though that for me it was a bit stepping outside of my comfort zone as I went somewhere where I didn’t know anyone (besides Alex), but as I later found out, all the people there were super friendly and I’m happy that I met some new faces. We had very productive conversations with Alex and having enough time to talk in person was beneficial for both sides as we could explain to each other technical details of our backgrounds and talk about future plans and current issues.

Firefox and WebRTC rebase

Firefox upstream has been behind with our WebRTC changes for more than ~2 years. This has changed recently with Firefox 106, where they finally rebased their WebRTC version to M103 (Chromium 103). I also provided them list of additional backports they should pick up in order to have fixes for some crashes and issues we have fixed since then. Sadly, as I later found out, even though they did the rebase and all the backports, the new codebase (for screen sharing) is not in use and instead they still keep the old code and use it instead. This is because they don’t have all the dependencies in place and it’s been blocked on this issue since then, hopefully it will get figured out soon and also Firefox users can benefit from all the improvements we did over the past two years. This is not an issue for Fedora users where I created a patch for our Firefox package that enables the new code and use it instead.

Plans for the future

  • Extend PipeWire code test coverage. Currently we test only the essential part of the code, but I would like to further extend the tests with tests for all kinds of metadata we can use (damage regions, crops, mouse cursor etc.)
  • Use portal dialog as the default one. This has some requirements that need to be fulfilled before Chromium can rely on it. We need to show screen/window previous, as currently it can easily happen you pick a wrong screen (especially when you have two identical monitors) and we need a way how to invoke Chromium dialog so users can share a web tab.
  • Fix bugs. I’m currently not aware of any issue and we already fixed a bunch of them this year as people use it more often, but I expect more issues to appear as Wayland becomes more and more dominant and we finally make it enabled by default.

WebRTC: journey to make wayland screen sharing enabled by default

While we have pretty good support for screen sharing on Wayland in WebRTC, which is included in browsers like Chromium or Firefox, it is still not enabled by default in Chromium and it is kept behind a flag. Not only you have to remember to always enable it for new configurations, but for many users it is not even something they are aware of. This has been my main focus recently and I would like to share with you steps that has been done and what are the plans for the future.

What are the changes to expect in Chromium soon?

DMA-BUF improvements/fixes:

Last year I landed proper DMA-BUF support in WebRTC, which made things way faster. It was working, but it was not perfect and there were some corner cases where it might not be working at all. Here are changes I made recently:

  • Advertise DMA-BUF support when it is really supported. Older versions of PipeWire don’t handle the new way of DMA-BUF negotiation and therefore it shouldn’t be used in such cases. Also using DMA-BUF modifiers requires some recent versions of PipeWire on both sides.
  • Implemented stream renegotiation. In situation when we fail to import a DMA-BUF with given modifier, we will drop this modifier and try to renegotiate stream parameters and go with a different modifier or fallback to shared memory buffers in case we fail completely.
  • Make sure to import DMA-BUF with correct render node. In case of multi-gpu setups, we always picked the first render node to import DMA-BUFs, but it can happen that they were actually produced by a different render node and for that reason we might fail to import them. We now try to get default EGLDisplay, which should be the same one used by the wayland compositor and we should be using same render node.

Better mouse cursor support:

Until now we had mouse cursor as part of the screen content. This means that everytime you moved with your mouse cursor, we had to update whole image and that is very inefficient. The API in WebRTC allows you to implement MouseCursorMonitor which can be used to track mouse changes only and each platform can have both MouseCursorMonitor and DesktopCapturer implementations combined in DesktopAndCursorComposer to get complete image and this all works automatically like a magic. Unlike X11 implementation, our only option is to get everything from one PipeWire stream we connect to and there was no way how to make it shared from DesktopCapturer implementation so it can be used by MouseCursorMonitor implementation. I had to split DesktopCapturer to have xdg-desktop-portal and PipeWire separate implementations. Code for PipeWire is now a SharedScreenCastStream class which is being shared through DesktopCaptureOptions. This is set of parameters associated with each capturer instance and luckily this is also passed to MouseCursorMonitor so we can have access to already initialized PipeWire stream and get the cursor data from there. Implement MouseCursorMonitor with SharedScreenCastStream was then piece of cake.

List of merge requests:

This should again significantly improve performance of screen sharing, because moving with a mouse over a static screen content doesn’t need full screen content update.

Misc:

Last but not least, I’m now in touch with Google developers who help me to review all my changes and discuss with me the current state, issues I have, etc. on monthly meetings we have. The plan is to make this finally enabled by default, hopefully in the first half of this year. There are still some things that need to be solved before this is enabled and there is lot of work ahead, but things look promising.

Plans for the future:

  • Implement stream restoration
    • this will allow us to skip the second portal dialog and I already have plan in my head how to do this in WebRTC. This is currently only supported by xdg-desktop-portal-gnome and xdg-desktop-portal-kde lacks this functionality.
  • Improve UX of the Chromium screen sharing dialog
  • Write tests for all PipeWire/portal code in WebRTC

Even though WebRTC is used in Firefox, I mostly talk about Chromium, because Firefox doesn’t use most recent WebRTC and will need to pick all the changes I did or rebase to newer WebRTC in order to have them. Firefox also has PipeWire/Wayland screen sharing enabled by default and doesn’t have UX issues as there is no internal screen sharing dialog like in Chromium.

I hope all these changes will make your experience better and next time when you read a new blog post I will be informing you about end of this journey.

DMA-BUF support in WebRTC

It will be almost three years since we landed initial support for screensharing on Wayland with the use of PipeWire in the WebRTC project. This enabled screensharing support in both major Linux browsers. Last year I implemented support for window sharing, added support for PipeWire 0.3 and added support for DMA-BUF and MemFD buffer types. Problem was, as it turned out, the DMA-BUF support was not implemented in a correct way.

The original implementation was using mmap() to get the buffer content. This worked correctly for current Intel GPUs, but was terrifically slow on e.g. AMD GPUs. Proper solution is to use OpenGL context to get the content from buffer. However, there were many implementations using mmap() already, including WebRTC and we needed a way how to properly communicate between the server and the client that when the client advertises DMA-BUF support, it means it doesn’t use mmap() and goes through OpenGL context instead.

Here are some issues if you want to read about the details:

This all resulted into a completely different way how the communication between the consumer and the producer should happen in order to use DMA buffers for way faster and smoother screensharing support. Both sides are now required to query the list of all supported modifiers and add this as a new stream parameter, including flags that the modifiers are mandatory parameter, rest of stream parameters are kept as before so we can keep using other types in case DMA-BUFs are not supported by the producer. Once both sides matches their expectations, we can query whether the stream includes modifiers, based on that we know we can use DMA buffers, which we now properly open using OpenGL context, while we kept mmap() for MemFd buffer types as fallback. This will result into faster screensharing support in your web browsers.

Last but not least, I made screensharing even faster, regardless of buffer type we use. Originally when we received buffer from PipeWire, we copied it to a local variable so we can apply cropping and adjust the position and only after that we copied this adjusted content into a DesktopFrame, which each DesktopCapturer (a class representing screensharing implementation) is supposed to return and let it be displayed by the browser. That means we performed two copy operations for each frame. I improved this implementation and now we copy the PW buffer content directly to a desktop frame which we can return directly so one copy operation less than before. I didn’t do some exact measures, but simply running htop and comparing usage of top 5 processes when sharing a 4k screen I got:

  • Original result: 66%, 64%, 26% 23%, 10%
  • Updated result: 41%, 39%, 19%, 17%, 12%

I also have some other improvements on my TODO list, all of them should bring some additional optimizations and improvements. I will keep you informed once I have news to share with you.

Both changes have been merged into WebRTC, that means it should be in Chrome/Chromium 96 (released during November 2021).

Tutorial: Screen Sharing and Remote Desktop on Fedora Workstation 30 (Wayland)

I recently got an email from a user asking me how to make all this work on Fedora. Problem is that unlike in old XServer sessions, there are certain things which need to be enabled first. There are also dependencies which need to be installed and services which need to be running. While most of the dependencies are automatically installed and services automatically activated, there still might be situations when this is not true, for example when switching from another desktop so it’s better to cover it all. This tutorial targets Fedora, but it can be probably used by any other distribution.

Dependencies

Both screen sharing and remote desktop work almost identically on Wayland, they both use portals as a communication tool between applications and compositor (in this case Mutter) to start the process of sharing and setup PipeWire stream (see below). While portals were primarily meant to be used by sandboxed applications (e.g. Flatpak) to get access to system (like files or printing) outside the sandbox, their design perfectly fits for Wayland usage too. In Fedora you should have portals automatically installed, they are represented by two separate packages, first is xdg-desktop-portal, which is the portal service communicating with sandboxed applications and with a backend implementation of portals, and the second package is the backend implementation, in our case xdg-desktop-portal-gtk. Both are DBus activatable, which means they are automatically started whenever application calls them. The reason why portals consist from two services is that there can be multiple backends, each one providing native dialogs for your desktop. For example you don’t get a gtk dialog to open a file in KDE Plasma session or you want a backend communicating with specific compositor (like in our case with Mutter).

The second important dependency is PipeWire. PipeWire is the core technology used for screen content delivery from the compositor to applications. This is done throught a PipeWire stream shared between the compositor and application. PipeWire should be automatically installed on your system, the package name is pipewire and it provides socket-based activation so you shouldn’t need to worry if it’s running or not.

Enabling screen sharing and remote desktop in Gnome

You don’t seem to do any additional step in order to make screen sharing work. However, you need to enable remote desktop (if you want to). Go and open gnome-control-center (Settings) and there go to Sharing section. There you should see this window when you click on Screen Sharing:

If you don’t have such option, make sure you have installed gnome-remote-desktop, because I’m not sure whether it’s installed by default. Allowing screen sharing will start a server instance which you can connect to from another computer, using vnc://linux.local or your_computer_ip:5900. You will most likely need to open a hole to your firewall, but the same you need to do for any other VNC server. I tried to connect with Vinagre and Krdc (KDE VNC viewer) and both worked for me, but I was unable to connect with Tigervnc (vncviewer) due to not maching security type.

Screen sharing in Firefox

Firefox in Fedora already comes with PipeWire support enabled and you don’t need to do anything special. You can test it for example with this testing page. The PipeWire support in Firefox unfortunately needs to be enabled during build by handmade changes, which is most likely not happening in other distributions, but from what I now there is an ongoing effort to make this configurable with a build option.

Screen sharing in Chromium/Chrome

Similar situation is with Chromium, where PipeWire needs to be also enabled during build, but it’s already configurable via a build option. In Fedora we have this enabled by default. Official Chrome builds are build with PipeWire support enabled as well. I should maybe mention that PipeWire support is in Chromium starting with version 74.

Unlike with Firefox, this support needs to be also enabled in runtime. You can do that with following chrome flag:

chrome://flags/#enable-webrtc-pipewire-capturer

Then you should be all set to be able to share a screen or a window from your Chromium or Chrome.

Issues

Don’t get scared by higher number of dialogs for screen/windows selection you will get when sharing screen in your browser. We are aware of this annoying usability issue and hopefully we will manage to solve it one day. The reason why this happens is that every browser provide their dialog for screen/window selection and in both browsers these dialogs show previews of your selection. You need to select screen/window first in the portal dialog for the preview dialog and once you accept the preview dialog in your browser, you again need to select screen/window in the portal dialog to get the content into the web page itself.

Support in other applications

There is a KDE application called Krfb, which in the next release (19.08) will have similar support for remote desktop on Wayland as you have in Gnome. Otherwise there are probably no other applications which would allow you to share a screen or control your desktop remotely on Gnome Wayland sessions. You will not be able to use TeamViewer, Tigervnc or any other application you are used to use. If you want to use these applications, you will have to switch to X session for now.

Scroll To Top