Published: jeu. 10 octobre 2019
After explaining how to
Debug Hybrid Graphics issues on Linux, here is the story of four graphics bugs
that I had in GNOME and Firefox on my Fedora 30 between May 2018 and September
2019: bugs in gnome-shell, Gtk, Firefox and mutter.
In May 2018, six months after I got my Lenovo P50 laptop, gnome-shell was
"sometimes" freezing between 1 and 5 seconds. It was annoying because key
stokes created repeated keys writing "helloooooooooooooooooooooo" instead of
"hello" for example.
My colleagues led my to
of the GIMP IRC server where I met
my colleague #fedora-desktop Jonas Ådahl (jadahl) who almost immediately identified my
issue! Extract of the IRC chat:
15:03 <vstinner> hello. i upgraded from F27 to F28, and it seems like I
switched from Xorg to Wayland. sometimes, the desktop hangs a few
milliseconds (less than 2 secondes)
15:03 <vstinner> bentiss told me that "libinput error: client bug: timer
event7 keyboard: offset negative (-39ms)" can occur when shell is too
15:04 <vstinner> journalctl shows me frenquently the bug
Shell.GenericContainer (0x559e6bfddc60), has been already finalized.
Impossible to get any property from it."
15:04 <vstinner> i also get "Window manager warning: last_user_time
(3093467) is greater than comparison timestamp (3093466). This most
likely represents a buggy client sending inaccurate timestamps in
messages such as _NET_ACTIVE_WINDOW. Trying to work around..." errors
in logs (from shell)
15:05 <vstinner> bentiss: ah, i also get "libinput error: client bug: timer
event7 trackpoint: offset negative (-352ms)" errors
15:06 <vstinner> it's a recent laptop, Lenovo P50: 32 GB of RAM, 4 physical
CPUs (8 threads) Intel(R) Core(TM) i7-6820HQ CPU @ 2.70GHz
15:06 <vstinner> so. what can i do to debug such performance issue? may it
come from shell? what does it mean if shell is slow? can it be a GPU
15:13 <jadahl> vstinner: whats your hardware? Do you have a hybrid gpu
15:13 <jadahl> ah, yes P50
15:14 <jadahl> vstinner: there is a branch on mutter upstream that fixes
that issue. want to compile it to test?
Ten minutes after I asked my question, Jonas asked the right question:
have a hybrid gpu system?
I was able to workaround the issue by connecting my laptop to my TV using the
15:22 < jadahl> for example, IIRC if you have a monitor connected to the
HDMI, the issue will go away since the secondary GPU is always awake
15:31 < vstinner> jadahl: i plugged a HDMI cable to my TV and it seems like
the issue is gone
15:31 < vstinner> jadahl: impressive
When an external monitor is used (like a TV plugged on the HDMI port), my
NVIDIA GPU is always active which works around the bug I had in gnome-shell.
Jonas provided me a RPM package for Fedora including his work-in-progress fix:
Upload HW cursor sprite on-demand. I confirmed that
this change fixed my bug. His mutter change has been merged upstream.
Firefox crash when selecting text
In March 2019, Firefox with Wayland crashed on
wl_abort() when selecting
more than 4000 characters in a <textarea>. I found the bug in Gmail when
selecting the whole email text to remove it. Pressing CTRL + A or
Right-click + Select All crashed the whole Firefox process!
I reported the bug to Firefox:
Firefox with Wayland crash on wl_abort() when
selecting more than 4000 characters in a <textarea>.
Running gdb in Firefox caused me some troubles since it's a very large binary with
many libraries. I also read
Wayland protocol specifications.
I managed to analyze the bug and so I reported the bug to Gtk as well, On
Wayland, notify_surrounding_text() crash on wl_abort() if text is longer than
According to gdb,
wl_abort() because the buffer is too short. It seems like
wl_buffer_put() fails with E2BIG.
Quickly, I identified that
my Gtk bug has already been fixed 3 months before
by Carlos Garnacho ( imwayland: Respect maximum length of 4000 Bytes on
strings being sent)
and the fix is part of gtk-3.24.3 ("wayland: Respect length limits in text
protocol" says "Overview of Changes in GTK+ 3.24.3").
I requested to upgrade Gtk in Fedora. But it was not possible since the newer
version changed the theme. I was asked to cherry-pick the fix and that's what I
imwayland: Respect maximum length of 4000 Bytes on strings.
My PR was merged and a new package was built. I tested it and confirmed that it
fixed the crash:
FEDORA-2019-d67ec97b0b. Soon, the
package was pushed to the public Fedora package repository.
That's the cool part about open source: if you have the skills to hack the
code, you can fix an annoying which is affecting you!
Firefox: [Wayland] Window partially or not updated when switching between two tabs
Analyze the bug
In September 2019, after a large system upgrade (install 6 packages, upgrade
234 packages, remove 5 packages), Firefox started to not update the window
content sometimes when I switched from one tab to another. Example:
It took me a few hours to analyze the bug to be able to produce an useful bug
I followed Fedora's guide
How to debug Firefox problems advices.
First, I tried to
understand which GPU driver is used. I finished by
blacklisting the nouveau driver in the Linux kernel, to ensure that Firefox was
using my Intel IGP. I still reproduced the bug.
disabled all Firefox extensions: bug reproduced.
Then I created a new Firefox profile and started Firefox in
safe mode: bug
I tested the latest Firefox binary from mozilla.org (Firefox 69.0): bug
I tested Firefox Nightly from mozilla.org (Firefox 71.0a1): bug
Ok, it was enough data to produce an interesting bug report. I reported
[Wayland] Window partially or not updated when switching between two tabs to Firefox.
Identify the regression using Fedora packages
Then I looked at
/var/log/dnf.log and I tried to identify which package
update could explain the regression.
gtk3-3.24.11-1.fc30.x86_64 to gtk3.x86_64 3.24.10-1.fc30: bug
I rebooted on oldest available
Linux kernel, version 5.2.8-200.fc30.x86_64:
bug reproduced. I checked journalctl logs to check which Linux version I was
running whhen the bug was first seen: Linux 5.2.9-200.fc30.x86_64.
I don't know why, but
downgrading Firefox was only my 3rd test.
I downgraded firefox-69.0-2.fc30.x86_64 to firefox-68.0.2-1.fc30.x86_64: the
bug is gone! Ok, so
the regression comes from the Firefox package, and it
was introduced between package versions 68.0.2-1.fc30 and 69.0-2.fc30.
On IRC, I met my colleague
Martin Stránský who package Firefox for Fedora.
He told me that he is aware of my bug and may have a fix for my bug. Great!
Only 9 days later,
Martin Stránský fix has been merged in Firefox upstream,
released in Firefox Nightly, and a new package has been shipped in Fedora 30!
Thanks Martin for your efficiency!
The final Firefox change is quite large and intrusive:
[Wayland] Fix rendering
glitches on wayland
Xwayland crash in xwl_glamor_gbm_create_pixmap()
In September 2019, while I was debugging the previous Firefox bug, I started my
IRC client hexchat. Suddently,
Xwayland crashed which closed my whole Gnome
session! I was testing various GPU configurations to analyze the Firefox
ABRT managed to rebuild an useless traceback and identified an existing bug
report. It added my coment to
OsLookupColor(): Segmentation fault at address 0x28 report.
At July 26, 2019 (1 month before I got the bug),
Olivier Fourdan added an
glamor_get_modifiers+0x767 is xwl_glamor_gbm_create_pixmap() so this
is the same as bug 1729925 fixed upstream with
xwayland: Do not free a NULL GBM bo.
So in fact, my bug was already fixed by
Olivier Fourdan in Xwayland
upstream, but the fix didn't land into Fedora yet.
I would like to thank the following developers who fixed my Fedora 30. What a
coincidence, all four are my collagues! It seems like Red Hat is investing in
the Linux desktop :-)
Carlos Garnacho (Red Hat).
Jonas Ådahl (Red Hat).
Martin Stránský (Red Hat).
Olivier Fourdan (Red Hat).