Other posts in this thread:

In the previous post, we encountered with Dogtail some behaviors that were not obvious to interpret.

OK that was some time ago and I never got around to finishing that next article. Since then I finally decided that it was more useful to switch Omaha to non-blocking dialogs than diving into that part of Gtk3’s accessibility (which got completely overhauled for Gtk4). This resulted in some progress in UI testing under Gtk3, but then I hit a couple of other issues there, first with GtkFileChooser not being particularly accessible, which did not look too hard to improve by adding a couple of names and actions, but then with something annoying becomes visible with the GtkStack embedded in the complex widget that GtkFileChooser is.

Exploring GtkFileChooserWidget

The GtkFileChooserDialog is based on GtkFileChooserWidget, so this is the one we’ll focus on here for a start (at this point anyway the accessibility improvements I worked on are only available for the widget, not for the dialog).

Let’s use a very simple test app with a file chooser, and a useless button to help use navigate the accessible nodes, since such an app is much more complicated than we’d expect.

#!/usr/bin/python3

import gi
gi.require_version('Gtk', '3.0')
from gi.repository import Gtk

window = Gtk.Window()
window.connect("destroy", lambda w: Gtk.main_quit())
vbox = Gtk.VBox()
window.add(vbox)

chooser = Gtk.FileChooserWidget(
    action=Gtk.FileChooserAction.OPEN)
vbox.pack_start(chooser, expand=False, fill=True, padding=0)

btn = Gtk.Button(label="Press me")
vbox.pack_start(btn, expand=False, fill=False, padding=0)

window.show_all()
Gtk.main()

In Accerciser with Gtk 3.24.23 we can now find our GtkFileChooserWidget as the button’s sibling, but finding our way inside is a bit of a challenge:

tree in gtk 3.24.23

With accessibility improvements we find our way a bit more easily to the GtkStack that can show one of a pathbar, a search textfield, or a "location" textfield to enter a path:

tree in with MR#2721

Now if we randomly hit Ctrl-L (to toggle the "location" layer in the stack) and Ctrl-F (to toggle the "search" layer") we see Accerciser having problems:

  1. Ctrl-L: the pathbar layer disappears and a location layer appears

    testseq1.1
  2. Ctrl-F: a search layer appears, but the location layer stays

    testseq1.2
  3. Ctrl-F: the pathbar layer is back, but the other two layers stay

    testseq1.3
  4. Ctrl-L: the pathbar layer disappears and a location layer is back, but the search layer stays

    testseq1.4

Asking Accerciser to reload all information from the registry will show the same information, but quitting and starting Accerciser again does not necessarily.

I even witnessed cases where the widget was showing the location layer, and Accerciser on startup shows only a pathbar child and no location. Accerciser would not be able to make up such information without the widget exposing it. OTOH, exploring the same app state using Dogtail’s sniff does show the location layer instead.

If one goes on further playing with the two shortcuts, quite inevitably we reach a point where Accerciser aborts after some event, saying:

double free or corruption (fasttop)

Now what ? Do we have any wrong information reported by the a11y layer ? Are the tools interpreting information incorrectly ? We’ll have to see for ourselves…​

Exploring GtkStack

Since the problem in the file chooser seems to be with the stack widget it embeds, maybe we can cook a simpler example, that should help understand where the bad smell comes from.

Here is one, it just features:

  • a GtkStack with two children labels, labeled "on" and "off"

  • a toggle button whose changes change which of the stack children is visible

#!/usr/bin/python3

import gi
gi.require_version('Gtk', '3.0')
from gi.repository import Gtk

window = Gtk.Window()
window.connect("destroy", lambda w: Gtk.main_quit())

vbox = Gtk.VBox()
window.add(vbox)

stack = Gtk.Stack()
vbox.add(stack)

toggle = Gtk.ToggleButton(label="Switch")
vbox.add(toggle)

btn = Gtk.Label(label="off")
stack.add_named(btn, "off")
txt = Gtk.Label(label="on")
stack.add_named(txt, "on")

toggle.connect("toggled",
               lambda w: stack.set_visible_child_name("on" if w.get_active() else "off"))

window.show_all()
Gtk.main()

what Accerciser 3.38.0 shows

On startup, we see the stack has a single children labeled "off". After that I could observe two kinds of behavior:

  • no automatic updates

  • on first toggle the Accerciser tree is not updated automatically, and when we refresh it we see two children labeled "on"; both allow to interact with the widget

  • on second toggle and after another manual update we see one child labeled "off" (properly linked with the widget) and one labeled "on"

  • on third toggle we’re back with two children labeled "on"

  • automatic updates

  • on first toggle the Accerciser tree shows a single child labeled "on"

  • on second toggle we see one child labeled "off" and one labeled "on"

  • on third toggle we’re back with one child labeled "on"

Till now I haven’t been able to tell what selects one buggy behavior or the other

A fresh Accerciser launched against on this app state does insist that the two labels are present, though the extra one is reported as not being "showing" (which is at least a relief, when sniff does not seem aware of its existence).

what Dogtail 0.9.11’s sniff shows

Sniff does not auto-refresh by default, and manual refreshes each time correctly show the proper label. Activating auto-refresh apparently triggers a re-walk of the modified tree, and gives a expected information each time.

what next ?

The characterization of that GtkStack issue seems not that obvious, with Gtk devs apparently not trusting any of the existing tools. As they point out, the code destroys (as far as a11y is concerned) any child it hides, so Accerciser should not be able to see it.

It seems that these new events bring me back to what I originally envisioned: let’s dive now into the AT-SPI world, and now not only to discover what Dogtail itself is looking at :).

Looking at AT-SPI events

the basics

At first sight, to write an AT-SPI client in Python, the obvious choice would be pyatspi2.

As exemplified here, its entry point is at pyatspi.Registry, where we are proposed to

  • register a handler to receive events, using pyatspi.Registry.registerEventListener()

  • run the main loop with pyatspi.Registry.start()

  • possibly stop it with pyatspi.Registry.stop()

Such a script to monitor AT-SPI events in included in the dogtail source tree (though the comments inside threaten to get rid of it). Don’t forget to run it explicitly with python3 test-events.py, as for some reason they removed the shebang lines in there but kept the scripts executable. Also note that a simple Ctrl-C won’t terminate it, but you can use Ctrl-\ to send a SIGQUIT.

The test-events.py script already shows several drawbacks:

  • we see events from all accessible apps, so there’s a lot of noise when looking at a specific issue

  • the output information is too generic to identify particular widgets

  • only a predefined list of events is logged

a first a11y-monitoring tool

So what do we need ? Let’s make a first naive list:

  • wait for a particular app to start up, so we can catch all a11y events it sends. We could also check whether it has been already started, but that:

    • would require to probe the existing a11y tree first, which takes time so that we would have to defer interpretation of incoming events, making things more complicated

    • would not work easily when several instances of the app under test are running on the same machine

  • just print whatever we know about every a11y event we receive from this application

Here is a first simple script for those requirements. This version simply: * listens to events from the desktop to detect startup and termination of the requested app * then uses pyatspi’s event formatting for printout:

#!/usr/bin/python3 -u
from gi.repository import GLib
import pyatspi
import sys

def _registryChildChanged(event):
    global the_app
    if event.sender is desktop:
        if event.type.major != "children-changed":
            return
        if (event.type.minor == "add"
            and event.any_data.getRoleName() == "application"
            and event.any_data.name == appname
        ):
            if the_app != None:
                print("Ignoring extra '%s' app" % appname)
                return
            print(f"App '{appname}' 0x0x{id(the_app):x} intercepted")
            the_app = event.any_data
            print("App windows: {}".format([str(w) for w in the_app]))
        elif event.type.minor == "remove":
            # WTF cannot make a link with recorded app ?
            if the_app not in list(desktop):
                print("App is gone")
                pyatspi.Registry.stop()
        else:
            print("DESKTOP EVENT: %s" % event)
    elif event.sender is the_app:
        if event.type.major != "children-changed":
            print(event)
            return
        assert event.sender is the_app
        print("APP CHANGED")
        print(event)
    elif event.sender in old_apps:
        pass # just ignore unrelated apps
    else:
        print(event)

appname = sys.argv[1]

desktop = pyatspi.Registry.getDesktop(0)
old_apps = list(desktop)
the_app = None

#GLib.timeout_add(200, pyatspi.Registry.pumpQueuedEvents)
pyatspi.Registry.registerEventListener(_registryChildChanged,
                                       'object') # :children-changed

try:
    pyatspi.Registry.start(**{'async': True, 'gil': False})
except KeyboardInterrupt:
    pyatspi.Registry.stop()

Just provide it as argument the name your app exposes to AT-SPI (which defaults to argv[0]), arrange to save the output to a file for easier reading, eg. ./atspitest_v0.py test-stack.py |& tee out.log, and run the test-stack.py script.

What do we notice here ?

  • we still cannot distinguish one a11y node from the other

  • while some object:state-changed:defunct look quite complete, some of them are quite uninformative:

    object:state-changed:defunct(1, 0, 0)
            source: [DEAD]
            host_application: [DEAD]
            sender: None

Proper event details

Using print(event) exposes a couple of attributes, eg:

object:children-changed:remove(0, 0, [frame | ])
        source: [application | test-stack.py]
        host_application: [application | test-stack.py]
        sender: [application | test-stack.py]

We see those event attributes in order:

  • type

  • 3 unlabeled values (indeed detail1, detail2 and any_data, whose use depend on the event type)

  • source

  • host_application

  • sender

In our case the host_application field is always the same so we’ll omit it to make some room.

OTOH the nodes mentioned (as source, sender, etc.) do not show much details, and we would wand to add to node descriptions enough to get its identity:

  • python object identity would be great, so we could use the python is operator to check if two events are talking about the same object. Unfortunately we can rapidly see that the python object identity, while stable as long as the accessible node lives, seems to change when it is destroyed. This feels strange and may be a bug, but we just can’t use id().

  • AT-SPI path, the unique identifier for a node at protocol level, seems to be available at all time, and stable, so we’ll use it. However such a path lives in a per-application namespace, so we have to complement it with its parent app’s identity (which shows issues too when the app is dead, but since we’re mostly interested in the app’s life it won’t hurt us too much).

Thus we can go with a node pretty-printing like:

def pretty_accessible(acc):
    if acc == None:
        return "NONE"
    if acc is desktop:
        return "DSK"

    app_str = pretty_app(acc.get_application())

    states = acc.get_state_set()
    if states.contains(pyatspi.STATE_DEFUNCT):
        return (f"[(DEAD){app_str}:{acc.path}]")

    # cannot even check role when it is dead <sigh>
    if acc.get_role() == pyatspi.ROLE_APPLICATION:
        return app_str

    role = acc.get_role()
    role_str = role.value_nick if role is not None else "?"
    name = acc.get_name()
    name_str = name if name is not None else "?"
    return (f"[{role_str}|{name_str}|{app_str}:{acc.path}]")

def pretty_app(app):
    if app == None:
        return "NONE"

    states = app.get_state_set()
    if states.contains(pyatspi.STATE_DEFUNCT):
        # we cannot use get_id() if it is dead, and then its id() seems
        # semi-random, but sometimes we can see its role+name
        return f"DEADAPP(0x{id(app):x}, {app})"

    assert app.get_role() == pyatspi.ROLE_APPLICATION
    if app is the_app:
        return "APP"

    return f"APP(0x{id(app):x}, {app.get_name()}, id={app.get_id()})"

For event formatting we have a couple of special cases to handle, and can go with something like:

def log_event(event):
    sender_str = pretty_accessible(event.sender)
    source_str = pretty_accessible(event.source)

    if isinstance(event.any_data, pyatspi.Accessible):
        any_data_str = pretty_accessible(event.any_data)
    else:
        any_data_str = str(event.any_data)

    ts = datetime.datetime.now().strftime('%H:%M:%S')
    print(f"{ts} {sender_str}: {event.type}({event.detail1}, {event.detail2}, {any_data_str}, src={source_str})")

application to the analysis of test-stack.py

When we launch test-stack.py we now see on app startup the creation of nodes, and their suppression at app shutdown (defunct(0) appears to mean "alive):

20:40:45 APP: object:state-changed:defunct(0, 0, 0, src=[panel||APP:/org/a11y/atspi/accessible/2])
20:40:45 APP: object:property-change:widget(0, 0, 0, src=[panel||APP:/org/a11y/atspi/accessible/2])
...
20:40:59 APP: object:children-changed:remove(0, 0, [panel||APP:/org/a11y/atspi/accessible/2], src=[filler||APP:/org/a11y/atspi/accessible/1])
...
20:40:59 APP: object:state-changed:defunct(1, 0, 0, src=[(DEAD)APP:/org/a11y/atspi/accessible/2])

We also see initialization of the first GtkStack child, and their change on toggle event:

20:40:45 APP: object:state-changed:defunct(0, 0, 0, src=[label|off|APP:/org/a11y/atspi/accessible/4])
20:40:45 APP: object:property-change:widget(0, 0, 0, src=[label|off|APP:/org/a11y/atspi/accessible/4])
...
20:40:49 APP: object:property-change:accessible-parent(0, 0, [panel||APP:/org/a11y/atspi/accessible/2], src=[label|off|APP:/org/a11y/atspi/accessible/4])
20:40:49 APP: object:children-changed:remove(0, 0, [label|off|APP:/org/a11y/atspi/accessible/4], src=[panel||APP:/org/a11y/atspi/accessible/2])
20:40:49 APP: object:state-changed:defunct(0, 0, 0, src=[label|on|APP:/org/a11y/atspi/accessible/6])
20:40:49 APP: object:property-change:widget(0, 0, 0, src=[label|on|APP:/org/a11y/atspi/accessible/6])
20:40:49 APP: object:property-change:accessible-parent(0, 0, [panel||APP:/org/a11y/atspi/accessible/2], src=[label|on|APP:/org/a11y/atspi/accessible/6])
20:40:49 APP: object:children-changed:add(0, 0, [label|on|APP:/org/a11y/atspi/accessible/6], src=[panel||APP:/org/a11y/atspi/accessible/2])
20:40:49 APP: object:state-changed:showing(1, 0, 0, src=[label|on|APP:/org/a11y/atspi/accessible/6])
20:40:49 APP: object:state-changed:showing(0, 0, 0, src=[label|off|APP:/org/a11y/atspi/accessible/4])
...
20:40:52 APP: object:property-change:accessible-parent(0, 0, [panel||APP:/org/a11y/atspi/accessible/2], src=[label|on|APP:/org/a11y/atspi/accessible/6])
20:40:52 APP: object:children-changed:remove(0, 0, [label|on|APP:/org/a11y/atspi/accessible/6], src=[panel||APP:/org/a11y/atspi/accessible/2])
20:40:52 APP: object:property-change:accessible-parent(0, 0, [panel||APP:/org/a11y/atspi/accessible/2], src=[label|off|APP:/org/a11y/atspi/accessible/4])
20:40:52 APP: object:children-changed:add(0, 0, [label|off|APP:/org/a11y/atspi/accessible/4], src=[panel||APP:/org/a11y/atspi/accessible/2])
20:40:52 APP: object:state-changed:showing(1, 0, 0, src=[label|off|APP:/org/a11y/atspi/accessible/4])
20:40:52 APP: object:state-changed:showing(0, 0, 0, src=[label|on|APP:/org/a11y/atspi/accessible/6])
...
20:40:54 APP: object:property-change:accessible-parent(0, 0, [panel||APP:/org/a11y/atspi/accessible/2], src=[label|off|APP:/org/a11y/atspi/accessible/4])
20:40:54 APP: object:children-changed:remove(0, 0, [label|off|APP:/org/a11y/atspi/accessible/4], src=[panel||APP:/org/a11y/atspi/accessible/2])
20:40:54 APP: object:property-change:accessible-parent(0, 0, [panel||APP:/org/a11y/atspi/accessible/2], src=[label|on|APP:/org/a11y/atspi/accessible/6])
20:40:54 APP: object:children-changed:add(0, 0, [label|on|APP:/org/a11y/atspi/accessible/6], src=[panel||APP:/org/a11y/atspi/accessible/2])
20:40:54 APP: object:state-changed:showing(1, 0, 0, src=[label|on|APP:/org/a11y/atspi/accessible/6])
20:40:54 APP: object:state-changed:showing(0, 0, 0, src=[label|off|APP:/org/a11y/atspi/accessible/4])

We can notice a number of peculiar things:

  • the property-change:accessible-parent events sent on removal and addition of a child to the stack cannot be distinguished one from the other, so they’re not even saying "this is the new parent", and none is emitted for the initial child either - too bad that means we would have to make an extra AT-SPI request to find out what happened. We still miss some context to be sure, but that makes me suspect a bug of some sort here, even if it’s just a useless parameter.

  • the children are declared to the AT-SPI stack the first time they are added, other than that all toggle events look pretty much the same

Temporary wrap up and takeaway bug snacks

Up to now, things under the hood look…​ not that bad. We still have