> Ironically, I want interoperability on this to help with use cases relating to accessibility.
> I work at the BBC and, on our UK website, our navigation bar menu button behaves slightly differently depending on if it is opened with a pointer or keyboard. The click event will always open the menu, but:
> - when opening with a pointer, the focus moves to the menu container.
> - when opening with a keyboard, there is no animation to open the menu and the focus moves to the first link in the menu.
> Often when opening a menu, we don't want a slightly different behaviour around focus and animations depending on if the user 'clicks' with a pointer or keyboard.
> The 'click' event is great when creating user experiences for keyboard users because it is device independent. On keyboards, it is only invoked by Space or Enter key presses. If we were to use the keydown event, we would have to check whether only the the Space or Enter keys were pressed.
const isInvokedByMouse = event => event.screenX > 0 || event.screenY > 0;
const isInvokedByKeyboard = event => isEnterKey(event) || isSpaceKey(event);
Ignoring the actual conditions entirely, this code seems to be trying to categorize the event into one of two categories: mouse or keyboard. But what it actually does is to categorize into one of four categories: (mouse and not keyboard), (keyboard and not mouse), (keyboard and mouse), and (neither keyboard nor mouse). And, as the original bug shows, (neither keyboard nor mouse) is handled inappropriately. One might wonder whether (keyboard and mouse) works well.Either the code should be deliberate about the fact that (is it a keyboard) and (is it a mouse) are separate booleans, or the code should be structured so that the actual categories are mutually exclusive. For example:
const isInvokedByMouse = ...
and use !isInvokedByMouse to check for keyboardiness, or: const eventSource = ... (returns "keyboard" or "mouse")
or, perhaps even better: const eventSource = ... (returns "keyboard", "mouse", or "not sure")
Funnily enough, I have a physical door & lock that OFTEN gets in to this state - it's exactly as irritating as it sounds, and it has very meaningful impact on it's state (it then can't be closed without first unlocking the lock!)
[0] Business, etc
[1] System design, security, database management, cost vs speed trade-offs, SCM, etc etc etc
Or, as pointed out in the post where multiple booleans are merged into a single enum, encode the constraints into the data itself ie. use the constraints of the host programming language.
But this wont be possible in general - for instance if your language doesn't have sets/dictionaries, how would you encode uniqueness of values directly using arrays and lists? It would have to be done using interface functions.
this is called algebraic data type (https://en.wikipedia.org/wiki/Algebraic_data_type), and it is the best way, imho, to reduce bugs in code.
By making it easy to pattern match, it reduces the possiblity of producing an invalid state, because at the time of definition, you have to figure out how to get that type (and checked by compiler).
The first mistake the developer made, was that he wanted to create a different user experience between keyboard and mouse. Stick to what you get by default and design your components so they work for both usecases. Don't try to be smart when it comes to accessibility.
What he ended up doing is what I would have considered a hack. A solution that inevitably breaks or has side effects.
The reason there rarely are good handles to do things differently in accessibility context, is because it's not something that's meant to be handled differently.
On mobile it's not perfect either but in general you do have features to change stuff like. focus, grouping of elements, how the keyboard navigate the view stack, how to access a button through custom actions and like you mention, change the tab index programmatically.
Even so, not everything can be fixed or handled through standard accessibility means and as such hacks will inevitably make it into the products.
I get what you're saying but I still think that making things accessible and designing with common accessibility in mind should be default and as such it has to be thought about when designing and developing from the get go. Having to create custom interfaces to fulfill a specific need might be a good fit for some things but not when developing apps and websites unless you're targeting that user-group specifically.
> The first mistake the developer made, was that he wanted to create a different user experience between keyboard and mouse. Stick to what you get by default and design your components so they work for both usecases.
We have. The behaviour is mostly the same whether you're using the keyboard or a pointer (mouse/touch/pen). The only difference is that, for keyboard users, we want to turn off the animation and move the focus to the first link in the menu instead of focussing on the menu's parent <ul>.
The problem was that, as various devs have iterated on the menu over the years, it's broken the fallback behaviour. For my colleague on the funny multi-monitor set up, it should have fallen back to the keyboard no-animation behaviour with no real major difference to the UX, but instead it fell back to the no-JS experience.
So yes, generally don't try to be smart with accessibility, avoid ARIA attributes except where necessary, etc, but click events are the universal input event and work on any kind of input device and have perfect browser support. It's far better for accessibility using them instead of a mix of keydown and mousedown or pointerdown, and potentially missing other kinds of input events.
As I stated in another comment, if it was a scenario where there needs to be a major difference in behaviour between keyboard and pointers, then I would rather use separate keydown and pointerdown events.
Maybe the former could have been solved using ARIA tags or maybe it would require bigger changes to the component itself. Accessibility is a roller-coaster for all these reasons alone.
Why not just always turn off the animations? Why not just always move the focus to the link?
What is the benefit of the animation to the user? What is the benefit of focusing on the menu’s parent to the user?
One rule of thumb with accessibility is that accessible products are usually better for everyone.
Animations enhance experience by drawing attention to state changes and providing intuitive feedback to user actions.
If you don't find them engaging or useful, that's fine - and you can set prefers-reduced-motion to true on your client - , but many people do.
> What is the benefit of focusing on the menu’s parent to the user?
The first item was not interacted with nor navigated to, therefore it shouldn't be focused under normal circumstances. It would be unexpected behavior.
Focusing the first item in keyboard interactions is an accessibility hack recommended by W3C:
> If you don't find them engaging or useful, that's fine - and you can set prefers-reduced-motion to true on your client - , but many people do.
The question here is not "does an animation have worth", but how is that worth tied to whether an onclick event originated from the mouse or the keyboard? Your reasoning applies equally to both, and thus leaves us still confused: why are we varying the animation by input device?
---
> why are we varying the animation by input device?
Another user explains it here: https://news.ycombinator.com/item?id=42176540
I don't actually agree, I think you can keep the animation and still make the content available immediately for screen readers. (And of course, keyboard navigation is not just for screen reader users!) Maybe someone else knows of some issue I don't.
No, they wanted to make them the same. It's just to give a blind person the same experience as a seeing person requires different things because they operate differently for obvious reasons. For example, a blind person can't see when an animation has finished. They expect that menu to be available once they've triggered it. However, seeing people see the dropdown appearing and then go to use it once it's ready.
> Don't try to be smart when it comes to accessibility.
In all seriousness, considering the state of accessibility as is, I think going outside the box isn't trying to be smart. It's actually being smart. The BBC frontend team is probably at the forefront of making high-traffic websites extremely usable.
A blind person can and should get cues from their assistive technologies that an item is is being loaded and is shown, either using announcements or aria tags that provide this information to the user.
While its fine to expect that something is available immediately, that's rarely a realistic expectation, whether you're blind or not.
For my two-cents, the BBC was simply trying too much to be "cutesy". Don't animate anything, because the silly animation on mouse click simply makes the website feel slower overall. Just open the menu as fast as the user's browser will open it.
Animation helps to correlate screen elements. Without animation it actually takes longer to establish the mental relationship between the action and the result.
But it's very easy to create cases where the UX sucks because things happen instantly especially as inherent complexity of the app increases.
“Don't try to be smart” alone is good advice in general and everywhere. Also in UI “don’t try to be original”
Sometimes complexity is simply the right tool for the job. Complexity is essential and valuable in all sorts of places - like fuzzers, LLMs, compilers, rendering engines, kernel schedulers and so on. But projects only have so much complexity budget to spend. I think I've spent my whole career trying to figure out how to spend complexity wisely. And I could spend decades more on it.
However, the BBC's intent seems quite sound to me from an a11y point of view, and their commitment to a11y is commendable. Though it's likely that for some browsers their attempts at defining their own a11y experience will result in a bad UX anyways.
My understanding from this is that BBC want slightly different behaviour depending on whether it's a mouse or keyboard "click" (keyboard shouldn't show the animation and should focus the first link in the menu).
However, they also want the ease of binding to a single event and while binding to "click" can do this, they have no way to tell whether it was a mouse click or keyboard press which triggered the event.
To solve this they're using an unreliable heuristic after realising in Chrome if the mouse position is screenX=0, screenY=0 it means the event was either triggered by a mouse click at screenX=0, screenY=0 or a keyboard.
As someone whose worked on accessibility projects in the past, this is a really stupid idea imo, and had I reviewed a PR with something like this I would have asked it to be reworked. While I agree browsers should ideally do the same thing, the real issue here seems to me that screenX and screenY make little sense on "click" triggered by a keyboard.
The solution ideally would be a new event ("trigger" or something) which doesn't emit a "MouseEvent", but something more generic which could apply to both a keyboard and mouse event and provide information about the "trigger" source. Imo keyboard "clicks" are weird to begin with and would ideally be fixed with a more appropriate event.
That said, I understand this doesn't currently exist in the spec and a solution is needed now. Therefore I don't see why they couldn't also bind to a "keydown" event then if the click is triggered alongside the "keydown" on the same element, assume it was a keyboard press. That would be far more reliable and far less hacky than what they're doing, and would allow them to trigger from the single event with a bit of extra code to detect if it was a keyboard or mouse.
To use the keydown event means we have to assume that the 'Enter' and 'Space' are the only keys we need to check for. Using 'click' is far safer from an accessibility point of view because it will always respect what your device considers to be some kind of input trigger.
As stated in the UI Events spec:
> For maximum accessibility, content authors are encouraged to use the click event type when defining activation behavior for custom controls, rather than other pointing-device event types such as mousedown or mouseup, which are more device-specific. Though the click event type has its origins in pointer devices (e.g., a mouse), subsequent implementation enhancements have extended it beyond that association, and it can be considered a device-independent event type for element activation.
And to be clear, I would not want to do it this way if it was for some really critical difference in behaviour between pointer or keyboard interactions. I'm OK with this strange mechanism here because the fallback behaviour is not that different. If you're on Safari, for example, which can't check for `screenX === 0`, then all that happens is that there will be an animation when you open the menu.
However, sadly, because of the ways various developers have added to this code over the years, it's broken that fallback behaviour and stopped it working entirely. So I've just finished a refactor to sort that out and it will hopefully be going live soon.
I currently have an open semi-related bug, also in a menu dropdown component (where we also want to focus the first item when triggered via keyboard). My issue is that when Windows Narrator is used, the space/enter triggers a mocked click event instead of the keydown. We could check for the position like you do.
Unfortunately, accessibility is often hacky both on the content side, but also on on the browser/screen reader side.
Is the word “don’t” a mistake which gives the sentence the opposite of the intended meaning?
It's obviously extremely unlikely but what if the mouse is actually at 0,0 when the user clicks? I'm not very familiar with JS, is checking for != 0 really the best/only way to do this?
EDIT: actually upon going back, I realized I didn't fully process this sentence originally but it seems to address this:
> We should probably do further refactoring of the event handler function, since it's complicated by the fact that it also handles keydown events. For now, though, this fix will do just fine.
My question is why they're relying on those heuristics. My guess is that toggleMenu is being used by multiple event handlers. Or maybe there's something else going on that is specific to their codebase.
It's hard to judge without knowing the full picture.
EDIT: Aha, there's an answer here: https://news.ycombinator.com/item?id=42174177
Also, stack overflow suggests that exact code to "differentiate between mouse and keyboard triggering onclick": https://stackoverflow.com/questions/7465006/differentiate-be...
They care, because focus for keyboard-controlled screen readers sending "click" should behave differently: an element inside the menu should receive focus, even though it's not the element that has been clicked. Otherwise if focus stayed on top-level menu bar, screen reader users would be lost, and had to navigate to menu's content themselves.
I used it to select which layout to show in the past.
If you want to listen to input on touch only then you can do that and call preventDefault on the event so that the browser does not then cause a click event. Or you can just save yourself the trouble and write a click handler.
As an industry, why haven't we figured out how to make drop downs that consistently open for all users? Is accessibility just that hard? Are there web frameworks/web components BBC should be using that already handle this?
I've been wary (as a backend-focused full-stack developer) about tweaking the browsers components. There's so much nuance to how they work and the implementations are battle tested. The idea of creating a custom text box (for example) without doing extensive research of text box behavior across platforms seems ripe for failure. I notice broken copy/paste and dropped characters often enough (on major corporate sites too). Why are text boxes broken in 2024? React feels arrogant to me now.
Personally, I've tried to handle this with server-side templates, CSS frameworks like Bulma, minimal JS. It's not viable for sites demanding slick custom branding (vanity?) but my text boxes work and my site doesn't cost a fortune to develop. Is it accessible to BBC standards? I'm not sure.
I don't know the answers to all the questions, but "is accessibility just that hard" is a firm, concrete, YES.
Here's some real world examples, modals. If you are not a vision impaired user, you can see what's going on when you're presented with a white box containing ui components swimming in a sea of "don't touch this bit" grey.
If you're using a screen reader there's no guarantee that you'll receive any of that information. When your screen reader controls tab through the UI elements and you land back at the top of the box, will your particular screen reader report that to you? Will it list the available interactable elements? Will it list them in the same order as the other screen readers? How about on phone? How about on Mac? Will your screen reader and browser report the inputs right, or will it silently allow the user to fall out of the modal and back into the rest of the site?
When it comes to accessibility you can't trust that the OS, browser and the screen reader are cooperative, or even that they'll do something sane in the right situation.
In 2019 I had to log a bug with VoiceOver + Safari because a negative CSS margin could cause screen readers to read RTL text blocks out of order. Users with unimpaired vision would see "9/10/2019" and on the screen reader you would hear "ten slash nine slash two-thousand-and-nineteen", as a stopgap measure we had to set the text aria-hidden and put in an invisible p tag there with the correctly ordered text so screen readers wouldn't choke on it.
All this to say, sometimes when you see some jank code relating to accessibility there really isn't a better way to do it. Even if you dumped everything, turned the codebase upside down and focused on accessibility first you'd see stuff inexplicably break the moment JAWS or VoiceOver updates.
I'm just trying to reason on their decision here.
Instead of checking the more appropriate property that other commenters have suggested (pointerType), I'm a bit surprised that the solution given by the author is to patch up the shaky heuristics even more:
> We could deduce from our final two clues the solution: we need to check for negative numbers as well as positive numbers when checking the screenX and screenY coordinates.
At the time this code was originally written four years ago or whenever it was, not all browsers used PointerEvent for click.
Has anyone seen good use-cases for that feature? I'm thinking about dual window applications that interacts with each other (I think I saw a demo of something like this a while ago on HN but I wasn't able to find it again), or sites where behavior depends on their location on the virtual screen.
For example:
As for why they're checking for coordinates instead of checking for event.type is beyond me. Still I appreciate the write up, it is a good puzzle and relatable to come across code you didn't write and ask, why is it important that the click coordinate is nonzero? Why can't we just check that event.target is the button we want to activate? Why are we using JavaScript at all when a details/summary tag would do the same job?
I'm with you on the second point - as unlikely as it is for the click to occur at the origin, it's still a legitimate value being abused as an indicator of something that might not actually be true - quite frankly the code was bad to begin with, and it was still bad after the fix.
Why would you just send a document when you can generate a heat map of where the user is on your website. And then complain about the performance and wonder why it costs so much to run a modern website.
layerX[1] while non-standard is supported and returns a position relative to the top of the page or the top of the parent element. This makes coordinates positive only and 50,50 is the same for all users. For screenX, 3000,1567 is the same coordinate as 15,37 depending on where the window is located.
[1] https://developer.mozilla.org/en-US/docs/Web/API/MouseEvent/...
FTA:
> The isInvokedByMouse was checking whether the click event was invoked by a mouse or touch pointer – rather than a keyboard – by checking if the screenX or screenY coordinates were a positive number.
They were trying to detect whether it was keyboard or mouse activation, and whoever wrote it assumed that screen coordinates of mouse events would always be positive.
But the code shown doesn't do different stuff for Keyboard vs Mouse, it just checks if it is either one of them. Why would you do that? Which other click event types are there that you want to filter?
Also, if you really want to determine whether a MouseEvent is "real" or "synthetic", and you don't want to worry about when mouse events are triggered relative to keyboard events in the event loop (although it doesn't seem very hard to keep track of), it seems like you can use the current click count (i.e., event.detail). This works on both Chrome and Safari—it's 1 for mouse clicks, and 0 for keyboard "clicks", but the spec text is also a little contradictory and under-specified: the "click" event handler says that "the attribute value MUST be 1 when the user begins this action and increments by 1 for each click" (https://w3c.github.io/uievents/#event-type-click) but it also says "This MUST be a non-negative integer indicating the number of consecutive clicks of a pointing device button within a specific time" (https://w3c.github.io/uievents/#current-click-count), and the definition of "pointing device button" seems to exclude synthetic keyboard events (since those are handled separately)
https://www.joshtumath.uk/posts/2024-11-18-how-i-refactored-...
Then we wouldn't need the isInvokedByMouse and isInvokedByKeyboard functions.
Is there a better way? Relying on screen coordinates for this is highly dubious and I would argue a hack.
1: https://developer.mozilla.org/en-US/docs/Web/API/UIEvent/det...
https://developer.mozilla.org/en-US/docs/Web/CSS/CSSOM_view/...
Story of my life is finding out the details that apparently matter when I am debugging stuff has not been actually written in the spec (any)
const isInvokedByMouse = event =>
event.type === 'click' && (event.screenX !== 0 || event.screenY !== 0);
Why do you even have to check if screenX and screenY are non-zero (as opposed to just checking typeof event.screenX == "number")? Wouldn't that mean (and this is a wild edge-case) that if someone positioned their browser window so that the menu was in the top left corner (at position 0,0) the event handler would break again? Is this to block synthetic click events like (<div />).click()? Keyboard events don't have a screenX or screenY from what I remember as well.Remember that this is on 'click' events. The 'click' event type is a bit of a misnomer: it’s arguably more “activate” than “click”, because (depending a little on platform conventions) it also triggers on Space/Enter if the element is focused. But importantly it’s still a 'click' event: so it’s still a PointerEvent, not a KeyboardEvent. Since various of the fields aren’t appropriate, they get zeroed. So, screenX == 0 && screenY == 0 means either that the pointer is at the top left of the screen, or that the event was not generated by a pointer (that is, by a keyboard).
Try it out yourself, if you like, by loading a URL like this and activating by keyboard and by mouse and comparing the events.
data:text/html,<button onclick=console.log(event)>
In reality, if you used such a check more generally, you’d find it wasn’t such a rare edge case: if the page is fullscreen, corner clicking is actually pretty common, and if you have buttons that are supposed to be in the corner, they should certainly activate on exact-corner clicks. (See also Fitt’s law <https://en.wikipedia.org/wiki/Fitts's_law#Implications_for_U...>.)Fortunately, there’s a proper fix: `event.pointerId === -1` indicates non-pointer (viz. keyboard) activation.
In the article the author says that the issue is that the same function is handling both events, and they will work on refactoring it to something better.
The normal approach is just have different functions answering to different events. Or using more precise information about the event [1], instead of a heuristic.
[1] A suggestion was made by this poster: https://news.ycombinator.com/item?id=42174436
Who knows, they probably broke the menu for keyboard navigation, voice navigation, eye tracking or something like that. This is one of those cases where you could really "make it make sense" by just using something CSS based.
Actually not just 101, it's basically with all of us at all levels and for life. So they're in good company having made a mistake everyone makes all the time, but it was a mistake on their part not a bug in WebKit, nore even a "interoperability issue" in WebKit or any browser.
They say they weren't aware that negative values were possible and that different browsers produce different values.
Ok, but neither of those matters.
If the function is even allowed to contain or express a negative value (IE right at the lowest basic level, is the returned data type actually a uint, or is it anything else? a regular int? a string?) then negetive values were always a possibility even if you personally never saw one before. Saying "I didn't expect a number below 0" is barely any different from saying "I didn't expect a number above 10000".
The discrepency between browsers doesn't matter and isn't the browsers fault that it tripped you up. You just made a standard boring unsafe assumption like every other programmer ever.
The entire problem is that you cared about something you don't actually care about.
You assumed that there was meaning in the absolute position of the window or the mouse pointer, when there never was, and you don't actually care about those anyway. The absolute position is like the actual internal-only row number in a db. Every row has a unique one, but it's none of your business what it is. There is only meaning in it's position relative to something else like a button, or relative to it's previous position to track movement.
Similarly checking for 0,0 and assuming that means keyboard is just another false heuristic that doesn't actually prove any such thing. The specs may or may not promise that the value will be 0,0 in the event of a keyboard initiated click, but no way it says that it can't be 0,0 any other way.
Don't de ashamed of this error because it's common, but don't be proud of calling these WebKit or browser interoperability bugs.
Do write up and publish the experience though as a warning and lesson to new developers about assumptions and heuristics and relying on side effects that just happen to work on the developers laptop when they tried it once and shipped it.
Also "it's for accessibility" doesn't change anything. Trying to be too smart just makes it worse. Actually that's true just generally for everything.
Where they using those values in their code?
Very interesting article but I'm missing the step where it would impact their code ...
(() => {
const logEvent = event => console.log({
coords: [event.screenX, event.screenY],
type: event.type
});
const input = document.querySelector("textarea");
// use "keydown" instead of "keypress" to detect all keyboard input instead of just character producing input
input.addEventListener("keydown", logEvent);
input.addEventListener("click", logEvent);
})();
Type in or click on the reply text input and you'll see that the coords array is undefined for all keyboard events. I haven't tried this equivalent on a touch device however, so not sure how it's handled there.Bear in mind the following:
> Also, thank you to Patrick H. Lauke from TetraLogical (editor of the Pointer Events Level 2 spec) for his comment on Mastodon that suggested improving the offending code by checking for pointerType in the PointerEvent interface instead of screenX and screenY.
The code provided in the second post[0] uses pointerType if available, else falls back to checking for (0,0). Also:
> At the time this code was originally written four years ago, not all browsers treated click events as PointerEvents. They used the MouseEvent interface, so pointerId or pointerType wasn't an option.
[0]: https://www.joshtumath.uk/posts/2024-11-18-how-i-refactored-...
Can you believe that every app has a team of people who just maintain the app's code?
You don't need any javascript to provide a usable web experience. In fact, you are more likely to break usability with it.