Multi-touch API

Thu May 24 00:48:22 PDT 2012

Hello,
we are at the end of a long journey to introduce an initial multi-touch 
support. The API has been lying in 2.2 repository for some time already, 
but we haven't got much feedback, so we've decided to make a last call 
for your concerns that might result in API changes before it goes 
public. We are already past feature freeze though, so the window is 
going to be quite small. A summary of the API follows.

The touch actions produce three types of events: MouseEvents, 
GestureEvents and TouchEvents. All the events are delivered 
simultaneously and independently.

* Mouse events *

Single touches are translated to normal MouseEvents, so the existing 
applications should work fine on touch screen. Sometimes you may need to 
identify and handle differently the mouse events synthesized from 
touch-screen, for that purpose they have a new isSynthesized flag.

* Gesture events *

Gesture events are: ScrollEvent, RotateEvent, ZoomEvent, SwipeEvent. 
They are generated by both touch screen and trackpad. Basic common 
characteristics are:
  * each event has coordinates, for trackpad events mouse coordinates 
are used, four touch screen events the center point between all the 
touches is used
  * each event has modifiers information
  * each event has isDirect flag that distinguishes between direct 
events produced by touching on direct coordinates on touch screen and 
indirect events produced by trackpad or mouse.

The ScrollEvent, RotateEvent and ZoomEvent are continuous. They have 
event types "started", "performed" and "finished". The "started" event 
comes when the gesture is detected, then the "performed" events are 
delivered, containing delta values (change since the previous event) and 
total delta values (change since the gesture start). The "finished" 
event is delivered when the gesture finishes (the touches are released). 
After that, another "performed" events may be delivered, with an 
isInertia flag set to true. Whole gesture is delivered to a single node 
that was picked on gesture coordinates in time of gesture start.

The SwipeEvent is a one-time event. When all the touch points involved 
in a gesture are pressed, then moved in the same direction and released, 
we recognize the gesture and deliver it as a single SwipeEvent 
(containing the swipe direction). Note that the described gesture 
produces also ScrollEvents, they are not exlusive with swipe.

The gestures specifically:

ScrollEvent has types SCROLL_STARTED, SCROLL, SCROLL_FINISHED. The 
started/finished notifications are generated only by touch gestures, 
mouse wheel still generates only one-time SCROLL event. In addition to 
the formerly existing deltaX and deltaY fields it has totalDeltaX and 
totalDeltaY (those contain zeros for mouse wheel scrolling). There is 
also a new field touchCount that specifies how many touch points are 
used for the gesture (a new gesture is started each time the touch count 
changes).

ZoomEvent has types ZOOM_STARTED, ZOOM, ZOOM_FINISHED. It contains 
zoomFactor and totalZoomFactor. The values are to be multiplied with 
node's scale - values greater than 1 mean zooming in, values between 
zero and one mean zooming out.

RotateEvent has types ROTATION_STARTED, ROTATE, ROTATION_FINISHED. It 
contains angle and totalAngle. The angles are in degrees and are meant 
to be added to node's rotation - positive values mean clock-wise rotation.

SwipeEvent has types SWIPE_LEFT, SWIPE_RIGHT, SWIPE_UP, SWIPE_DOWN and 
contains touchCount field. On touch screen it is delivered to the node 
picked on center point of the entire gesture.

* Touch events *

The TouchEvent can be used for tracking all the particular touch points. 
They are delivered for touch screen actions only, trackpad doesn't 
produce them.

Each event carries a touch point (representing one pressed finger) and 
references to all other touch points. This design allows for handling 
and consuming each finger separately while making it possible to 
encapsulate handling of more complex multi-touch gestures in which not 
all touch points need to be over the handling node. In any moment of a 
multi-touch action, we have a set of touch points - for each of those 
touch points we create one touch event. This bunch of events is called 
"event set" and is marked by a common eventSetId number. All touch 
events from the set carry the same list of touch points, each of them 
carries different one as the "main" touch point.

Each touch point has state (PRESSED, MOVED, STATIONARY, RELEASED), 
coordinates, id (unique in scope of a gesture) and target (the node to 
which this touch point's event is delivered). A method belongsTo(node) 
allows to test on the other touch points if they are delivered to the 
given target node (including bubbling).

The event has types TOUCH_PRESSED, TOUCH_MOVED, TOUCH_STATIONARY, 
TOUCH_RELEASED, corresponding to the sate of its touch point. They also 
contain touchCount and modifiers.

By default, each touch point is delivered to a single node during its 
whole trajectory, similarly to mouse dragging and gestures. This 
behavior is great for dragging nodes and is consistent with the rest of 
our events, but sometimes you want it different way - in this case it 
can be altered by using a grabbing API. The touch point provides methods 
ungrab() (since this call the touch point will be always delivered to a 
node picked under it), grab(node) (since this call the touch point will 
be always delivered to the given node) and grab() (since this call the 
touch point will be grabbed by the current source - the node whose 
handler calls it). In another words the grabbing/ungrabbing determines 
where to deliver each touch point and by default each newly pressed 
touch point is automatically grabbed by the node picked for it.

* Notes *

There is one thing that makes the existing apps behave wrong. When you 
drag one finger over touch screen, it generates both mouse dragging and 
scrolling. We have to deliver both of them for the majority of 
applications to work correctly. The few nodes that handle both of those 
events (ScrollBar is a typical example) usually don't work well and need 
to be updated (to ignore synthesized mouse events for instance).

For a classic application it should be enough to use mouse events and 
gestures. For a touch-screen-only appliction that wants to do some 
complex multi-touch logic, TouchEvents provide all the power. Developing 
an application that uses all kinds of events requires considerable care 
to avoid conflicting handling of user actions.

We tried to stick with the native behavior as much as possible for users 
to get what they are used to. Where the underlying platform knows a 
gesture, we use the native recognition (elsewhere we have our own). We 
produce inertia for gestures based on the native support on each 
platform (for instance if a platform doesn't support zooming inertia at 
all, we don't generate it there). The goal is to make the same 
application behave as much native-like as possible on all platforms 
while minimizing developer's need to consider the differences.

The gestures can generally be used without considering differences 
between touch-screen and trackpad. The most significant exception is 
touchCount on SwipeEvent. On touch-screen you can generate swipe by any 
number of touch points. On trackpad one-finger siwpes make no sense 
(they are used just to move the mouse cursor). On Mac in particular we 
use the native swipe recognition that generates only three and more 
finger swipes. So an application that uses touchCount on SwipeEvent will 
probably need different handling for direct and indirect events (which 
such complex applications will likely do anyway).

This is basically what we have in 2.2. Are there any objections?
Thanks,
Pavel