Mixing 2D and 3D

Mon Jul 22 17:19:05 PDT 2013

Hi August,

I will attempt some kind of answer although what we end up doing has a lot to do with where the community wants to take things, so some things we've talked about, some things we haven't, and the conversation is needed and good.

> Whatever the use case will be JavaFX 3D will be measured against the business needs and certainly against the capabilities of the latest releases of the native 3D APIs Direct3D (11.1) and OpenGL (4.3). This might be unfair or not helpful, but it is legitimate because they are the upper limits.
> 
> So, even an object-oriented scene-graph-based 3D implementation should in principle be able to benefit from almost all native 3D features and should provide access to them. If the JavaFX architecture enables this approach then a currently minor implementation state will be rather accepted.
> Potential features are currently determined and limited by JavaFX' underlying native APIs Direct3D 9 and OpenGL ES 2. Correct?

Java 8 doesn't support Windows XP, so in theory we can start taking advantage of DirectX 10+. At this time we are limited to OpenGL ES 2 for the sake of mobile and embedded. Also the cost of supporting multiple pipelines is significant. I think the first thing we would need to work through, before we could support DX 10, 11+, is to make it as easy as we can to write & maintain multiple pipelines.

So we have been looking at moving off of DX 9, but not off of ES 2 (until ES 3 is widespread).

> - core, e.g.: primitives (points, lines, line-strip, triangle-strip, patches, corresponding adjacency types), vertex attributes, double-precision, shaders (vertex, tessalation, geometry, fragment, compute), 3D-/cubemap-texture, multitextures, compressed texture, texture properties, multipass rendering, occlusion queries, stencil test, offscreen rendering into image, multiple active cameras, stereo?

I think some of these are relatively easy things -- a line strip or triangle strip for example could be Mesh subclasses?

We need to talk about how to handle shaders. I agree this is something that is required (people must have some way of supplying a custom shader). This will require work in Prism to support, but conceptually seems rather fine to me.

I think different texture types could be supported by extending API in Image. The worst case would be to introduce a Texture class and setup some kind of relationship if possible between an Image and a Texture (since images are just textures, after all).

The harder things are things like multi-pass rendering. Basically, features that we can add where prism is in control and picks a rendering strategy for you are relatively straightforward to think about. But giving a hook that allows the developer to pick the rendering strategy is quite a bit more complicated. I was reading up on order independent transparency algorithms and thinking "how would this be exposed in API?". I haven't had any good brain-waves on that yet.

I think we already do multiple active cameras?

> - APIs, e.g.: user input for scene navigation and model interaction (keyboard, mouse, touchpad/screen), geometry utilies, skinning, physics engine interface (kinematics), shadows

Key/ mouse / touch / etc should be there already?

Skinning and physics are both interesting. Skinning and boning is interesting because of different ways to go about this. Last JavaOne we did this with special shaders all in hardware. Ideally the custom shader support (that doesn't exist) mentioned above would give a way to do this. There is a balance, I think, between what we want to provide built-in and what 3rd parties such as yourself could layer above the basic support with additional libraries.

For example, physics. I don't think we're going to add a physics engine of our own (probably ever), but it should be relatively easy for a framework to be built on top of JavaFX that did so. I was reading over the Unity docs for example to get a taste for how they setup their system, and was thinking that a GameObject is basically a type of Parent which has the ability to take multiple "Components", including a MeshView. So a "Unity-like" API could be built on top of FX Nodes, including "unity-like" physics engine integration. The core thing for our 3D support, I think, is to expose enough of the drawing system so that additional libraries can be built on top. Obviously that will include needing a way to let developers set shaders and manipulate various state.

Right now we just expose "depthTest", but we don't differentiate testing from depth-writing. The more GL / D3D specific functionality we expose in API the more treacherous the waters become. I think this is the main design problem to be navigated.

> Will JavaOne 2013 highlight the JavaFX 3D strategy extensively?

I don't expect so. We'll show 3D but you're getting the cutting edge information now :-).

> II. Mixing 2D and 3D
> - "to tease apart the scene graph into Node, Node3D, and NodeBase ... doesn't work"
> - "we keep the integrated scene graph as we have it",
> 
> So, all current and future 3D leaf and branch nodes are/will be derived from Node, which is primarily designed for use in a 2D/2.5D scene graph.

Yes.

> Alea iacta est!

I guess so, Google translate failed me though so I'm not sure :-D

> Various Node's properties and methods seem not to be relevant or applicable for 3D nodes (e.g. Shape3D). Are you going to communicate which of these several hundreds (see below) should be used preferably, have a different semantic, should not be used or aren't supported for 3D nodes?

Sure, I think actually it is not so dire. Most of the APIs are applicable to both, there are only a few that are *strictly* 2D in nature (clip, effect, blend mode, cache), and cache at least we can ignore in a 3D world and blend mode might be applicable even in 3D (though with different semantics).

> - "So the idea is that we can have different pipelines optimized for 2D or 3D rendering"
> Would be great!

The more I think of it, the more I think this is necessary. If done right, we can make the pipeline abstraction simple enough and straightforward enough that we could also have DX11 pipeline, ES 3 pipeline, NVidia DX 11 pipeline, etc. I'm no doubt wildly optimistic, but if possible, it would mean we could optimize for specific configurations. Testing those configurations will be a nightmare, so even if we could write them, supporting them all is a monumental effort without nailing down a sensible testing strategy.

> - Scene - Scene2D/SubScene2D - Scene3D/SubScene3D
> - "if you put a 2D rectangle in a Scene3D/SubScene3D"
> 
> What is the use case for rendering 2D pixel-coordinates-based shapes (or even controls) within a 3D scene consisting of meshes constructed on 3D coordinates of arbitrary unit?

I think if we had this split, I would treat 2D shapes as if they were 3D shapes but without depth. So a Rectangle is just a box with depth=0 (assuming a box with depth=0 would even be drawn, I don't know what we do now). So I would expect 2D shapes going through the 3D pipeline to be tessellated (if it is a path or rounded rect) and interact just like any other geometry. Maybe it wouldn't be used very much but unless we completely split the scene graphs (such that Node3D existed and did not extend from Node at all) we have to make sense of them somehow.

ImageView and MediaView are both good examples of "2D" nodes which are also 3D nodes.

Controls *could* make sense in 3D if we had normals and bump-mapping. The 3D pipeline could use that additional information to give some 3D aspect to the controls. I could imagine a 3D radio button or button or checkbox being wanted for a game's settings page.

I suspect most of the time, people will use a SubScene, put their 2D controls in this, and then put the SubScene into the 3D world.

> Couldn't the pipelines be even better optimized if neither 2D nodes are rendered in SubScene3D nor 3D nodes are rendered in SubScene2D?

I don't think so, if you treat a 2D node as just a really thin 3D node in terms of how the pipeline sees things. If we try to AA stroke rectangles etc as we do today, then yes, I think it would be quite a pain for a 3D pipeline to deal with 2D nodes. But the only practical way to avoid it would be to have completely different graphs, and I think most of the state in Node would also be wanted in 3D, so that seems like a waste.

> blendMode

Potentially 2D only, although you can set blend modes in GL, so if in a 3D pipe this was redefined in terms of semantics, it could make sense there as well?

> boundsInLocal, boundsInParent,

Bounds objects are 3D bounding volumes, and the API is designed such that we could produce BoundingRectangles or BoundingSpheres or whatever else we wanted to.

> cacheHint, cache

A 3D pipeline could just ignore this. Otherwise it must cause flattening in 3D which is kind of nasty.

> clip

Bad 2D only property :-(

> cursor, depthTest

Both great API for 3D as well!

> disabled, disable

Not as commonly used with 3D apps, perhaps, but it doesn't hurt. These states don't actually do anything, except change CSS pseudo-class states, which in the case of controls end up with different rendering. And it might turn off mouse picking of the nodes involved as well (but I don't think so).

> effectiveNodeOrientation

This (and some related properties / methods) is useful for blended 2D / 3D applications. Some state you set on a parent and it affects the parent and all children. If those children are nested 2D / 3D scenes, you'd like this state to fall down into those as well. So in a world where you are doing 3D but want to embed a 2D sub scene, being able to define this property even in a 3D scene makes sense.

> effect

2D only (I think).

> eventDispatcher, focused, focusTraversable, hover, id, inputMethodRequests, layoutBounds

All fine or necessary for 3D. Focused doesn't hurt.

> layoutX, layoutY

We might add a layoutZ. We talked about layout management for 3D, and how I could imagine times when you might want such a thing. If the time ever came, we can add layoutZ. But it doesn't hurt to have these additional transform properties (other than weight per node, but I'm sorry to say we've crossed the line on that one and can't go back)

> localToParentTransform, localToSceneTransform, managed, mouseTransparent, nodeOrientation

All fine.

> onContextMenuRequested, onDragDetected, onDragDone, onDragDropped, onDragEntered, onDragExited, onDragOver, onInputMethodTextChanged, onKeyPressed, onKeyReleased, onKeyTyped, onMouseClicked, onMouseDragEntered, onMouseDragExited, onMouseDragged, onMouseDragOver, onMouseDragReleased, onMouseEntered, onMouseExited, onMouseMoved, onMousePressed, onMouseReleased, onRotate, onRotationFinished, onRotationStarted, onScrollFinished, onScroll, onScrollStarted, onSwipeDown, onSwipeLeft, onSwipeRight, onSwipeUp, onTouchMoved, onTouchPressed, onTouchReleased, onTouchStationary, onZoomFinished, onZoom, onZoomStarted

All of these you want for 3D too.

> opacity

If redefined in 3D, then this doesn't have to be a 2D only operation.

> parent, pickOnBounds, pressed, rotate, rotationAxis, scaleX, scaleY, scaleZ, scene, style, translateX, translateY, translateZ, visible

All good.

I trimmed this next list down so it doesn't have the getter / setter / property method for the properties already discussed above.

> addEventFilter, addEventHandler, buildEventDispatchChain, fireEvent, removeEventFilter, removeEventHandler

All good (more event handling infrastructure)

> autosize, maxHeight, maxWidth, minHeight, minWidth, getBaselineOffset, prefHeight, prefWidth, relocate, resize, resizeRelocate

These are layout methods and while not particularly useful for 3D, they aren't dangerous or hard to define.

> getClassCssMetaData, getCssMetaData, getPseudoClassStates, pseudoClassStateChanged

CSS for 3D is … unorthodox. It might end up being useful (setting the font for 3D extruded text from CSS?), or not. But like the "component orientation" stuff, you really want CSS to be for the whole tree in cases of embedded 2D in 3D in 2D etc.

> computeAreaInScreen, contains, contains, intersects, intersects, localToParent, localToParent, localToParent, localToParent, localToParent, localToScene, localToScene, localToScene, localToScene, localToScene, localToScreen, localToScreen, localToScreen, localToScreen, localToScreen, parentToLocal, parentToLocal, parentToLocal, parentToLocal, parentToLocal, sceneToLocal, sceneToLocal, sceneToLocal, sceneToLocal, sceneToLocal, screenToLocal, screenToLocal, screenToLocal

All coordinate space transformation methods. All good for 2D / 3D.

> getProperties, getUserData, hasProperties

Random goodness, not 2D specific.

> lookup, lookupAll

Basic graph methods (which also require some CSS support), and highly useful in 3D as well as 2D I suspect.

> requestFocus

OK (actually needed to give key events to a specific 3D node)

> snapshot, snapshot

Should work in 3D just fine

> startDragAndDrop, startFullDrag

Needed for drag and drop gestures in 3D

> toBack, toFront

OK

> usesMirroring

OK.

So really I think that there are only 2-4 properties that are problematic for 3D, otherwise everything is pretty clearly applicable, so having a fully integrated 2D / 3D scene graph seems OK. The only problem is, defining how to interpret some properties and how to draw everything. The way to proceed differs based on a 2D app or 3D app, so that's what draws me towards having a Scene and Scene3D, a SubScene and SubScene3D. This makes it explicit which kind of rendering pipeline you want (2D or 3D) and that allows us to tailor our rendering strategy accordingly. I suspect we could do just as well with having only Scene and SubScene and either using depthBuffer flag as the key, or adding another property as the key by which we know what kind of pipeline to use. I like the idea of something more explicit than just the depth buffer (in fact, the Web now has the ability with preserve-3D CSS attribute to have a 'depth buffer' in a 2.5D scene in a way that makes sense. I think we should not use depthBuffer flag as the way to know whether to render using 2D or 3D way).

Richard