JavaFX 3D

Fri May 25 13:10:27 PDT 2012

I've done a lot more thinking on this and am really coming back to thinking that it is just best to have a single unified scene graph. In any case (whether we have basically a 2D/2.5D scene graph and separate 3D scene graph) we have to deal with the fact that in the 2.5D world we flatten under certain circumstances. Also by having two scene graphs we complicated matters by having to have adapters for putting 3D into 2.5D and vice versa. We also have to deal with API drift etc. Overall I think that will end up making it all more complicated.

Instead, I'm thinking we would benefit from just fitting 3D into the existing scene graph, and carefully documenting that "opacity", "blendMode", etc will end up "flattening" the sub tree (they only matter on a Parent, on leaves there is no weirdness). If we make this very clear, then I think it will be better to just have a unified scene graph.

I have some ideas and some questions. These thoughts are all my own (having bounced them off Jasper), so I'm not saying this is the "way it is", but rather, here's what I'm thinking now and we'll of course refine and throw some things out and include other things as we go here through the discussion.

Making it 3D:

Can we just turn depthTest enabled by default and set the camera to a perspective camera by default? My main concern, is that presently when anybody does a 3D transform or drops a 3D node into the scene graph, it simply doesn't work. Instead they have to enable depthTest at the global scene level, and then turn it off on the root node, and then turn it on again wherever it is needed. And they have to set the perspective camera.

I don't understand the intricacies well enough, but I would much prefer that when I do a 3D transform or drop a 3D node into the scene graph, that it "just works" :-)

If we cannot change these defaults for some reason, this needs to be clearly documented and we need to add a Scene3D which extends Scene and adds this stuff by default. Then you can pick a Scene or Scene3D depending on what type of app you're building. It still isn't perfect but might be OK.

Cameras, Lights, Audio:

In a 3D scene graph we need to support multiple cameras, lights, and directed audio sources. Having these be part of the scene graph, where they can be transformed and moved within a Group or other Parent, or animate along a path using a PathTransition, would be very convenient. However there is some state on Node which just doesn't make sense for a DirectedAudio node, for example, opacity.

So right now, you can set a single camera on the Scene. This camera is the one used for drawing the scene. It has a default position etc such that it is centered in the window. However you could have multiple cameras looking at different parts of the scene graph, and the switch which camera is used for the Scene, for example.

In this case, having "opacity", "effect" etc might actually make sense for a Camera, and thus maybe Camera could extend Node. So the idea is that Camera extends Node and can be placed in the scene graph. It doesn't draw anything itself, it just lives in the scene graph. However if you set the Camera on the Stage, then it draws the scene graph from its perspective, and applies any effect, clip, etc as specified on the Camera node.

We could also add a CameraView node, which would take a Camera. For example, your game might have a car driving through a course. There is a camera looking from the drivers viewpoint, from above and behind the car looking forward, and one looking backwards from the rear view mirror. The user can toggle the Scene's camera between the driver's view & above&behind view. The rear-view mirror would be a CameraView with a clip applied (to make it look like a rear view mirror), and it would use the rear-view mirror camera for its input.

Camera also could have added to it a lookAt property of type node. If specified, the camera will automatically be transformed to look at a specific node, regardless of where the node goes or the camera goes.

Another way to go is to have Camera not be a Node, but to have a "follow" property of type Node. Essentially the Camera will use the world transform of that Node to define its own position.

Lighting and directed audio want to also have transforms and be part of the scene graph, conceptually, although opacity, effect, and other such properties really do not affect the final output. I'm not totally sure on lighting though and how it fits in. If we go with the "follow" property on Camera instead of making it a node, we could do the same here. Typically you'd then use a Group with no children as the node that is positioned in the scene graph, so that it ends up doing no rendering, but you could actually affix the camera / light / audio source to anything.

2.5D Transforms:

Right now, it is kind of a pain to do 2.5D transforms. For example, if I want to rotate a 2D node around the Y axis, then I have to set the camera to be a perspective camera. But the camera is centered over the scene, so the perspective being applied is not what a 2D app would expect. So you want the camera to be centered over the node being transformed. We need to handle this use case better by providing some way to indicate that a perspective transform is being used on this node and the camera needs to be centered over the node. I don't know that this can be done implicitly, we probably need to have some flag or setting to indicate this.

3D Primitives:

I would like to have basic primitives such as Box, Sphere, Cylinder, etc. We would design it so that if you apply opacity to a Box, for example, then we will handle the vertex sorting so that it looks "right" (ie: the back-most sides of the Box are drawn first and blended correctly with the later drawn top-most sides).

I would like to have a Mesh (and maybe it is Mesh + MeshView so the same Mesh can be reused). We might have a couple Mesh classes -- one that has a simple API for basic stuff or people coming fresh to the world of 3D, and a more complicated API that is way more efficient. I'm not sure, maybe we just go with hard-core.

A Mesh would allow you to define your own vertices, normals, texture coords, etc. Basically with Mesh you have full control over the geometry. We could provide some mechanism for the developer to sort their mesh based on the scene transform, so that transparency can be made to work correctly. The idea is that we could have a callback such that during sync (before grabbing the render lock), we'd tell the developer it is time to sort anything, and let them have at it.

I'd also like us to have bones and joints support, so that stuff we showed at JavaOne last year are possible using real API.

Scene Graph Semantics:

Every node in the scene graph has many properties, most of which are perfectly suited to the semantics of a 3D scene graph, but some of which just aren't. For example, Opacity, Effect, Blend Mode, Cache, and Clip are 2D specific use cases that require (in many cases) flattening the sub graph into a texture and then applying the property to that texture.

This we just have to clearly document. And in normal 3D applications, you wouldn't be using these properties. Opacity is a real obvious difference. The problem here is that opacity in a 2D semantic should be such that, semantically, when you apply transparency to a group it renders everything into a texture and causes the entire thing to go transparent equally. In a 3D scene graph, it instead is a property (or attribute) assigned to each of the leaf nodes individually.

So we could add a property to node called "opacityMultiplier" for example, and that would be the semantic that a 3D scene graph would typically use. Or we could have a special Group subclass which defines opacity, clip, etc as properties which apply to its children instead of to itself, and then when constructing a 3D scene graph you would use this Group3D class instead of Group or whatnot. Or we could add the idea of an "attribute" which is just some special type that you can set on a Parent and it will apply to all of its children itself of itself, and then have an OpacityAttribute type, etc.

I'm not sure which way I prefer, but we'd have to sort this out somehow.

Shaders:

Right now, all of our effects cause render-to-texture flattening, but I could imagine that some (like color adjust, sepia, etc) could actually not require render-to-texture flattening. Basically, if we don't have to do read-back of the destination, then we could avoid render-to-texture (i think, maybe that is wrong though and it is no read-back or reading of source other than on the current pixel. That is probably more accurate). Such effects we could duly mark as requiring no flattening.

We could then perhaps make a new class Shader an Effect that requires no flattening. The Shader then takes a GLSL / HSL script as its implementation. This would be a mechanism for custom effects, basically. We could then use them on any node, 2D or 3D, and they wouldn't require flattening so we'd get the right semantics in the 3D part of the scene graph.

Other Stuff:

At present, we tessellate all 2D primitives when they go into 3D space. However this leads to unsightly jaggies. Rather, I think we should semantically treat any 2D leaf node translated into 3D as a texture. 3D nodes always would be tessellated, and we'd provide some API (on Scene? Scene3D?) to set the MSAA level to use. Groups and such translated in 3D would not be render-to-texture. I'm not sure if / how this will work with z-fighting and such to deal with.

We also need some level-of-detail support. Not sure what this would look like.

Speaking of which, when it comes to ImageView in 3D, there is probably a bunch of new API that would be needed. At least, we should have some kind of MipMapping support so that we get level of detail for images. Maybe level of detail support for primitives would just be a simple dial, and on Meshes there would be a callback that you'd have to implement to deal with LOD.

We also have an ImagePaint which could be used for simple texture mapping on 3D primitives, but if you're going to do anything serious we'd need to add a lot more control over the uv coords. Or maybe on the primitives we don't give that level of control and you have to use a Mesh if you want to get that kind of control. Most such things are going to be produced in some 3D content authoring tool anyway, and then exported / imported into FX, and so will be using Mesh anyway rather than primitives, I would guess.

We could add a Canvas3D which was basically OpenGL ES 2.0 (or some subset thereof) . We'd have to map this onto D3D on windows. Although maybe instead we just provide a handle to the surface such that you could use JOGL, Java3D, or native code for drawing your own canvas, and then we just composite this into the scene graph. From a practical perspective this is probably a heck of a lot easier to do and enables existing code to run in the scene graph, so from that perspective of interop this is probably the first thing to do before adding a new Canvas3D.

Probably need to add some kind of Billboard node.

Conclusion:

OK, that's about as far as I got. I wanted to get it out to this list so you could chime in, and we're hoping to have some internal discussion next week in a face-to-face to whiteboard things and work through problems, so maybe it will be all completely discredited by next week, but at least to my current thinking I'm kind of liking the above proposal / thoughts. What do you think?

Richard