Mixing 2D and 3D

Thu Aug 1 13:33:47 PDT 2013

> - What is the minimum space taken for "new Node()" these days?  Is that too heavyweight for a 3D scene with hundreds or thousands of "things", whatever granularity we have, or come to have, for our Nodes?

I think each Node is going to come in somewhere between 1.5-2K in size (I haven't measured recently though so it could be worse or better). In any case, it is fairly significant. The "StretchyGrid" toy in apps/toys had 300K nodes and the performance limitation appeared to be on the rendering size rather than the scene graph side (other than sync overhead which was 16ms, but I was just looking at pulse logger and didn't do a full analysis with profiling tools). So although a node isn't cheap, I think it is cheap enough for a hundred thousand nodes or so on reasonably beefy hardware. On a small device with crappy CPU, we're looking at a thousand or so nodes max.

> - How often do those attributes get used on a 3D object?  If one is modeling an engine, does one really need every mesh to be pickable, or are they likely to be multi-mesh groups that are pickable?  In other words, you might want to pick on the piston, but is that a single mesh?  And is the chain that connects it to the alternator a single mesh or a dozen meshes per link in the chain with 100 links?  (Yes, I know that alternators use belts, but I'm trying to come up with meaningful examples.)

I think this is a good question also applicable to 2D. For example I have a bunch of nodes that are in a TableView, but for many of these they don't need to be individually pickable. We do optimize for this in memory though by "bucketing" the properties. So if you don't use any of the mouse listeners, you only pay for a single null field reference on the Node. As soon as you use a single mouse event property (add a listener, set a value), then we inflate it to an object that holds the implementation for those listeners. So there are games we can play here so that just having this API on a node is actually very inexpensive.

Most of the cost (last time I measured) of a Node is actually in the bounds & transforms, which we duplicated on Node and NGNode. And that state we really can't get away without.

One thing we talked about in the past was having a way to "compile" a portion of a scene graph. So you could describe some nodes and then pass it to a method which would "compile" it down to an opaque Node type that you could put state on. Maybe there is still a big NG node graph behind the scenes, but once you've compiled it, it is static content (so no listeners etc). Anyway, I'm not sure if it is useful or not.

> - How does picking work for 3D apps?  Is the ability to add listeners to individual objects good or bad?

I assume it is basically the same as 2D. There will be some elements in the screen you want to be pickable / get listeners (key listeners, mouse listeners, etc) but most things won't be.

> - How does picking interact with meshes that are tessellated on the fly?  Does one ever want to know which tessellated triangle was picked?  

I could imagine a design application (like Cheetah 3D or something) being interested in this, but with on-card geometry shaders doing LOD-based dynamic tessellation, I think it is something we would not expose in API. Somebody would have to actually create a mesh that complicated (and I guess we'd have to allow picking a particular part of the mesh).

So in real 3D we'd not be able to guarantee picking of any geometry, but we could potentially allow picking of any geometry known about on the FX level (in a mesh). Right now they are one-in-the-same, but when we start supporting on-card tessellation, the FX-known geometry will be a subset of all the real geometry on the card.

I imagine it is similar to 2D. In 2D we say "here is a rounded rectangle" but in reality we might represent this with any kind of geometry on the card (right now it is just a quad, but we've talked about having a quad for the interior and others for the exterior such that the interior is just a solid opaque quad and the edges transparent quads for AA, such that on-card occlusion culling of fragments could be used).

> How does that fit in with the 2D-ish picking events we deliver now?  If a cylinder is picked, how do we describe which part of the cylinder was picked?

Pavel or Chien will have to pipe in on this part of the question, I don't know.

> - Are umpteen isolated 2D-ish transform attributes a convenient or useful way to manipulate 3D objects?  Or do we really want occasional transforms inserted in only a few places that are basically a full Affine3D because when you want a transform, you are most likely going to want to change several attributes at once?  2D loves isolated just-translations like candy.  It also tends to like simple 2D scales and rotates around Z here and there.  But doesn't 3D love quaternion-based 3-axis rotations and scales that very quickly fill most of the slots of a 3x3 or 4x3 matrix?

I think that in 3D you definitely use the bigger transforms more. However I think that the simple transforms are also useful. Particularly since we have those various Transition classes to animate stuff around. I think we want to have a Transition class for doing 3D transitions as well.

> - Right now our Blend modes are pretty sparse, representing some of the more common equations that we were aware of, but I'm not sure how that may hold up in the future.  I can implement any blending equation that someone feeds me, and optimize the heck out of the math of it - but I'm pretty unfamiliar with which equations are useful to content creators in the 2D or 3D world or how they may differ now or in our evolution.

This is a good question. There are some blend modes GL supports, but for the ones that GL doesn't natively support, how would we implement those? That is, we could say "in 3D we don't flatten we just change the blend mode used on the card" but I'm not sure that works.

> - How will the nodes, and the granularity they impose, enable or prevent getting to the point of an optimized bundle of vector and texture information stored on the card that we tweak and re-trigger?  I should probably start reading up on 3D hw utilization techniques.  8(

I think having a scene graph helps us here and we're in good shape. This is one of the things I want to do after positioning prism for multi-threading, because the ideas are related. In order to do multi-threading, we have to be able to compute per-node what the drawing commands are (all on background threads) and then combine them all together into a sequence of draw commands that get sent to the card in one go at the end. If we had such a system, it would be easy to persist those commands from frame to frame on the Java side (allowing us to skip a bunch of work on each frame for a node that has to be drawn but hasn't changed). It isn't a far leap from there to storing the drawing commands on the card instead of in Java (or at least the vertex information on the card).

> - Looking at the multi-camera issues I keep coming back to my original pre-mis-conceptions of what 3D would add wherein I was under the novice impression that we should have 3D models that live outside the SG, but then have a 3DView that lives in the SG.  Multiple views would simply be multiple 3DView objects with shared models similar to multiple ImageViews vs. a small number of Image objects.  I'm not a 3D person so that was simply my amateur pre-conception of how 3D would be integrated, but I trust the expertise that went into what we have now.  In this pre-concept, though, there were fewer interactions of "3Disms" and "2Disms" - and much lighter weight players in the 3D models.

We actually aren't that far off from this view, in that most of the really complicated things people will do will be with a big complicated Mesh displayed in a MeshView (analogous to Image in ImageView). Cameras and such are defined externally so that you can have multiple separate models loaded together. But I imagine that in *most* use cases, people will be off building a 2D app and then create a SubScene configured for 3D and construct their 3D scene graph in there. Its just that they'll be building it using the same node types and API that they are already using, except mostly 3D variants.

I was thinking yesterday that there is even a place for 2D rectangles and such in a 3D world, particularly if you have a Paint type that includes bump-map information (normals) that we could define. For example, I could have a gradient where the stops not only have rgba values, but also normals. I seem to remember talking I think with Burkey about this in the past in that we could then have a CSS style sheet to style UI controls such that they would appear to have 3D surface to them. In this way being able to drop UI controls into a 3D app makes sense.

> - In looking briefly at some 3D-lite demos it looks like there are attempts to do higher quality AA combined with depth sorting, possibly with breaking primitives up so that they depth sort more cleanly.  Some docs on the CSS3 3D attributes indicate particular algorithms that they recommend for slicing up the 2D objects to allow for back to front ordering that allows alpha to mix better with Z while not necessarily targeting the kinds of performance one might want for pure 3D.  Such techniques would also allow us to do the algorithmic AA that runs into trouble in the "circles" demo that Richard showed, but those techniques don't scale well.  On the other hand, it allows for things like the CSS3 demo that has 4 images on rotating fan blades with alpha - very pretty, but probably not done in a way that would facilitate a model of a factory with 10K parts to be tracked (in particular, you can't do that with just a Z-buffer alone due to the constantly re-sorted alpha):

Right, I think the CSS 3D techniques make sense for a 2D-pipeline with 2.5D type 3D rendering, but not for a real 3D pipeline. So if you're just rotating a few planes around in 3D, then it works well, but if you're modeling Duke and Tux in a lightsaber duel, then you really are looking at a completely different rendering pipeline being what you want (with more traditional 3D techniques like BSP trees or whatnot to make sure the transparent triangles are sorted correctly, or something even more radical like depth-peeling).

The more I think on it the more it makes sense to me that we ought to have a "2D" pipeline wherein you can do 3D stuff but it all fits according to 2D semantics, and a 3D pipeline for doing real 3D work. As time goes on, we could expose ways for developers to provide their own shaders for the 3D pipeline, which might not make as much sense on the 2D side of the world.

Cheers
Richard