-
-
Notifications
You must be signed in to change notification settings - Fork 56
Physically based transparency #974
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
689518a
to
ee5baf5
Compare
😮 |
@almarklein In physics-based transparency rendering, it's necessary to render all opaque objects first to sample the transmitted light. To achieve this, we need to distinguish between transparent and opaque objects within the renderer. Initially (before 6013660), I used a separate render pass to generate the transmitted light sampling texture (rendering only opaque objects). This approach reused the object's RenderPipeline but required rendering to a specific framebuffer. However, the coupling between RenderPipeline and Blender, and Blender's coupling with the render target, made this logic complex. I invested some effort in handling this (constructing a dedicated Blender), but I wasn't entirely satisfied. After some consideration, the current approach no longer uses a separate pass for the transmitted light sampling texture. Instead, it's generated during the rendering process. To achieve this, we render all opaque objects first, followed by transparent objects. This also represents a major change in the renderer's behavior, as we previously may have intentionally maintained the order independence of rendered objects within the renderer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in general these changes make sense to support transmissive objects. I made a few comments. Maybe @Korijn can have a look too, especially at the changes in renderer.py
.
|
||
@property | ||
def transmission(self): | ||
"""How much the material is transparent. Default is 0.0.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps explain difference with opacity a bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opacity is primarily used for alpha blending, while the transmission property is used for physically-based transparency. When transmission is not 0, opacity generally should be set to 1. I will update the relevant documentation later.
elif ob.material.is_transparent: | ||
flat.wobjects["transparent"].append(ob) | ||
else: | ||
flat.wobjects["opaque"].append(ob) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What of objects that have opacity 1 (i.e. is_transparent
is False), but are still transparent, or partially transparent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This issue is a bit complicated, and I originally intended to raise a separate issue for it. In fact, it’s quite difficult for the engine to automatically determine whether an object is transparent (and it probably shouldn’t attempt to do so). Ideally, there should be a boolean property on the material that indicates whether the object is transparent, allowing the user to explicitly set it.
This is because the rendering logic for transparent and non-transparent objects is quite different. When rendering transparent (or potentially transparent) objects, the user should manually set this flag to intentionally use a different rendering process (after all opaque objects, from far to near, etc.), which is maybe more performance-intensive. In the glTF specification, materials have an "alpha_mode" property, which can be set to "opaque," "alpha_blend," or "alpha_test." Generally, when it's set to "alpha_blend," it indicates that the "transparent object" rendering process should be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, there should be a boolean property on the material that indicates whether the object is transparent, allowing the user to explicitly set it.
I tend to agree with this sentiment.
It's easy to see the scientific user counter argument though. Someone's rendering a Points object to create e.g. a scatter plot, they set opacity to 0.8, and nothing happens, because they forgot to also set transparent=True
.
🤷 Personally I don't mind making transparent=True
an explicit required configuration on the user's part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also have objects that have some fragments that are opaque and others that are transparent. This can be e.g. rendering markers with a solid edge, but a semi-transparent face. But it already happens with anti-aliasing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partially transparent mesh objects should be treated as transparent object usually.
Generally, If an object requires color blending (e.g., glass, fire), it must be handled as a transparent object (sorted from back to front + Alpha Blend).
However, if the transparency is used primarily for anti-aliasing or edge softening (e.g., grass, leaves), it is typically treated as an opaque object, using Alpha-to-Coverage (in conjunction with MSAA) or Alpha Test.
if isinstance(ob.parent, Stats): # special case for Stats | ||
flat.wobjects["front"].append(ob) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this special case is a good idea. With the refactoring of the renderer and blender that I'm working on, overlays will be taken into account in some form. So maybe this will do for the time being ... But let's at least move the import out of the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I believe that such UI elements should not be added in the scene, as they are not part of the 3D scene and should not participate in the sorting of renderable objects or any interactions with scene objects. For example, they should not be involved in generating transmission light sampling maps or environment maps for the scene. Perhaps we should have a dedicated UI layer to store these objects and handle them separately.
flat.wobjects.extend(wobject_dict[render_order]) | ||
# for render_order in sorted(wobject_dict.keys()): | ||
# flat.wobjects.extend(wobject_dict[render_order]) | ||
depth_sort_func = _get_sort_function(camera) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
depth_sort_func = _get_sort_function(camera) | |
renderorder_sort_func = _get_sort_function(camera) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what the implications are when there are a lot of objects in the scene. The previous code avoided sorting by grouping by render-order during flattening. I'm ok with using this approach with a call to sorted
, but some questions must be answred: if there are 10000 objects with the same render_order
, will this sort call be very fast? What if there's a couple of objects with a different render order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a topic worth discussing.
My suggestion is that, in most cases, Z-sorting should be the default behavior of the renderer.
First, in most cases, after performing frustum culling (which we haven’t implemented yet, but will eventually), there shouldn’t be too many world objects requiring sorting during rendering. For advanced rendering, sorting can improve performance. Modern GPUs typically support Early-Z functionality.
- For non-transparent objects, sorting by Z value from near to far can maximize the use of Early-Z. By rendering objects closer to the camera first, all subsequent occluded fragments won’t perform shading computings. However, if we render distant objects first and then closer objects, any distant objects occluded by closer ones will result in unnecessary overdraw.
- For transparent objects using alpha blending, sorting from far to near is necessary to achieve the correct blend result. (By the way, the default "ordered2 blender" method we currently use actually doesn’t sort, which is not correct for transparent rendering.)
In 3D game development, when creating scene assets, it's generally advisable to avoid having a large number of small mesh objects in the scene. Typically, geometries with the same material are merged where possible. If there are many mesh objects, it usually indicates poor optimization, and we should consider using instancing or other methods to optimize rendering.
Lastly, if there really are a large number of simple objects to render, we could provide an API to allow users to disable sorting (with the user ensuring the correctness of the rendering results). But as I mentioned earlier, the default behavior should be " enbale Z-sorting."
Returning to the previous issue, if there are indeed 10,000 objects that require sorting, it's clearly a poor optimized scene, and we should avoid this situation. There are various techniques to address this (such as instanced rendering, particle/sprite systems, etc.). However, if it's absolutely necessary, we can manually disable sorting.
For cases where multiple objects have the same render order, since in most situations, Z-sorting should be applied, this isn't a major concern. However, to ensure stable and consistent sorting results, we can additionally compare the object's ID beyond the render order and Z value.
When sorting is disabled, the render order will also be ineffective. Additionally, we can provide an API that allows users to define their own sorting methods to accommodate specific logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- By the way, the default "ordered2 blender" method we currently use actually doesn’t sort, which is not correct for transparent rendering
I want to make sure that it is clear to @panxinmiao that we do not necessarily consider this to be "incorrect". In pygfx you can independently pick a blending method, and toggle scene sorting. If you enable scene sorting, and select the ordered2
blender, you'll get the expected result. What I can see though is that perhaps sort_objects
should be True
by default, since ordered2
is the default blender.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to agree with @almarklein though, there will commonly be cases where there is no transparency and no use of render order in the scene at all, and the new code will then still run a full sorting algorithm over all world objects. The old code would not perform any sorting at all in that case, and support very high amounts of world objects just fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I'm convinced :) I'm okay with your change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sorting is faster than I anticipated. I did find that using a sort function make its more than 10x slower.
print(timeit.timeit('sorted(a, key=lambda i: i)', globals=globals(), number=1000))
Still fast enough not to affect performance, in the case of sorting by render_order
.
Sorting by z is a completely different story, because it will do matrix multiplications. Rendering 10k objects is IMO a perfectly viable use-case. Sure, game devs should probably do some optimizations if they have such scenes, but for scientific use-cases such optimizations are out of scope.
I like Pygfx to be fast by default, so my preference is to keep the z-sorting opt-in (i.e. off by default).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorting also does not guarantee correct results, because (transparent) objects may cross each-other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorting by z is a completely different story, because it will do matrix multiplications.
Yes, that's correct. However, transforming objects into camera (view) space is a necessary step. The difference is that we’ve moved this matrix multiplication (view_matrix @ model_matrix) into the shader, whereas many engines perform this step before render and only pass the product of the multiplication (view_matrix @ model_matrix) to the shader (meaning the shader uses camera space with the camera as the origin).
This has several benefits.:
- First, it makes Z-sorting easier.
- Second, each object only needs to execute the matrix multiplication once per frame, avoiding the need to perform it for every vertex in the shader.
- Additionally, using camera space in the shader helps improve precision. This is because when the camera is far from the origin of the world coordinate system, the world matrix values become large, which can lead to precision issues. Using camera space mitigates this, as objects that pass frustum culling are always close to the camera (the origin of the coordinate system).
Of course, this is beyond the scope of this PR. Back to this question, Although Z-sorting currently requires an additional matrix multiplication (a 4x4 matrix multiplication is actually quite fast), considering the impact of Early-Z, I believe the overall performance will still improve, especially for advanced rendering that requires complex shading calculations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorting also does not guarantee correct results, because (transparent) objects may cross each-other.
Yes. For complex scenes, achieving correct transparent rendering remains challenging. However, rendering transparent objects in intricate scenes is inherently difficult. We should consider the performance and feasibility, aiming to meet the needs of most use cases.
In typical scenarios, we avoid intersections of transparent objects (e.g., by performing mesh geometry segmentation to minimize overlaps). Z-sorting and alpha blending suffice for most situations.
For truly complex scenes containing transparent objects, advanced techniques are necessary. Methods such as weighted blending (approximate effects), depth peeling algorithms with multiple render passes, or ray tracing can be employed. These scenes often require custom renderers and algorithms tailored to specific rendering needs.
render_pipeline_containers.extend(container_group.render_containers) | ||
# render_pipeline_containers.extend(container_group.render_containers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
? How does stuff still render?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the following _render_objects()
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to admit, I don't understand the internals of the renderer and blender well enough to comment on the changes here. I'm just worried that existing functionality may be broken or damaged. Best I can do is ask some questions to check:
- Do all the existing transparency blend modes still work?
- Are we making more roundtrips to the GPU than before (to get a single frame drawn)?
- Is pipeline state management still robust (reconstructing wgpu pipelines and recompiling shaders reactively)?
if double_sided_objects: | ||
# draw back side of double sided objects | ||
self._render_objects( | ||
double_sided_objects, renderstate, physical_viewport, command_encoder | ||
) | ||
command_encoder.copy_texture_to_texture( | ||
{ | ||
"texture": renderstate.blender.color_tex, | ||
"origin": (0, 0, 0), | ||
}, | ||
{ | ||
"texture": ensure_wgpu_object( | ||
self._shared.transmission_framebuffer | ||
), | ||
}, | ||
copy_size=self.physical_size, | ||
) | ||
generate_texture_mipmaps( | ||
self._shared.transmission_framebuffer, command_encoder | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain this special treatment of double_sided transmissive objects? Why need to copy the color texture after each one here, but not for single-sided transmissive objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's a bit complex, let me try to explain.
Rendering complex transmissive objects correctly is not an easy task. We need to balance performance, feasibility, and visual accuracy, often resorting to approximations for the effect.
First, we render all opaque objects and generate a transmission light sampling texture, which will later be used by the transmissive objects.
You might quickly realize that when there are multiple transmissive objects in the scene, the transmission light sampling texture of a transmissive object located closer to the camera should include the transmissive objects' effects behind it — this is the physically correct result. However, for real-time rendering, it is too costly to generate a separate transmission light sampling texture for each transmissive object. So, we let "alpha blending" handle the overlap between them.
But for some transmissive objects with volume, the transmission light not only affect by the light Exit Surface. The ligth Entry Surface should have an effect on the transmitted light map too, but it doesn't, Because it is the backside and be obscured. For a more realistic effect, we can set it to double-sided rendering. When rendering these objects, we first draw the back face, and update the transmission light sampling texture using the frame containing the back face. Then, when we render the front face, we get a more physically accurate transmission light sampling texture, resulting in a more realistic rendering effect.
Let's give an example, Imagine a glass sphere where light enters from the rear (the side opposite the camera) and exits from the front into the camera. The following processes occur:
- Refraction at the Entry Surface: Light enters the object from the rear (incident surface) and undergoes refraction (from air to glass). This corresponds to the first step, where the back surface of the transparent glass is rendered, and the result is updated in the transmitted light map.
- Internal Propagation: Light propagates inside the object, potentially being absorbed or scattered.
- Refraction at the Exit Surface: Light exits the object from the front (exit surface), refracting again (from glass to air) and carrying previous transmission information (color, brightness). This corresponds to the second step, where the front surface is rendered using the transmitted light map that includes the results from the first step.
This approach yields more accurate and realistic results, but more costly too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation! So IIUC this copying of the transmission_framebuffer
should in theory be done after rendering each object, but to balance performance with quality, you only do it for double-sided geometry, to make sure that this use-case works correctly. Is that more or less correct? 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, to be precise, this should occurs each time light passes through the surface of a transmissive object (both entering and exiting). However, this process is computationally expensive. Therefore, we've simplified it.
But if users wish to achieve high-quality rendering of a transmissive object and set it to be double-sided, we can provide more accurate results. This requires coordination with the modeling process, as using single-sided or double-sided rendering may necessitate adjustments to material properties such as transmission and IOR. If you model with double-sided rendering in mind, it will be more physically accurate, though it will also be more computationally demanding.
@panxinmiao I hope you will appreciate my code review comments, I'm trying to help work out what is necessary to get your amazing work merged at some point in the future. I think it's very brave and impressive that you decided to untangle the engine's core mechanisms like this! It's great to have you with us. |
Thank you for your kind words and constructive feedback. Your encouragement means a lot to me. :) |
def sort_func(wobject: WorldObject): | ||
return wobject.render_order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sort function can be much faster luckily! :)
def sort_func(wobject: WorldObject): | |
return wobject.render_order | |
sort_func = attrgetter("render_order") |
Import it like this:
from operator import attrgetter
The new blending of #1002 should make it relatively easy to fit this in, although it needs big refactoring or a fresh start. You may need to add a flag here: pygfx/pygfx/renderers/wgpu/engine/renderer.py Lines 80 to 84 in 571b996
And some logic here to set the flag (for sorting) and the pygfx/pygfx/renderers/wgpu/engine/renderer.py Lines 115 to 134 in 571b996
See for example here, where the resolve pass is applied for weighted blending: pygfx/pygfx/renderers/wgpu/engine/renderer.py Lines 703 to 726 in 571b996
|
The code has been updated. Regardless of how we design the APIs related to transparency and blending, the Currently, we divide objects into three categories: opaque, transparent (primarily determined by the In addition, double-sided transmissive objects are treated specially—they are rendered in two passes, one for the front face and one for the back face. |
This PR is now mostly ready on my side. I made some adjustments to the behavior of
In addition, this PR involves a significant refactor of the renderer’s behavior. The rendering process is now explicitly divided into three stages/passes: opaque, transparent, and weighted blend. I plan to write a separate post to explain the reasoning behind this design. There are also various other changes and optimizations, such as using the center of the bounding box instead of the object’s Ps: During the implementation of this PR, I tried to ensure that the behaviors for blending and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the effort you put into this! This would be a real cool addition 🚀 And I appreciate the efforts to keep things compatible. I can see how adjusting to the alpha-mode stuff was not always fun...
This PR is not super-big (thanks!) but still covers quite a lot of ground and hot paths in the internals 😄 I made some comments in the code.
The trickiest topic is I think still the render stages vs render_queue 🫣 .
edit: lets just continue this discussion in #1124
@@ -722,6 +754,7 @@ def render( | |||
# Get renderstate object | |||
renderstate = get_renderstate(flat.lights, self._blender) | |||
self._renderstates_per_flush[0].append(renderstate) | |||
self._shared.ensure_transmission_framebuffer_size(self.physical_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The transmission framebuffer is not placed on the shared object, i.e. it's a global texture. And it's re-created if the size does not match. This means that if there are two renderers with a different internal size, it gets re-created on each draw.
I think it makes sense to put it in the blender. There is already logic in place to only create the texture if its needed; if there are no objects that need picking there is no picking texture. We could do the same for the transmission texture. We already have flat.has_transmissive_objects
👍
if ( | ||
material.render_queue <= 2500 | ||
and material.alpha_config["method"] == "opaque" | ||
): | ||
self["OPAQUE"] = True | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now handled by the blender, so we can remove this.
$$ if OPAQUE is defined | ||
opacity = 1.0; | ||
$$ endif | ||
|
||
$$ if USE_TRANSMISSION is defined | ||
opacity *= material.transmission_alpha; | ||
$$ endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the transmission_alpha
intentionally multiplied after setting opacity=1
for opaque objects here?
if self._view_matrix is not None and sort_sign: | ||
# stack the centers of the objects for batch processing | ||
bbox = np.array( | ||
[ | ||
b | ||
if (b := item.wobject.get_world_bounding_box()) is not None | ||
else np.zeros((2, 3)) | ||
for item in render_items | ||
] | ||
) | ||
|
||
# bbox is ndarray of shape (N, 2, 3), where N is the number of items | ||
centers = (bbox[:, 0] + bbox[:, 1]) / 2 | ||
|
||
dist_flags = ( | ||
la.vec_transform(centers, self._view_matrix, projection=False) | ||
* sort_sign | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code looks scary to me, when I think it being applied to say 5000 objects on each draw...
Using the bounding box is a nice idea, and likely more precise in some cases, but I would like to know more how this affects performance. I don't think we've so far considered that the bounding box for all objects is queried at each draw. There's potential matrix multiplications to the get the world bounding box. We really want to make sure we have the caching correct etc.
command_encoder.copy_texture_to_texture( | ||
{ | ||
"texture": self._blender._textures.get("color"), | ||
"origin": (0, 0, 0), | ||
}, | ||
{ | ||
"texture": ensure_wgpu_object( | ||
self._shared.transmission_framebuffer | ||
), | ||
}, | ||
copy_size=self.physical_size, | ||
) | ||
generate_texture_mipmaps( | ||
self._shared.transmission_framebuffer, command_encoder | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe turn this into a method for re-use?
if isinstance(wobject, Group): | ||
group_order = wobject.render_order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes the behavior of render_order
so that there's an extra level of sorting based on the render_order
of any group ancestor. I'm fine with changing this behavior, but let's discuss what makes the most sense. And we should update the docs.
In current main
, the render_order
(an int) is summed with that of all its ancestors. The idea is that you usually want the child objects to be rendered "together" with the parent.
With the current change (if I read it correctly), there are now 4 sort keys: (render_queue, group_order, render_order, z). Where the group_order is the render_order
of the first ancestor that is a Group
.
I think this is the same as what ThreeJS
does? In any case it still allows changing the order of the objects withing that group. I can see how it can give some more control.
One could argue about what should happen if there are multiple group in the ancestors.
Actually, let's continue the discussion about render_queue vs render-stages in #1124, where we already talked about it a lot. |
This PR try to introduce Physically-based transparency (.transmission) property.
It provides a more realistic option for thin, transparent surfaces like glass.
Related glTF extensions:
KHR_materials_transmission
KHR_materials_volume
KHR_materials_dispersion
2.10.2.mp4
Note:
To implement the Physically-based transparency rendering logic, I have made some refactoring to the renderer, but there is still some cleanup needed.
One of the difficulties I encountered during this process was the Blender-related code. The Blender-related code is tightly coupled with the rendering logic, RendererPipeline, and even shaders, which makes adding certain render passes more challenging.