-
Notifications
You must be signed in to change notification settings - Fork 24
Description
Hello! So as apart of investigation into potentially improving perf, I've collected some stats, and I've identified two target areas that appear to occupy largest portion of meshing budget:
Reconstructed 11790 vertices (indices=64464) from 1000 particlces in 43.492334ms and pushed in 43.679657ms
reconstruct_surface: 100.00%, 43.49ms/call @ 22.99Hz
compute minimum enclosing aabb: 0.01%, 0.01ms/call @ 22.99Hz
neighborhood_search: 11.67%, 5.07ms/call @ 22.99Hz
parallel_generate_cell_to_particle_map: 26.25%, 1.33ms/call @ 22.99Hz
get_cell_neighborhoods_par: 5.06%, 0.26ms/call @ 22.99Hz
calculate_particle_neighbors_par: 64.24%, 3.26ms/call @ 22.99Hz
parallel_compute_particle_densities: 0.47%, 0.21ms/call @ 22.99Hz
parallel_generate_sparse_density_map: 41.18%, 17.91ms/call @ 22.99Hz
triangulate_density_map: 46.62%, 20.28ms/call @ 22.99Hz
interpolate_points_to_cell_data: 91.94%, 18.64ms/call @ 22.99Hz
generate_iso_surface_vertices: 84.61%, 15.77ms/call @ 22.99Hz
relative_to_threshold_postprocessing: 15.36%, 2.86ms/call @ 22.99Hz
triangulate: 8.04%, 1.63ms/call @ 22.99Hz
So for meshing every frame the 1k particles, it takes from 30-50ms; Ideally we can get this down somewhere close to 16ms, so that we could have a one-frame latency delay on generating the meshes for a realtime sim in 60fps.
As such, it looks like generate_iso_surface_vertices (15.7ms) and parallel_generate_sparse_density_map (17.9ms) are good candidates.
I don't know much about fluid simulations, so I'll defer to you on matters here, but I have done a lot of work in perf and optimization; do you think there's any place to attack here, and if so, mind giving me a pointer so I could start/take a look? :)
I'm also wondering perhaps is there any data structures we don't have to compute every frame? Perhaps the density map? Or similar to #4 we could perhaps reuse container structures to reduce allocation strain?
Thanks, and looking forward to your insights here :)