[meta] Dota performance issues on Metal

Analyzing the Instruments profiles reveals quite a few things we could do better.

### Native API calls
  - [x] Inefficient ObjectiveC selector lookups - #2145
  - [x] ~~Too many ObjectiveC messages. Roughly, our profile shows 5-7% dedicated to `objc_msgSend` versus 0.3-0.5% in MoltenVK. The difference in retain/release calls is of a similar scale.~~
  - [x] Too many `retain`/`release` calls - #2175
  - [ ] Retains & releases could be faster if not done via messaging  - https://github.com/gfx-rs/metal-rs/issues/58 .
  - [x] ~~Attribute access goes through messages - #2167~~
  - [x] `Class::get` calls are super slow - https://github.com/SSheldon/rust-objc/issues/65 . We spend roughly 3% of our library's time doing those calls.
  - [x] nsAutoreleasePool could be created faster - #2267, https://github.com/SSheldon/rust-objc/pull/67

### Metal backend issues
  - [x] Too much heap (re)allocation - #2185. We should avoid any allocations at run-time by re-using the storage and/or using the iterators more aggressively.
  - [x] Too many command buffers - #2180. The instrumented memory profile doesn't show command buffers being re-used. I suppose the queue just creates a new one every time until it reaches the capacity, and then starts trying to re-use the completed ones. This is undesired, and we may attempt to address it (temporarily) by playing with the capacity limit.
  - [x] Dynamic creation of a render pass - #2178. We create one every time a render pass is started by copying one from the framebuffer and filling out all the clear values/operations. Apparently, this is super heavy, and we should probably avoid creating any Metal objects at normal run time.
  - [x] Dynamic creation of depth-stencil states - #2195.
  - [x] Binding descriptor sets is visibly slow - #2183. We should avoid doing work for repeated bindings, and need to explore ways to share heap-allocated data between sets of the same group.
  - [x] Non-existent semaphore frame synchronization  - #2143
  - [x] Too much locking going on - #2229
  - [x] Too many command buffer callbacks - #2224
  - [x] Copying the render pass descriptor (which we do at each start of an RP) takes 7.4% of the time in our library
  - [ ] Binding descriptor sets still involves a few hot loops that could be faster. In particular, we set resources in batches, and we provide a closure, which checks for the current pre-render status. All of that can be simplified if we are outside of pass.

### Portability issues
  - [x] Leaking descriptor sets - https://github.com/gfx-rs/portability/issues/99
  - [x] Too much data moving and undesired heap allocation (during descriptor updates in particular)

### Application concerns
  - [x] The engine operates within an assumption that command buffer recording is cheap, while submission is expensive. They move the submission onto a separate thread, which doesn't appear to be saturated enough.
    - the chosen threading model (of having 2 threads for job execution and a dedicated submission thread) allows MoltenVK to effectively run on 3 threads, while we are limited to 2. ~~This is something Dota should fix or expose for us to test.~~
    - passing "-threads 3" technically solves this
  - [x] Submissions are too late - #2232, #2260 (more work is possible)

### Things to investigate
  - [ ] Actual memory types/heaps used by an application. Maybe we could tweak the queries and/or ask Valve to fix those in order to use our exposed memory more efficiently. This is certainly a difference with Molten.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[meta] Dota performance issues on Metal #2161

Native API calls

Metal backend issues

Portability issues

Application concerns

Things to investigate

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[meta] Dota performance issues on Metal #2161

Description

Native API calls

Metal backend issues

Portability issues

Application concerns

Things to investigate

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions