Skip to content

Small Performance Improvements#27

Open
bkin wants to merge 2 commits intopmachata:masterfrom
bkin:perf
Open

Small Performance Improvements#27
bkin wants to merge 2 commits intopmachata:masterfrom
bkin:perf

Conversation

@bkin
Copy link

@bkin bkin commented Sep 15, 2016

Hi Petr,

thanks for writing dwgrep! I'm currently experimenting a bit with it. The lib I'm testing with has 24 million DIEs which takes 9.3s to count.
My PR improves this to about 6.5s, mostly by doing fewer allocations.

Benjamin King added 2 commits September 15, 2016 23:32
Profiling did show allocations below all_dies_iterator::operator++().
This was due to the value being returned by copy, copying the m_stack
vector along the way.

Changing m_stack into a shared_ptr gave a ~10% performance win for my
use case (large lib with ~24 million DIEs)
Destruction releases memory for zw_stack::m_values which takes time. It
is a little faster to keep a few stacks around and just call
m_values.clear(), which destroys the contained object without releasing
the memory.
@pmachata
Copy link
Owner

pmachata commented Oct 4, 2016

Hi. Sorry for not being able to get around to this patchset. I'm not against merging this in principle, but:

  • die_it_producer::m_stack shouldn't be a std::shared_ptr—it wasn't shared before, and it's not shared now either. A given stack is either exclusively owned by a die_it_producer, or in the stack cache. It should be either std::unique_ptr, or possibly even a plain vector like before, which you would emplace to the cache on destruction (so that the vector data is moved to a new cache element).
  • should zw_stack_init just call stack_cache::acquire as well?
  • what was the reason that all_dies_iterator::m_stack went behind a std::shared_ptr as well? I guess end() iterator construction is a bit cheaper now, is that actually useful? Or am I missing something?

There are also smaller points:

  • coding style is inconsistent with the rest of the project (mainly the whitespace)
  • aquire should be spelled acquire

Thanks for the contribution!

@pmachata
Copy link
Owner

pmachata commented Oct 4, 2016

And I see you address my third point in your commit message, so never mind. The trouble is that sharing the stack breaks the semantics, because the two iterators end up being entangled, and that breaks expectations. We'll need to figure out something else. Maybe returning the iterator as const, but that's also not how things should be. Would moving some of the methods to .hh (making them inline—the ctors, the operators) help things? The compiler might be able to see that the result is thrown away and simply elide the copy ctor altogether.

@bkin
Copy link
Author

bkin commented Oct 9, 2016

Hi Petr,

sorry, I was on vacation and am only seeing your comments now. Frankly, I was just picking some low hanging fruits performancewise while dabbeling a bit with dwgrep. I did not try to completely understand the semantics. If I can find some time in the next days, I'll evaluate your ideas of exchanging shared_ptr with unique_ptr and do a bit of inlining.

Thanks for looking into my PR anyway, it's my first on github ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants