Skip to content

memcached: use own small's arena #102

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Apr 6, 2022
Merged

Conversation

ligurio
Copy link
Member

@ligurio ligurio commented Feb 1, 2022

Fixes #96

@ligurio ligurio marked this pull request as draft February 1, 2022 16:04
@ligurio ligurio force-pushed the ligurio/gh-96-own-small-alloc branch 3 times, most recently from 8ee97da to 8900822 Compare February 4, 2022 14:46
@ligurio ligurio force-pushed the ligurio/gh-96-own-small-alloc branch 4 times, most recently from 358ea7d to 3e9c300 Compare February 10, 2022 16:42
@ligurio ligurio marked this pull request as ready for review February 10, 2022 16:42
@ligurio ligurio force-pushed the ligurio/gh-96-own-small-alloc branch 4 times, most recently from 8f728d4 to 7d7fbf2 Compare February 11, 2022 13:06
@ligurio ligurio requested a review from Totktonada February 11, 2022 13:38
Copy link
Member

@Totktonada Totktonada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I'm thinking on the patchset and collecting review comments to discuss, I'll left several trivial comments, which are easy to fix.

@ligurio ligurio force-pushed the ligurio/gh-96-own-small-alloc branch 2 times, most recently from c25064c to 225d2e8 Compare February 16, 2022 05:40
@ligurio ligurio force-pushed the ligurio/gh-96-own-small-alloc branch 7 times, most recently from 415356b to 3c5a39f Compare February 21, 2022 14:05
@ligurio ligurio requested a review from Totktonada February 21, 2022 14:41
@Totktonada
Copy link
Member

Brief feedback:

  1. I can't find a reason to spawn slab cache per fiber. In tarantool we have one per cord (IOW, one per thread). Let's initialize it at the module initialization.

  2. I would like to see allocator (de)initialization code in memcached/internal/alloc.c. I guess three new functions: initialization, deinitialization, get slab cache pointer.

  3. Since we have only one arena limited by given quota, we'll always have quota_used == arena_size. If we'll keep just quota stats, it'll looks easier to understand. I would also make memcached.slab a table of functions just like box.slab, because this way we can add memcached.info.stats() later. So I would like to see the following:

    tarantool> memcached.slab.info()
    ---
    - quota_size: <...>
      quota_used: <...>
      quota_used_ratio: <...>%
    ...

    Here we don't track internal fragmentation (within slabs), but it is enough now.

@ligurio
Copy link
Member Author

ligurio commented Mar 1, 2022

I can't find a reason to spawn slab cache per fiber. In tarantool we have one per cord (IOW, one per thread). Let's initialize it at the module initialization.

Done.

I would like to see allocator (de)initialization code in memcached/internal/alloc.c. I guess three new functions: initialization, deinitialization, get slab cache pointer.

Moved functions for allocations to alloc.c. There is no deinitialization function in memcached and I have no idea how to call it if it would exist, so currently memcached creates arena and shared slab cache and never destroy them.
"get slab cache pointer." - done.

Since we have only one arena limited by given quota, we'll always have quota_used == arena_size. If we'll keep just quota stats, it'll looks easier to understand. I would also make memcached.slab a table of functions just like box.slab, because this way we can add memcached.info.stats() later. So I would like to see the following:

Here we don't track internal fragmentation (within slabs), but it is enough now.

Done.

@Totktonada Totktonada self-requested a review March 1, 2022 14:03
Copy link
Member

@CuriousGeorgiy CuriousGeorgiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see allocator (de)initialization code in memcached/internal/alloc.c. I guess three new functions: initialization, deinitialization, get slab cache pointer.

Me too. At least, I would like to see a clarification from Lango team that this is not required or possible: the potential memory leak that could be caused by module unload in runtime boggles me.

@ligurio ligurio force-pushed the ligurio/gh-96-own-small-alloc branch 3 times, most recently from e96c82c to 232ed25 Compare April 5, 2022 12:34
@ligurio ligurio assigned CuriousGeorgiy and unassigned ligurio Apr 5, 2022
@CuriousGeorgiy
Copy link
Member

@Totktonada FYI, for some reason I cannot resolve conversations in this PR and my status near Reviewers section is displayed grey.

@Totktonada
Copy link
Member

@Totktonada FYI, for some reason I cannot resolve conversations in this PR and my status near Reviewers section is displayed grey.

Should be resolved now. I re-requested the review to test it.

@CuriousGeorgiy
Copy link
Member

@Totktonada It works now, thanks!

@ligurio ligurio force-pushed the ligurio/gh-96-own-small-alloc branch from 232ed25 to 9a28693 Compare April 6, 2022 08:39
ligurio added 8 commits April 6, 2022 12:39
Name of internal function for creating a new memcached instance is
memcached_init(). It may be confusing - name can be mixed up with
name of function for module initialization.

In the following commit we will need an explicit function
for module initialization, memcached_init() was renamed to
memcached_create_instance() for clarity.

This commit changes internal function's name, no impact on public API.

Needed for #96
Historically memcached module uses slab cache created by Tarantool.
Fibers get access to it via call of function cord_slab_cache() that is a
part of Tarantool C API. The problem is happen when memcached and
Tarantool use different versions of Small library where ABI is broken.
Issues #59 and #96 are examples of such problems.

We decided use independent slab arena in memcached module to avoid such
problems in future and this patch implements it.

Below I'll describe what is important in switching to own arena.

memcached module uses different types of Small's allocators (ibuf and
obuf used for every network connection, one slab cache used per fiber).
There are two important things with used Small's allocators:

1. some allocators depends on other and sequence of allocations should
   be correct. README in small source code repository has a clear
   description of relationships between allocators [1].
2. some allocators are thread-safe and others not. slab arena is shared
   between all memcached instances, when slab cache allocated on top of
   slab arena is not thread-safe and must be used in scope of fiber.

memcached performs memory allocations and deallocations on different
stages and I would list all these stages explicitly:

- memcached module initialization
- memcached module deinitialization
- memcached instance gc
- memcached instance initialization
- memcached instance start
- memcached instance stop
- memcached instance deinitialization

Most part of these cases covered in tests
test/common/instance_api.test.lua and test/common/module_api.test.lua
that were added in previous commits. The exception here is gc that was
checked manually.

In tests function is_port_open() has been replaced to memcached
minimalistic client that do set value, get value with the same key and
compare both values. This check is more rigorous than check port
openess.

1. https://github.com/tarantool/small

Fixes #96
Before switching to our own slab arena user can control usage and
information about slab arena using Tarantool Lua API (specifically
box.slab.{check,info,stat}() functions), see [1]. Example of output
box.slab.info() is below:

tarantool> box.slab.info()
---
- items_size: 146896
  items_used_ratio: 26.05%
  quota_size: 268435456
  quota_used_ratio: 12.50%
  arena_used_ratio: 3.7%
  items_used: 38264
  quota_used: 33554432
  arena_size: 33554432
  arena_used: 1234296
...

With our own slab arena we need a way to obtain information about arena.
It's a reason why new module function memcached.slab.info() was
introduced. Output of memcached.slab.info() is similar to output
produced by `box.slab.info()`, but without items_size, items_used,
items_used_ratio and arena_used.

tarantool> mc = require('memcached')
creating arena with 4294967295 bytes...
allocate slab cache with slab size 4194304
---
...

tarantool> mc.slab.info()
---
- quota_used: 0
  quota_size: 4294967296
  quota_used_ratio: 0%
...

1. https://www.tarantool.io/en/doc/latest/reference/reference_lua/box_slab/slab_info/

Follows up #96
Testing for Tarantool 2.10 has been disabled in commit "github-ci: skip running
tests for tarantool 2.10" (763c73d) due to
issue #96. This patch reverts these changes because #96 is fixed.

Follows up #96
@ligurio ligurio force-pushed the ligurio/gh-96-own-small-alloc branch from 9a28693 to 17959b4 Compare April 6, 2022 09:42
@ligurio ligurio merged commit cdb28b0 into master Apr 6, 2022
@Totktonada Totktonada deleted the ligurio/gh-96-own-small-alloc branch May 12, 2022 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

All memcached tests fail with tarantool 2.10.0-beta1-376-gd2a012455 (master branch)
4 participants