Skip to content

Implement configurable bitfield allocation strategy #56737

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vladris mannequin opened this issue Jul 10, 2011 · 10 comments
Closed

Implement configurable bitfield allocation strategy #56737

vladris mannequin opened this issue Jul 10, 2011 · 10 comments
Labels
topic-ctypes type-feature A feature request or enhancement

Comments

@vladris
Copy link
Mannequin

vladris mannequin commented Jul 10, 2011

BPO 12528
Nosy @terryjreedy, @amauryfa, @meadori
Files
  • configurable_bitfield_allocation.diff: Patch to support configurable bitfield allocation strategies
  • configurable_bitfield_allocation_v2.diff: Updated patch
  • ctypes-packed-bitfields.tar.bz2
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2011-07-10.19:52:36.799>
    labels = ['ctypes', 'type-feature']
    title = 'Implement configurable bitfield allocation strategy'
    updated_at = <Date 2017-07-23.00:20:21.166>
    user = 'https://bugs.python.org/vladris'

    bugs.python.org fields:

    activity = <Date 2017-07-23.00:20:21.166>
    actor = 'Charles Machalow'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['ctypes']
    creation = <Date 2011-07-10.19:52:36.799>
    creator = 'vladris'
    dependencies = []
    files = ['22618', '22728', '23084']
    hgrepos = []
    issue_num = 12528
    keywords = ['patch']
    message_count = 9.0
    messages = ['140088', '140092', '140842', '140989', '143339', '143342', '143351', '143368', '298881']
    nosy_count = 6.0
    nosy_names = ['terry.reedy', 'amaury.forgeotdarc', 'meador.inge', 'santoso.wijaya', 'vladris', 'Charles Machalow']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue12528'
    versions = ['Python 3.3']

    @vladris
    Copy link
    Mannequin Author

    vladris mannequin commented Jul 10, 2011

    Opened this issue to track configurable bitfield allocation strategy. This will address issues like http://bugs.python.org/issue6069, http://bugs.python.org/issue11920.

    Summary: the way bitfields are allocated is up to the compiler not defined by standard. MSVC and GCC have different strategies to perform the allocation so the size of bitfield structures can be different depending on compiler. Currently we hardcode allocation strategy to be GCC-way on non-Windows and MSVC-way on Windows which raises issues when trying to interop on Windows with GCC binaries.

    Short term this solution will enable interop between MSVC compiled Python with GCC compiled binaries under Windows. It will also enable addressing other possible compiler interop issues in the future, for compilers that don't use GCC strategy.

    Following is copied from thread discussing this:

    On 6/25/2011 12:33 PM, Vlad Riscutia wrote:
    I recently started looking at some ctypes issues. I dug a bit into
    http://bugs.python.org/issue6069 and then I found
    http://bugs.python.org/issue11920. They both boil down to the fact that
    bitfield allocation is up to the compiler, which is different in GCC and
    MSVC. Currently we have hard-coded allocation strategy based on paltform
    in cfields.c:

    if (bitsize /* this is a bitfield request */

           &&  *pfield_size /* we have a bitfield open */
     #ifdef MS_WIN32
           /* MSVC, GCC with -mms-bitfields */
           &&  dict->size * 8 == *pfield_size
     #else
           /* GCC */
           &&  dict->size * 8<= *pfield_size
     #endif
           &&  (*pbitofs + bitsize)<= *pfield_size) {
           /* continue bit field */
           fieldtype = CONT_BITFIELD;
     #ifndef MS_WIN32
       } else if (bitsize /* this is a bitfield request */
           &&  *pfield_size /* we have a bitfield open */
           &&  dict->size * 8>= *pfield_size
           &&  (*pbitofs + bitsize)<= dict->size * 8) {
           /* expand bit field */
           fieldtype = EXPAND_BITFIELD;
     #endif

    So when creating a bitfield structure, it's size can be different on
    Linux vs Windows.

    class MyStructure(ctypes.BigEndianStructure):
        _pack_      = 1    # aligned to 8 bits, not ctypes default of 32
        _fields_    = [
                       ("Data0",   ctypes.c_uint32, 32),
                       ("Data1",   ctypes.c_uint8, 3),
                       ("Data2",   ctypes.c_uint16, 12),
                      ]

    sizeof for above structure is 6 on GCC build and 7 on MSVC build. This
    leads to some confusion and issues, because we can't always interop
    correctly with code compiled with different compiler than the one Python
    is compiled with on the platform.

    Just curious, are you saying that this is the 'cause' of the two bug reports, or 'just' something you discovered while investigating them?

    Short term solution is to add a warning in the documentation about this.

    For 2.7/3.2, yes.

    Longer term though, I think it
    would be better to add a property on the Structure class for
    configurable allocation strategy, for example Native (default), GCC,
    MSVC and when allocating the bitfield, use given strategy. Native would
    behave similar to what happens now, but we would also allow GCC-style
    allocation on Windows for example and the ability to extend this if we
    ever run into similar issues with other compilers. This shouldn't be too
    difficult to implement, will be backwards compatible and it would
    improve interop. I would like to hear some opinions on this.

    If this would allow the MSVC-compilied Python to better access dlls compiled with gcc (cygwin) on Windows, definitely -- in 3.3.
    If the new feature is (currently) only useful on Windows, doc should say so.

    --
    Terry Jan Reedy

    /copy

    Attached is patch with initial refactoring of cfield.c to enable configurable allocation. Next step is to provide a way to configure this through Python library. I will also look at updating documentation to point out the known issue.

    @vladris vladris mannequin added topic-ctypes type-feature A feature request or enhancement labels Jul 10, 2011
    @vladris
    Copy link
    Mannequin Author

    vladris mannequin commented Jul 10, 2011

    Removed previously attached partial patch, this is complete patch.

    Summary:
    Added following 3 constants in ctypes:
    ctypes.BITFIELD_ALLOCATION_NATIVE
    ctypes.BITFIELD_ALLOCATION_GCC
    ctypes.BITFIELD_ALLOCATION_MSVC

    Setting _bitfield_allocation_ attribute to one of these on a class declaration inheriting from Structure will force specified allocation of the bitfield.

    NATIVE is equivalent to not specifying anything.
    GCC will do GCC-style allocation (what Python does now on non-Windows)
    MSVC will do MSVC-style allocation (what Python does now on Windows)

    I added unittests to cover these and ran full suit on both Windows and Linux.

    Still have to update documentation to mention this. Will submit diff for that after this gets reviewed.

    @amauryfa
    Copy link
    Contributor

    My review of the patch: http://bugs.python.org/review/12528/show

    @vladris
    Copy link
    Mannequin Author

    vladris mannequin commented Jul 23, 2011

    Updated patch to reflect review feedback. Allocation strategy is now specified as string in Python code.

    I kept asserts in CanContinueField/CanExpandField because, as I said, default case should never be hit. Input is validated in stgdict and this should make it clear to whoever reads the code that something is very wrong if execution gets to default case.

    @meadori
    Copy link
    Member

    meadori commented Sep 1, 2011

    As stated, how a particular compiler allocates bitfields is *extremely* implementation specific. There can be differences in implementations between different compilers, different *versions* of the same compiler, and different invocations of the same compiler where the options are varied.

    I am wondering whether adding this feature will open up a can of worms that we don't want to deal with. There are other options beyond MSVC and GCC that seem reasonable. For example, GCC packs bitfields together on structures defined with '__attribute__((packed))'. Do we need a GCCPACKED option now?

    Also, GCC 4.4 fixed a bug that can lead to differences in structure layout from previous versions. See -Wpacked-bitfield-compat option [1].
    Do we need a GCC44 option now?

    Finally, structure layout is architecture specific. GCC for x86, for example, has the 'ms_struct' attribute extensions for x86 [2]. Does this mean that for a GCC compiled Python that the MSVC option will only work for an x86 host?

    My point is that there are many, many variations on how a *single* compiler can allocate bitfields. So just saying "GCC allocation"
    strategy is not adequate.

    So, lets take a step back. What exact problem is this feature trying
    to solve? Is one of the use cases that 'ctypes' for a Windows hosted Python built with MSVC++ can interop with C code compiled with a Windows hosted GCC (e.g. cygwin)? If so, then is that realistic? ISTM, that there are other ABI differences that would prevent that from working.

    It seem perfectly reasonable to me that ctypes will only interact with bits that were constructed with the exact same compiler (and options) as the interpreter itself. If it is not already, then we should document this.

    [1] http://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html
    [2] http://gcc.gnu.org/onlinedocs/gcc/Variable-Attributes.html

    @vladris
    Copy link
    Mannequin Author

    vladris mannequin commented Sep 1, 2011

    Well currently we pack bitfields with an algorithm that uses #ifdefs for GCC and MSVC builds. This feature tries to remove the hardcoded behavior and implement it as a runtime option. This should improve interop with other compilers. Currently I provided these for MSVC-style and GCC-style bitfield allocations. These, of course, can be extended with other strategies.

    I am not sure that the fact that GCC has different types of bitfield allocations in different versions is a point against this feature. Consider that in our current code we don't use compiler bitfield allocation, we create the structure layout using our own algorithm, interop might be broken even if Python gets built with same version of GCC as the binary we want to interop with as long as algorithm is out of date. This patch should provide some flexibility in this matter.

    Wouldn't a GCC44 constant provided at API level be better than saying "you can't interop with anything build with GCC 4.4 and up"? Or vice-versa, anything built with GCC < 4.4.

    @terryjreedy
    Copy link
    Member

    As a (Windows) user, I would like to be able to download any working pre-compiled shared library (.dll) and access it via ctypes. The particular compiler used to compile cpythonx.y.z should not determine whether a Pythonx.y program works. The use of VSC2008 is not part of the Python3.2 definition. So I am in favor of features than makes ctypes more likely to work.

    I understand that this should be easy if the datatypes sent and received are standard ints, floats, and arrays thereof, since the bit patterns are then knowable. I gather that the problem with custom bitfields is that the resulting bit pattern format is not only not documented in the .dll, but is also not determined by the external documentation (the .h files).

    Does anyone know how Cython, for instance, handles this problem? Stephen Behnel recommends it as an alternative to ctypes. Does it even try to deal with bitfields?

    @meadori
    Copy link
    Member

    meadori commented Sep 2, 2011

    Hi Vlad,

    On Thu, Sep 1, 2011 at 1:30 PM, Vlad Riscutia <[email protected]> wrote:

    Vlad Riscutia <[email protected]> added the comment:

    Well currently we pack bitfields with an algorithm that uses #ifdefs for GCC and MSVC builds. This feature tries to remove the
    hardcoded behavior and implement it as a runtime option. This should improve interop with other compilers. Currently I provided these
    for MSVC-style and GCC-style bitfield allocations. These, of course, can be extended with other strategies.

    Yup, I understand what the feature is doing. I just wanted to
    reiterate (as I am sure you already know) that typically the rules for
    implementing bit fields in a particular compiler are pretty complex.
    I just wanted to make sure that the use cases are out there to justify
    the complexity. bpo-11920 really does seem like a use case where
    someone would otherwise be stuck without a runtime configurable
    allocation strategy.

    BTW, out of curiosity I explored the packed case that I mentioned
    earlier with GCC more (reproduction case attached):

    [meadori@motherbrain ctypes]$ make clean; make
    rm -f foo.o libfoo.so.1.0
    gcc -Wall -fPIC -c foo.c
    gcc -shared -Wl,-soname,libfoo.so.1 -o libfoo.so.1.0 foo.o
    python repro.py
    In Python: (85, 85)
    From C: (85, 85)
    From Python: (85, 85)
    [meadori@motherbrain ctypes]$ make clean; CFLAGS=-DPACK=1 make
    rm -f foo.o libfoo.so.1.0
    gcc -Wall -fPIC -DPACK=1 -c foo.c
    gcc -shared -Wl,-soname,libfoo.so.1 -o libfoo.so.1.0 foo.o
    python repro.py
    In Python: (85, 85)
    From C: (85, 85)
    From Python: (85, 170)

    This shows that there are already cases that can't be handled with
    packed bit fields and GCC. This runtime configuration feature, could
    fix this case as well. However, it is probably better to wait for a
    real world use case before implementing some pathological case that I
    cooked up ;-)

    Wouldn't a GCC44 constant provided at API level be better than saying "you can't interop with anything build with GCC 4.4 and up"? Or > vice-versa, anything built with GCC < 4.4.

    Yeah, probably so. I think the compiler constraint I stated before is
    probably a bit too strong. I need to think about it more. But,
    again, my point is that there are a lot of different variations when
    it comes to bit field allocations. I don't think we want to end up
    implementing all of them.

    @CharlesMachalow
    Copy link
    Mannequin

    CharlesMachalow mannequin commented Jul 23, 2017

    Was browsing and found this.

    This option would be very useful as it could help avoid a current bug I've had to deal with : https://bugs.python.org/issue29753. My use case works with data/structs from another device all-together, so I can't control it's packing. However since GCC/Linux builds currently have the specified bug, I can't get around it. This option would have allowed me to just specify the MSVC strategy, which in general is what I want to not have to have OS-specific behavior. With the bug in mind, this could have almost be a bug-workaround for me.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @encukou
    Copy link
    Member

    encukou commented May 13, 2025

    #97702 added a _layout_ attribute that allows specifying the layout, along with fixing GCC layout bugs and moving the layout logic to Python.
    Apologies for overlooking this patch.

    @encukou encukou closed this as completed May 13, 2025
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    topic-ctypes type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants