Skip to content

Optimize pack() #18524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Optimize pack() #18524

wants to merge 2 commits into from

Conversation

nielsdos
Copy link
Member

@nielsdos nielsdos commented May 8, 2025

Instead of using lookup tables, we can use a combination of shifts and
byte swapping to achieve the same thing in less cycles and with less
code.

Benchmark files

pack1.php:

for ($i = 0; $i < 10_000_000; ++$i) {
    pack("J", 0x7FFFFFFFFFFFFFFF);
}

pack2.php:

for ($i = 0; $i < 4000000; ++$i) {
    pack("nvc*", 0x1234, 0x5678, 65, 66);
}

Results

On an i7-4790:

Benchmark 1: ./sapi/cli/php pack1.php
  Time (mean ± σ):     408.8 ms ±   3.4 ms    [User: 406.1 ms, System: 1.6 ms]
  Range (min … max):   403.6 ms … 413.6 ms    10 runs

Benchmark 2: ./sapi/cli/php_old pack1.php
  Time (mean ± σ):     451.7 ms ±   7.7 ms    [User: 448.5 ms, System: 2.0 ms]
  Range (min … max):   442.8 ms … 461.2 ms    10 runs

Summary
  ./sapi/cli/php pack1.php ran
    1.11 ± 0.02 times faster than ./sapi/cli/php_old pack1.php

Benchmark 1: ./sapi/cli/php pack2.php
  Time (mean ± σ):     239.3 ms ±   6.0 ms    [User: 236.2 ms, System: 2.3 ms]
  Range (min … max):   233.2 ms … 256.8 ms    12 runs

Benchmark 2: ./sapi/cli/php_old pack2.php
  Time (mean ± σ):     271.9 ms ±   3.3 ms    [User: 269.7 ms, System: 1.3 ms]
  Range (min … max):   267.4 ms … 279.0 ms    11 runs

Summary
  ./sapi/cli/php pack2.php ran
    1.14 ± 0.03 times faster than ./sapi/cli/php_old pack2.php

On an i7-1185G7:

Benchmark 1: ./sapi/cli/php pack1.php
  Time (mean ± σ):     263.7 ms ±   1.8 ms    [User: 262.6 ms, System: 0.9 ms]
  Range (min … max):   261.5 ms … 268.2 ms    11 runs

Benchmark 2: ./sapi/cli/php_old pack1.php
  Time (mean ± σ):     303.3 ms ±   6.5 ms    [User: 300.7 ms, System: 2.3 ms]
  Range (min … max):   297.4 ms … 318.1 ms    10 runs

Summary
  ./sapi/cli/php pack1.php ran
    1.15 ± 0.03 times faster than ./sapi/cli/php_old pack1.php

Benchmark 1: ./sapi/cli/php pack2.php
  Time (mean ± σ):     156.7 ms ±   2.9 ms    [User: 154.7 ms, System: 1.7 ms]
  Range (min … max):   151.6 ms … 164.7 ms    19 runs

Benchmark 2: ./sapi/cli/php_old pack2.php
  Time (mean ± σ):     174.6 ms ±   3.3 ms    [User: 171.9 ms, System: 2.3 ms]
  Range (min … max):   170.7 ms … 180.4 ms    17 runs

Summary
  ./sapi/cli/php pack2.php ran
    1.11 ± 0.03 times faster than ./sapi/cli/php_old pack2.php

@staabm
Copy link
Contributor

staabm commented May 21, 2025

maybe a interessting benchmark/test-case for unpack: stomp-php/stomp-php#184

<?php
$file = file_get_contents('FILE');
echo count(unpack('C*', $file)) . "\n";

vs.

<?php
$file = file_get_contents('FILE');
echo strlen($file) . "\n";

using truncate -s 80M FILE.

the strlen() variant is a lot faster

@nielsdos
Copy link
Member Author

nielsdos commented Jun 8, 2025

maybe a interessting benchmark/test-case for unpack: stomp-php/stomp-php#184

<?php
$file = file_get_contents('FILE');
echo count(unpack('C*', $file)) . "\n";

vs.

<?php
$file = file_get_contents('FILE');
echo strlen($file) . "\n";

using truncate -s 80M FILE.

the strlen() variant is a lot faster

Unpack will always be slower than just strlen. However, your code revealed that repetitions were handled in a slow way where lots of temporary strings were created and then parsed. I opened a PR to fix that particular issue: #18803

@nielsdos nielsdos mentioned this pull request Jun 10, 2025
@divinity76
Copy link
Contributor

divinity76 commented Jun 10, 2025

could benchmark

static inline void php_pack(const zval *val, size_t size,
                            php_pack_endianness enc, char *out)
{
    zend_long z = zval_get_long(val);

    if ((enc == PHP_LITTLE_ENDIAN) != MACHINE_LITTLE_ENDIAN) {
        z = PHP_LONG_BSWAP(z);
    }
    memcpy(out, (char*)&z + sizeof(z) - size, size);
}

might be faster

@nielsdos
Copy link
Member Author

Very strangely, my original code with zend_never_inline is slightly faster than master, but your code without zend_never_inline seems to beat that in my test with the 'J' specifier. Testing some more stuff...

@divinity76
Copy link
Contributor

if the performance difference insignificant/marginal, as in hardly even benchmark-able, i would recommend just ignoring it.

I like how this makes pack the code much simpler (assuming it actually works on BE)

@nielsdos
Copy link
Member Author

I think I managed to make the compiler happy and let it make good inlining decisions while keeping the code simple.

For example for this:

for ($i = 0; $i < 10_000_000; ++$i) {
  pack("J", 0x7FFFFFFFFFFFFFFF);
}

On an i7-4790:

Benchmark 1: ./sapi/cli/php pack.php
  Time (mean ± σ):     408.8 ms ±   3.4 ms    [User: 406.1 ms, System: 1.6 ms]
  Range (min … max):   403.6 ms … 413.6 ms    10 runs
 
Benchmark 2: ./sapi/cli/php_old pack.php
  Time (mean ± σ):     451.7 ms ±   7.7 ms    [User: 448.5 ms, System: 2.0 ms]
  Range (min … max):   442.8 ms … 461.2 ms    10 runs
 
Summary
  ./sapi/cli/php pack.php ran
    1.11 ± 0.02 times faster than ./sapi/cli/php_old pack.php

And for this:

for ($i=0;$i<4000000;$i++)
pack("nvc*", 0x1234, 0x5678, 65, 66);

On the same machine:

Benchmark 1: ./sapi/cli/php pack.php
  Time (mean ± σ):     239.3 ms ±   6.0 ms    [User: 236.2 ms, System: 2.3 ms]
  Range (min … max):   233.2 ms … 256.8 ms    12 runs
 
Benchmark 2: ./sapi/cli/php_old pack.php
  Time (mean ± σ):     271.9 ms ±   3.3 ms    [User: 269.7 ms, System: 1.3 ms]
  Range (min … max):   267.4 ms … 279.0 ms    11 runs
 
Summary
  ./sapi/cli/php pack.php ran
    1.14 ± 0.03 times faster than ./sapi/cli/php_old pack.php

Let's hope it's reproducible

@nielsdos nielsdos force-pushed the opt-pack branch 2 times, most recently from 47a5320 to 3b1918b Compare June 11, 2025 19:01
@nielsdos nielsdos changed the title [WIP] Optimize pack() Optimize pack() Jun 11, 2025
@nielsdos
Copy link
Member Author

It's consistent on my laptop. Squashed the commits and updated the descriptions.

@nielsdos nielsdos marked this pull request as ready for review June 11, 2025 19:29
@nielsdos nielsdos requested a review from bukka as a code owner June 11, 2025 19:29
nielsdos added 2 commits June 12, 2025 19:24
Instead of using lookup tables, we can use a combination of shifts and
byte swapping to achieve the same thing in less cycles and with less
code.

Benchmark files
---------------

pack1.php:
```php
for ($i = 0; $i < 10_000_000; ++$i) {
    pack("J", 0x7FFFFFFFFFFFFFFF);
}
```

pack2.php:
```php
for ($i = 0; $i < 4000000; ++$i) {
    pack("nvc*", 0x1234, 0x5678, 65, 66);
}
```

On an i7-4790:
```
Benchmark 1: ./sapi/cli/php pack1.php
  Time (mean ± σ):     408.8 ms ±   3.4 ms    [User: 406.1 ms, System: 1.6 ms]
  Range (min … max):   403.6 ms … 413.6 ms    10 runs

Benchmark 2: ./sapi/cli/php_old pack1.php
  Time (mean ± σ):     451.7 ms ±   7.7 ms    [User: 448.5 ms, System: 2.0 ms]
  Range (min … max):   442.8 ms … 461.2 ms    10 runs

Summary
  ./sapi/cli/php pack1.php ran
    1.11 ± 0.02 times faster than ./sapi/cli/php_old pack1.php

Benchmark 1: ./sapi/cli/php pack2.php
  Time (mean ± σ):     239.3 ms ±   6.0 ms    [User: 236.2 ms, System: 2.3 ms]
  Range (min … max):   233.2 ms … 256.8 ms    12 runs

Benchmark 2: ./sapi/cli/php_old pack2.php
  Time (mean ± σ):     271.9 ms ±   3.3 ms    [User: 269.7 ms, System: 1.3 ms]
  Range (min … max):   267.4 ms … 279.0 ms    11 runs

Summary
  ./sapi/cli/php pack2.php ran
    1.14 ± 0.03 times faster than ./sapi/cli/php_old pack2.php
```

On an i7-1185G7:
```
Benchmark 1: ./sapi/cli/php pack1.php
  Time (mean ± σ):     263.7 ms ±   1.8 ms    [User: 262.6 ms, System: 0.9 ms]
  Range (min … max):   261.5 ms … 268.2 ms    11 runs

Benchmark 2: ./sapi/cli/php_old pack1.php
  Time (mean ± σ):     303.3 ms ±   6.5 ms    [User: 300.7 ms, System: 2.3 ms]
  Range (min … max):   297.4 ms … 318.1 ms    10 runs

Summary
  ./sapi/cli/php pack1.php ran
    1.15 ± 0.03 times faster than ./sapi/cli/php_old pack1.php

Benchmark 1: ./sapi/cli/php pack2.php
  Time (mean ± σ):     156.7 ms ±   2.9 ms    [User: 154.7 ms, System: 1.7 ms]
  Range (min … max):   151.6 ms … 164.7 ms    19 runs

Benchmark 2: ./sapi/cli/php_old pack2.php
  Time (mean ± σ):     174.6 ms ±   3.3 ms    [User: 171.9 ms, System: 2.3 ms]
  Range (min … max):   170.7 ms … 180.4 ms    17 runs

Summary
  ./sapi/cli/php pack2.php ran
    1.11 ± 0.03 times faster than ./sapi/cli/php_old pack2.php
```

Co-authored-by: [email protected]
@nielsdos
Copy link
Member Author

nielsdos commented Jun 12, 2025

Rebased to solve conflict

Copy link
Member

@bukka bukka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I verified the logic and it looks good. Nice work with reducing all that stuff. Approving with an assumption that you managed to somehow test it on BE? If not, maybe @NattyNarwhal might be able to test it.

@NattyNarwhal
Copy link
Member

I quickly tested on the CI box, lot of pack stuff broken:

iconv() test 2 (UCS4BE to ASCII) [ext/iconv/tests/iconv002.phpt]
Phar: opendir test - no dir specified at all [ext/phar/tests/017.phpt]
Phar: opendir test, root directory [ext/phar/tests/018.phpt]
Phar: opendir test, subdirectory [ext/phar/tests/019.phpt]
Phar: opendir test, recurse into [ext/phar/tests/019b.phpt]
Phar: opendir test, recurse into [ext/phar/tests/019c.phpt]
GHSA-h35g-vwh6-m678 (mysqlnd leaks partial content of the heap - auth message buffer over-read) [ext/mysqli/tests/ghsa-h35g-vwh6-m678-auth-message.phpt]
Phar object: unset file [ext/phar/tests/phar_oo_012b.phpt]
GH-13833 (Applying zero offset to null pointer in zend_hash.c) [ext/phar/tests/gh13833.phpt]
Phar: url stat [ext/phar/tests/020.phpt]
GHSA-h35g-vwh6-m678 (mysqlnd leaks partial content of the heap - tabular default) [ext/mysqli/tests/ghsa-h35g-vwh6-m678-def.phpt]
Phar: stream stat [ext/phar/tests/021.phpt]
GHSA-h35g-vwh6-m678 (mysqlnd leaks partial content of the heap - upsert filename buffer over-read) [ext/mysqli/tests/ghsa-h35g-vwh6-m678-filename.phpt]
Phar: stream stat [ext/phar/tests/022.phpt]
Phar::isWriteable [ext/phar/tests/phar_oo_iswriteable.phpt]
Phar: test that refcounting avoids problems with deleting a file [ext/phar/tests/refcount1.phpt]
Phar::getSignature() no signature [ext/phar/tests/phar_oo_nosig.phpt]
Phar: phar:// file_get_contents [ext/phar/tests/023.phpt]
GHSA-h35g-vwh6-m678 (mysqlnd leaks partial content of the heap - stmt row no space for the field) [ext/mysqli/tests/ghsa-h35g-vwh6-m678-query-len-overflow.phpt]
GH-18642 (Signed integer overflow in ext/phar fseek) [ext/phar/tests/gh18642.phpt]
Phar: rename test [ext/phar/tests/rename.phpt]
GHSA-h35g-vwh6-m678 (mysqlnd leaks partial content of the heap - stmt row bit buffer over-read) [ext/mysqli/tests/ghsa-h35g-vwh6-m678-stmt-row-bit.phpt]
Phar: rename_dir test [ext/phar/tests/rename_dir.phpt]
Phar: phar:// include [ext/phar/tests/024.phpt]
Phar with metadata (read) [ext/phar/tests/phar_metadata_read.phpt]
Phar::setAlias() [ext/phar/tests/phar_setalias.phpt]
Phar: phar:// include (repeated names) [ext/phar/tests/025.phpt]
Phar: rename_dir and mount test [ext/phar/tests/rename_dir_and_mount.phpt]
GHSA-h35g-vwh6-m678 (mysqlnd leaks partial content of the heap - stmt row date buffer over-read) [ext/mysqli/tests/ghsa-h35g-vwh6-m678-stmt-row-date.phpt]
Phar::setAlias() error [ext/phar/tests/phar_setalias2.phpt]
Phar with metadata (write) [ext/phar/tests/phar_metadata_write.phpt]
Phar: phar:// require from within [ext/phar/tests/026.phpt]
Phar: rmdir test [ext/phar/tests/rmdir.phpt]
Phar with object in metadata [ext/phar/tests/phar_metadata_write2.phpt]
Phar::mapPhar too many manifest entries [ext/phar/tests/009.phpt]
GHSA-h35g-vwh6-m678 (mysqlnd leaks partial content of the heap - stmt row datetime buffer over-read) [ext/mysqli/tests/ghsa-h35g-vwh6-m678-stmt-row-datetime.phpt]
Phar: phar:// opendir [ext/phar/tests/027.phpt]
Phar with meta-data (read) [ext/phar/tests/metadata_read.phpt]
Phar::mapPhar buffer overrun [ext/phar/tests/010.phpt]
Phar with unsafe object in metadata does not unserialize on reading a file. [ext/phar/tests/phar_metadata_write3.phpt]
Phar::loadPhar [ext/phar/tests/028.phpt]
GHSA-h35g-vwh6-m678 (mysqlnd leaks partial content of the heap - stmt row double buffer over-read) [ext/mysqli/tests/ghsa-h35g-vwh6-m678-stmt-row-double.phpt]
Phar with meta-data (write) [ext/phar/tests/metadata_write.phpt]
Phar::mapPhar filesize too small in manifest [ext/phar/tests/011.phpt]
Phar with object in metadata [ext/phar/tests/phar_metadata_write4.phpt]
Phar::loadPhar overloading alias names [ext/phar/tests/029.phpt]
Phar::mapPhar valid file [ext/phar/tests/012.phpt]
Phar with meta-data (write) [ext/phar/tests/metadata_write_commit.phpt]
GHSA-h35g-vwh6-m678 (mysqlnd leaks partial content of the heap - stmt row int buffer over-read) [ext/mysqli/tests/ghsa-h35g-vwh6-m678-stmt-row-float.phpt]
Phar::convertToZip|Tar|Phar() repeated (phar_based archives) [ext/phar/tests/phar_convert_repeated.phpt]
Phar::loadPhar ignoring alias [ext/phar/tests/030.phpt]
Phar::mapPhar filesize mismatch [ext/phar/tests/013.phpt]
Phar::convertToTar() [ext/phar/tests/phar_convert_tar.phpt]
Phar: include and parser error [ext/phar/tests/031.phpt]
Phar::mapPhar filesize mismatch [ext/phar/tests/014.phpt]
GHSA-h35g-vwh6-m678 (mysqlnd leaks partial content of the heap - stmt row int buffer over-read) [ext/mysqli/tests/ghsa-h35g-vwh6-m678-stmt-row-int.phpt]
Phar::convertToZip() [ext/phar/tests/phar_convert_zip.phpt]
Phar: require hash [ext/phar/tests/032.phpt]
GHSA-h35g-vwh6-m678 (mysqlnd leaks partial content of the heap - stmt row no space for the field) [ext/mysqli/tests/ghsa-h35g-vwh6-m678-stmt-row-no-space.phpt]
Phar::setStub() [ext/phar/tests/phar_stub.phpt]
Phar: delete test [ext/phar/tests/delete.phpt]
Phar::chmod [ext/phar/tests/033.phpt]
Phar object: basics [ext/phar/tests/phar_oo_001.phpt]
Phar: fopen a .phar for writing (existing file) [ext/phar/tests/open_for_write_existing.phpt]
Phar::setStub()/getStub() [ext/phar/tests/phar_stub_error.phpt]
GHSA-h35g-vwh6-m678 (mysqlnd leaks partial content of the heap - stmt row string buffer over-read) [ext/mysqli/tests/ghsa-h35g-vwh6-m678-stmt-row-string.phpt]
Phar: delete a file within a .phar [ext/phar/tests/delete_in_phar.phpt]
Phar::chmod [ext/phar/tests/033a.phpt]
Phar: fopen a .phar for writing (existing file) [ext/phar/tests/open_for_write_existing_b.phpt]
Phar::setStub()/getStub() [ext/phar/tests/phar_stub_write.phpt]
Phar: delete a file within a .phar [ext/phar/tests/delete_in_phar_b.phpt]
Phar object: iterator & entries [ext/phar/tests/phar_oo_002.phpt]
GHSA-h35g-vwh6-m678 (mysqlnd leaks partial content of the heap - stmt row time buffer over-read) [ext/mysqli/tests/ghsa-h35g-vwh6-m678-stmt-row-time.phpt]
Phar: create new Phar with broken.dirname in path [ext/phar/tests/phar_dotted_path.phpt]
Phar: fopen a .phar for writing (existing file) [ext/phar/tests/open_for_write_existing_c.phpt]
Phar::setStub()/getStub() from file [ext/phar/tests/phar_stub_write_file.phpt]
Phar: delete a file within a .phar (confirm disk file is changed) [ext/phar/tests/delete_in_phar_confirm.phpt]
Phar object: entry & openFile() [ext/phar/tests/phar_oo_003.phpt]
Phar: fopen a .phar for writing (new file) [ext/phar/tests/open_for_write_newfile.phpt]
Phar::unlinkArchive() [ext/phar/tests/phar_unlinkarchive.phpt]
Phar and DirectoryIterator [ext/phar/tests/phar_oo_004.phpt]
Phar: fopen a .phar for writing (new file) [ext/phar/tests/open_for_write_newfile_b.phpt]
Phar and RecursiveDirectoryIterator [ext/phar/tests/phar_oo_005.phpt]
Phar: fopen a .phar for writing (new file) [ext/phar/tests/open_for_write_newfile_c.phpt]
Phar object: array access [ext/phar/tests/phar_oo_006.phpt]
Phar object: access through SplFileObject [ext/phar/tests/phar_oo_007.phpt]
Phar: PharFileInfo::getCRC32 [ext/phar/tests/pharfileinfo_getcrc32.phpt]
Phar object: iterating via SplFileObject [ext/phar/tests/phar_oo_008.phpt]
Phar object: iterating via SplFileObject and reading csv [ext/phar/tests/phar_oo_009.phpt]
Phar object: ArrayAccess and isset [ext/phar/tests/phar_oo_010.phpt]
Phar object: add file [ext/phar/tests/phar_oo_011.phpt]
Phar object: add file [ext/phar/tests/phar_oo_011b.phpt]
Phar object: unset file [ext/phar/tests/phar_oo_012.phpt]
Phar object: unset file (confirm disk file is changed) [ext/phar/tests/phar_oo_012_confirm.phpt]
Random: Randomizer: getFloat(): Returned floats have equal distance. [ext/random/tests/03_randomizer/methods/getFloat_equal_steps.phpt]
Random: Randomizer: getInt(): Returned values with insufficient bits are correctly expanded [ext/random/tests/03_randomizer/methods/getInt_expansion_32.ph
Bug #38770 (unpack() broken with longs on 64 bit machines) [ext/standard/tests/strings/bug38770.phpt]
Bug #61764: 'I' unpacks n as signed if n > 2^31-1 on LP64 [ext/standard/tests/strings/bug61764.phpt]
htmlentities() / htmlspecialchars() ENT_DISALLOWED [ext/standard/tests/strings/htmlentities20.phpt]
htmlentities() conformance check (HTML 4) [ext/standard/tests/strings/htmlentities_html4.phpt]
htmlentities() conformance check (HTML 5) [ext/standard/tests/strings/htmlentities_html5.phpt]
64bit pack()/unpack() tests [ext/standard/tests/strings/pack64.phpt]
test unpack() to array with named keys [ext/standard/tests/strings/pack_arrays.phpt]
Bug #68225 unpack and X format code [ext/standard/tests/strings/unpack_bug68225.phpt]
unpack() with offset [ext/standard/tests/strings/unpack_offset.phpt]
Test base64_encode() function : basic functionality [ext/standard/tests/url/base64_encode_basic_001.phpt]
MySQL protocol - statement row data fetch) [ext/mysqli/tests/protocol_query_row_fetch_data.phpt]
MySQL protocol - statement row data fetch) [ext/mysqli/tests/protocol_stmt_row_fetch_data.phpt]

(ignoring failed tests that are already failing... I'll have to look at those monday)

@nielsdos
Copy link
Member Author

Probably something subtle with shifting around, I can take a look thursday

@divinity76
Copy link
Contributor

If someone has a big endian server running ssh they'd be willing to share, that would be helpful.
Fwiw nielsdos ssh keys are available at https://github.com/nielsdos.keys

Likewise my keys at https://github.com/divinity76.keys

@bukka
Copy link
Member

bukka commented Jun 15, 2025

You should be able to emulate it in qemu but haven't tried it myself so not sure if it's any good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants