Skip to content

Fix Small String Optimization misplaced 0 terminator #5890

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions cores/esp8266/WString.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,7 @@ String::~String() {
// /*********************************************/

inline void String::init(void) {
memset(sso_buf, 0, sizeof(sso_buf));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see why the initial zeroing in ::init is needed, honestly. Can you explain? Once you apply a string of >12 chars to the String, a pointer and 2 length fields will be written over the 0s you wrote in ::init. At some later point you assign a 1-char string the SSO better write 0 to sso_buff[1] or it'll go poorly...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did test it without this memset and then the garbled texts re-appeared in the use case I showed in the issue.
The main reason I placed this memset in the init, is because that's the only point where I knew for sure that any data present would not be worth keeping.

Immediately after setting this buffer to 0's, the mode is switched to non-SSO.
So then all calls made in init are only on the _ptr struct.
N.B. this struct is then also set to contain 0's, but it does not overlap all the space occupied by sso_buf

In the constructor where I had the issues, only the copy is called after init.
This calls reserve, which calls changeBuffer.

In changeBuffer we enter this part, since the ptr.buff is set to 0 in init:

unsigned char String::changeBuffer(unsigned int maxStrLen) {
// Can we use SSO here to avoid allocation?
if (maxStrLen < sizeof(sso_buf)) {
if (sso() || !buffer()) {
// Already using SSO, nothing to do
setSSO(true);
return 1;

So only SSO is set and nothing else.
Since nothing has changed to the SSO buffer, but SSO is now active, a call to len() may return something other than 0 (it is calling strlen(sso_buf) which may return anything)

So the first byte of the buffer is not set to 0 and thus the String object may contain garbage and it is up to the compiler how this union is aligned.
If the pointer of _ptr is at the first byte of the sso_buf, then it should already be a zero, but apparently the Windows build is not locating the _prt struct at the same address as sso_buf.

By wiping as much as possible to 0 in this union struct, it is guaranteed it will contain zeroes at the start.
Also I guess it could differ between builds where this union is located and I don't know what all possible offsets are between the start of _ptr and sso_buf.

So that's why I think it is needed to have the sso_buf initialized to 0.
Does that make any sense?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD64 compiles are going to align things somewhat differently than X86_32 ones do (the pointer, for example, is gonna be 2x the size...). I don't know if Windows supports it w/the mingw(?) toolchain, but there is a 32-bit compile mode in the makefile which should help get closer to on-board alignment.

I'm not sure I fully follow, but my point is in the general case the sequence you've got is not guaranteed to happen, and there's possibility for the early bytes of the union to contain non-0s after you've called ::init and when switching. 0-ing out at ::init is safe, but I think it's unneeded.

I think the main problem is you found that when a SSO string was copied into the struct it did not get explicitly terminated w/a \0. Your patch does fix that, later on in the SSO assignment changeBuffer. My gut says that dropping this memset but leaving everything else untouched won't change the behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, leaving out this memset did not fix the issue (tested it).
I'm also not 100% convinced that whatever this init does fix will cover all situations, but at least it does something.
Maybe the suggestion I made yesterday does cover all and then this part in the init can be removed?

I don't understand what you tried to explain with the difference on 64-bit and 32 bit platforms.
Do the compilers generate different code even though they compile for the same 32 bit platform?
I know the size_t, ptrdiff_t and pointers have different sizes when compiling for 64 bit or 32 bit platforms, but when they compile for the same 32 bit platform they should align structures in memory the same, right?
Otherwise we might have a problem when serializing structs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was talking about the host testing. It can run either 32-bit or 64-bit on your system, depending on the compilers available and make options. The gcc-xtensa binaries should produce the exact same code under either host OS, of course.

setSSO(false);
setCapacity(0);
setLen(0);
Expand Down Expand Up @@ -161,11 +162,11 @@ unsigned char String::changeBuffer(unsigned int maxStrLen) {
return 1;
} else { // if bufptr && !sso()
// Using bufptr, need to shrink into sso_buff
char temp[sizeof(sso_buf)];
char temp[sizeof(sso_buf)] = {0};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the following, which would be more efficient:

            char temp[sizeof(sso_buf)];
            memcpy(temp, buffer(), maxStrLen);
            free(wbuffer());
            setSSO(true);
            memcpy(wbuffer(), temp, maxStrLen);
            wbuffer()[maxStrLen] = 0;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess copying less may improve it a bit theoretically, but all other calls to memcpy regarding the sso_buf do copy the entire buffer.
So for the sake of consistency, it may be wise to have them the same.
The reason it wasn't , did trigger me to have a look at that line.

I don't know how much impact the zeroing has at the time of construction (on the stack).

But since we're also setting it to all zeros in the init(), it may be a bit more than needed.
N.B. That zero'ing in the init is really needed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems only setting a string to "" would have any different behavior in this patch as-is, vs. @devyte's which would explicitly write a 0 at maxstrlen. I think his is correct (but again, I'm having trouble seeing where this actually changes behavior...)

memcpy(temp, buffer(), maxStrLen);
free(wbuffer());
setSSO(true);
memcpy(wbuffer(), temp, maxStrLen);
memcpy(wbuffer(), temp, sizeof(sso_buf));
return 1;
}
}
Expand Down
10 changes: 10 additions & 0 deletions tests/host/core/test_string.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,12 @@ TEST_CASE("String constructors", "[core][String]")
REQUIRE(ssh == "3.14159_abcd");
String flash = (F("hello from flash"));
REQUIRE(flash == "hello from flash");
const char textarray[6] = {'h', 'e', 'l', 'l', 'o', 0};
String hello(textarray);
REQUIRE(hello == "hello");
String hello2;
hello2 = textarray;
REQUIRE(hello2 == "hello");
}

TEST_CASE("String concantenation", "[core][String]")
Expand Down Expand Up @@ -360,4 +366,8 @@ TEST_CASE("String SSO works", "[core][String]")
REQUIRE(s == "0123456789abcdefghijklmnopqrstu");
REQUIRE(s.length() == 31);
}
s = "0123456789abcde";
s = s.substring(s.indexOf('a'));
REQUIRE(s == "abcde");
REQUIRE(s.length() == 5);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These and the prior added test pass on the unpatched core. Were they supposed to fail?

I was hoping there would be something that fails like your app code did, but passes once the patch is applied.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not test the tests myself. These were merely added to see how often they fail, but I guess it also depends on how the data on the stack is being allocated.
Does the test environment have some debug compile options which may set some allocated arrays to 0 first? (would perhaps be not the best options for a test environment I guess)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this, but need @devyte's C++ internals help. Basically you want to get a RAM buffer of 16 bytes, fill it with 0xff, and then coerce a pointer to that buffer into a (String*). The bugger is you need to call the constructor on it, and I've no idea if that's possible explicitly in C++.

-edit-
StackOverflow seems to believe it's supported:
https://stackoverflow.com/questions/2494471/c-is-it-possible-to-call-a-constructor-directly-without-new

}