Skip to content

Allow attaching to existing macOS process #190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 12, 2024
Merged

Allow attaching to existing macOS process #190

merged 5 commits into from
May 12, 2024

Conversation

vvuk
Copy link
Collaborator

@vvuk vvuk commented May 11, 2024

Implement attaching to existing mac processes. This needs samply to be signed with the debugger entitlement:

  1. Create a file ent.xml:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
	<key>com.apple.security.cs.debugger</key>
	<true/>
</dict>
</plist>
  1. cargo build
  2. codesign --force --options runtime --sign - --entitlements ent.xml target/debug/samply

The basic functionality works, but there is still work to do:

  1. This doesn't capture jitdump or markers. It's possible to just watch /tmp/ for appropriately-named jitdump files, but a real solution would need cooperation with the existing process to notify it to dump all existing jit compiled info to jitdump files.
  2. This won't capture child processes. That relies on the dyld preload to send the task of every new process back, and there's no preload here because there's no env vars set.

For 2, this might be possible to sort out if can set the dyld preload env var in the target process. I don't know if there's a simpler way to do this, but one idea is to load the preload shared library into the target (I think this is doable?) and then create a new thread in that process that calls an init function from the preload lib which will set the process env.

@vvuk vvuk requested a review from mstange May 11, 2024 19:35
@vvuk
Copy link
Collaborator Author

vvuk commented May 11, 2024

Hrm. Being able to attach to newly created children seems very complicated, borderline not doable. In order to write to the environment, the magic seems to be the _NSGetEnviron symbol which returns a char*** (pointer to the location where the array of char* that form the environment is stored). There are many problems with this path:

  • We need to figure out the address of _NSGetEnviron from libSystem. This is maybe possible, since we have the library info for the target task, and we can load debug info... we should be able to just forward-resolve the symbol ourselves.
  • Once we got the address of this symbol, in theory we should be able to read task memory of the relocated instructions and.. maybe.. compute the actual address of the environment pointer. I'm not sure if we'll have enough info.
  • Assuming we get the environment pointer, we can allocate and write task memory to create a new env block
  • ... but that will blow up the next time the process tries to setenv or similar because setenv and friends use realloc, so the memory needs to come from malloc

Spawning a thread in the target and executing code seems like maybe more possible, especially if that code can be dlopen()/dlsym()/call. Reading through this stuff it pointed me to this threadexec library for which the source is pretty damn complex. But one interesting bit is that it takes the address of a function in the local process, and uses that same address in the target process... so maybe the dyld cache is not mapped at different locations in different processes? I can't believe that wouldn't be the case for security, maybe it wasn't 6 years ago.

Given all this, I'm inclined to just not support attaching to new children when profiling a target process. It should be possible to attach to existing children, though.

@vvuk vvuk marked this pull request as ready for review May 11, 2024 22:57
@mstange
Copy link
Owner

mstange commented May 11, 2024

For profiling child processes, we could poll proc_listchildpids, if it doesn't have too much overhead. There's a Rust wrapper in remoteprocess::Process::child_processes. We'd miss the very beginning of new processes, but it's probably good enough for most use cases.

@mstange
Copy link
Owner

mstange commented May 11, 2024

And for profiling system-wide, if it's even possible with acceptable overhead, we could list all processes using the KERN_PROC_ALL sysctl, like lldb does in Host::FindProcessesImpl.

@mstange
Copy link
Owner

mstange commented May 11, 2024

Thanks for investigating this! I was imagining using a sudo subprocess for this, but self-signing samply is a good idea too. We could even have both. And we could have a samply setup command that does the self-signing for the user, with some kind of interactive wizard.

Comment on lines 244 to 246
pub fn take_task(&mut self) -> mach_port_t {
mem::replace(&mut self.task, MACH_PORT_NULL)
// TODO: I think it's safe to not do this mem::replace; we may need to resume the task
// in start_execution()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then this method should be renamed to just task().

I think it's fine to not mem::replace this for now because we never call mach_deallocate to drop our reference to these task ports. It might be cleaner to do that; once we do, we could store an Arc<OwnedPort> here.

Comment on lines +40 to +41
let mut root_child = self.launch_child();
let mut exit_status = root_child.wait().expect("couldn't wait for child");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If samply has debugger entitlements, can we call task_for_pid for root_child.id() if we're launching system binaries or shell scripts? Or will those still be denied?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still denied. (Whether as root or entitled.) Anything that's protected by SIP is off the table completely I think.

@mstange
Copy link
Owner

mstange commented May 12, 2024

Spawning a thread in the target and executing code seems like maybe more possible, especially if that code can be dlopen()/dlsym()/call. Reading through this stuff it pointed me to this threadexec library for which the source is pretty damn complex.

I wanted to point you at Listing 12-9 from the *OS Internals Book 1, but then I found this post which has further improved on it, and one of the comments on the gist with the full code links to this implementation which is arm64 compatible.

@vvuk
Copy link
Collaborator Author

vvuk commented May 12, 2024

Thanks for investigating this! I was imagining using a sudo subprocess for this, but self-signing samply is a good idea too. We could even have both. And we could have a samply setup command that does the self-signing for the user, with some kind of interactive wizard.

Yep, I was thinking samply setup too. Followup PR though. For root -- I think self-signing via setup and/or just calling via sudo directly should be sufficient, I don't think we need a separate subprocess. I swear I read somewhere that at some point Apple will disallow task_for_pid even for root processes if they don't have the entitlement, but I can't find it. It works right now; same code, just running sudo samply without code signing.

I wanted to point you at Listing 12-9 from the *OS Internals Book 1, but then I found this post which has further improved on it, and one of the comments on the gist with the full code links to this implementation which is arm64 compatible.

Awesome, thanks! Good to know, though I probably won't go down this route any time soon.

@mstange
Copy link
Owner

mstange commented May 12, 2024

and/or just calling via sudo directly should be sufficient

I want to discourage sudo samply record because this would also run the webserver as root. I think it would also leave a profile.json file that's not writable without sudo.

@vvuk vvuk enabled auto-merge (squash) May 12, 2024 19:36
@vvuk vvuk merged commit 3049f62 into main May 12, 2024
15 checks passed
@vvuk vvuk deleted the mac-attach branch May 12, 2024 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants