-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[lldb] Have disassembler show load addresses when using a core file #115453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We got a bug report that the disassember output was not relocated (i.e. a load address) for a core file (like it is for a live process). It turns out this behavior it depends on whether the instructions were read from an executable file or from process memory (a core file will not typically contain the memory image for segments backed by an executable file). It's unclear whether this behavior is intentional, or if it was just trying to handle the case where we're dissassembling a module without a process, but I think it's undesirable. What makes it particularly confusing is that the instruction addresses are relocated in this case (unlike the when we don't have a process), so with large files and adresses it gets very hard to see whether the relocation has been applied or not. This patch removes the data_from_file check so that the instruction is relocated regardless of where it was read from. It will still not get relocated for the raw module use case, as those can't be relocated anywhere as they don't have a load address.
@llvm/pr-subscribers-lldb Author: Pavel Labath (labath) ChangesWe got a bug report that the disassember output was not relocated (i.e. a load address) for a core file (like it is for a live process). It turns out this behavior it depends on whether the instructions were read from an executable file or from process memory (a core file will not typically contain the memory image for segments backed by an executable file). It's unclear whether this behavior is intentional, or if it was just trying to handle the case where we're dissassembling a module without a process, but I think it's undesirable. What makes it particularly confusing is that the instruction addresses are relocated in this case (unlike the when we don't have a process), so with large files and adresses it gets very hard to see whether the relocation has been applied or not. This patch removes the data_from_file check so that the instruction is relocated regardless of where it was read from. It will still not get relocated for the raw module use case, as those can't be relocated anywhere as they don't have a load address. Full diff: https://github.com/llvm/llvm-project/pull/115453.diff 3 Files Affected:
diff --git a/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp b/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp
index 31edd8d46c444e..08264d837f9c23 100644
--- a/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp
+++ b/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp
@@ -583,7 +583,6 @@ class InstructionLLVMC : public lldb_private::Instruction {
lldb::addr_t pc = m_address.GetFileAddress();
m_using_file_addr = true;
- const bool data_from_file = disasm->m_data_from_file;
bool use_hex_immediates = true;
Disassembler::HexImmediateStyle hex_style = Disassembler::eHexStyleC;
@@ -593,12 +592,10 @@ class InstructionLLVMC : public lldb_private::Instruction {
use_hex_immediates = target->GetUseHexImmediates();
hex_style = target->GetHexImmediateStyle();
- if (!data_from_file) {
- const lldb::addr_t load_addr = m_address.GetLoadAddress(target);
- if (load_addr != LLDB_INVALID_ADDRESS) {
- pc = load_addr;
- m_using_file_addr = false;
- }
+ const lldb::addr_t load_addr = m_address.GetLoadAddress(target);
+ if (load_addr != LLDB_INVALID_ADDRESS) {
+ pc = load_addr;
+ m_using_file_addr = false;
}
}
}
diff --git a/lldb/test/Shell/Commands/command-disassemble-process.yaml b/lldb/test/Shell/Commands/command-disassemble-process.yaml
index 75be1a42fb196d..ce1b37bc8aea7a 100644
--- a/lldb/test/Shell/Commands/command-disassemble-process.yaml
+++ b/lldb/test/Shell/Commands/command-disassemble-process.yaml
@@ -20,7 +20,7 @@
# CHECK: (lldb) disassemble
# CHECK-NEXT: command-disassemble-process.exe`main:
-# CHECK-NEXT: 0x4002 <+0>: addb %al, (%rcx)
+# CHECK-NEXT: 0x4002 <+0>: jmp 0x4004 ; <+2>
# CHECK-NEXT: -> 0x4004 <+2>: addb %al, (%rdx)
# CHECK-NEXT: 0x4006 <+4>: addb %al, (%rbx)
# CHECK-NEXT: 0x4008 <+6>: addb %al, (%rsi)
@@ -32,7 +32,7 @@
# CHECK-NEXT: 0x400a: addb %al, (%rdi)
# CHECK-NEXT: (lldb) disassemble --frame
# CHECK-NEXT: command-disassemble-process.exe`main:
-# CHECK-NEXT: 0x4002 <+0>: addb %al, (%rcx)
+# CHECK-NEXT: 0x4002 <+0>: jmp 0x4004 ; <+2>
# CHECK-NEXT: -> 0x4004 <+2>: addb %al, (%rdx)
# CHECK-NEXT: 0x4006 <+4>: addb %al, (%rbx)
# CHECK-NEXT: 0x4008 <+6>: addb %al, (%rsi)
@@ -44,13 +44,13 @@
# CHECK-NEXT: 0x400a: addb %al, (%rdi)
# CHECK-NEXT: (lldb) disassemble --address 0x4004
# CHECK-NEXT: command-disassemble-process.exe`main:
-# CHECK-NEXT: 0x4002 <+0>: addb %al, (%rcx)
+# CHECK-NEXT: 0x4002 <+0>: jmp 0x4004 ; <+2>
# CHECK-NEXT: -> 0x4004 <+2>: addb %al, (%rdx)
# CHECK-NEXT: 0x4006 <+4>: addb %al, (%rbx)
# CHECK-NEXT: 0x4008 <+6>: addb %al, (%rsi)
# CHECK-NEXT: (lldb) disassemble --count 7
# CHECK-NEXT: command-disassemble-process.exe`main:
-# CHECK-NEXT: 0x4002 <+0>: addb %al, (%rcx)
+# CHECK-NEXT: 0x4002 <+0>: jmp 0x4004 ; <+2>
# CHECK-NEXT: -> 0x4004 <+2>: addb %al, (%rdx)
# CHECK-NEXT: 0x4006 <+4>: addb %al, (%rbx)
# CHECK-NEXT: 0x4008 <+6>: addb %al, (%rsi)
@@ -81,32 +81,32 @@ Sections:
- Name: .text
Type: SHT_PROGBITS
Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
- Address: 0x0000000000004000
+ Address: 0x0000000000000000
AddressAlign: 0x0000000000001000
- Content: 00000001000200030006000700080009000A000B000E000F00100011001200130016001700180019001A001B001E001F00200021002200230026002700280029002A002B002E002F
+ Content: 0000EB00000200030006000700080009000A000B000E000F00100011001200130016001700180019001A001B001E001F00200021002200230026002700280029002A002B002E002F
Size: 0x10000
- Name: .note.gnu.build-id
Type: SHT_NOTE
Flags: [ SHF_ALLOC ]
- Address: 0x0000000000005000
+ Address: 0x0000000000001000
AddressAlign: 0x0000000000001000
Content: 040000000800000003000000474E5500DEADBEEFBAADF00D
Symbols:
- Name: main
Type: STT_FUNC
Section: .text
- Value: 0x0000000000004002
+ Value: 0x0000000000000002
Size: [[MAIN_SIZE]]
ProgramHeaders:
- Type: PT_LOAD
Flags: [ PF_X, PF_R ]
- VAddr: 0x4000
+ VAddr: 0x0000
Align: 0x1000
FirstSec: .text
LastSec: .text
- Type: PT_LOAD
Flags: [ PF_W, PF_R ]
- VAddr: 0x5000
+ VAddr: 0x1000
Align: 0x1000
FirstSec: .note.gnu.build-id
LastSec: .note.gnu.build-id
diff --git a/lldb/test/Shell/Commands/command-disassemble.s b/lldb/test/Shell/Commands/command-disassemble.s
index 10ce8354025ac5..1625f80468eb17 100644
--- a/lldb/test/Shell/Commands/command-disassemble.s
+++ b/lldb/test/Shell/Commands/command-disassemble.s
@@ -15,7 +15,7 @@
# CHECK-NEXT: error: Cannot disassemble around the current PC without a selected frame: no currently running process.
# CHECK-NEXT: (lldb) disassemble --start-address 0x0
# CHECK-NEXT: command-disassemble.s.tmp`foo:
-# CHECK-NEXT: command-disassemble.s.tmp[0x0] <+0>: int $0x10
+# CHECK-NEXT: command-disassemble.s.tmp[0x0] <+0>: jmp 0x2 ; <+2>
# CHECK-NEXT: command-disassemble.s.tmp[0x2] <+2>: int $0x11
# CHECK-NEXT: command-disassemble.s.tmp[0x4] <+4>: int $0x12
# CHECK-NEXT: command-disassemble.s.tmp[0x6] <+6>: int $0x13
@@ -41,7 +41,7 @@
# CHECK-NEXT: error: End address before start address.
# CHECK-NEXT: (lldb) disassemble --address 0x0
# CHECK-NEXT: command-disassemble.s.tmp`foo:
-# CHECK-NEXT: command-disassemble.s.tmp[0x0] <+0>: int $0x10
+# CHECK-NEXT: command-disassemble.s.tmp[0x0] <+0>: jmp 0x2 ; <+2>
# CHECK-NEXT: command-disassemble.s.tmp[0x2] <+2>: int $0x11
# CHECK-NEXT: command-disassemble.s.tmp[0x4] <+4>: int $0x12
# CHECK-NEXT: command-disassemble.s.tmp[0x6] <+6>: int $0x13
@@ -63,7 +63,7 @@
# CHECK: command-disassemble.s.tmp[0x203e] <+8190>: int $0x2a
# CHECK-NEXT: (lldb) disassemble --start-address 0x0 --count 7
# CHECK-NEXT: command-disassemble.s.tmp`foo:
-# CHECK-NEXT: command-disassemble.s.tmp[0x0] <+0>: int $0x10
+# CHECK-NEXT: command-disassemble.s.tmp[0x0] <+0>: jmp 0x2 ; <+2>
# CHECK-NEXT: command-disassemble.s.tmp[0x2] <+2>: int $0x11
# CHECK-NEXT: command-disassemble.s.tmp[0x4] <+4>: int $0x12
# CHECK-NEXT: command-disassemble.s.tmp[0x6] <+6>: int $0x13
@@ -101,8 +101,8 @@
.text
foo:
- int $0x10
- int $0x11
+ jmp 1f
+1: int $0x11
int $0x12
int $0x13
int $0x14
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless there's a reason (that Jason remembers) to do it differently, the new behavior makes more sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick review of the method being modified, this looks fine to me.
…lvm#115453) We got a bug report that the disassember output was not relocated (i.e. a load address) for a core file (like it is for a live process). It turns out this behavior it depends on whether the instructions were read from an executable file or from process memory (a core file will not typically contain the memory image for segments backed by an executable file). It's unclear whether this behavior is intentional, or if it was just trying to handle the case where we're dissassembling a module without a process, but I think it's undesirable. What makes it particularly confusing is that the instruction addresses are relocated in this case (unlike the when we don't have a process), so with large files and adresses it gets very hard to see whether the relocation has been applied or not. This patch removes the data_from_file check so that the instruction is relocated regardless of where it was read from. It will still not get relocated for the raw module use case, as those can't be relocated anywhere as they don't have a load address.
.. by changing the signal stop reason format 🤦 The reason this did not work is because the code in StopInfo::GetCrashingDereference was looking for the string "address=" to extract the address of the crash. Macos stop reason strings have the form EXC_BAD_ACCESS (code=1, address=0xdead) while on linux they look like: signal SIGSEGV: address not mapped to object (fault address: 0xdead) Extracting the address from a string sounds like a bad idea, but I suppose there's some value in using a consistent format across platforms, so this patch changes the signal format to use the equals sign as well. All of the diagnose tests pass except one, which appears to fail due to something similar llvm#115453 (disassembler reports unrelocated call targets). I've left the tests disabled on windows, as the stop reason reporting code works very differently there, and I suspect it won't work out of the box. If I'm wrong -- the XFAIL will let us know.
.. by changing the signal stop reason format 🤦 The reason this did not work is because the code in `StopInfo::GetCrashingDereference` was looking for the string "address=" to extract the address of the crash. Macos stop reason strings have the form ``` EXC_BAD_ACCESS (code=1, address=0xdead) ``` while on linux they look like: ``` signal SIGSEGV: address not mapped to object (fault address: 0xdead) ``` Extracting the address from a string sounds like a bad idea, but I suppose there's some value in using a consistent format across platforms, so this patch changes the signal format to use the equals sign as well. All of the diagnose tests pass except one, which appears to fail due to something similar #115453 (disassembler reports unrelocated call targets). I've left the tests disabled on windows, as the stop reason reporting code works very differently there, and I suspect it won't work out of the box. If I'm wrong -- the XFAIL will let us know.
We got a bug report that the disassember output was not relocated (i.e. a load address) for a core file (like it is for a live process). It turns out this behavior it depends on whether the instructions were read from an executable file or from process memory (a core file will not typically contain the memory image for segments backed by an executable file).
It's unclear whether this behavior is intentional, or if it was just trying to handle the case where we're dissassembling a module without a process, but I think it's undesirable. What makes it particularly confusing is that the instruction addresses are relocated in this case (unlike the when we don't have a process), so with large files and adresses it gets very hard to see whether the relocation has been applied or not.
This patch removes the data_from_file check so that the instruction is relocated regardless of where it was read from. It will still not get relocated for the raw module use case, as those can't be relocated anywhere as they don't have a load address.