[lldb] Have disassembler show load addresses when using a core file #115453

labath · 2024-11-08T10:14:54Z

We got a bug report that the disassember output was not relocated (i.e. a load address) for a core file (like it is for a live process). It turns out this behavior it depends on whether the instructions were read from an executable file or from process memory (a core file will not typically contain the memory image for segments backed by an executable file).

It's unclear whether this behavior is intentional, or if it was just trying to handle the case where we're dissassembling a module without a process, but I think it's undesirable. What makes it particularly confusing is that the instruction addresses are relocated in this case (unlike the when we don't have a process), so with large files and adresses it gets very hard to see whether the relocation has been applied or not.

This patch removes the data_from_file check so that the instruction is relocated regardless of where it was read from. It will still not get relocated for the raw module use case, as those can't be relocated anywhere as they don't have a load address.

We got a bug report that the disassember output was not relocated (i.e. a load address) for a core file (like it is for a live process). It turns out this behavior it depends on whether the instructions were read from an executable file or from process memory (a core file will not typically contain the memory image for segments backed by an executable file). It's unclear whether this behavior is intentional, or if it was just trying to handle the case where we're dissassembling a module without a process, but I think it's undesirable. What makes it particularly confusing is that the instruction addresses are relocated in this case (unlike the when we don't have a process), so with large files and adresses it gets very hard to see whether the relocation has been applied or not. This patch removes the data_from_file check so that the instruction is relocated regardless of where it was read from. It will still not get relocated for the raw module use case, as those can't be relocated anywhere as they don't have a load address.

llvmbot · 2024-11-08T10:15:32Z

@llvm/pr-subscribers-lldb

Author: Pavel Labath (labath)

Changes

We got a bug report that the disassember output was not relocated (i.e. a load address) for a core file (like it is for a live process). It turns out this behavior it depends on whether the instructions were read from an executable file or from process memory (a core file will not typically contain the memory image for segments backed by an executable file).

It's unclear whether this behavior is intentional, or if it was just trying to handle the case where we're dissassembling a module without a process, but I think it's undesirable. What makes it particularly confusing is that the instruction addresses are relocated in this case (unlike the when we don't have a process), so with large files and adresses it gets very hard to see whether the relocation has been applied or not.

This patch removes the data_from_file check so that the instruction is relocated regardless of where it was read from. It will still not get relocated for the raw module use case, as those can't be relocated anywhere as they don't have a load address.

Full diff: https://github.com/llvm/llvm-project/pull/115453.diff

3 Files Affected:

(modified) lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp (+4-7)
(modified) lldb/test/Shell/Commands/command-disassemble-process.yaml (+10-10)
(modified) lldb/test/Shell/Commands/command-disassemble.s (+5-5)

diff --git a/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp b/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp
index 31edd8d46c444e..08264d837f9c23 100644
--- a/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp
+++ b/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp
@@ -583,7 +583,6 @@ class InstructionLLVMC : public lldb_private::Instruction {
         lldb::addr_t pc = m_address.GetFileAddress();
         m_using_file_addr = true;
 
-        const bool data_from_file = disasm->m_data_from_file;
         bool use_hex_immediates = true;
         Disassembler::HexImmediateStyle hex_style = Disassembler::eHexStyleC;
 
@@ -593,12 +592,10 @@ class InstructionLLVMC : public lldb_private::Instruction {
             use_hex_immediates = target->GetUseHexImmediates();
             hex_style = target->GetHexImmediateStyle();
 
-            if (!data_from_file) {
-              const lldb::addr_t load_addr = m_address.GetLoadAddress(target);
-              if (load_addr != LLDB_INVALID_ADDRESS) {
-                pc = load_addr;
-                m_using_file_addr = false;
-              }
+            const lldb::addr_t load_addr = m_address.GetLoadAddress(target);
+            if (load_addr != LLDB_INVALID_ADDRESS) {
+              pc = load_addr;
+              m_using_file_addr = false;
             }
           }
         }
diff --git a/lldb/test/Shell/Commands/command-disassemble-process.yaml b/lldb/test/Shell/Commands/command-disassemble-process.yaml
index 75be1a42fb196d..ce1b37bc8aea7a 100644
--- a/lldb/test/Shell/Commands/command-disassemble-process.yaml
+++ b/lldb/test/Shell/Commands/command-disassemble-process.yaml
@@ -20,7 +20,7 @@
 
 # CHECK:       (lldb) disassemble
 # CHECK-NEXT: command-disassemble-process.exe`main:
-# CHECK-NEXT:     0x4002 <+0>: addb   %al, (%rcx)
+# CHECK-NEXT:     0x4002 <+0>: jmp    0x4004 ; <+2>
 # CHECK-NEXT: ->  0x4004 <+2>: addb   %al, (%rdx)
 # CHECK-NEXT:     0x4006 <+4>: addb   %al, (%rbx)
 # CHECK-NEXT:     0x4008 <+6>: addb   %al, (%rsi)
@@ -32,7 +32,7 @@
 # CHECK-NEXT:     0x400a:      addb   %al, (%rdi)
 # CHECK-NEXT: (lldb) disassemble --frame
 # CHECK-NEXT: command-disassemble-process.exe`main:
-# CHECK-NEXT:     0x4002 <+0>: addb   %al, (%rcx)
+# CHECK-NEXT:     0x4002 <+0>: jmp    0x4004 ; <+2>
 # CHECK-NEXT: ->  0x4004 <+2>: addb   %al, (%rdx)
 # CHECK-NEXT:     0x4006 <+4>: addb   %al, (%rbx)
 # CHECK-NEXT:     0x4008 <+6>: addb   %al, (%rsi)
@@ -44,13 +44,13 @@
 # CHECK-NEXT:     0x400a:      addb   %al, (%rdi)
 # CHECK-NEXT: (lldb) disassemble --address 0x4004
 # CHECK-NEXT: command-disassemble-process.exe`main:
-# CHECK-NEXT:     0x4002 <+0>: addb   %al, (%rcx)
+# CHECK-NEXT:     0x4002 <+0>: jmp    0x4004 ; <+2>
 # CHECK-NEXT: ->  0x4004 <+2>: addb   %al, (%rdx)
 # CHECK-NEXT:     0x4006 <+4>: addb   %al, (%rbx)
 # CHECK-NEXT:     0x4008 <+6>: addb   %al, (%rsi)
 # CHECK-NEXT: (lldb) disassemble --count 7
 # CHECK-NEXT: command-disassemble-process.exe`main:
-# CHECK-NEXT:     0x4002 <+0>: addb   %al, (%rcx)
+# CHECK-NEXT:     0x4002 <+0>: jmp    0x4004 ; <+2>
 # CHECK-NEXT: ->  0x4004 <+2>: addb   %al, (%rdx)
 # CHECK-NEXT:     0x4006 <+4>: addb   %al, (%rbx)
 # CHECK-NEXT:     0x4008 <+6>: addb   %al, (%rsi)
@@ -81,32 +81,32 @@ Sections:
   - Name:            .text
     Type:            SHT_PROGBITS
     Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
-    Address:         0x0000000000004000
+    Address:         0x0000000000000000
     AddressAlign:    0x0000000000001000
-    Content:         00000001000200030006000700080009000A000B000E000F00100011001200130016001700180019001A001B001E001F00200021002200230026002700280029002A002B002E002F
+    Content:         0000EB00000200030006000700080009000A000B000E000F00100011001200130016001700180019001A001B001E001F00200021002200230026002700280029002A002B002E002F
     Size:            0x10000
   - Name:            .note.gnu.build-id
     Type:            SHT_NOTE
     Flags:           [ SHF_ALLOC ]
-    Address:         0x0000000000005000
+    Address:         0x0000000000001000
     AddressAlign:    0x0000000000001000
     Content:         040000000800000003000000474E5500DEADBEEFBAADF00D
 Symbols:
   - Name:            main
     Type:            STT_FUNC
     Section:         .text
-    Value:           0x0000000000004002
+    Value:           0x0000000000000002
     Size:            [[MAIN_SIZE]]
 ProgramHeaders:
   - Type: PT_LOAD
     Flags: [ PF_X, PF_R ]
-    VAddr: 0x4000
+    VAddr: 0x0000
     Align: 0x1000
     FirstSec: .text
     LastSec:  .text
   - Type: PT_LOAD
     Flags: [ PF_W, PF_R ]
-    VAddr: 0x5000
+    VAddr: 0x1000
     Align: 0x1000
     FirstSec: .note.gnu.build-id
     LastSec: .note.gnu.build-id
diff --git a/lldb/test/Shell/Commands/command-disassemble.s b/lldb/test/Shell/Commands/command-disassemble.s
index 10ce8354025ac5..1625f80468eb17 100644
--- a/lldb/test/Shell/Commands/command-disassemble.s
+++ b/lldb/test/Shell/Commands/command-disassemble.s
@@ -15,7 +15,7 @@
 # CHECK-NEXT: error: Cannot disassemble around the current PC without a selected frame: no currently running process.
 # CHECK-NEXT: (lldb) disassemble --start-address 0x0
 # CHECK-NEXT: command-disassemble.s.tmp`foo:
-# CHECK-NEXT: command-disassemble.s.tmp[0x0] <+0>:   int    $0x10
+# CHECK-NEXT: command-disassemble.s.tmp[0x0] <+0>:   jmp    0x2 ; <+2>
 # CHECK-NEXT: command-disassemble.s.tmp[0x2] <+2>:   int    $0x11
 # CHECK-NEXT: command-disassemble.s.tmp[0x4] <+4>:   int    $0x12
 # CHECK-NEXT: command-disassemble.s.tmp[0x6] <+6>:   int    $0x13
@@ -41,7 +41,7 @@
 # CHECK-NEXT: error: End address before start address.
 # CHECK-NEXT: (lldb) disassemble --address 0x0
 # CHECK-NEXT: command-disassemble.s.tmp`foo:
-# CHECK-NEXT: command-disassemble.s.tmp[0x0] <+0>:  int    $0x10
+# CHECK-NEXT: command-disassemble.s.tmp[0x0] <+0>:  jmp    0x2 ; <+2>
 # CHECK-NEXT: command-disassemble.s.tmp[0x2] <+2>:  int    $0x11
 # CHECK-NEXT: command-disassemble.s.tmp[0x4] <+4>:  int    $0x12
 # CHECK-NEXT: command-disassemble.s.tmp[0x6] <+6>:  int    $0x13
@@ -63,7 +63,7 @@
 # CHECK:      command-disassemble.s.tmp[0x203e] <+8190>: int    $0x2a
 # CHECK-NEXT: (lldb) disassemble --start-address 0x0 --count 7
 # CHECK-NEXT: command-disassemble.s.tmp`foo:
-# CHECK-NEXT: command-disassemble.s.tmp[0x0] <+0>:  int    $0x10
+# CHECK-NEXT: command-disassemble.s.tmp[0x0] <+0>:  jmp    0x2 ; <+2>
 # CHECK-NEXT: command-disassemble.s.tmp[0x2] <+2>:  int    $0x11
 # CHECK-NEXT: command-disassemble.s.tmp[0x4] <+4>:  int    $0x12
 # CHECK-NEXT: command-disassemble.s.tmp[0x6] <+6>:  int    $0x13
@@ -101,8 +101,8 @@
 
         .text
 foo:
-        int $0x10
-        int $0x11
+        jmp 1f
+1:      int $0x11
         int $0x12
         int $0x13
         int $0x14

JDevlieghere

Unless there's a reason (that Jason remembers) to do it differently, the new behavior makes more sense to me.

jasonmolenda

Quick review of the method being modified, this looks fine to me.

…lvm#115453) We got a bug report that the disassember output was not relocated (i.e. a load address) for a core file (like it is for a live process). It turns out this behavior it depends on whether the instructions were read from an executable file or from process memory (a core file will not typically contain the memory image for segments backed by an executable file). It's unclear whether this behavior is intentional, or if it was just trying to handle the case where we're dissassembling a module without a process, but I think it's undesirable. What makes it particularly confusing is that the instruction addresses are relocated in this case (unlike the when we don't have a process), so with large files and adresses it gets very hard to see whether the relocation has been applied or not. This patch removes the data_from_file check so that the instruction is relocated regardless of where it was read from. It will still not get relocated for the raw module use case, as those can't be relocated anywhere as they don't have a load address.

.. by changing the signal stop reason format 🤦 The reason this did not work is because the code in StopInfo::GetCrashingDereference was looking for the string "address=" to extract the address of the crash. Macos stop reason strings have the form EXC_BAD_ACCESS (code=1, address=0xdead) while on linux they look like: signal SIGSEGV: address not mapped to object (fault address: 0xdead) Extracting the address from a string sounds like a bad idea, but I suppose there's some value in using a consistent format across platforms, so this patch changes the signal format to use the equals sign as well. All of the diagnose tests pass except one, which appears to fail due to something similar llvm#115453 (disassembler reports unrelocated call targets). I've left the tests disabled on windows, as the stop reason reporting code works very differently there, and I suspect it won't work out of the box. If I'm wrong -- the XFAIL will let us know.

.. by changing the signal stop reason format 🤦 The reason this did not work is because the code in `StopInfo::GetCrashingDereference` was looking for the string "address=" to extract the address of the crash. Macos stop reason strings have the form ``` EXC_BAD_ACCESS (code=1, address=0xdead) ``` while on linux they look like: ``` signal SIGSEGV: address not mapped to object (fault address: 0xdead) ``` Extracting the address from a string sounds like a bad idea, but I suppose there's some value in using a consistent format across platforms, so this patch changes the signal format to use the equals sign as well. All of the diagnose tests pass except one, which appears to fail due to something similar #115453 (disassembler reports unrelocated call targets). I've left the tests disabled on windows, as the stop reason reporting code works very differently there, and I suspect it won't work out of the box. If I'm wrong -- the XFAIL will let us know.

labath requested a review from JDevlieghere as a code owner November 8, 2024 10:14

labath requested a review from jasonmolenda November 8, 2024 10:15

llvmbot added the lldb label Nov 8, 2024

JDevlieghere approved these changes Nov 8, 2024

View reviewed changes

jasonmolenda approved these changes Nov 8, 2024

View reviewed changes

labath merged commit d8ebb08 into llvm:main Nov 11, 2024
9 checks passed

labath deleted the disas branch November 11, 2024 14:51

labath mentioned this pull request Jan 16, 2025

[lldb] Enable "frame diagnose" on linux #123217

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[lldb] Have disassembler show load addresses when using a core file #115453

[lldb] Have disassembler show load addresses when using a core file #115453

Uh oh!

labath commented Nov 8, 2024

Uh oh!

llvmbot commented Nov 8, 2024

Uh oh!

JDevlieghere left a comment

Uh oh!

jasonmolenda left a comment

Uh oh!

Uh oh!

Uh oh!

[lldb] Have disassembler show load addresses when using a core file #115453

[lldb] Have disassembler show load addresses when using a core file #115453

Uh oh!

Conversation

labath commented Nov 8, 2024

Uh oh!

llvmbot commented Nov 8, 2024

Uh oh!

JDevlieghere left a comment

Choose a reason for hiding this comment

Uh oh!

jasonmolenda left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!