Skip to content

[lldb] Have disassembler show load addresses when using a core file #115453

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 11, 2024

Conversation

labath
Copy link
Collaborator

@labath labath commented Nov 8, 2024

We got a bug report that the disassember output was not relocated (i.e. a load address) for a core file (like it is for a live process). It turns out this behavior it depends on whether the instructions were read from an executable file or from process memory (a core file will not typically contain the memory image for segments backed by an executable file).

It's unclear whether this behavior is intentional, or if it was just trying to handle the case where we're dissassembling a module without a process, but I think it's undesirable. What makes it particularly confusing is that the instruction addresses are relocated in this case (unlike the when we don't have a process), so with large files and adresses it gets very hard to see whether the relocation has been applied or not.

This patch removes the data_from_file check so that the instruction is relocated regardless of where it was read from. It will still not get relocated for the raw module use case, as those can't be relocated anywhere as they don't have a load address.

We got a bug report that the disassember output was not relocated (i.e.
a load address) for a core file (like it is for a live process). It
turns out this behavior it depends on whether the instructions were read
from an executable file or from process memory (a core file will not
typically contain the memory image for segments backed by an executable
file).

It's unclear whether this behavior is intentional, or if it was just
trying to handle the case where we're dissassembling a module without a
process, but I think it's undesirable. What makes it particularly
confusing is that the instruction addresses are relocated in this case
(unlike the when we don't have a process), so with large files and
adresses it gets very hard to see whether the relocation has been
applied or not.

This patch removes the data_from_file check so that the instruction is
relocated regardless of where it was read from. It will still not get
relocated for the raw module use case, as those can't be relocated
anywhere as they don't have a load address.
@labath labath requested a review from JDevlieghere as a code owner November 8, 2024 10:14
@labath labath requested a review from jasonmolenda November 8, 2024 10:15
@llvmbot llvmbot added the lldb label Nov 8, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 8, 2024

@llvm/pr-subscribers-lldb

Author: Pavel Labath (labath)

Changes

We got a bug report that the disassember output was not relocated (i.e. a load address) for a core file (like it is for a live process). It turns out this behavior it depends on whether the instructions were read from an executable file or from process memory (a core file will not typically contain the memory image for segments backed by an executable file).

It's unclear whether this behavior is intentional, or if it was just trying to handle the case where we're dissassembling a module without a process, but I think it's undesirable. What makes it particularly confusing is that the instruction addresses are relocated in this case (unlike the when we don't have a process), so with large files and adresses it gets very hard to see whether the relocation has been applied or not.

This patch removes the data_from_file check so that the instruction is relocated regardless of where it was read from. It will still not get relocated for the raw module use case, as those can't be relocated anywhere as they don't have a load address.


Full diff: https://github.com/llvm/llvm-project/pull/115453.diff

3 Files Affected:

  • (modified) lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp (+4-7)
  • (modified) lldb/test/Shell/Commands/command-disassemble-process.yaml (+10-10)
  • (modified) lldb/test/Shell/Commands/command-disassemble.s (+5-5)
diff --git a/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp b/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp
index 31edd8d46c444e..08264d837f9c23 100644
--- a/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp
+++ b/lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp
@@ -583,7 +583,6 @@ class InstructionLLVMC : public lldb_private::Instruction {
         lldb::addr_t pc = m_address.GetFileAddress();
         m_using_file_addr = true;
 
-        const bool data_from_file = disasm->m_data_from_file;
         bool use_hex_immediates = true;
         Disassembler::HexImmediateStyle hex_style = Disassembler::eHexStyleC;
 
@@ -593,12 +592,10 @@ class InstructionLLVMC : public lldb_private::Instruction {
             use_hex_immediates = target->GetUseHexImmediates();
             hex_style = target->GetHexImmediateStyle();
 
-            if (!data_from_file) {
-              const lldb::addr_t load_addr = m_address.GetLoadAddress(target);
-              if (load_addr != LLDB_INVALID_ADDRESS) {
-                pc = load_addr;
-                m_using_file_addr = false;
-              }
+            const lldb::addr_t load_addr = m_address.GetLoadAddress(target);
+            if (load_addr != LLDB_INVALID_ADDRESS) {
+              pc = load_addr;
+              m_using_file_addr = false;
             }
           }
         }
diff --git a/lldb/test/Shell/Commands/command-disassemble-process.yaml b/lldb/test/Shell/Commands/command-disassemble-process.yaml
index 75be1a42fb196d..ce1b37bc8aea7a 100644
--- a/lldb/test/Shell/Commands/command-disassemble-process.yaml
+++ b/lldb/test/Shell/Commands/command-disassemble-process.yaml
@@ -20,7 +20,7 @@
 
 # CHECK:       (lldb) disassemble
 # CHECK-NEXT: command-disassemble-process.exe`main:
-# CHECK-NEXT:     0x4002 <+0>: addb   %al, (%rcx)
+# CHECK-NEXT:     0x4002 <+0>: jmp    0x4004 ; <+2>
 # CHECK-NEXT: ->  0x4004 <+2>: addb   %al, (%rdx)
 # CHECK-NEXT:     0x4006 <+4>: addb   %al, (%rbx)
 # CHECK-NEXT:     0x4008 <+6>: addb   %al, (%rsi)
@@ -32,7 +32,7 @@
 # CHECK-NEXT:     0x400a:      addb   %al, (%rdi)
 # CHECK-NEXT: (lldb) disassemble --frame
 # CHECK-NEXT: command-disassemble-process.exe`main:
-# CHECK-NEXT:     0x4002 <+0>: addb   %al, (%rcx)
+# CHECK-NEXT:     0x4002 <+0>: jmp    0x4004 ; <+2>
 # CHECK-NEXT: ->  0x4004 <+2>: addb   %al, (%rdx)
 # CHECK-NEXT:     0x4006 <+4>: addb   %al, (%rbx)
 # CHECK-NEXT:     0x4008 <+6>: addb   %al, (%rsi)
@@ -44,13 +44,13 @@
 # CHECK-NEXT:     0x400a:      addb   %al, (%rdi)
 # CHECK-NEXT: (lldb) disassemble --address 0x4004
 # CHECK-NEXT: command-disassemble-process.exe`main:
-# CHECK-NEXT:     0x4002 <+0>: addb   %al, (%rcx)
+# CHECK-NEXT:     0x4002 <+0>: jmp    0x4004 ; <+2>
 # CHECK-NEXT: ->  0x4004 <+2>: addb   %al, (%rdx)
 # CHECK-NEXT:     0x4006 <+4>: addb   %al, (%rbx)
 # CHECK-NEXT:     0x4008 <+6>: addb   %al, (%rsi)
 # CHECK-NEXT: (lldb) disassemble --count 7
 # CHECK-NEXT: command-disassemble-process.exe`main:
-# CHECK-NEXT:     0x4002 <+0>: addb   %al, (%rcx)
+# CHECK-NEXT:     0x4002 <+0>: jmp    0x4004 ; <+2>
 # CHECK-NEXT: ->  0x4004 <+2>: addb   %al, (%rdx)
 # CHECK-NEXT:     0x4006 <+4>: addb   %al, (%rbx)
 # CHECK-NEXT:     0x4008 <+6>: addb   %al, (%rsi)
@@ -81,32 +81,32 @@ Sections:
   - Name:            .text
     Type:            SHT_PROGBITS
     Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
-    Address:         0x0000000000004000
+    Address:         0x0000000000000000
     AddressAlign:    0x0000000000001000
-    Content:         00000001000200030006000700080009000A000B000E000F00100011001200130016001700180019001A001B001E001F00200021002200230026002700280029002A002B002E002F
+    Content:         0000EB00000200030006000700080009000A000B000E000F00100011001200130016001700180019001A001B001E001F00200021002200230026002700280029002A002B002E002F
     Size:            0x10000
   - Name:            .note.gnu.build-id
     Type:            SHT_NOTE
     Flags:           [ SHF_ALLOC ]
-    Address:         0x0000000000005000
+    Address:         0x0000000000001000
     AddressAlign:    0x0000000000001000
     Content:         040000000800000003000000474E5500DEADBEEFBAADF00D
 Symbols:
   - Name:            main
     Type:            STT_FUNC
     Section:         .text
-    Value:           0x0000000000004002
+    Value:           0x0000000000000002
     Size:            [[MAIN_SIZE]]
 ProgramHeaders:
   - Type: PT_LOAD
     Flags: [ PF_X, PF_R ]
-    VAddr: 0x4000
+    VAddr: 0x0000
     Align: 0x1000
     FirstSec: .text
     LastSec:  .text
   - Type: PT_LOAD
     Flags: [ PF_W, PF_R ]
-    VAddr: 0x5000
+    VAddr: 0x1000
     Align: 0x1000
     FirstSec: .note.gnu.build-id
     LastSec: .note.gnu.build-id
diff --git a/lldb/test/Shell/Commands/command-disassemble.s b/lldb/test/Shell/Commands/command-disassemble.s
index 10ce8354025ac5..1625f80468eb17 100644
--- a/lldb/test/Shell/Commands/command-disassemble.s
+++ b/lldb/test/Shell/Commands/command-disassemble.s
@@ -15,7 +15,7 @@
 # CHECK-NEXT: error: Cannot disassemble around the current PC without a selected frame: no currently running process.
 # CHECK-NEXT: (lldb) disassemble --start-address 0x0
 # CHECK-NEXT: command-disassemble.s.tmp`foo:
-# CHECK-NEXT: command-disassemble.s.tmp[0x0] <+0>:   int    $0x10
+# CHECK-NEXT: command-disassemble.s.tmp[0x0] <+0>:   jmp    0x2 ; <+2>
 # CHECK-NEXT: command-disassemble.s.tmp[0x2] <+2>:   int    $0x11
 # CHECK-NEXT: command-disassemble.s.tmp[0x4] <+4>:   int    $0x12
 # CHECK-NEXT: command-disassemble.s.tmp[0x6] <+6>:   int    $0x13
@@ -41,7 +41,7 @@
 # CHECK-NEXT: error: End address before start address.
 # CHECK-NEXT: (lldb) disassemble --address 0x0
 # CHECK-NEXT: command-disassemble.s.tmp`foo:
-# CHECK-NEXT: command-disassemble.s.tmp[0x0] <+0>:  int    $0x10
+# CHECK-NEXT: command-disassemble.s.tmp[0x0] <+0>:  jmp    0x2 ; <+2>
 # CHECK-NEXT: command-disassemble.s.tmp[0x2] <+2>:  int    $0x11
 # CHECK-NEXT: command-disassemble.s.tmp[0x4] <+4>:  int    $0x12
 # CHECK-NEXT: command-disassemble.s.tmp[0x6] <+6>:  int    $0x13
@@ -63,7 +63,7 @@
 # CHECK:      command-disassemble.s.tmp[0x203e] <+8190>: int    $0x2a
 # CHECK-NEXT: (lldb) disassemble --start-address 0x0 --count 7
 # CHECK-NEXT: command-disassemble.s.tmp`foo:
-# CHECK-NEXT: command-disassemble.s.tmp[0x0] <+0>:  int    $0x10
+# CHECK-NEXT: command-disassemble.s.tmp[0x0] <+0>:  jmp    0x2 ; <+2>
 # CHECK-NEXT: command-disassemble.s.tmp[0x2] <+2>:  int    $0x11
 # CHECK-NEXT: command-disassemble.s.tmp[0x4] <+4>:  int    $0x12
 # CHECK-NEXT: command-disassemble.s.tmp[0x6] <+6>:  int    $0x13
@@ -101,8 +101,8 @@
 
         .text
 foo:
-        int $0x10
-        int $0x11
+        jmp 1f
+1:      int $0x11
         int $0x12
         int $0x13
         int $0x14

Copy link
Member

@JDevlieghere JDevlieghere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless there's a reason (that Jason remembers) to do it differently, the new behavior makes more sense to me.

Copy link
Collaborator

@jasonmolenda jasonmolenda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick review of the method being modified, this looks fine to me.

@labath labath merged commit d8ebb08 into llvm:main Nov 11, 2024
9 checks passed
@labath labath deleted the disas branch November 11, 2024 14:51
Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Nov 15, 2024
…lvm#115453)

We got a bug report that the disassember output was not relocated (i.e.
a load address) for a core file (like it is for a live process). It
turns out this behavior it depends on whether the instructions were read
from an executable file or from process memory (a core file will not
typically contain the memory image for segments backed by an executable
file).

It's unclear whether this behavior is intentional, or if it was just
trying to handle the case where we're dissassembling a module without a
process, but I think it's undesirable. What makes it particularly
confusing is that the instruction addresses are relocated in this case
(unlike the when we don't have a process), so with large files and
adresses it gets very hard to see whether the relocation has been
applied or not.

This patch removes the data_from_file check so that the instruction is
relocated regardless of where it was read from. It will still not get
relocated for the raw module use case, as those can't be relocated
anywhere as they don't have a load address.
labath added a commit to labath/llvm-project that referenced this pull request Jan 16, 2025
.. by changing the signal stop reason format 🤦

The reason this did not work is because the code in
StopInfo::GetCrashingDereference was looking for the string "address="
to extract the address of the crash. Macos stop reason strings have the
form
  EXC_BAD_ACCESS (code=1, address=0xdead)
while on linux they look like:
  signal SIGSEGV: address not mapped to object (fault address: 0xdead)

Extracting the address from a string sounds like a bad idea, but I
suppose there's some value in using a consistent format across
platforms, so this patch changes the signal format to use the equals
sign as well. All of the diagnose tests pass except one, which appears
to fail due to something similar llvm#115453 (disassembler reports
unrelocated call targets).

I've left the tests disabled on windows, as the stop reason reporting
code works very differently there, and I suspect it won't work out of
the box. If I'm wrong -- the XFAIL will let us know.
labath added a commit that referenced this pull request Jan 23, 2025
.. by changing the signal stop reason format 🤦

The reason this did not work is because the code in
`StopInfo::GetCrashingDereference` was looking for the string "address="
to extract the address of the crash. Macos stop reason strings have the
form
```
  EXC_BAD_ACCESS (code=1, address=0xdead)
```
while on linux they look like:
```
  signal SIGSEGV: address not mapped to object (fault address: 0xdead)
```

Extracting the address from a string sounds like a bad idea, but I
suppose there's some value in using a consistent format across
platforms, so this patch changes the signal format to use the equals
sign as well. All of the diagnose tests pass except one, which appears
to fail due to something similar #115453 (disassembler reports
unrelocated call targets).

I've left the tests disabled on windows, as the stop reason reporting
code works very differently there, and I suspect it won't work out of
the box. If I'm wrong -- the XFAIL will let us know.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants