Skip to content

WIP: Add semantic hightlighting to assembly inside asm! and global_asm! #8738

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

repnop
Copy link
Contributor

@repnop repnop commented May 6, 2021

Attempt at #6031, very early WIP proof of concept PR. (see #6031 (comment) for an example screenshot)

It currently relies on some common conventions between assembly dialects, however this is definitely not ideal as, for instance, # indicates a comment in RISC-V assembly, but an integer literal in ARM assembly. If the asm! macro ends up namespaced into the specific core::arch::<name> modules, this should be more straightforward to figure out which architecture the assembly is for. The alternatives are:

  1. Trying to sneak in target information somehow (if possible)
    1a. I assume this could be extremely awkward considering that the highlighting is done fairly high up in the stack, from my understanding.
  2. Taking a heuristics based approach
    2a. This has the downside of needing a lot more information about the instructions available to each architecture, and can easily be inaccurate for small, single-to-few line asm! snippets.

Assuming we can do either reliably, per-architecture highlighters are the way forward here I believe. It will likely still not be perfect as we'll only have the textual knowledge of the syntax and miss some details, e.g. a symbol defined somewhere else, but I believe even somewhat incomplete/inaccurate highlighting would be a massive improvement for working on low-level code that needs to make liberal use of asm!/global_asm!.

Additional discussion on deciding which semantic tokens types should map to assembly constructs is also a pretty complex problem to solve, I think. Likely will be a decent bit of bikeshedding, but I think I came to a somewhat sensible default for the initial proof of concept.

Feedback and suggestions welcomed :)

@bjorn3
Copy link
Member

bjorn3 commented May 6, 2021

You could limit comment highlighting to // at least for now. LLVM supports it on all platforms.

@repnop
Copy link
Contributor Author

repnop commented May 6, 2021

oh interesting! I wasn't aware of that, thanks :)

}
}

fn lex_asm_string<F>(string: &ast::String, mut callback: F)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you like to publish & maintain a crate for syntax analysis of rust-flavored asm? ^^ It seems like this is a generally re-usable bit of functionality.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, right, the syntax is arch-specific... Hm, I guess, you know more about this than I do, but do consider if we could salvage some re-usability here?

Copy link
Contributor Author

@repnop repnop May 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't be opposed to extracting it out into its own crate(s) :) I think it could probably be a single crate since the formats aren't usually particularly complicated so each arch would probably just be a module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I meant to ask: what did you have in mind with "syntax analysis of rust-flavored asm"? I'm assuming that includes tracking labels and such between different asm! blocks to try and give more complete information to the user? is there anything else? I suppose it may also be possible to give some information about the "type" of labels and such as well 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t know what is this exactly, but I imagine you need to pars asm to do things like:

  • completion of instructions
  • instruction explanations in hover
  • Highlighting targets of jumps
  • Highlighting syntax errors in asm blocks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooohhh, that wasn't something I was thinking of, but that sounds awesome :o the "instruction explanations in hover" is something I actually really like about Godbolt's asm explorer and that'd be fantastic to have inline here! in this case, is this something that would be fairly coupled with r-a's crate ecosystem?

@matklad
Copy link
Member

matklad commented May 7, 2021

Trying to sneak in target information somehow (if possible)

it actually is fairly easy, you can get target_arch and frields like this (#8753 ):

diff --git a/crates/ide/src/syntax_highlighting.rs b/crates/ide/src/syntax_highlighting.rs
index 9df8d21af..519ad6219 100644
--- a/crates/ide/src/syntax_highlighting.rs
+++ b/crates/ide/src/syntax_highlighting.rs
@@ -59,6 +59,10 @@ pub(crate) fn highlight(
 ) -> Vec<HlRange> {
     let _p = profile::span("highlight");
     let sema = Semantics::new(db);
+    if let Some(m) = sema.to_module_def(file_id) {
+        let cfg = m.krate().cfg(db);
+        eprintln!("cfg = {:?}", cfg);
+    }
 
     // Determine the root based on the given range.
     let (root, range_to_highlight) = {

@matklad
Copy link
Member

matklad commented May 17, 2021

@repnop any updates here? :)

@repnop
Copy link
Contributor Author

repnop commented May 17, 2021

oh, somehow I missed that you posted something! I need to get better about checking my notifications 😅 I can take a look at those soon and reply :)

@repnop
Copy link
Contributor Author

repnop commented May 18, 2021

Trying to sneak in target information somehow (if possible)

it actually is fairly easy, you can get target_arch and frields like this (#8753 ):

diff --git a/crates/ide/src/syntax_highlighting.rs b/crates/ide/src/syntax_highlighting.rs
index 9df8d21af..519ad6219 100644
--- a/crates/ide/src/syntax_highlighting.rs
+++ b/crates/ide/src/syntax_highlighting.rs
@@ -59,6 +59,10 @@ pub(crate) fn highlight(
 ) -> Vec<HlRange> {
     let _p = profile::span("highlight");
     let sema = Semantics::new(db);
+    if let Some(m) = sema.to_module_def(file_id) {
+        let cfg = m.krate().cfg(db);
+        eprintln!("cfg = {:?}", cfg);
+    }
 
     // Determine the root based on the given range.
     let (root, range_to_highlight) = {

oh, that's quite convenient! does this include being able to check cfgs for functions etc as well then? because that would make supporting multiple arches at once possible, which would be fantastic :)

@matklad
Copy link
Member

matklad commented May 25, 2021 via email

@repnop
Copy link
Contributor Author

repnop commented May 25, 2021

gotcha! I'll find some time soon to start sketching out a crate for this and post some updates here after that :)

@matklad
Copy link
Member

matklad commented Aug 30, 2021

Let's close this to keep the queue manageble, feel free to re-open when there are updates!

@matklad matklad closed this Aug 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants