Skip to content

Improve v6 release guide #2706

Open
Open
@Rot127

Description

@Rot127
Collaborator

The v6 release guide should be improved with feedback by the users who already use v6.


Feedback by @OBarronCS (pwndbg)

  • The v6 guide would profit from concrete examples about alias instructions. E.g. what is the difference between:
    • AARCH64_INS_LSL vs. AARCH64_INS_ALIAS_LSL
    • AARCH64_INS_MOV vs. AARCH64_INS_ALIAS_MOV
    • SPARC_INS_CALL vs. SPARC_INS_ALIAS_CALL
  • "ARM unconditional branches (B, BL), used to use the condition code ARM_CC_AL to indicate unconditionality"
  • Edge cases like:
  • Add more explanation how groups changed. It can introduce bugs because they changed so much.

Full feddback:

loop/OBarronCS

PwnDbg
User: logo

Hi! Thank you for all your work on Capstone! I've been following the recent releases, and appreciate all the very quick responses to my GitHub issues!

Regarding updating to V6 - the release guide was very helpful for updating from v5 to v6 - https://github.com/capstone-engine/capstone/blob/next/docs/cs_v6_release_guide.md. It gave a great idea of the API changes to expect.

For context, Pwndbg uses Capstone for disassembling instructions which we print to the context view. We also rely on the instruction metadata (operands, groups) that Capstone provides in order to display other information, like displaying the action that an instruction takes (such as what mathematical operation an instruction does, or values that are moved/stored/loaded). The groups help us determine jumps and syscalls, so we can do special logic to display function arguments/step through the program until we get to a syscall.

I started the update during V6 Alpha3, and I was able to make somewhat quick progress on updating because we have a bunch of tests which run short snippets of assembly for various architectures to make sure the disassembly output is as expected (files in https://github.com/pwndbg/pwndbg/tree/dev/tests/qemu-tests/tests/user). Some of these tests caught changes that we ended up reporting to the Capstone Github (thank you for fixing those so quickly so we could use Alpha4!).

While updating, I would also often compile a program and just step through it with Pwndbg while printing the underlying Capstone metadata of each instruction - this is a nice way to encounter a wide range of instructions and to catch unexpected capstone behavior, such as seeing an instructions that should have certain groups but are missing them, or seeing an unexpected makeup of operands.

It took a little bit to get used to the aliases, but I like the change as it makes it more clear what is actually an instruction, and what is not. In the Capstone update documentation, it might be worth adding a couple concrete examples of these aliases and explaining when you would see one or the other, especially ones like AARCH64_INS_MOV v.s. AARCH64_INS_ALIAS_MOV, or AARCH64_INS_LSL vs AARCH64_INS_ALIAS_LSL.

Projects that rely on Capstone might run into bugs in these cases if not made aware of this, as in the above two examples, if you are handling some logic on MOV or LSL instructions, you now need to handle both the alias and the real version. While disassembling real world binaries, sometimes the ALIAS versions are the ones that actually appear.

I figured the large overhaul of the Capstone backend would result in some minor changes here and there. There a handful of very subtle changes that our tests caught. To name one from the top of my head, ARM unconditional branches (B, BL), used to use the condition code ARM_CC_AL to indicate unconditionality, and now use ARMCC_UNDEF iirc. Another one was that AARCH64 now has a memory operand variant which has only a constant value associated with it https://github.com/pwndbg/pwndbg/blob/3e25f170019d8025339594da5fd7bd8739c17d3b/pwndbg/aglib/disasm/aarch64.py#L448-L451, which used to be encoded as an immediate operand. There are some others I might be able to dig up from my notes, such as some branch/syscall instructions changing what groups they have. However, by and large, things went smoothly and just worked.

For pwndbg, being able to use real instruction details wouldn't help, and it works better having the alias operands. As a example, when handling something like a RISC-Vnop which is an alias of add x0, x0, 0, we want the alias details which have no operands, as opposed to having three operands. This is particularly relevant for something like *_ALIAS_MOV, where the "real" instruction is a shift or something. We use the operands to guide the display logic, and we want to display information that lines up with the visual mnemonic/operands that show up in the disassembly.


kiedystobylo (radare2)

  • ARM64 compatibility header: AArch64CC_CondCode should be mapped to arm64_cc instead of ARM64CC_CondCode. Same for arm64_vas. Highlight the change in Enum values. To clarify this.

full feedback: No public channel.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @Rot127

        Issue actions

          Improve v6 release guide · Issue #2706 · capstone-engine/capstone