I want to play with the new ARM SVE instructions using open source tools.
As a start, I would like to assemble the minimal example present at: https://developer.arm.com/docs/dui0965/latest/getting-started-with-the-sve-compiler/assembling-sve-code
// example1.s
.global main
main:
mov x0, 0x90000000
mov x8, xzr
ptrue p0.s //SVE instruction
fcpy z0.s, p0/m, #5.00000000 //SVE instruction
orr w10, wzr, #0x400
loop:
st1w z0.s, p0, [x0, x8, lsl #2] //SVE instruction
incw x8 //SVE instruction
whilelt p0.s, x8, x10 //SVE instruction
b.any loop //SVE instruction
mov w0, wzr
ret
However, when I try that on my Ubuntu 16.04:
sudo apt-get install binutils-aarch64-linux-gnu
aarch64-linux-gnu-as example1.S
it does not recognize any of the SVE assembly instructions, e.g.:
example1.S:6: Error: unknown mnemonic `ptrue' -- `ptrue p0.s'
I think this is because my GNU AS 2.26.1 is too old and does not have SVE support yet.
I'm also fine using LLVM or any other open source assembler.
Once I manage to assemble, I then want to run it on QEMU user mode since 3.0.0 has SVE support.
Automated example with an assertion
Below I described how that example was achieved.
Assembly
The
aarch64-linux-gnu-as
2.30 in Ubuntu 18.04 is already new enough for SVE as can be seen from: https://sourceware.org/binutils/docs-2.30/as/AArch64-Extensions.html#AArch64-ExtensionsOtherwise, compiling Binutils from source is easy on Ubuntu 16.04, just do:
I didn't check out to a tag because the last tag is a few months old, and I don't feel like grepping log messages for when SVE was introduced ;-)
Then use the compiled
as
and link with the packaged GCC on Ubuntu 16.04:On Ubuntu 16.04,
aarch64-linux-gnu-gcc
5.4 does not have-march=armv8.5-a
, so just use-march=armv8-a
and it should be fine. In any case, neither Ubuntu 16.04 nor 18.04 has-march=armv8-a+sve
which will be the best option when it arrives.Alternatively, instead of passing
-march=armv8.5-a+sve
, you can also add the following to the start of the.S
source code:On Ubuntu 19.04 Binutils 2.32, I also learnt about and tested:
which also works for SVE, I think I'll be using more of that in the future, as it seems to just enable all features in one go, not just SVE!
QEMU simulation
The procedure to step debug it on QEMU is explained at: How to single step ARM assembly in GDB on QEMU?
First I made the example into a minimal self contained Linux executable:
You can run it with:
then it exits nicely.
Next, we can step debug to confirm that the sum was actually made:
and:
Now, step up to right after
bl daxpy
, and run:which confirms that the sum was actually done as expected.
Observing SVE registers seems unimplemented as I can't find anything under: https://github.com/qemu/qemu/tree/v3.0.0/gdb-xml but it should not be too hard to implement by copying other FP registers? Asked at: http://lists.nongnu.org/archive/html/qemu-discuss/2018-10/msg00020.html
You can currently already observe it partially and indirectly by doing:
because the first entry of SVE register
zX
is shared with the oldervX
FP registers, but we can't seep
at all.