mirror of
https://github.com/ioacademy-jikim/debugging
synced 2025-06-10 09:26:15 +00:00
241 lines
6.9 KiB
Plaintext
241 lines
6.9 KiB
Plaintext
|
|
Status
|
|
~~~~~~
|
|
|
|
As of Jan 2014 the trunk contains a port to AArch64 ARMv8 -- loosely,
|
|
the 64-bit ARM architecture. Currently it supports integer and FP
|
|
instructions and can run anything generated by gcc-4.8.2 -O3. The
|
|
port is under active development.
|
|
|
|
Current limitations, as of mid-May 2014.
|
|
|
|
* limited support of vector (SIMD) instructions. Initial target is
|
|
support for instructions created by gcc-4.8.2 -O3
|
|
(via autovectorisation). This is complete.
|
|
|
|
* Integration with the built in GDB server:
|
|
- works ok (breakpoint, attach to a process blocked in a syscall, ...)
|
|
- still to do:
|
|
arm64 xml register description files (allowing shadow registers
|
|
to be looked at).
|
|
cpsr transfer to/from gdb to be looked at (see also arm equivalent code)
|
|
|
|
* limited syscall support
|
|
|
|
There has been extensive testing of the baseline simulation of integer
|
|
and FP instructions. Memcheck is also believed to work, at least for
|
|
small examples. Other tools appear to at least not crash when running
|
|
/bin/date.
|
|
|
|
Enough syscalls and instructions are supported for substantial
|
|
programs to work. Firefox 26 is able to start up and quit. The noise
|
|
level from Memcheck is low enough to make it practical to use for real
|
|
debugging.
|
|
|
|
|
|
Building
|
|
~~~~~~~~
|
|
|
|
You could probably build it directly on a target OS, using the normal
|
|
non-cross scheme
|
|
|
|
./autogen.sh ; ./configure --prefix=.. ; make ; make install
|
|
|
|
Development so far was however done by cross compiling, viz:
|
|
|
|
export CC=aarch64-linux-gnu-gcc
|
|
export LD=aarch64-linux-gnu-ld
|
|
export AR=aarch64-linux-gnu-ar
|
|
|
|
./autogen.sh
|
|
./configure --prefix=`pwd`/Inst --host=aarch64-unknown-linux \
|
|
--enable-only64bit
|
|
make -j4
|
|
make -j4 install
|
|
|
|
Doing this assumes that the install path (`pwd`/Inst) is valid on
|
|
both host and target, which isn't normally the case. To avoid
|
|
this limitation, do instead:
|
|
|
|
./configure --prefix=/install/path/on/target \
|
|
--host=aarch64-unknown-linux \
|
|
--enable-only64bit
|
|
make -j4
|
|
make -j4 install DESTDIR=/a/temp/dir/on/host
|
|
# and then copy the contents of DESTDIR to the target.
|
|
|
|
See README.android for more examples of cross-compile building.
|
|
|
|
|
|
Implementation tidying-up/TODO notes
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
UnwindStartRegs -- what should that contain?
|
|
|
|
|
|
vki-arm64-linux.h: vki_sigaction_base
|
|
I really don't think that __vki_sigrestore_t sa_restorer
|
|
should be present. Adding it surely puts sa_mask at a wrong
|
|
offset compared to (kernel) reality. But not having it causes
|
|
compilation of m_signals.c to fail in hard to understand ways,
|
|
so adding it temporarily.
|
|
|
|
|
|
m_trampoline.S: what's the unexecutable-insn value? 0xFFFFFFFF
|
|
is there at the moment, but 0x00000000 is probably what it should be.
|
|
Also, fix indentation/tab-vs-space stuff
|
|
|
|
|
|
./include/vki/vki-arm64-linux.h: uses __uint128_t. Should change
|
|
it to __vki_uint128_t, but what's the defn of that?
|
|
|
|
|
|
m_debuginfo/priv_storage.h: need proper defn of DiCfSI
|
|
|
|
|
|
readdwarf.c: is this correct?
|
|
#elif defined(VGP_arm64_linux)
|
|
# define FP_REG 29 //???
|
|
# define SP_REG 31 //???
|
|
# define RA_REG_DEFAULT 30 //???
|
|
|
|
|
|
vki-arm64-linux.h:
|
|
re linux-3.10.5/include/uapi/asm-generic/sembuf.h
|
|
I'd say the amd64 version has padding it shouldn't have. Check?
|
|
|
|
|
|
syswrap-linux.c run_a_thread_NORETURN assembly sections
|
|
seems like tst->os_state.exitcode has word type
|
|
in which case the ppc64_linux use of lwz to read it, is wrong
|
|
|
|
|
|
syswrap-linux.c ML_(do_fork_clone)
|
|
assuming that VGP_arm64_linux is the same as VGP_arm_linux here
|
|
|
|
|
|
dispatch-arm64-linux.S: FIXME: set up FP control state before
|
|
entering generated code. Also fix screwy indentation.
|
|
|
|
|
|
dispatcher-ery general: what's a good (predictor-friendly) way to
|
|
branch to a register?
|
|
|
|
|
|
in vki-arm64-scnums.h
|
|
//#if __BITS_PER_LONG == 64 && !defined(__SYSCALL_COMPAT)
|
|
Probably want to reenable that and clean up accordingly
|
|
|
|
|
|
putIRegXXorZR: figure out a way that the computed value is actually
|
|
used, so as to keep any memory reads that might generate it, alive.
|
|
(else the simulation can lose exceptions). At least, for writes to
|
|
the zero register generated by loads .. or .. can anything other
|
|
integer instructions, that write to a register, cause exceptions?
|
|
|
|
|
|
loads/stores: generate stack alignment checks as necessary
|
|
|
|
|
|
fix barrier insns: ISB, DMB
|
|
|
|
|
|
fix atomic loads/stores
|
|
|
|
|
|
FMADD/FMSUB/FNMADD/FNMSUB: generate and use the relevant fused
|
|
IROps so as to avoid double rounding
|
|
|
|
|
|
ARM64Instr_Call getRegUsage: re-check relative to what
|
|
getAllocableRegs_ARM64 makes available
|
|
|
|
|
|
Make dispatch-arm64-linux.S save any callee-saved Q regs
|
|
I think what is required is to save D8-D15 and nothing more than that.
|
|
|
|
|
|
wrapper for __NR3264_fstat -- correct?
|
|
|
|
|
|
PRE(sys_clone): get rid of references to vki_modify_ldt_t and the
|
|
definition of it in vki-arm64-linux.h. Ditto for 32 bit arm.
|
|
|
|
|
|
sigframe-arm64-linux.c: build_sigframe: references to nonexistent
|
|
siguc->uc_mcontext.trap_no, siguc->uc_mcontext.error_code have been
|
|
replaced by zero. Also in synth_ucontext.
|
|
|
|
|
|
m_debugger.c:
|
|
uregs.pstate = LibVEX_GuestARM64_get_nzcv(vex); /* is this correct? */
|
|
Is that remotely correct?
|
|
|
|
|
|
host_arm64_defs.c: emit_ARM64INstr:
|
|
ARM64in_VDfromX and ARM64in_VQfromXX: use simple top-half zeroing
|
|
MOVs to vector registers instead of INS Vd.D[0], Xreg, to avoid false
|
|
dependencies on the top half of the register. (Or at least check
|
|
the semantics of INS Vd.D[0] to see if it zeroes out the top.)
|
|
|
|
|
|
preferredVectorSubTypeFromSize: review perf effects and decide
|
|
on a types-for-subparts policy
|
|
|
|
|
|
fold_IRExpr_Unop: add a reduction rule for this
|
|
1Sto64(CmpNEZ64( Or64(GET:I64(1192),GET:I64(1184)) ))
|
|
vis 1Sto64(CmpNEZ64(x)) --> CmpwNEZ64(x)
|
|
|
|
|
|
check insn selection for memcheck-only primops:
|
|
Left64 CmpwNEZ64 V128to64 V128HIto64 1Sto64 CmpNEZ64 CmpNEZ32
|
|
widen_z_8_to_64 1Sto32 Left32 32HLto64 CmpwNEZ32 CmpNEZ8
|
|
|
|
|
|
isel: get rid of various cases where zero is put into a register
|
|
and just use xzr instead. Especially for CmpNEZ64/32. And for
|
|
writing zeroes into the CC thunk fields.
|
|
|
|
|
|
/* Keep this list in sync with that in iselNext below */
|
|
/* Keep this list in sync with that for Ist_Exit above */
|
|
uh .. they are not in sync
|
|
|
|
|
|
very stupid:
|
|
imm64 x23, 0xFFFFFFFFFFFFFFA0
|
|
17 F4 9F D2 F7 FF BF F2 F7 FF DF F2 F7 FF FF F2
|
|
|
|
|
|
valgrind.h: fix VALGRIND_ALIGN_STACK/VALGRIND_RESTORE_STACK,
|
|
also add CFI annotations
|
|
|
|
|
|
could possibly bring r29 into use, which be useful as it is
|
|
callee saved
|
|
|
|
|
|
ubfm/sbfm etc: special case cases that are simple shifts, as iropt
|
|
can't always simplify the general-case IR to a shift in such cases.
|
|
|
|
|
|
LDP,STP (immediate, simm7) (FP&VEC)
|
|
should zero out hi parts of dst registers in the LDP case
|
|
|
|
|
|
DUP insns: use Iop_Dup8x16, Iop_Dup16x8, Iop_Dup32x4
|
|
rather than doing it "by hand"
|
|
|
|
|
|
Any place where ZeroHI64ofV128 is used in conjunction with
|
|
FP vector IROps: find a way to make sure that arithmetic on
|
|
the upper half of the values is "harmless."
|
|
|
|
|
|
math_MINMAXV: use real Iop_Cat{Odd,Even}Lanes ops rather than
|
|
inline scalar code
|
|
|
|
|
|
chainXDirect_ARM64: use direct jump forms when possible
|