instr_t
, which represents a single IA-32 instruction, and the instrlist_t
, which is a linked list of instructions. The header files dr_ir_instrlist.h and dr_ir_instr.h list a number of functions that operate on these data structures, including:
instrlist_t
.instr_t
from an instrlist_t
.
As we will see in the the Events section that follows, a client usually interacts with instrlist_t's
in the form of basic blocks or traces. A basic block is a sequence of instructions that terminates with a control transfer operation. Traces are frequently-executed sequences of basic blocks that DynamoRIO forms dynamically as the application executes, i.e., hot code. Collectively, we refer to basic blocks and traces as fragments. Both basic blocks and traces present a linear view of control flow. In other words, instruction sequences have a single entrance and one or more exits. This representation greatly simplifies analysis and is a primary contributor to DynamoRIO's efficiency.
DynamoRIO provides two events related to fragment creation: one for basic blocks and one for traces (see dr_register_bb_event() and dr_register_trace_event()). Through these fragment-creation hooks, the client has the ability to inspect and modify any piece of code that DynamoRIO emits before it executes. Using the basic block hook, a client sees all application code. The trace-creation hook provides a mechanism for clients to instrument only frequently-executed code paths.
The code manipulation API is highly efficient in that fragment creation comprises a small part of DynamoRIO's overhead. A client's analysis time rarely impacts execution time; only instrumentation code inserted by the client affects efficiency.
dr_emit_flags_t new_block(void *drcontext, void *tag, instrlist_t *bb, bool for_trace, bool translating);
drcontext
is a pointer to the input program's machine context. Clients should not inspect or modify the context; it is provided as an opaque pointer (i.e., void *
) to be passed to API routines that require access to this internal data.
tag
is a unique identifier for the basic block fragment.
bb
is a pointer to the list of instructions that comprise the basic block. Clients can examine, manipulate, or completely replace the instructions in the list.
for_trace
indicates whether this callback is for a new basic block (false) or for adding a basic block to a trace being created (true). The client has the opportunity to either include the same modifications made to the standalone basic block, or to use different modifications, for the code in the trace.
translating
indicates whether this callback is for basic block creation (false) or is for address translation (true). This is further explained in State Translation.The return value of the basic block callback should generally be DR_EMIT_DEFAULT; however, time-varying instrumentation or complex code transformations may need to return DR_EMIT_STORE_TRANSLATIONS. See State Translation for further details.
To iterate over instructions in an instrlist_t
, use the instrlist_first(), instrlist_last(), and instr_get_next() routines. For example:
dr_emit_flags_t new_block(void *drcontext, void *tag, instrlist_t *bb, bool for_trace, bool translating) { instr_t *instr, *next; for (instr = instrlist_first(bb); instr != instrlist_last(bb); instr = next) { next = instr_get_next(instr); /* do some processing on instr */ } return DR_EMIT_DEFAULT; }
instr_set_ok_to_mangle() instrlist_meta_preinsert() instrlist_meta_postinsert() instrlist_meta_append()
Through meta instructions, a client can add its own internal control flow or make a call to a native routine that will not be brought into the code cache by DynamoRIO. However, such native calls need to be careful to remain transparent (see Clean Calls).
Meta-instructions are considered observational and should not fault. See State Translation for more information on fault handling.
While DynamoRIO attempts to support arbitrary code transformations, its internal operation requires that we impose the following limitations:
for_trace
parameter is true). The trace-creation event provides a mechanism for clients to instrument hot code separately.The parameters to the trace-creation event hook are nearly identical to those of the basic block hook:
dr_emit_flags_t new_trace(void *drcontext, void *tag, instrlist_t *trace, bool translating);
drcontext
is a pointer to the input program's machine context. Clients should not inspect or modify the context; it is provided as an opaque pointer (i.e., void *
) to be passed to API routines that require access to this internal data.
tag
is a unique identifier for the trace fragment.
bb
is a pointer to the list of instructions that comprise the trace. Clients can examine, manipulate, or completely replace the instructions in the list.
translating
indicates whether this callback is for trace creation (false) or is for address translation (true). This is further explained in State Translation.The return value of the trace callback should generally be DR_EMIT_DEFAULT; however, time-varying instrumentation or complex code transformations may need to return DR_EMIT_STORE_TRANSLATIONS. See State Translation for further details.
DynamoRIO calls the client-supplied event hook each time a trace is created, just before the trace is emitted into the code cache. Additionally, as each constituent basic block is added to the trace, DynamoRIO calls the basic block creation hook with the for_trace
parameter set to true. In order to preserve basic block instrumentation inside of traces, a client need only act identically with respect to the for_trace
parameter; it can ignore the trace event if its goal is to place instrumentation on all code.
The constituent basic blocks will be stitched together prior to insertion in the code cache: conditional branches will be realigned so that their fall-through target remains on the trace, and inlined indirect branches will be preceded by a comparison against the on-trace target.
If the basic block callback behaves differently based on the for_trace
parameter, different instrumentation will exist in the trace as opposed to the standalone basic block. If the basic block corresponds to the application code at the start of the trace (i.e., it is a trace head), the trace will shadow the basic block and the trace will be executed preferentially. If dr_delete_fragment() is called, it will also delete the trace first and may leave the basic block in place. The flush routines (dr_flush_region(), dr_delay_flush_region(), dr_unlink_flush_region()), however, will delete traces and basic blocks alike.
void restore_state(void *drcontext, void *tag, dr_mcontext_t *mcontext, bool restore_memory, bool app_code_consistent) void restore_state_ex(void *drcontext, bool restore_memory, dr_restore_state_info_t *info)
See State Translation for further details.
void fragment_deleted(void *drcontext, void *tag);
DynamoRIO calls this event hook each time it deletes a fragment from the code cache. Such information may be needed if the client maintains its own data structures about emitted fragment code that must be consistent across fragment deletions.
instr_t
data structure with all information about the instruction (e.g., opcode and operand information).
When decoding instructions, clients must explicitly manage the instr_t
data structure. For example, the following code shows how to use the instr_init(), instr_reset(), and instr_free() routines to decode a sequence of arbritrary instructions:
instr_t instr; instr_init(&instr); do { instr_reset(dcontext, &instr); pc = decode(dcontext, pc, &instr); /* check for invalid instr */ if (pc == NULL) break; if (instr_writes_memory(&instr)) { /* do some processing */ } } while (pc < stop_pc); instr_free(dcontext, &instr);
The 64-bit build of DynamoRIO is capable of decoding 32-bit code as well. See 64-bit Versus 32-bit Instructions for full details.
instr_t *instr = INSTR_CREATE_dec(dcontext, opnd_create_reg(REG_EDX));
instr_t *instr = instr_create(dcontext); instr_set_opcode(instr, OP_dec); instr_set_num_opnds(dcontext, instr, 1, 1); instr_set_dst(instr, 0, opnd_create_reg(REG_EDX)); instr_set_src(instr, 0, opnd_create_reg(REG_EDX));
When using the second method, the exact order of operands and their sizes must match the templates that DynamoRIO uses. The INSTR_CREATE_ macros in dr_ir_macros.h should be consulted to determine the order.
When encoding a control transfer instruction that targets another instruction, two encoding passes are performed: one to find the offset of the target instruction, and the other to link the control transfer to the proper target offset.
The 64-bit build of DynamoRIO is capable of encoding 32-bit code as well. See 64-bit Versus 32-bit Instructions for details.
Each instruction contains a flag indicating whether it should be encoded in 32-bit or 64-bit mode. When an instruction is created or decoded, the instruction's flag is set to the thread's current mode. It can be queried with instr_get_x86_mode() and changed with instr_set_x86_mode().
The following subsections describe these routines in more detail.
Here is an example of inserting a clean call to the at_mbr
function:
if (instr_is_mbr(instr)) { app_pc address = instr_get_app_pc(instr); uint opcode = instr_get_opcode(instr); instr_t *nxt = instr_get_next(instr); dr_insert_clean_call(drcontext, ilist, nxt, (void *) at_mbr, false/*don't need to save fp state*/, 2 /* 2 parameters */, /* opcode is 1st parameter */ OPND_CREATE_INT32(opcode), /* address is 2nd parameter */ OPND_CREATE_INTPTR(address)); }
Through this mechanism, clients can write analysis code in C or other high-level languages and easily insert calls to these routines in the instruction stream. Note, however, that saving and restoring machine state is an expensive operation. Performance-critical operations should be inlined for maximum efficiency.
The stack that DynamoRIO switches to for clean calls is relatively small: only 20KB by default. Clients can increase the size of the stack with the -stack_size runtime option. Clients should also avoid keeping persistent state on the clean call stack, as it is wiped clean at the start of each clean call.
The saved interrupted application state can be accessed using dr_get_mcontext() and modified using dr_set_mcontext().
For performance reasons, clean calls do not save or restore floating point, MMX, or SSE state by default. If the clean callee is using floating point or multimedia operations, it should request that the clean call mechanism preserve the floating point state through the appropriate parameter to dr_insert_clean_call(). See also Floating Point State, MMX, and SSE Transparency.
If more detailed control over the call sequence is desired, it can be broken down into its constituent pieces:
dr_save_reg(), dr_restore_reg(), and dr_reg_spill_slot_opnd()
The values stored in these spill slots remain valid until the next non-meta (i.e. application) instruction and as such can be accessed from clean calls using:
For longer term persistance DynamoRIO also provides a generic dedicated thread-local storage field for use by clients, making it easy to write thread-aware clients. From C code, use:
To access this thread-local field from the code cache, use the following routines to generate the necessary code:
Since saving and restoring the eflags
register is required for almost all code transformations, and since it is difficult to do so efficiently, we export routines that use our efficient method of arithmetic flag preservation:
As just discussed in Clean Calls, we also export convenience routines for making clean (i.e., transparent) native calls from the code cache, as well as floating point and multimedia state preservation.
dr_insert_call_instrumentation() dr_insert_ubr_instrumentation() dr_insert_mbr_instrumentation() dr_insert_cbr_instrumentation()
In order to directly modify the instrumentation on a particular fragment (as opposed to replacing instrumentation on all copies of fragments corresponding to particular application code), DynamoRIO also supports directly replacing an existing fragment with a new instrlist_t:
However, this routine is only supported when running with the -thread_private runtime option, and it replaces the fragment for the current thread only. A client can call this routine even while inside the to-be-replaced fragment (e.g., in a clean call from inside the fragment). In this scenario, the old fragment is executed to completion and the new code is inserted before the next execution.
For example usage, see the client sample Modifying Existing Instrumentation.
DynamoRIO allows a client to build custom traces by marking its own trace heads (in addition to DynamoRIO's normal trace heads) and deciding when to end traces. If a client registers for the following event, DynamoRIO will call its hook before extending a trace (with tag trace_tag
) with a new basic block (with tag next_tag
):
int query_end_trace(void *drcontext, void *trace_tag, void *next_tag);
The client hook returns one of these values:
If using standard termination criteria, DynamoRIO ends the trace if it reaches a trace head or another trace (or certain corner-case basic blocks that cannot be part of a trace).
The client can also mark any basic block as a trace head with
For example usage, see the callee-inlining client sample Custom Tracing.
If a client is only adding observational instrumentation (i.e., Meta-Instructions) (which should not fault) and is not modifying, reordering, or removing application instructions, these details can be ignored. In that case the client's basic block and trace callbacks should return DR_EMIT_DEFAULT in addition to being deterministic and idempotent (i.e., DynamoRIO should be able to repeatedly call the callback and receive back the same resulting instruction list, with no net state changes to the client).
If a client is performing modifications, then in order for DynamoRIO to properly translate a code cache address the client must use instr_set_translation() (chainable via INSTR_XL8()) in the basic block and trace creation callbacks to set the corresponding application address for each modified instruction and each added non-meta instruction. The translation value is the application address that should be presented to the application as the faulting address, or the application address that should be restarted after a suspend.
There are two methods for using the translated addresses:
tag
will not be called with translating
set to true. Note that unless DR_EMIT_STORE_TRANSLATIONS is also returned for for_trace
calls (or DR_EMIT_STORE_TRANSLATIONS is returned in the trace callback), each constituent block comprising the trace will need to be re-created with both for_trace
and translating
set to true. Storing translations uses additional memory that can be significant: up to 20% in some cases, as it prevents DR from using its simple data structures and forces it to fall back to its complex, corner-case design. This is why DR does not store all translations by default.translating
set to true. All modifications to the instruction list that were performed on the creation callback must be repeated on the translating callback. This option is only posible when basic block modifications are deterministic and idempotent, but it saves memory. Naturally, global state changes triggered by block creation should be wrapped in checks for translating
being false. Even in this case, instr_set_translation() should be called for non-meta instructions even when translating
is false, as DynamoRIO may decide to store the translations at creation time for reasons of its own.Furthermore, if the client's modifications change any part of the machine state besides the program counter, the client should use dr_register_restore_state_event() or dr_register_restore_state_ex_event() (see State Restoration) to restore the registers to their original application values.
For meta instructions that do not reference application memory (i.e., they should not fault), leave the translation field as NULL. A NULL value instructs DynamoRIO to use the subsequent non-meta instruction's translation as the application address, and to fail when translating the full state. Since the full state will only be needed when relocating a thread (as stated, there will not be a fault here), failure indicates that this is not a valid relocation point, and DynamoRIO's thread synchronization scheme will use another spot. If the translation field is set to a non-NULL value, the client should be willing to also restore the rest of the machine state at that point (restore spilled registers, etc.) via dr_register_restore_state_event() or dr_register_restore_state_ex_event(). This is necessary for meta instructions that reference application memory. DynamoRIO takes care of such potentially-faulting instructions added by its own API routines (dr_insert_clean_call() arguments that reference application data, dr_insert_mbr_instrumentation()'s read of application indirect branch data, etc.)
For meta-instructions that can fault but only in deliberate ways (typically by accessing client memory) and never access application memory, the "meta-instruction that can fault" property can be set via instr_set_meta_may_fault() to avoid incurring the potential cost of added sandboxing checks that look for changes to application code. The instrlist_meta_fault_preinsert() and related routines make this a simpler process. Here is an example:
#define PREXL8M instrlist_meta_fault_preinsert app_pc xl8 = instr_get_app_pc(inst); PREXL8M(bb, inst, INSTR_XL8(INSTR_CREATE_mov_st(drcontext, dst, src), xl8));
When building an executable that uses DynamoRIO's Application Interface, follow the steps for Building a Client to include the header files and link with the DynamoRIO library, but omit the linker flags requesting no standard libraries or startup files. DynamoRIO's CMake support does this automatically, as the linker flags for shared libraries are separate from those for executables.