Interaction
Tap, swipe, type text, and press keys on the device.
Direct input to the device screen. Every interaction tool requires a reason parameter — a first-person description of the action and its intent. This is captured for audit logging.
All interaction tools return a diff of what changed on screen as a result of the action.
tap
Tap at specific screen coordinates. Use get_textual_state to find element positions.
| Parameter | Type | Required | Description |
|---|---|---|---|
x | number | Yes | X coordinate |
y | number | Yes | Y coordinate |
reason | string | Yes | Intent description |
Tapped at (540, 750).
Elements changed:
EditText "Email" focused=false -> focused=true
Keyboard appearedlong_press
Long press at a coordinate. Useful for context menus, drag initiation, or selection.
| Parameter | Type | Required | Description |
|---|---|---|---|
x | number | Yes | X coordinate |
y | number | Yes | Y coordinate |
duration | number | No | Hold duration in milliseconds. Default: 1000 |
reason | string | Yes | Intent description |
swipe
Swipe between two points. Use for scrolling, dismissing cards, pull-to-refresh, and drawer gestures.
| Parameter | Type | Required | Description |
|---|---|---|---|
startX | number | Yes | Start X |
startY | number | Yes | Start Y |
endX | number | Yes | End X |
endY | number | Yes | End Y |
duration | number | No | Swipe duration in milliseconds. Default: 300 |
reason | string | Yes | Intent description |
Swiped from (540, 1800) to (540, 400).
Elements added:
TextView "Item 5" [120,1600][960,1700]
TextView "Item 6" [120,1720][960,1820]type_text
Type text into the currently focused input field. The keyboard must be visible — verify with get_general_state before calling.
| Parameter | Type | Required | Description |
|---|---|---|---|
text | string | Yes | Text to type |
reason | string | Yes | Intent description |
Typed "user@example.com".press_key
Press a system key or send a numeric key code.
| Parameter | Type | Required | Description |
|---|---|---|---|
key | string | Yes | back, home, enter, recents, or a numeric key code |
reason | string | Yes | Intent description |
Pressed key: back.