-
Notifications
You must be signed in to change notification settings - Fork 32
🤖 feat: unify bash into task_* tools #1308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
7860d42 to
986955f
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Change-Id: Ic4a03eb4953916443fcc074412498e0853c0ff17 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: If25e133840a3d7cd29c32a0970cabd432a0e2b2d Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I93d9236d1160456b3f4f3c05815b70f84e812012 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: Ifa45aefee1298bd27137a4d1ffc6bf95517689b8 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I002ab1584a1c9e73ebdd06a126d5bf01acdd9756 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I1a869f9aea822126fdf7098161d5f5198fe12ee2 Signed-off-by: Thomas Kosiewski <[email protected]>
Previously, timeout_secs=0 was treated as 'unset' and fell back to the default 10-minute timeout for agent tasks (while bash tasks correctly supported non-blocking 0-second waits). Now timeout_secs=0 is respected for both agent and bash tasks, enabling immediate non-blocking checks when needed. Addresses Codex review feedback. Change-Id: Iad7d72a342c9ae6c86ffcade4f7bc34e846376c5 Signed-off-by: Thomas Kosiewski <[email protected]>
215bd13 to
db76e83
Compare
|
Fixed in db76e83 - |
Change-Id: Id3c00aad060e4f137cbe5e6712be0502fd3a1cd6 Signed-off-by: Thomas Kosiewski <[email protected]>
|
@codex review Follow-up: fixed static-check by aligning RefreshController unit tests with the current (post-revert) RefreshController API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
mux/src/node/services/tools/task_await.ts
Lines 174 to 176 in f84415d
| try { | |
| const report = await taskService.waitForAgentReport(taskId, { | |
| timeoutMs, |
task_await now allows timeout_secs=0 (per schema/docs for non‑blocking bash), but createTaskAwaitTool forwards timeoutMs directly into taskService.waitForAgentReport. When timeout_secs is 0 and any agent task is included (explicitly or via the default “all active tasks”), waitForAgentReport asserts timeoutMs > 0 and throws, so those agent results come back as error instead of queued|running|awaiting_report. Consider treating 0 as “no wait” for agent tasks (return current status) or defaulting to the 10‑minute wait when timeout_secs is 0.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
When timeout_secs=0, task_await should return the current agent task status (queued/running/awaiting_report) rather than erroring due to waitForAgentReport's timeoutMs>0 invariant. Add a regression test. Change-Id: I6114dece5254064298bfe523339b17b1579abe48 Signed-off-by: Thomas Kosiewski <[email protected]>
|
Fixed in cac38ca.
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Change-Id: I8721464eef644e87bd4aebd60c0231b99af432e7 Signed-off-by: Thomas Kosiewski <[email protected]>
|
@codex review Addressed the truncation metadata issue for
|
Change-Id: Ic45bb48c897b0bab9a0460908e7167f07c43bf6f Signed-off-by: Thomas Kosiewski <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Change-Id: I0405e87882b040e05611ed899c1590293d15deb1 Signed-off-by: Thomas Kosiewski <[email protected]>
|
@codex review |
|
Codex Review: Didn't find any major issues. Another round soon, please! ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
Change-Id: Iae7e5a1f29002e2ccbc5bf1e5f82c2e72c975400 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: Ia3295b2a2f720c18253f36f129edf853dd0b44c1 Signed-off-by: Thomas Kosiewski <[email protected]>
This PR makes
task_*the single tool surface for both sub-agent tasks and bash execution.Highlights
task(kind="bash")supports foreground bash execution (with output limits/truncation) and background spawn (returnsbash:<processId>).task_awaitcan await both agent reports and bash incremental output (withfilter/filter_exclude+ timeout semantics).task_list/task_terminatenow include bash tasks, with descendant-scope checks.bash_*tools by default (wrappers remain temporarily for compatibility).Follow-ups addressed in this PR
truncatedmetadata and include an explicit warning inreportMarkdown.task_awaitcan still await a completed sub-agent even if its workspace has been removed.task(kind="bash")now renders using the existing Bash tool UI (instead of the task UI wrapper).bash_outputwhen surfaced viatask_await).Validation
make static-checkbun test src/node/services/tools/task.test.ts src/node/services/tools/task.bash.test.tsbun test src/node/services/backgroundProcessManager.test.ts src/node/services/tools/task_await.test.tsbun test src/node/services/taskService.test.ts📋 Implementation Plan
Consolidate
bash_*into thetask_*abstraction (task = agent | bash)Goal
Reduce tool surface overlap by making
task_*the single abstraction for both:Success criteria
bash,bash_output,bash_background_list,bash_background_terminatein the default toolset.task,task_await,task_list,task_terminatecover the full bash workflow:Recommended approach (net product LoC ≈ +450)
Extend the
task_*tools to support a new bash task kind, backed byBackgroundProcessManager, and stop exposingbash_*to models (aligngetToolsForModel()with thegetAvailableTools()allowlist; keep legacy wrappers temporarily).1) API shape (tool schemas)
1.1
taskargs: union of agent + bashUpdate
TaskToolArgsSchemainsrc/common/utils/tools/toolDefinitions.tsfrom a single strict object to a strict union:{ subagent_type, prompt, title, run_in_background? }{ kind: "bash", script, timeout_secs, run_in_background?, display_name }Notes:
kind: "bash"so parsing is unambiguous.1.2
taskresults: keep current discriminant, add optional bash fieldsTo avoid changing the
status-based discriminated unions:status: "queued"|"running"|"completed"exactly as-is.exitCode?: numbernote?: stringtruncated?: { reason: string; totalLines: number }For bash tasks:
status:"completed"withreportMarkdowncontaining a formatted bash summary + output.status:"running"and ataskId.1.3
task_awaitbecomes the unified “await output/report”Update
TaskAwaitToolArgsSchemato add bash-output knobs (all optional):filter?: stringfilter_exclude?: booleantimeout_secs?: number(allow0for non-blocking, aligning withbash_output)Update
TaskAwaitTool*ResultSchemato include optional streaming fields on active/completed results:output?: string(incremental output when kind=bash)elapsed_ms?: numberexitCode?: numbernote?: stringMapping:
agent_report).BackgroundProcessManager.getOutput(); may returnstatus:"running"withoutputpopulated.Timeout semantics for bash tasks (replacing
bash_output):task_await.timeout_secsis forwarded toBackgroundProcessManager.getOutput(...)and means “wait up to N seconds for new output (or process exit)”.status:"running"and whatever incremental output was read since the previous call (outputmay be empty).output.log); incomplete final line fragments remain buffered until newline/exit.1.4
task_listreturns both kindsExtend
TaskListToolTaskSchemaadditively:kind?: "agent"|"bash"(optional; omit for agent tasks to minimize churn)display_name?: string(bash)script?: string(bash)uptime_ms?: number(bash)exitCode?: number(bash)status:"reported"and includeexitCode.TaskListStatusSchemawithexited|killed|failedand plumb through.1.5
task_terminatesupports bothKeep
task_terminateinput shape, but broaden semantics:task_idsmay include agent or bash tasks.For results:
terminatedTaskIdsarray.terminatedTaskIds: [taskId].2) Backend implementation (node/services/tools)
2.1 Add a tiny task-id router (defensive)
Create a helper (new file) e.g.
src/node/services/tools/taskId.ts:isBashTaskId(taskId: string): booleantoBashTaskId(processId: string): string(e.g.bash:${processId})fromBashTaskId(taskId: string): string | nullDefensive checks:
assert(taskId.trim().length > 0)bash:ids early with explicit errors.2.2 Implement
task(kind=bash)insrc/node/services/tools/task.tsDispatch based on args shape:
config.runtime,config.backgroundProcessManager,config.workspaceId.run_in_background=true:backgroundProcessManager.spawn(runtime, workspaceId, script, { cwd, env, displayName, timeoutSecs }).status:"running",taskId: toBashTaskId(processId).run_in_background=false:BashToolResult→TaskToolCompletedResult.2.3 De-duplicate bash execution logic
Refactor
src/node/services/tools/bash.tsso that foreground execution is callable from both tools:executeBashForeground(config, args, ctx): Promise<BashToolResult>.createBashTool()remains as thin wrapper.createTaskTool()uses the helper for bash foreground tasks.This avoids re-implementing:
2.4 Extend
task_awaitto support bash tasksIn
src/node/services/tools/task_await.ts:args.task_idsprovided: use them.taskService.listActiveDescendantAgentTaskIds(workspaceId)backgroundProcessManager.list()filtered to descendant workspaces +status === "running"waitForAgentReport()behavior.workspaceIdor a descendant agent workspacebackgroundProcessManager.getOutput(processId, filter, filter_exclude, timeout_secs, abortSignal, workspaceId)status:"running"withoutputwhen runningstatus:"completed"withreportMarkdownandexitCodewhen exited/killed/failedinterrupted→status:"running"withnote:"interrupted"(or add a new status if you choose to extend the schema)2.5 Extend
task_listto merge bash processesIn
src/node/services/tools/task_list.ts:backgroundProcessManager.list()and include any process whoseworkspaceIdis in that set.TaskListToolTaskSchemaentry.workspaceDepth(process.workspaceId) + 1(or1if owned by current workspace).2.6 Extend
task_terminateto terminate bash tasksIn
src/node/services/tools/task_terminate.ts:terminateDescendantAgentTask()backgroundProcessManager.terminate(processId)Optional improvement:
BackgroundProcessManager.cleanup(workspaceId)already exists, but doing it explicitly makes the tool semantics deterministic.2.7 Tool wiring: ensure
task_*are init-wait wrappedBecause
task_*will now execute bash (runtime-dependent), updatesrc/common/utils/tools/tools.ts#getToolsForModel:task,task_await,task_list,task_terminatewithwrapWithInitWait(...)(same asbashtoday).runtime.exec()/BackgroundProcessManagerbefore workspace init.ask_user_question,propose_plan,todo_*,status_setnon-runtime.3) Tool exposure + deprecation
3.1 Remove
bash_*from the actual model toolset (not just prompt)Today,
getAvailableTools()is used for system prompt + tool-instruction extraction, but the model still receives whatevergetToolsForModel()returns (unless atoolPolicydenies it).To truly eliminate overlap:
getAvailableTools(modelString, mode, options)(plus MCP tools). Best place:src/common/utils/tools/tools.ts#getToolsForModeljust before returning.getAvailableTools()(src/common/utils/tools/toolDefinitions.ts):bashbash_outputbash_background_listbash_background_terminatebash_*tools at all (or gate behind a temporary env/feature flag).3.2 Keep legacy
bash_*as wrappers (short-lived)To reduce risk, keep the tools around during rollout:
taskwithkind:"bash".”bash_output/list/terminateby delegating to the newtask_*internals.Why keep wrappers temporarily?
bash_*tool names.bash_*as bridgeable tools.4) Frontend/UI updates (tool message rendering)
Because the model will now call
task_*for bash operations, update task-tool renderers to handle the new union types:src/browser/components/tools/TaskToolCall.tsxTaskToolCall: render bash variant (script,display_name, background vs foreground)TaskAwaitResult: ifoutputpresent, render it (even whenstatus !== completed)TaskListItem: displaydisplay_name/scriptwhenagentType/titleabsentTaskStatusBadge: optionally style bash terminal states if you extend the enumsrc/browser/stories/App.task.stories.tsx+src/browser/stories/mockFactory.tstask_awaitoutput streaming.5) Tests + validation
5.1 Backend tool tests
Add/extend tests under
src/node/services/tools/:task.bash.test.ts(new):task_awaitreturns incremental output + runningtask_terminatekills ittask_listincludes bash processes for:5.2 UI rendering tests (lightweight)
If there are existing component tests, add minimal assertions for bash-variant rendering. Otherwise rely on storybook + typecheck.
5.3 Manual validation checklist
make typecheckbun test src/node/services/tools/task_await.test.tsbun test src/node/services/tools/task_list.test.ts(or add if missing)bun test src/node/services/tools/bash_output.test.ts(ensure wrappers still work if kept)6) Rollout plan
bash_*fromgetAvailableTools()so models stop seeing them.bash_*from PTC bridgeables if desired.Generated with
mux• Model:openai:gpt-5.2• Thinking:xhigh