-
Notifications
You must be signed in to change notification settings - Fork 332
Open
Labels
capabilitiesenhancementNew feature or requestNew feature or requesthelp wantedExtra attention is neededExtra attention is neededideas
Description
What I and many seem to want is a way to truly put LLMs to work, ways to outsource higher volumes of work without explicit instruction/live oversight.
I think this is a fascinating problem, and in many ways how I hope to use gptme, but not really there yet.
Ideas:
- ask it to try implementing feature/fix/refactor x, then typechecking/testing/linting it, then make a PR if successful
- if unsuccessful, attempt up to n retry strategies, possibly in a branching manner where we try to detect the best branch, since any step could go wrong and put it in an unusable state.
- started working on this in refactor: work on programmatic interface, self-reviewing agent #199
- Similar goals to SWE-Bench stuff: feat: started working on SWE-bench evals #142
- if unsuccessful, attempt up to n retry strategies, possibly in a branching manner where we try to detect the best branch, since any step could go wrong and put it in an unusable state.
- various automation workflows, putting it in a cron job, etc.
- return to the gptme roots and give it your documents/data/activitywatch to inspect and review, condensing it into a daily report or similar
- browsing the web for content, giving the user recommendations
- write docs for how to do all these things, give the docs to gptme so you can ask it to write scripts doing it
- Minimal docs added to https://gptme.org/docs/automation.html in docs: added minimal automation docs #144
- more subagent examples/evals
- feedback loops, like
make edit to page -> view web page -> make edit
- Used it for generating https://github.com/ErikBjare/gptme/blob/master/site/style.css with much success
What we should focus on:
- Making gptme work on gptme while I sleep
What we'd need:
- Tool customization
- It is likely that performance would improve by limiting each step to a restricted set of tools (would at least save on tokens)
- Implemented
--tools
option in 48d559b
- Implemented
- It is likely that performance would improve by limiting each step to a restricted set of tools (would at least save on tokens)
- Reliable message passing between gptme runs/agents
- The current subagent mechanism of asking it to output JSON isn't the most reliable.
- Maybe making it a prompt chain would fix it? (done in 0dd6583)
- The current subagent mechanism of asking it to output JSON isn't the most reliable.
Issues:
- How to avoid burning lots of non-value generating tokens while I sleep?
- Make reliable, do intelligent tree-search of branches, limit retries
bhupesh-sf, 0xbrayo and jrmi
Metadata
Metadata
Assignees
Labels
capabilitiesenhancementNew feature or requestNew feature or requesthelp wantedExtra attention is neededExtra attention is neededideas