| 1 | <div align="center">
|
| 2 | <a href="https://mini-swe-agent.com/latest/"><img src="https://github.com/SWE-agent/mini-swe-agent/raw/main/docs/assets/mini-swe-agent-banner.svg" alt="mini-swe-agent banner" style="height: 7em"/></a>
|
| 3 | </div>
|
| 4 |
|
| 5 | # The minimal AI software engineering agent
|
| 6 |
|
| 7 | š£ [New tutorial on building minimal AI agents](https://minimal-agent.com/)<br/>
|
| 8 | š£ [Gemini 3 Pro reaches 74% on SWE-bench verified with mini-swe-agent!](https://x.com/KLieret/status/1991164693839270372)<br/>
|
| 9 | š£ [New blogpost: Randomly switching between GPT-5 and Sonnet 4 boosts performance](https://www.swebench.com/SWE-bench/blog/2025/08/19/mini-roulette/)
|
| 10 |
|
| 11 | [](https://mini-swe-agent.com/latest/)
|
| 12 | [](https://join.slack.com/t/swe-bench/shared_invite/zt-36pj9bu5s-o3_yXPZbaH2wVnxnss1EkQ)
|
| 13 | [](https://pypi.org/project/mini-swe-agent/)
|
| 14 |
|
| 15 | > [!WARNING]
|
| 16 | > This is **mini-swe-agent v2**. Read the [migration guide](https://mini-swe-agent.com/latest/advanced/v2_migration/). For the previous version, check out the [v1 branch](https://github.com/SWE-agent/mini-swe-agent/tree/v1).
|
| 17 |
|
| 18 | In 2024, we built [SWE-bench](https://github.com/swe-bench/SWE-bench) & [SWE-agent](https://github.com/swe-agent/swe-agent) and helped kickstart the coding agent revolution.
|
| 19 |
|
| 20 | We now ask: **What if our agent was 100x smaller, and still worked nearly as well?**
|
| 21 |
|
| 22 | The `mini` agent is for
|
| 23 |
|
| 24 | - **Researchers** who want to **[benchmark](https://swe-bench.com), [fine-tune](https://swesmith.com/) or RL** without assumptions, bloat, or surprises
|
| 25 | - **Developers** who like to **own, understand, and modify** their tools
|
| 26 | - **Engineers** who want something **trivial to sandbox & to deploy anywhere**
|
| 27 |
|
| 28 | Here's some details:
|
| 29 |
|
| 30 | - **Minimal**: Just some 100 lines of python for the [agent class](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/agents/default.py) (and a bit more for the [environment](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/environments/local.py),
|
| 31 | [model](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/models/litellm_model.py), and [run script](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/run/hello_world.py)) ā no fancy dependencies!
|
| 32 | - **Performant:** Scores >74% on the [SWE-bench verified benchmark](https://www.swebench.com/) benchmark; starts much faster than Claude Code
|
| 33 | - **Deployable:** In addition to local envs, you can use **docker**, **podman**, **singularity**, **apptainer**, and more
|
| 34 | - Built by the Princeton & Stanford team behind [SWE-bench](https://swebench.com), [SWE-agent](https://swe-agent.com), and more (see below)
|
| 35 | - **Widely adopted:** In use by Meta, NVIDIA, Essential AI, Anyscale, and others
|
| 36 | - **Tested:** [](https://codecov.io/gh/SWE-agent/mini-swe-agent)
|
| 37 |
|
| 38 | <details>
|
| 39 |
|
| 40 | <summary>More motivation (for research)</summary>
|
| 41 |
|
| 42 | [SWE-agent](https://swe-agent.com/latest/) jump-started the development of AI agents in 2024. Back then, we placed a lot of emphasis on tools and special interfaces for the agent.
|
| 43 | However, one year later, as LMs have become more capable, a lot of this is not needed at all to build a useful agent!
|
| 44 | In fact, the `mini` agent
|
| 45 |
|
| 46 | - **Does not have any tools other than bash** ā it doesn't even need to use the tool-calling interface of the LMs.
|
| 47 | This means that you can run it with literally any model. When running in sandboxed environments you also don't need to take care
|
| 48 | of installing a single package ā all it needs is bash.
|
| 49 | - **Has a completely linear history** ā every step of the agent just appends to the messages and that's it.
|
| 50 | So there's no difference between the trajectory and the messages that you pass on to the LM.
|
| 51 | Great for debugging & fine-tuning.
|
| 52 | - **Executes actions with `subprocess.run`** ā every action is completely independent (as opposed to keeping a stateful shell session running).
|
| 53 | This makes it trivial to execute the actions in sandboxes (literally just switch out `subprocess.run` with `docker exec`) and to
|
| 54 | scale up effortlessly. Seriously, this is [a big deal](https://mini-swe-agent.com/latest/faq/#why-no-shell-session), trust me.
|
| 55 |
|
| 56 | This makes it perfect as a baseline system and for a system that puts the language model (rather than
|
| 57 | the agent scaffold) in the middle of our attention.
|
| 58 | You can see the result on the [SWE-bench (bash only)](https://www.swebench.com/) leaderboard, that evaluates the performance of different LMs with `mini`.
|
| 59 |
|
| 60 | </details>
|
| 61 |
|
| 62 | <details>
|
| 63 | <summary>More motivation (as a tool)</summary>
|
| 64 |
|
| 65 | Some agents are overfitted research artifacts. Others are UI-heavy frontend monsters.
|
| 66 |
|
| 67 | The `mini` agent wants to be a hackable tool, not a black box.
|
| 68 |
|
| 69 | - **Simple** enough to understand at a glance
|
| 70 | - **Convenient** enough to use in daily workflows
|
| 71 | - **Flexible** to extend
|
| 72 |
|
| 73 | Unlike other agents (including our own [swe-agent](https://swe-agent.com/latest/)), it is radically simpler, because it:
|
| 74 |
|
| 75 | - **Does not have any tools other than bash** ā it doesn't even need to use the tool-calling interface of the LMs.
|
| 76 | Instead of implementing custom tools for every specific thing the agent might want to do, the focus is fully on the LM utilizing the shell to its full potential.
|
| 77 | Want it to do something specific like opening a PR?
|
| 78 | Just tell the LM to figure it out rather than spending time to implement it in the agent.
|
| 79 | - **Executes actions with `subprocess.run`** ā every action is completely independent (as opposed to keeping a stateful shell session running).
|
| 80 | This is [a big deal](https://mini-swe-agent.com/latest/faq/#why-no-shell-session) for the stability of the agent, trust me.
|
| 81 | - **Has a completely linear history** ā every step of the agent just appends to the messages that are passed to the LM in the next step and that's it.
|
| 82 | This is great for debugging and understanding what the LM is prompted with.
|
| 83 |
|
| 84 | </details>
|
| 85 |
|
| 86 | <details>
|
| 87 | <summary>Should I use SWE-agent or mini-SWE-agent?</summary>
|
| 88 |
|
| 89 | You should use `mini-swe-agent` if
|
| 90 |
|
| 91 | - You want a quick command line tool that works locally
|
| 92 | - You want an agent with a very simple control flow
|
| 93 | - You want even faster, simpler & more stable sandboxing & benchmark evaluations
|
| 94 | - You are doing FT or RL and don't want to overfit to a specific agent scaffold
|
| 95 |
|
| 96 | You should use `swe-agent` if
|
| 97 |
|
| 98 | - You need specific tools or want to experiment with different tools
|
| 99 | - You want to experiment with different history processors
|
| 100 | - You want very powerful yaml configuration without touching code
|
| 101 |
|
| 102 | What you get with both
|
| 103 |
|
| 104 | - Excellent performance on SWE-Bench
|
| 105 | - A trajectory browser
|
| 106 |
|
| 107 | </details>
|
| 108 |
|
| 109 | <table>
|
| 110 | <tr>
|
| 111 | <td width="50%">
|
| 112 | <a href="https://mini-swe-agent.com/latest/usage/mini/"><strong>CLI</strong></a> (<code>mini</code>)
|
| 113 | </td>
|
| 114 | <td>
|
| 115 | <a href="https://mini-swe-agent.com/latest/usage/swebench/"><strong>Batch inference</strong></a>
|
| 116 | </td>
|
| 117 | </tr>
|
| 118 | <tr>
|
| 119 | <td width="50%">
|
| 120 |
|
| 121 | 
|
| 122 |
|
| 123 | </td>
|
| 124 | <td>
|
| 125 |
|
| 126 | 
|
| 127 |
|
| 128 | </td>
|
| 129 | </tr>
|
| 130 | <tr>
|
| 131 | <td>
|
| 132 | <a href="https://mini-swe-agent.com/latest/usage/inspector/"><strong>Trajectory browser</strong></a>
|
| 133 | </td>
|
| 134 | <td>
|
| 135 | <a href="https://mini-swe-agent.com/latest/advanced/cookbook/"><strong>Python bindings</strong></a>
|
| 136 | </td>
|
| 137 | </tr>
|
| 138 | <tr>
|
| 139 | <td>
|
| 140 |
|
| 141 | 
|
| 142 |
|
| 143 | </td>
|
| 144 | <td>
|
| 145 |
|
| 146 | ```python
|
| 147 | agent = DefaultAgent(
|
| 148 | LitellmModel(model_name=...),
|
| 149 | LocalEnvironment(),
|
| 150 | )
|
| 151 | agent.run("Write a sudoku game")
|
| 152 | ```
|
| 153 |
|
| 154 | </td>
|
| 155 | </tr>
|
| 156 | </table>
|
| 157 |
|
| 158 | ## Let's get started!
|
| 159 |
|
| 160 | **Option 1:** If you just want to try out the CLI (package installed in anonymous virtual environment)
|
| 161 |
|
| 162 | ```bash
|
| 163 | pip install uv && uvx mini-swe-agent
|
| 164 | # or
|
| 165 | pip install pipx && pipx ensurepath && pipx run mini-swe-agent
|
| 166 | ```
|
| 167 |
|
| 168 | **Option 2:** Install CLI & python bindings in current environment
|
| 169 |
|
| 170 | ```bash
|
| 171 | pip install mini-swe-agent
|
| 172 | mini # run the CLI
|
| 173 | ```
|
| 174 |
|
| 175 | **Option 3:** Install from source (developer setup)
|
| 176 |
|
| 177 | ```bash
|
| 178 | git clone https://github.com/SWE-agent/mini-swe-agent.git
|
| 179 | cd mini-swe-agent && pip install -e .
|
| 180 | mini # run the CLI
|
| 181 | ```
|
| 182 |
|
| 183 | Read more in our [documentation](https://mini-swe-agent.com/latest/):
|
| 184 |
|
| 185 | * [Quick start guide](https://mini-swe-agent.com/latest/quickstart/)
|
| 186 | * [Using the `mini` CLI](https://mini-swe-agent.com/latest/usage/mini/)
|
| 187 | * [Global configuration](https://mini-swe-agent.com/latest/advanced/global_configuration/)
|
| 188 | * [Yaml configuration files](https://mini-swe-agent.com/latest/advanced/yaml_configuration/)
|
| 189 | * [Power up with the cookbook](https://mini-swe-agent.com/latest/advanced/cookbook/)
|
| 190 | * [FAQ](https://mini-swe-agent.com/latest/faq/)
|
| 191 | * [Contribute!](https://mini-swe-agent.com/latest/contributing/)
|
| 192 |
|
| 193 | ## Attribution
|
| 194 |
|
| 195 | If you found this work helpful, please consider citing the [SWE-agent paper](https://arxiv.org/abs/2405.15793) in your work:
|
| 196 |
|
| 197 | ```bibtex
|
| 198 | @inproceedings{yang2024sweagent,
|
| 199 | title={{SWE}-agent: Agent-Computer Interfaces Enable Automated Software Engineering},
|
| 200 | author={John Yang and Carlos E Jimenez and Alexander Wettig and Kilian Lieret and Shunyu Yao and Karthik R Narasimhan and Ofir Press},
|
| 201 | booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
|
| 202 | year={2024},
|
| 203 | url={https://arxiv.org/abs/2405.15793}
|
| 204 | }
|
| 205 | ```
|
| 206 |
|
| 207 | Our other projects:
|
| 208 |
|
| 209 | <div align="center">
|
| 210 | <a href="https://github.com/SWE-agent/SWE-agent"><img src="https://raw.githubusercontent.com/SWE-agent/swe-agent-media/refs/heads/main/media/logos_banners/sweagent_logo_text_below.svg" alt="SWE-agent" height="120px"></a>
|
| 211 |
|
| 212 | <a href="https://github.com/SWE-agent/SWE-ReX"><img src="https://raw.githubusercontent.com/SWE-agent/swe-agent-media/refs/heads/main/media/logos_banners/swerex_logo_text_below.svg" alt="SWE-ReX" height="120px"></a>
|
| 213 |
|
| 214 | <a href="https://github.com/SWE-bench/SWE-bench"><img src="https://raw.githubusercontent.com/SWE-agent/swe-agent-media/refs/heads/main/media/logos_banners/swebench_logo_text_below.svg" alt="SWE-bench" height="120px"></a>
|
| 215 |
|
| 216 | <a href="https://github.com/SWE-bench/SWE-smith"><img src="https://raw.githubusercontent.com/SWE-agent/swe-agent-media/refs/heads/main/media/logos_banners/swesmith_logo_text_below.svg" alt="SWE-smith" height="120px"></a>
|
| 217 |
|
| 218 | <a href="https://github.com/codeclash-ai/codeclash"><img src="https://raw.githubusercontent.com/SWE-agent/swe-agent-media/refs/heads/main/media/logos_banners/codeclash_logo_text_below.svg" alt="CodeClash" height="120px"></a>
|
| 219 |
|
| 220 | <a href="https://github.com/SWE-bench/sb-cli"><img src="https://raw.githubusercontent.com/SWE-agent/swe-agent-media/refs/heads/main/media/logos_banners/sbcli_logo_text_below.svg" alt="sb-cli" height="120px"></a>
|
| 221 | </div>
|
| 222 |
|