| 1 | <div align="center">
|
| 2 | <img src="assets/mini-swe-agent-banner.svg" alt="mini-swe-agent banner" style="height: 7em"/>
|
| 3 |
|
| 4 | <h1 style="margin-bottom: 1ex;">The 100 line AI agent that's actually useful</h1>
|
| 5 |
|
| 6 | </div>
|
| 7 |
|
| 8 | <div align="center">
|
| 9 |
|
| 10 | <a href="https://join.slack.com/t/swe-bench/shared_invite/zt-36pj9bu5s-o3_yXPZbaH2wVnxnss1EkQ">
|
| 11 | <img src="https://img.shields.io/badge/Slack-4A154B?style=for-the-badge&logo=slack&logoColor=white" alt="Slack">
|
| 12 | </a>
|
| 13 | <a href="https://github.com/SWE-agent/mini-swe-agent">
|
| 14 | <img alt="GitHub Release" src="https://img.shields.io/github/v/release/swe-agent/mini-swe-agent?style=for-the-badge&logo=github&label=GitHub&labelColor=black&color=green" alt="GitHub Release">
|
| 15 | </a>
|
| 16 | <a href="https://pypi.org/project/mini-swe-agent/">
|
| 17 | <img src="https://img.shields.io/pypi/v/mini-swe-agent?style=for-the-badge&logo=python&logoColor=white&labelColor=black&color=deeppink" alt="PyPI - Version">
|
| 18 | </a>
|
| 19 |
|
| 20 | </div>
|
| 21 |
|
| 22 | !!! warning "This is mini-swe-agent v2"
|
| 23 |
|
| 24 | Read the [migration guide](https://mini-swe-agent.com/latest/advanced/v2_migration/). For the previous version, check out the [v1 documentation](https://mini-swe-agent.com/v1/) or the [v1 branch](https://github.com/SWE-agent/mini-swe-agent/tree/v1).
|
| 25 |
|
| 26 | In 2024, [SWE-bench](https://swebench.com) & [SWE-agent](https://swe-agent.com) helped kickstart the coding agent revolution.
|
| 27 |
|
| 28 | We now ask: **What if the agent was 100x smaller, and still worked nearly as well?**
|
| 29 |
|
| 30 | The `mini` agent is for
|
| 31 |
|
| 32 | - **Researchers** who want to **[benchmark](https://swe-bench.com), [fine-tune](https://swesmith.com/) or RL** without assumptions, bloat, or surprises
|
| 33 | - **Developers** who like to **own, understand, and modify** their tools
|
| 34 | - **Engineers** who want something **trivial to sandbox & to deploy anywhere**
|
| 35 |
|
| 36 | Here's some details:
|
| 37 |
|
| 38 | - **Minimal**: Just [100 lines of python](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/agents/default.py) (+100 total for [env](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/environments/local.py),
|
| 39 | [model](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/models/litellm_model.py), [script](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/run/hello_world.py)) — no fancy dependencies!
|
| 40 | - **Performant:** Scores >74% on the [SWE-bench verified benchmark](https://www.swebench.com/) benchmark; starts faster than Claude Code
|
| 41 | - **Deployable:** In addition to local envs, you can use **docker**, **podman**, **singularity**, **apptainer**, and more
|
| 42 | - **Cutting edge:** Built by the Princeton & Stanford team behind [SWE-bench](https://swebench.com) and [SWE-agent](https://swe-agent.com).
|
| 43 | - **Widely adopted:** In use by Meta, NVIDIA, Essential AI, Anyscale, and others
|
| 44 | - **Tested:** [](https://codecov.io/gh/SWE-agent/mini-swe-agent)
|
| 45 |
|
| 46 | ??? note "Why use mini-SWE-agent for research?"
|
| 47 |
|
| 48 | [SWE-agent](https://swe-agent.com/latest/) jump-started the development of AI agents in 2024. Back then, we placed a lot of emphasis on tools and special interfaces for the agent. However, one year later, a lot of this is not needed at all to build a useful agent!
|
| 49 |
|
| 50 | In fact, the `mini` agent:
|
| 51 |
|
| 52 | - **Does not have any tools other than bash** — it doesn't even use the tool-calling interface of the LMs.
|
| 53 | This means that you can run it with literally any model.
|
| 54 | When running in sandboxed environments you also don't need to take care of installing a single package — all it needs is bash.
|
| 55 | - **Has a completely linear history** — every step of the agent just appends to the messages and that's it.
|
| 56 | So there's no difference between the trajectory and the messages that you pass on to the LM.
|
| 57 | Great for debugging & fine-tuning.
|
| 58 | - **Executes actions with `subprocess.run`** — every action is completely independent (as opposed to keeping a stateful shell session running). This makes it trivial to execute the actions in sandboxes (literally just switch out `subprocess.run` with `docker exec`) and to scale up effortlessly.
|
| 59 | Seriously, this is [a big deal](faq.md#why-no-shell-session), trust me.
|
| 60 |
|
| 61 | This makes it perfect as a baseline system and for a system that puts the language model (rather than the agent scaffold) in the middle of our attention.
|
| 62 | You can see the result on the [SWE-bench (bash only)](https://www.swebench.com/) leaderboard, that evaluates the performance of different LMs with `mini`.
|
| 63 |
|
| 64 | ??? note "Why use mini-SWE-agent as a tool?"
|
| 65 |
|
| 66 | Some agents are overfitted research artifacts. Others are UI-heavy frontend monsters.
|
| 67 |
|
| 68 | The `mini` agent wants to be a hackable tool, not a black box.
|
| 69 |
|
| 70 | - **Simple** enough to understand at a glance
|
| 71 | - **Convenient** enough to use in daily workflows
|
| 72 | - **Flexible** to extend
|
| 73 |
|
| 74 | Unlike other agents (including our own [swe-agent](https://swe-agent.com/latest/)), it is radically simpler, because it:
|
| 75 |
|
| 76 | - **Does not have any tools other than bash** — it doesn't even use the tool-calling interface of the LMs.
|
| 77 | Instead of implementing custom tools for every specific thing the agent might want to do, the focus is fully on the LM utilizing the shell to its full potential.
|
| 78 | Want it to do something specific like opening a PR?
|
| 79 | Just tell the LM to figure it out rather than spending time to implement it in the agent.
|
| 80 | - **Executes actions with `subprocess.run`** — every action is completely independent (as opposed to keeping a stateful shell session running).
|
| 81 | This is [a big deal](https://mini-swe-agent.com/latest/faq/#why-no-shell-session) for the stability of the agent, trust me.
|
| 82 | - **Has a completely linear history** — every step of the agent just appends to the messages that are passed to the LM in the next step and that's it.
|
| 83 | This is great for debugging and understanding what the LM is prompted with.
|
| 84 |
|
| 85 | ??? note "Should I use mini-SWE-agent or swe-agent?"
|
| 86 |
|
| 87 | You should use `mini-swe-agent` if
|
| 88 |
|
| 89 | - You want a quick command line tool that works locally
|
| 90 | - You want an agent with a very simple control flow
|
| 91 | - You want even faster, simpler & more stable sandboxing & benchmark evaluations
|
| 92 | - You are doing FT or RL and don't want to overfit to a specific agent scaffold
|
| 93 |
|
| 94 | You should use `swe-agent` if
|
| 95 |
|
| 96 | - You need specific tools or want to experiment with different tools
|
| 97 | - You want to experiment with different history processors
|
| 98 | - You want very powerful yaml configuration without touching code
|
| 99 |
|
| 100 | What you get with both
|
| 101 |
|
| 102 | - Excellent performance on SWE-Bench
|
| 103 | - A trajectory browser
|
| 104 |
|
| 105 | </details>
|
| 106 | <table>
|
| 107 | <tr>
|
| 108 | <td width="50%">
|
| 109 | <a href="usage/mini"><strong>CLI</strong></a> (<code>mini</code>)
|
| 110 | </td>
|
| 111 | <td>
|
| 112 | <a href="usage/swebench/"><strong>Batch inference</strong></a>
|
| 113 | </td>
|
| 114 | </tr>
|
| 115 | <tr>
|
| 116 | <td width="50%">
|
| 117 | <div class="gif-container" data-glightbox-disabled>
|
| 118 | <img src="https://github.com/SWE-agent/swe-agent-media/blob/main/media/mini/png/mini.png?raw=true"
|
| 119 | data-gif="https://github.com/SWE-agent/swe-agent-media/blob/main/media/mini/gif/mini.gif?raw=true"
|
| 120 | alt="mini" data-glightbox="false" />
|
| 121 | </div>
|
| 122 | </td>
|
| 123 | <td>
|
| 124 | <div class="gif-container" data-glightbox-disabled>
|
| 125 | <img src="https://github.com/SWE-agent/swe-agent-media/blob/main/media/mini/png/swebench.png?raw=true"
|
| 126 | data-gif="https://github.com/SWE-agent/swe-agent-media/blob/main/media/mini/gif/swebench.gif?raw=true"
|
| 127 | alt="swebench" data-glightbox="false" />
|
| 128 | </div>
|
| 129 | </td>
|
| 130 | </tr>
|
| 131 | <tr>
|
| 132 | <td>
|
| 133 | <a href="usage/inspector/"><strong>Trajectory browser</strong></a>
|
| 134 | </td>
|
| 135 | <td>
|
| 136 | <a href="advanced/cookbook/"><strong>Python bindings</strong></a>
|
| 137 | </td>
|
| 138 | </tr>
|
| 139 | <tr>
|
| 140 | <td>
|
| 141 | <div class="gif-container" data-glightbox-disabled>
|
| 142 | <img src="https://github.com/SWE-agent/swe-agent-media/blob/main/media/mini/png/inspector.png?raw=true"
|
| 143 | data-gif="https://github.com/SWE-agent/swe-agent-media/blob/main/media/mini/gif/inspector.gif?raw=true"
|
| 144 | alt="inspector" data-glightbox="false" />
|
| 145 | </div>
|
| 146 | </td>
|
| 147 | <td>
|
| 148 | <pre><code class="language-python">agent = DefaultAgent(
|
| 149 | LitellmModel(model_name=...),
|
| 150 | LocalEnvironment(),
|
| 151 | )
|
| 152 | agent.run("Write a sudoku game")</code></pre>
|
| 153 | </td>
|
| 154 | </tr>
|
| 155 | </table>
|
| 156 |
|
| 157 |
|
| 158 | !!! info "Upgrading to v2?"
|
| 159 |
|
| 160 | Check out our [v2 migration guide](advanced/v2_migration.md) for all the changes and how to update your code.
|
| 161 |
|
| 162 | ## Continue reading:
|
| 163 |
|
| 164 | <div class="grid cards">
|
| 165 | <a href="quickstart/" class="nav-card-link">
|
| 166 | <div class="nav-card">
|
| 167 | <div class="nav-card-header">
|
| 168 | <span class="material-icons nav-card-icon">launch</span>
|
| 169 | <span class="nav-card-title">Installation & Quick Start</span>
|
| 170 | </div>
|
| 171 | <p class="nav-card-description">Get started with mini-SWE-agent</p>
|
| 172 | </div>
|
| 173 | </a>
|
| 174 |
|
| 175 | <a href="usage/mini/" class="nav-card-link">
|
| 176 | <div class="nav-card">
|
| 177 | <div class="nav-card-header">
|
| 178 | <span class="material-icons nav-card-icon">flash_on</span>
|
| 179 | <span class="nav-card-title">Usage: Simple UI</span>
|
| 180 | </div>
|
| 181 | <p class="nav-card-description">Learn to use the <code>mini</code> command</p>
|
| 182 | </div>
|
| 183 | </a>
|
| 184 |
|
| 185 | <a href="faq/" class="nav-card-link">
|
| 186 | <div class="nav-card">
|
| 187 | <div class="nav-card-header">
|
| 188 | <span class="material-icons nav-card-icon">help</span>
|
| 189 | <span class="nav-card-title">FAQ</span>
|
| 190 | </div>
|
| 191 | <p class="nav-card-description">Common questions and answers</p>
|
| 192 | </div>
|
| 193 | </a>
|
| 194 |
|
| 195 | <a href="advanced/yaml_configuration/" class="nav-card-link">
|
| 196 | <div class="nav-card">
|
| 197 | <div class="nav-card-header">
|
| 198 | <span class="material-icons nav-card-icon">settings</span>
|
| 199 | <span class="nav-card-title">Configuration</span>
|
| 200 | </div>
|
| 201 | <p class="nav-card-description">Setup and customize your agent</p>
|
| 202 | </div>
|
| 203 | </a>
|
| 204 |
|
| 205 | <a href="advanced/cookbook/" class="nav-card-link">
|
| 206 | <div class="nav-card">
|
| 207 | <div class="nav-card-header">
|
| 208 | <span class="material-icons nav-card-icon">fitness_center</span>
|
| 209 | <span class="nav-card-title">Power up</span>
|
| 210 | </div>
|
| 211 | <p class="nav-card-description">Start hacking the agent!</p>
|
| 212 | </div>
|
| 213 | </a>
|
| 214 | </div>
|
| 215 |
|
| 216 | ## 📣 News
|
| 217 |
|
| 218 | * [New tutorial on building minimal AI agents](https://minimal-agent.com/)
|
| 219 | * Nov 19: [Gemini 3 Pro reaches 74% on SWE-bench verified with mini-swe-agent!](https://x.com/KLieret/status/1991164693839270372)
|
| 220 | * Aug 19: [New blogpost: Randomly switching between GPT-5 and Sonnet 4 boosts performance](https://www.swebench.com/SWE-bench/blog/2025/08/19/mini-roulette/)
|
| 221 |
|
| 222 | ## 📣 New features
|
| 223 |
|
| 224 | Please check the [github release notes](https://github.com/SWE-agent/mini-swe-agent/releases) for the latest updates.
|
| 225 |
|
| 226 | ## 📣 Documentation updates
|
| 227 |
|
| 228 | * Jul 27: More notes on [local models](models/local_models.md)
|
| 229 |
|
| 230 | {% include-markdown "_footer.md" %}
|
| 231 |
|