Moebius: 0.2B image inpainting model with 10B-level performance

simonw 21 hours ago |

I got this working with ONNX (thanks, Claude Opus 4.8) and now I have an interactive demo of the model running entirely in the browser here (~1.3GB download): https://simonw.github.io/moebius-web/ - code here: https://github.com/simonw/moebius-web

(Claude Code transcript: https://gisthost.github.io/?58039ba5c1ca3ed177e8659168996ee4)

Wrote this up in more detail on my blog: https://simonwillison.net/2026/Jun/22/porting-moebius/

K0IN 20 hours ago | |

Awesome, I wanted to do the exact same thing (used gpt 5.5 + code) but it didn't get the model to work in onnx...

g58892881 14 hours ago | |

well done!

unet weights are in fp32. did you by any chance try something lower, fp16?

da_grift_shift 7 hours ago | | |

The model considered it.

There are 25 or so mentions of fp16 and fp32 weights across the 7500+ words of Markdown text it generated. So the next question might be: Did it make the right calls?

https://github.com/simonw/moebius-web/blob/main/notes.md

https://github.com/simonw/moebius-web/blob/main/plan.md

https://github.com/simonw/moebius-web/blob/main/research.md

https://github.com/simonw/moebius-web/blob/main/understandin...

lifthrasiir 1 day ago |

Tried a bit, and while it is very impressive for 0.2B model it would be very hard to convince me that this matches with 10B models. It did work reasonably well with natural images but inpainted regions were visibly smoother than surroundings, and performed very badly on novel objects. It is also limited to 512x512 output, which limits its practical usefulness.

amelius 21 hours ago | |

Do you think the provided examples are representative of its performance, or do you think they were cherry picked?

lifthrasiir 21 hours ago | | |

Given its limited output dimension it's hard to tell. I haven't exactly tested fine-tuned variants but I think they would work well under certain situations. After all, some (possibly cherry-picked) examples still exhibit similar problems when you inspect them in detail.

xrd 23 hours ago |

I did an inpainting project for a client a few years ago. They were trying to inpaint banner ads for concert promoters, and find a way to make it easy to produce a bunch of different sized ads for a variety of placements. I was tasked with inpainting Xmas themed ad for a few major singers.

The weirdest thing was when the inpainting tool added strange people to an image. This singer was all decked out in tinsel and red, and the inpainting model added a grumpy old man in a top hat. I don't recall clicking the "Add creepy old man" button.

At the time this was Stable Diffusion on the backend, run by a variety of model hosting services, Amazon being one. They all had different requirements for the input image and that made things really complex. For some the aspect ratio was impossible to meet, and it would fail if the banner was 200x60. For others, you had to resize it before input, which meant you were adding an image with poor resolution to start. Garbage in, garbage out.

All of this to say, there is a lot of preproduction that went into it, and the client never ended up using my attempts.

james2doyle 1 day ago |

There are some demo spaces using this. This one seems the best (paint your own mask) but it failed on all the images I tried: https://huggingface.co/spaces/multimodalart/Moebius

hex4def6 21 hours ago | |

I've been playing around, got it to work, although quality was a bit crappy. Still playing around with the settings that get exposed, but you're welcome to look at : https://huggingface.co/spaces/jonatei/MoebiusDemo

Note that I'm actively messing with it, so it may break for short periods of time :)

It's also running on the free CPU, so it's like 80 seconds per image...

nickandbro 13 hours ago |

Here is a little app I made that allows you to experiment with all of the fine tuned models that runs entirely in your browser:

https://inpaintlab.com/

Zopieux 5 hours ago |

Not great. The inpainted areas are, as usual, very smooth compared to the detailed, "high frequency" look of natural photos.

Barely useful enough to erase things in thumbnails.

vunderba 3 hours ago | |

This and these are cherry-picked examples. The one removing that high tension wire in the nature photo is especially bad. You can literally see the band where it erased it. Even the standard restore tool in Photoshop from years ago can do a comparable job.

chatmasta 19 hours ago |

What is inpainting? Everyone in the comments seems to be familiar with the term, and I don’t see it described in the linked page.

torgoguys 19 hours ago | |

Click on the visualizations to see it in action. The purple areas are areas a user highlighted to tell the system to inpaint, and when you click on the image you see the results of the inpainting. Basically the model redraws sections of an image (the purple areas) using the context of what's in the non-purple areas to decide what might look best in the purple areas. Often used for removing objects but as you can see in the examples it can do other things too.

NooneAtAll3 7 hours ago | | |

> and when you click on the image

ah, bad UX

pattilupone 23 hours ago |

I want a version of this for manga (for translation). Right now I think the go-to lightweight inpainting model for anime and manga is LaMa which is several years old now and it feels like there is room for improvement.

matthewfcarlson 23 hours ago | |

I've been working on trying to outpaint an animated program for my son (Leapfrog Letter Factory if you're curious) and then upscale it. Doing so locally has been actually fairly difficult. I wonder if you could retrain or fine tune this model. They mention building an expert, I wonder if that expert could understand more about translating various characters.

delis-thumbs-7e 1 day ago |

This is the useful AI stuf. There’s so many usecases this makes possible.

NooneAtAll3 1 day ago |

I don't understand. Is it available somewhere to try or is it just an ad?

owebmaster 1 day ago | |

Yeah it's great but how do I use it?

Edit: I think I found it https://huggingface.co/hustvl/Moebius

K0IN 1 day ago | | |

with this size we could have a interaactive web demo.

james2doyle 1 day ago | | |

Like this? https://huggingface.co/spaces/multimodalart/Moebius

IvanK_net 22 hours ago | | |

Were you able to make it work? It never works in my case.

ErrorNoBrain 1 hour ago | | |

https://simonw.github.io/moebius-web/

choose "samlple image" or upload something

then mark something with the mouse (and press 'run inpaint') and it'll work a bit and try to hide it, sorta like that "magic eraser" some newer android phones have

teroshan 1 day ago |

Unrelated but when I read inpainting and Moebius I was scared it was related and using the art of the great Jean Giraud [0] a.k.a. Moebius

https://characterdesignreferences.com/artist-of-the-week-3/m...

[0] https://en.wikipedia.org/wiki/Jean_Giraud

coldtea 1 day ago | |

Scared why?

teroshan 1 day ago | | |

Scared for the same reason I found last year's 'Ghibli filter' craze upsetting, I would have personally hated to have seen this artist's legacy used for promoting AI image generation.

TeMPOraL 1 day ago | | |

In case that happened then the rest of the world would probably appreciate the art, and a subset of it, the artist (and even a small subset of ~whole Internet-connected population is a lot of people). Some silver lining, perhaps.

teroshan 22 hours ago | | |

Perhaps.

I like the idea that a piece of art, in addition of ultimately ending up as pixels on my screen, is also a window into a world that has been dreamt up by real human imagination, driven by their hopes and fears.

Semiconductors based generation may give me the first part, but not the second.

I'm speaking for myself here, I agree with your point though.

NooneAtAll3 7 hours ago | | |

> I like the idea that a piece of art, in addition of ultimately ending up as pixels on my screen, is also a window into a world that has been dreamt up by real human imagination, driven by their hopes and fears.

I guess this actually defines the fringe between ai-art enjoyers and haters - some people prefer what art does to their imagination, while others look at what art does to others'

zmgsabst 19 hours ago | | |

You just refuse to see certain people’s hopes and fears because they didn’t express them in a way you personally find pleasing.

The LLMs didn’t prompt themselves.

solid_fuel 1 day ago | | |

> In case that happened then the rest of the world would probably appreciate the art

What art?

We’re talking about generated pictures, aka slop, not art made by a real human.

And I don’t know if you’ve been paying attention but people seem to be pretty tired of the slop. I don’t think it would be appreciated nearly as much as you think.

inigyou 1 day ago | | |

It is possible to use generative AI in nonslop ways btw

TeMPOraL 1 day ago | | |

This definition of "slop" doesn't cut reality just quite at the joints.

People are tired of marketing. AI generated slop people are annoyed with, is garbage produced for marketing reasons, and it's distinctly noticeable precisely because all the bottom-feeder marketing houses switched to using it. But it's not the AI itself that's the problem here. Slop was here before, but it was made with cheap protein-based image generators. Silicon-based generators are just cheaper.

epolanski 1 day ago |

What is the current SOTA for impainting?

I have a potential project for my e-commerce where I want to allow users to upload images of their house exteriors and impaint awnings.

michaelfm1211 1 day ago |

> The core insight of Moebius can be summarized in a single equation: Synergy × (Architecture + Distillation) = Shattering the "Impossible Triangle" of Low Parameters, Fast Inference, and High Quality

Is it just me or is it weird seeing these clickbaity AI-generated taglines in an otherwise scientific work?

kevin_thibedeau 22 hours ago | |

It signals a paradigm shift in vacuous prose.

dormento 23 hours ago | |

It IS weird, but it "converts" (ugh...), that's why they coming.

Apart from this, the text details amazing work. Congrats.

soperj 23 hours ago | |

After "In Good Company" i can't hear (or see) the word Synergy without cringing.

Jackson__ 21 hours ago | |

Judging by the performance of the shown examples, the quality is closer to pre-2022 Photoshop content aware fill than actual 10B models.

I think it is safe to say this is pretty far from a "scientific" work.

gspr 1 day ago |

Nitpick: in the showcase on that page, under Comparison of Natural Scenes, Moebius should definitely get a "structural confusion" tag for the back of the surfboard. If other models get deducted for truncating the surfboard, then surely the elongation that Moebius does should count too.

Also, what's going on behind the in-painted corner of the house? We'd need to see higher resolution pictures, but I'm not convinced that it too shouldn't get a flag. Likewise with the beach just behind the surfboard. Not terrible, but what gets flagged in the competitors is similar.

N_Lens 1 day ago |

The gallery of their samples is pretty impressive!

GL26 1 day ago |

Could this run locally on a smartphone ?

rasz 1 day ago |

It sure has a thing for chins, jaws and removing weight, looksmaxing build in.

hari1123 1 day ago |

lot of the photo editors on mobiles have this, maybe even some apps?

zb3 1 day ago |

1) What are RAM requirements?

2) If these are reasonable, a WebGPU demo would be great..

lifthrasiir 1 day ago | |

The total model size is about 1.2GB (UNet + SDXL VAE included), so probably about ~3GB?