docx 458kb raw 217kb gzipped
pptx 574kb raw 253kb gzipped
xslx 601kb raw 269kb gzipped
I expected the Wasm bundles to be large and a lot more bigger than that for some reason.ChatGPT.com can benefit from using this library (or such a library) for rendering a preview of the file in a side panel on the right, instead of just giving me a download link to the outputted/transformed docx/pptx/xslx file.
It's 100% hallucinated.
So the tool is growing and maybe this would be interesting to have as the non LibreOffice dependent viewer...
For PPTX and DOCX, this solution is slightly worse than libreoffice conversion (this does not appear to output highlightable text, while PDF conversion does).
However, the XLSX preview BLEW my mind considering this was AI coded. Really good, even interactive!
Yeah, it does.
https://ooxml.silurus.dev/storybook/?path=/story/docxviewer-...
I'm not familiar with this application, so perhaps I'm missing a step, and editing mode.
Does this work in Cloudflare’s workerd environment? Would be nice to have a cheap serverless render -> LLM (GLM-OCR / PaddleOCR) -> Markdown pipeline for the various MS Office formats.
The slightest misalignment of a paragraph means a line on page 27 of 120 now moved down by 2 pixels, screwing everything else out of alignment. Yes, plenty of companies pay Microsoft 365 subscriptions because of exactly this reason; it sounds ludicrous when you think they could just pay someone to replicate the formatting in a different suite a lot less than the subscription costs, but that's not how it works...
If Microsoft can’t get consistent rendering of word docs between Word for Windows, Word for macOS and Office 365, I don’t like anyone else’s chances.
Bit identical/pixel-faithful reproductions are easy to verify…
"oh yeah? Show me what you made, you can't, nobody can, it's all just AI psychosis"
"I made a pixel perfect Office document viewer"
"well... I wish you hadn't"
The best developers are lazy.
Still, looks pretty; if it actually has proper testing, could close the gap. Code not being the hard part is a major impediment to good software coming out of these things.
Holy cow!!