undefined | Dark Hacker News

29 points by jallenjia 1 year ago

noisy_boy 1 year ago |

From the faq:

> Can I use Kokoro TTS offline?

> Kokoro TTS is a cloud-based service that requires an internet connection to access our advanced text to speech technology. This ensures you always have access to the latest improvements and don't need to worry about local hardware requirements or model installations.

I would happily take on the worrying for offline instead of them having to worry about my worries.

mvdtnz 1 year ago | |

What's the point of promoting a model as "light weight" or even mentioning the parameter count if I can't run it locally? I don't give a toss how much pressure your remote hardware is under, and promoting a cloud service as small and lightweight only makes me think it's going to be cheap and crappy.

ipsum2 1 year ago |

This looks like a fake website. The creator of the website is claiming credit for the model, which does not appear to be created by him. The original model can be found here, along with the source code: https://huggingface.co/hexgrad/Kokoro-82M

Every popular machine learning paper has a fake website associated with it, for some reason. Can anyone figure out why? Another example, someone created this website https://imagen3.org, which is NOT Imagen3 by Google. However, it currently ranks #2 for the model name.

padolsey 1 year ago | |

This seems to be a general pattern emerging. Cynical opportunists are wrapping hf endpoints/embeds in dodgy SaaS offerings. A similar one is BetterDictation, which tbf I do use. But I still hate that people are profiting off open-spirited ML engineers and HF's goodwill.

Notice in this case that each testimonial avatar links to an image asset with a different name than the purported persons' name. Notice additionally the user in the thread who's pushing this 'product'; their post history makes it obvious they're an LLM slopBot...

dudus 1 year ago | |

You can buy SaaS kits that include a frontend with pricing pages, backend and all code necessary to wrap any API and resell at a profit.

atoav 1 year ago | |

Why? Some people are so convinced they won't make it if they follow the rules and ethical principles, they try to do with out them.

qwertox 1 year ago |

> You can find a hosted demo at hf.co/spaces/hexgrad/Kokoro-TTS.

And in the FAQ:

> What's included in the Kokoro TTS free trial?

> New users can try Kokoro TTS's full capabilities with our free trial. This allows you to experience our professional-grade text to speech technology firsthand, including access to all voices and both American and British English options.

So this is the "free trial"? Plus it being a cloud-based service makes me not understand the situation.

makeitdouble 1 year ago |

Company is based in Singapore apparently

On the privacy policy part

> We collect certain personal data, including but not limited to your name, email address, and payment information (if applicable) to enhance the Service and improve user experience.

It's the first time I've seen collecting payment info to improve user experience.

nenaoki 1 year ago |

https://kokorotts.org/ is the proper site.

ipsum2 1 year ago | |

No, that one also appears to be fake.

nicman23 1 year ago |

i just used it with https://github.com/santinic/audiblez/pull/14/files (including the pr because it has gpu accel)

it is very fast and very passable.

jallenjia 1 year ago |

I'm excited to share Kokoro TTS, an open-source text-to-speech model we've been working on. Despite its relatively small size (82M parameters), it achieves impressive results in natural speech synthesis, ranking first in the TTS Spaces Arena benchmark.

The model is Apache 2.0 licensed and trained on less than 100 hours of audio data. It supports both American and British English, offering multiple voice options with natural emotional expression and 24kHz audio output.

We've deployed a demo at kokorotts.online where you can try it out. I'd really appreciate any feedback from the HN community on both the model's performance and potential applications.

Tech stack: StyleTTS 2 architecture, ONNX runtime, Next.js for the web interface.

kissgyorgy 1 year ago | |

It's NOT Open Source.

dontdoxxme 1 year ago | | |

Confusing messaging, a previous version is: https://huggingface.co/hexgrad/Kokoro-82M (matching the demo if you use the "TTS v0.19" tab, it has some artefacts in the voice[1] and definitely doesn't sound as good as the latest version).

"There currently isn't a release date scheduled for the other voices"

[1]: https://huggingface.co/blog/hexgrad/kokoro-short-burst-upgra...

vanous 1 year ago | | |

And it's not offline.

CGamesPlay 1 year ago | | |

In which sense? https://huggingface.co/hexgrad/Kokoro-82M

- Apache 2.0 weights in this repository

- MIT inference code in spaces/hexgrad/Kokoro-TTS adapted from yl4579/StyleTTS2

- GPLv3 dependency in espeak-ng

dcreater 1 year ago | |

The website is not from the authors. Seems fraudulent

HF: https://huggingface.co/spaces/hexgrad/Kokoro-TTS

matrixhelix 1 year ago | |

Where is the code?