The Quite OK Audio Format for Fast, Lossy Compression(qoaformat.org) |
The Quite OK Audio Format for Fast, Lossy Compression(qoaformat.org) |
like, can you reasonably qoa-compress real-time 16ksps audio on a 16 megahertz atmega328?
hmm, https://phoboslab.org/log/2023/04/qoa-specification has some benchmark results, let's see... seems like he encoded 9807 seconds of 44.1ksps stereo in 25.8 seconds and decoded it in 3.00 seconds on an i7-6700k running singlethreaded. what does that imply for other machines?
it seems to be integer code (because reproducibility between the predictor in encoding and decoding is important, and a significant part of it is 16-bit. https://ark.intel.com/content/www/xl/es/ark/products/88195/i... says it's a 4.2 gigahertz skylake. agner says skylake can do 4–6 ipc (well, μops/cycle) https://www.agner.org/optimize/blog/read.php?i=628, coincidentally testing on an i7-6700k himself, but let's assume it's 3 ipc, because it's usually hard to reach even that level of ilp in useful code
so that's about 380 μops per sample if i'm doing my math right; that might be on the order of 400 32-bit integer instructions per sample on an in-order processor. if (handwaving wildly now!) that's 600 8-bit instructions, the atmega328 should be able to encode somewhere in the range of 16–32 kilosamples per second
so, quite plausibly
for decoding the same math gives 43 μops per sample rather than 380
i'm very interested to hear anyone else's benchmarks or calculations
3 months ago - https://news.ycombinator.com/item?id=35738817
6 months ago - https://news.ycombinator.com/item?id=34625573
these had crucial information for me
Would be nice to see joint stereo support. If you were to take ADPCM or this OK format and try to encode any stereo music with it, you will need 2 channels. However, there is an extremely advantageous optimization that can be made here - most music is largely center panned, so both channels are almost the same. With joint stereo you record one channel (either by picking one or mixing to an average) and then you can store the difference for the other channel which will occupy a lot fewer bits, assuming you are able to quantize away the increased entropy.
For example, instead of using two 4bit ADPCM channels for stereo, which would only be a 50% savings over uncompressed, you could probably use an average of 5 bits per sample.
This was/is available in MP3 since forever, so seems a reasonable request.
https://wiki.hydrogenaud.io/index.php?title=Intensity_stereo
> I have just pushed a workaround to master. [...]
> This still introduces audible artifacts when the weights reset. It prevents the LMS from exploding, but is far from perfect :/
This, combined with the fact that that issue is still open mean that a breaking change is still to be expected.
It should be spelled out explicitly, but I figured out the rest
L-Left,R-Right,C-Center,FL-Front Left,FR-FrontRight,SL-SideLeft,SR-SideRight,BL-BackLeft,BR-BackRight
---
Edit: LFE-LowFrequencyEffects... so subwoofer?
https://www.dolby.com/uploadedFiles/Assets/US/Doc/Profession...
Subwoofers come with multichannel audio systems in which directional speakers usually can't cover the lower range of audio frequencies. They are responsible for bass content from all channels, and get it from software or hardware crossover filter which is independent from specific input formats. Placement of low frequency speaker does not matter much because of human perception.
LFE track is an additional effects channel for movie theaters and similar amusement rides in which audio system plays low frequencies from other channels just fine. Dedicated LFE emitter then adds rattling and other wub-wub effects without overloading audio speakers with all that extra energy. Movies that lack car chases and explosions routinely have completely silent LFE tracks.
These days crossover points are very configurable. Most bass shakers are rated for use between 20hz and 200hz.
Other than that, looks great!
https://en.m.wikipedia.org/wiki/Software_patents_under_the_E...
That's not what they did, apparently.
The document properties call out https://cairographics.org
I personally wouldn't mind a Quite OK Page Description Langage. Something that gets you most of PDF/PS/HPGL without all the effort. Could use the Quite OK Image Format for bitmap images. Not sure whether you'd need a Quite OK Vector Format and/or a Quite OK Font Format as prerequisites…
[1]: https://phoboslab.org/log/2019/06/pl-mpeg-single-file-librar...
AKA last Tuesday morning frustration : I wanted to make interactive plots on a web page to explain math stuffs.
https://www.novagraaf.com/en/insights/patentability-software...
> As a result, the widespread belief in the non-patentability of software is simply a misconception, partly as a result of insufficient training of innovators and the lobbying activities of certain interested parties.
https://fsfe.org/activities/swpat/swpat.en.html
> The European Patent Convention states that software is not patentable. But laws are always interpreted by courts, and in this case interpretations of the law differ. So the European Patents Office (EPO) grants software patents by declaring them as "computer implemented inventions".
We have a 3D environment with spatial audio. Audio is encoded server-side, and since it's spatial everyone needs their own mix. We're using Opus, and audio encoding turns out to be the usual limiting factor on small servers.
So this kind of thing is exactly up our alley: an alternate option that uses less CPU than Opus, but consumes less bandwidth than raw audio.
But adding supporting for FLAC is also on our list. It seems nicely performant when compared to Opus.
I appear to be able to get maybe 30% better performance -- pretty nice, but not nearly big enough especially on low end servers.
in https://phoboslab.org/log/2023/04/qoa-specification he got ffmpeg on one core of an i7-6700k (which is arguably 'modern hardware') to encode a 9807-second file in mp3 in 146.2 seconds, 67× faster than real time. but qoa was 25.75 seconds, 5.7 times faster than that. qoa decoding was 2.5× as fast as dr_mp3
you can imagine situations where reducing the number of audio encoding servers in your audio encoding cluster by a factor of 6 would be a big win, or where you want to encode 100+ audio streams in real time on your laptop (maybe an sdr tuned to every am radio station at once), but i agree with you that battery-constrained devices are a more likely application area: making your audio recorder battery last twice as long is a much bigger win
That was an USA patent from Fraunhofer, who made quite some cash from mp3 license fees (100 000 000€ according to Wikipedia).
Presumably because it's much easier to get injunctive relief in Germany I've seen more codec related litigation there than anywhere else.
(*) Like many pieces of misinformation it has its roots in a seed of truth: Particularly between 1998 (State Street) and 2014 (CLS v Alice) the case law in the US supported software patents.
The real confusion is that "Software patents" is an obscure term of art which refers to patents specifically on software methods without any reference to a physical machine or good.
When non-patent-attorneys say "software patents" they mean something more like "something I could infringe by writing software". But clever drafting allows people to write patents that software causes an infringement of without it technically being a "software patent": The patent's claims language will say something like "A recorded media containing instructions..." or "A microprocessor programmed to...". And this has been true in the US and Europe through the whole span.
Which is why there is an awful lot of patent action impacting software in places where "software patents" don't exist, such as the US (as of right now) and Europe.
https://en.wikipedia.org/wiki/Alice_Corp._v._CLS_Bank_Intern... - this?
>The patents were held to be invalid, because the claims were drawn to an abstract idea, and implementing those claims on a computer was not enough to transform that abstract idea into patentable subject matter.
But that might be an interesting experiment. Right now the low cpu usage/high quality/faily high bandwidth usage category is something we're looking to have an option for.