Finding the average of two unsigned integers without overflow

Finding the average of two unsigned integers without overflow(devblogs.microsoft.com)

448 points by mlex 4 years ago | 208 comments

Having done computer architecture and bit twiddling x86 in the ye olden days, I immediately, independently converged on the patented solution (code / circuit / Verilog, more or less the same thing). It goes to show how broken the USPTO is because it's obvious to anyone in the field. Patents are supposed to be nonobvious. (35 USC 103)

https://patentdefenses.klarquist.com/obviousness-sec-103/

phkahler 4 years ago | |

Agreed. I spent about a minute before reading it and came up with the first solution, didn't feel like thinking through the puzzle of how not to care which one is larger, and then settled on the one with the 2016 expiration date. All within 1 to 2 minutes. I briefly considered XOR but didnt feel like remembering more about it - the solution was obvious when I saw it. How any of that was ever patentable is a crime.

undecisive 4 years ago | |

This is bizarre. I wonder how many of us saw that title, thought "That's a really simple problem, surely?" came up with a solution and then were shocked when their coffee-lacking brain actually came up with the patented solution?

I mean... ignoring the bitwise arithmentic (which this only obvious to people used to doing binary operations) this is the kind of maths that an 11yo could do.

That said, the patented solution is a little more complex. But not by much.

Which makes me curious: what other patents have we violated in our day-to-day without even knowing it?

mschuster91 4 years ago | | |

> Which makes me curious: what other patents have we violated in our day-to-day without even knowing it?

Patents are like the criminal code - always remember "Three Felonies a Day" [1]. The system is set up so that if you are one of the 99%, the 1% can come in and bust you at will if you become too much of an annoyance/threat. They will find something if they just keep digging deep enough (not to mention that they can have your entire company's activity combed through with a microscope if they find a sympathetic court), and blast you with enough charges and threaten sequential jail time so that you cannot reasonably do anything other than plead guilty and forfeit your right to a fair trial [2].

And for what it's worth, that "play by the rules as we want or we will destroy you" tactic can even hit multi-billion dollar companies like Epic Games. It's one thing if society decides to regulate business practices by the democratic process of lawmaking... but the fact that Apple can get away banning perfectly legal activities such as adult content, vaping [3] or using a non-Apple payment processor from hundreds of millions of people is just insane, not to mention incredibly damaging to the concept of democracy.

[1]: https://kottke.org/13/06/you-commit-three-felonies-a-day

[2]: https://innocenceproject.org/guilty-pleas-on-the-rise-crimin...

[3]: https://www.macrumors.com/2020/06/01/pax-vape-management-web...

scotty79 4 years ago | | |

We should end legal protection of ideas as soon as possible.

Agentlien 4 years ago | |

I just want to second this with my own experience just now:

I looked at the title while still waking up.

At first I thought of the low + (high - low) / 2 method. I then figured maybe it was better to simply predivide both numbers before adding and just correcting for the lowest bit (how was that ever patented?!).

However, I didn't like having to perform two divisions so I thought there was probably something clever one could do with bit operations to avoid it. But, still being tired, I decided I didn't want to actually spend time thinking on the problem and I'd already spent a minute on it.

vbezhenar 4 years ago | | |

x / 2 === x >> 1, it's fast.

Ygg2 4 years ago | |

> Patents are supposed to be nonobvious

Emphasis on supposed.

The granted patents include: laser used to exercise cat, and mobile wood based dog game (log used to play fetch).

https://abovethelaw.com/2017/10/8-of-my-favorite-stupid-pate...

https://patents.google.com/patent/US5443036A/en

https://patents.google.com/patent/US6360693

Apple steals the cake though. By patenting a geometric shape.

nikanj 4 years ago | | |

I bet you broke this patent as a kid https://patents.google.com/patent/US6368227B1/en

zanethomas 4 years ago | |

100% correct

the patented solution immediately came to mind

grishka 4 years ago | | |

It almost did for me. I thought that you should be able to divide each number by 2 (or shift one bit) before adding, but that would lose a 1 if both numbers have 1 in their least significant bit. The part with "a & b & 1" fixes that exact issue and is obvious to me in hindsight.

tapirl 4 years ago | |

And what is the intention to make the patent? The second way is actually more useful, not limited to unsigned ints.

Tuna-Fish 4 years ago | | |

But it requires you to know which one is larger. The patented way is faster if you are working with unsigned.

ChrisLomont 4 years ago | |

The patent is more sophisticated than what the article implies - it's a single clock cycle method, which no compiler I've ever seen will do given the code presented in the article.

And it's from 1996.

bsdetector 4 years ago | | |

This thread is full of people who challenged themselves to solve it and then failed to come up with the 'obvious' 1-cycle solution. It's clearly non-obvious, as this thread shows.

The actual patent system failure here is the patent is not useful -- it's not valuable. If you needed this solution, you could sit down and derive it in less than an hour. That's not because it's obvious, but because the scope is so small.

The only difference between this patent and say a media codec is how long it would take to reinvent it. It might take you 200 years to come up with something as good as h.265, but there's no magic to it. There's a problem, somebody came up with a solution, somebody else could do it again given enough time to work on it. This is true for everything that's ever been patented.

The point of patents is to compensate for value of the work needed to reinvent, and so the real problem here is that value is less than any sane minimum. The value is less than the patent examiner's time to evaluate it! But court rulings have said it doesn't matter how insignificant a patent is, as long as it does anything at all it's "useful", which leads to these kinds of worthless patents.

adrian_b 4 years ago | | |

Sorry, but this argument about the single-cycle implementation is complete BS.

Any logic designer, who is not completely incompetent, when seeing the expression

(a / 2) + (b / 2) + (a & b & 1);

will notice that this is a 1-cycle operation, because it is just an ordinary single addition.

In hardware the divisions are made just by connecting the bits of the operands in the right places. Likewise, the "& 1" is done by connecting the LSB's of the operands to a single AND gate and the resulting bit is connected to the carry of the adder, so no extra hardware devices beyond a single adder are needed. This is really absolutely trivial for any logic designer.

The questions at any hiring interview, even for beginners, would be much more complex than how to implement this expression.

It is absolutely certain that such a patent should have never been granted, because both the formula and its implementation are obvious for any professional in the field.

roberthahn 4 years ago | |

What is obvious today might not have been obvious in 1996.

Our experiences and training has changed dramatically over the past 26 years.

scotty79 4 years ago | | |

I can assure you I would have come up with the patented solution just as fast in 1996 when I was a teenager and dabbled in 6502 assembler on Atari computer. Because I solved it now on the basis of exactly the expeirience and knowledge I acquired back then.

b112 4 years ago | |

But in as it is non-obvious, as to why it is non-obvious, criteria met.

hackthefender 4 years ago | |

> It goes to show how broken the USPTO is...

The patent issued in 1996 and wasn't revisited since then (because never asserted in litigation). The USPTO is a lot different now, a quarter-century later.

Dylan16807 4 years ago | | |

> The USPTO is a lot different now, a quarter-century later.

Please be more specific or link something that explains how they've improved.

nerdponx 4 years ago | | |

Isn't there also a recourse process by which you can get a patent invalidated? You can't expect USPTO to hire an expert in every single possible field.

ridiculous_fish 4 years ago |

The "SWAR" approach `(a & b) + (a ^ b) / 2` looks bizarre but can be understood.

Adding two bits produces a sum and a carry:

    0 + 0 = 0, carry 0
    1 + 0 = 1, carry 0
    0 + 1 = 1, carry 0
    1 + 1 = 0, carry 1

So the sum is XOR, and the carry is bitwise AND. We can rewrite x + y as (x ^ y) + (x & y)*2

Distribute the divide, and you get (x ^ y)/2 + (x & y) which is the mystery expression.

(Note this distribution is safe only because (x & y)*2 is even.)

worewood 4 years ago |

Just by reading the headline, before opening the article, I thought of the patented solution in my head. "Just halve before adding, it can be off by one but some boolean logic might do it"

Software patents are absolutely disgusting.

jws 4 years ago | |

Absolutely the same thing I did. I even had the low bit logic worked out by the time I scrolled the article down and saw the patented line. Clearly we have both had miraculous enlightenment because legally this is “not obvious”.

hackthefender 4 years ago | | |

> Clearly we have both had miraculous enlightenment because legally this is “not obvious”.

To be precise, legally it is "not obvious back in 1996." There is a lot of stuff that is obvious today that wasn't 25 years ago. That said, this one in particular probably would have been invalidated as obvious if it was ever litigated (and it was not). Also, the USPTO has reined in software patents a lot in recent years (but always people advocating for more or less).

andi999 4 years ago | | |

Yes. I think though the next solution is not obvious. (a & b)+ (a^b)/2

umeshunni 4 years ago | |

It's not the solution that's patented, but it's the implementation in a single CPU cycle in hardware.

jagger27 4 years ago | | |

That’s a pretty important distinction. If someone in the 1800s invented a mechanical calculator that could do this operation in a single crank, I don’t think anyone would upset about that patent.

sebazzz 4 years ago | | |

So the compiler developers had to pay fees? Or Intel and AMD?

teaearlgraycold 4 years ago | |

As if anyone would ever get prosecuted for that, though.

Given its simplicity this makes me wonder if a compiler has ever transformed legal original IP code into patented code.

enneff 4 years ago | | |

That’s not really how software patents are (ab)used. Just having the patent and a vaguely credible claim that someone is using the patented technology is enough to encumber them with enough legal issues that many people will settle instead of fight it.

justin66 4 years ago |

See also: "Nearly All Binary Searches and Mergesorts are Broken" by Joshua Bloch. The cluefulness or otherwise with which people often react to Bloch's excellent post is not something to ponder very closely if you want to retain any hope in the future of software engineering.

https://ai.googleblog.com/2006/06/extra-extra-read-all-about...

https://news.ycombinator.com/item?id=3530104

https://news.ycombinator.com/item?id=1130463

https://news.ycombinator.com/item?id=14906429

https://news.ycombinator.com/item?id=6799336

https://news.ycombinator.com/item?id=9857392

https://news.ycombinator.com/item?id=12147703

https://news.ycombinator.com/item?id=621557

https://news.ycombinator.com/item?id=7594625

https://news.ycombinator.com/item?id=9113001

https://news.ycombinator.com/item?id=16890739

If doomscrolling all that isn't enough to make you fear for mankind's future I'm pretty sure there's an Ulrich Drepper glibc bug report rejection related to this topic (or several) that you can google...

On topic: Raymond's post has some other great stuff. SWAR!

johnhenry 4 years ago |

I saw the title and thought to just do "(a / 2) + (b / 2)" and a do a little bit of fudging if a or b is odd.

After reading the article, learning that

  unsigned average(unsigned a, unsigned b)
  {
    return (a / 2) + (b / 2) + (a & b & 1);
  }

was once patented actually made me a bit sad for our entire system of patents.

cphoover 4 years ago | |

Why is math patentable? seems crazy to me

bonzini 4 years ago | | |

What is patentable is "this circuit to compute the average" where the circuit is an adder that drops the bottom bit from the addends, instead ANDing the two bottom bits and using the result as a carry-in.

Though actually it shouldn't be patented because it's an obvious implementation of a math formula (and math is not patentable).

nmilo 4 years ago |

    There’s another algorithm that doesn’t depend on knowing which value is larger, the U.S. patent for which expired in 2016:

    unsigned average(unsigned a, unsigned b)
    {
        return (a / 2) + (b / 2) + (a & b & 1);
    }

There's no way that should be patentable.

rustybolt 4 years ago |

> There’s another algorithm that doesn’t depend on knowing which value is larger, the U.S. patent for which expired in 2016.

That's completely retarded; it's literally the first solution I think of when I hear this problem.

kuboble 4 years ago | |

That's not a solid argument on its own. Today if you want to talk to someone then using a phone might be the first solution you can think of. That doesn't indicate phone was a bad patent in a past.

ghusbands 4 years ago | | |

It was obvious in 1996, too. It is and was the most obvious solution for a programmer fully aware of the problem and wanting to avoid comparisons.

wongarsu 4 years ago | | |

If the average expert in the field immediately comes up with the same or a very similar solution then it obviously isn't non-obvious, which is one of the tests for patentability.

In the case of the phone you already know the patented solution, which obviously makes it impossible for you to judge its obviousness. That presumable wasn't the case with GP and the presented problem.

SkeuomorphicBee 4 years ago | | |

If a phone is the first solution that comes to mind for a person that never saw or heard of a phone in their life, then that indicates phone was a bad patent.

d_tr 4 years ago | |

The fact that patents require time and money makes this even more pathetic and appalling.

dralley 4 years ago |

> I find it amusing that the PowerPC, patron saint of ridiculous instructions, has an instruction whose name almost literally proclaims its ridiculousness: rldicl. (It stands for “rotate left doubleword by immediate and clear left”.)

I suspect the POWER team has a good sense of humor. There's also the EIEIO instruction https://www.ibm.com/docs/en/aix/7.2?topic=set-eieio-enforce-...

benlivengood 4 years ago |

Before reading the article: In x86 assembly, add ax, bx ; rcr ax, 1 works. I guess technically that is with overflow, but using overflow bits as intended.

EDIT: it's included in the collection of methods in the article as expected.

jart 4 years ago | |

That's lovely. I missed it when reading the article. It's also the winner on AMD Zen architecture based on MCA analysis.

    unsigned midpoint(unsigned a, unsigned b) {
      asm("add\t%1,%0\n\t"
          "rcr\t%0"
          : "+r"(a)
          : "r"(b));
      return a;
    }

Although `(a & b) + (a ^ b) / 2` is probably the more conservative choice.

AnotherGoodName 4 years ago |

I noticed the following is in the middle of the article with no context that no one else is mentioning:

    unsigned average(unsigned a, unsigned b)
    {
        return (a & b) + (a ^ b) / 2;
    }

A quick sanity check of this

23 & 21 = 21

23 ^ 21 = 2

21 + 2 / 2 = 22 (order of operations)

I wonder why this is there. It seems the best solution but no one else is mentioning it. It also has no context near it. Nor is it stated correctly. It's just there on it's own.

jlynn 4 years ago | |

The average of 23 and 21 is indeed 22.

AnotherGoodName 4 years ago | | |

Oh right, sorry i'll edit this. It works straight up then. Weird it's there with no context.

829588225 4 years ago | |

23 ^ 21 = 2

AnotherGoodName 4 years ago | | |

Sorry, edited the above. This is straight up right then which is weird. It's just there in the middle of the article with no context. In the middle of the SWAR method.

yalogin 4 years ago |

I cannot believe that solution was allowed to be patented. How crappy is our patent process? Most engineers writing code would come up with that solution first.

stathibus 4 years ago | |

Every patent attorney I've ever worked with has emphasized that engineers are not equipped to determine if an idea is obvious and should let the PTO decide.

They say this because they know the USPTO strategy is to just hand out patents after putting in some bare minimum effort to review, and postpone the real review process to the unlikely day that someone chooses to challenge it in court and can pay private firms to do their job for them.

The winners in this arrangement are the government, the big law firms, and the large corporations that can afford them.

classichasclass 4 years ago |

He hinted at this obliquely, but the PowerPC family of bit rotate instructions (ridicl, rlwinm, rlwimi, etc.), although intimidating in the general case, allows shifting, rotation, insertion, masking and more. There are many alternative mnemonics to try to reduce the cognitive complexity but all of these just assemble to them with particular parameters.

leptoniscool 4 years ago |

This seems fundamental, surprised elementary operations hasn't been made a part of every major language/framework.

mzs 4 years ago | |

>Bonus chatter: C++20 adds a std::midpoint function that calculates the average of two values (rounding toward a).

Findecanor 4 years ago |

As an asm geek, I wasn't surprised to read that taking advantage of the carry flag yielded the most efficient code for some processors. I recalled that some ISAs also have special SIMD instructions specifically for unsigned average, so I looked them up:

* x86 SSE/AVX/AVX2 have (V)PAVGB and (V)PAVGW, for 8-bit and 16-bit unsigned integers. These are "rounding" instruction though: adding 1 to the sum before the shift.

* ARM "Neon" has signed and unsigned "Halving Addition". 8,16 or 32 bit integers. Rounding or truncating.

* RISC-V's new Vector Extension has instructions for both signed and unsigned "Averaging Addition". Rounding mode and integer size are modal.

* The on-the-way-out MIPS MSA set has instruction for signed, unsigned, rounded and truncated average, all integer widths.

Some ISAs also have "halving subtraction", but the purpose is not as obvious.

unwind 4 years ago |

Very cool.

I was surprised that the article didn't mention the need for this in binary search, and the famous problems [1] that occured due to naive attempts.

[1]: https://en.m.wikipedia.org/wiki/Binary_search_algorithm

ncmncm 4 years ago |

> gcc doesn’t have a rotation intrinsic, so I couldn’t try it there

Gcc and Clang both recognize the pattern of shifts and OR that reproduce a rotation, and substitute the actual instruction, no intrinsic needed.

I bet MSVC does too.

adrian_b 4 years ago | |

They recognize how to do a rotation of an unsigned integer value, but they do not recognize how to do the rotation of that value concatenated with the carry bit, which is needed here.

user-the-name 4 years ago |

"unsigned long long"?

It's 2022. stdint.h is old enough to drink, and is probably married with a kid on the way. Just include it already?

MaxBarraclough 4 years ago |

Reminds me of a Stack Overflow thread, Shortest way to calculate difference between two numbers? [0] Multiple answers ignored the possibility of overflow.

[0] https://stackoverflow.com/q/10589559/

favorited 4 years ago |

Marshall Clow gave a pretty excellent CppCon talk covering these exact problems, called "std::midpoint? How Hard Could it Be?" https://www.youtube.com/watch?v=sBtAGxBh-XI

rrss 4 years ago | |

yes, this is linked from the article

Subsentient 4 years ago |

Eh. I just cast both to a bigger integer type where possible, which in practice, is almost always. So if I'm averaging two uint32_ts, I just cast them to uint64_t beforehand. Or in Rust, with its lovely native support for 128-bit integers, I cast a 64-bit integer to 128-bit.

everyone 4 years ago |

I find it mind boggling that something as simple as this can actually be patented.

  unsigned average(unsigned a, unsigned b)
  {
      return (a / 2) + (b / 2) + (a & b & 1);
  }

That makes the patent system seem broken to me.

mark-r 4 years ago |

This was a lot more thorough and in-depth than I expected it to be. But that's Raymond Chen for you.

One of the reasons I love Python is that integers never overflow, so this becomes a trivial problem.

erwincoumans 4 years ago | |

Rounding in Python is interesting though:

https://www.askpython.com/python/built-in-methods/python-rou...

"Also, if the number is of the form x.5, then, the values will be rounded up if the roundup value is an even number. Otherwise, it will be rounded down.

For example, 2.5 will be rounded to 2, since 2 is the nearest even number, and 3.5 will be rounded to 4."

scatters 4 years ago | | |

Round half to even is well known outside Python. It's sometimes called bankers' rounding.

mark-r 4 years ago | | |

Yes it is. But nobody said anything about rounding!

nickm12 4 years ago | |

Raymond Chen is a treasure.

Beldin 4 years ago |

Since this discussion is all about patents: my 2 cents on improving the patent system.

Consider a term project of an undergraduate CS course, where the goal is spelled out, but the method is left for discovery.

Methods developed within any such project immediately invalidate patents. They're apparently obvious to folks learning to become "skilled in the art".

Yes, in practice, reaching a legal threshold would be hard (are you sure the students didn't read the patent or any description directly resulting from it?). But I'd definitely run a "patent invalidation course" - if I had confidence that the results would actually affect patents.

hollowturtle 4 years ago |

Wait, what? How can a patent right be applied on a one line of code that eventually is compiled down to machine code? Sounds ridicolous to me

bufferoverflow 4 years ago |

Isn't it better to do

    (a>>1) + (b>>1) + (a&b&1)

No division needed.

jws 4 years ago | |

Your compiler will take care of that. Leave the division for the humans to read.

xaduha 4 years ago | | |

I'm in a camp that thinks compilers should also take care of the original

    unsigned average(unsigned a, unsigned b) {
        return (a + b) / 2;
    }

At the end of the day it's all just text. There are plenty of steps before any of it does anything at all.

8jy89hui 4 years ago | |

Not really. It is harder for most programmers to read (a>>1) than the simpler (a/2) and in most modern programming languages the compiler will notice the division by a power of two and compile to bit shift operations in both cases.

marginalia_nu 4 years ago | | |

> most programmers

Really depends on where you're coming from. Anyone who has dipped their toes in embedded programming will immediately know they are equivalent, and many will correct /2 to a bitshift, because that's what you want to happen.

I get that bit twiddling is obscure outside of low level programming, but bit shifts really is kindergarten stuff in this domain.

dpacmittal 4 years ago |

You could also do (a + (b-a)/2) where a is the smaller number.

dathinab 4 years ago |

how is turning (a+b)/2 into a/2 + b/2 + a&b&1 even patentable?

Turning (a+b)/2 into a/2 + b/2 is basic obvious math.

If you do it and to any basic testing you will realize you are getting of by one errors, locking at them can then make it obvious that when they appear and hence how to fix them.

Sure a proof is more complex, but then you can just trivially test it for all smaller-bit numbers over all possible inputs, hence making proofs unnecessary (for that numbers).

This is a solution a not yet graduated bachelor student can find in less then a day.

Having granted a patent for this should lead to disciplinary measurements against the person granting the patent tbh.

phs318u 4 years ago |

I got a pang of nostalgia seeing the Alpha AXP instructions.

avmich 4 years ago |

Does this all work with BCD encoding?

blobbers 4 years ago |

This guy hacks.

kingcharles 4 years ago |

Some unreal solutions here that show how amazing mathematics can be. Especially that Google patented method that only just recently expired.

Props for including the assembler breakdown for every major CPU architecture.

readthenotes1 4 years ago | |

It was a Samsung patent. Only the document was hosted by Google

ijidak 4 years ago | |

I had to lol when I saw there was a patent for that.

Divide both operands by 2 was my first idea before loading the page. (I like to try that sometimes before reading the articles.)

I didn't think about the carry bit, but it seems like that would be a logical solution after 5 minutes of extra thinking.

I'm not sure how that's patentable.

That's insane to me.

But maybe there is more too it.

I didn't read the patent itself.

staticassertion 4 years ago | | |

https://patents.google.com/patent/US6007232A/en

The patent is for a circuit design to perform that algorithm in a single cycle. The algorithm was never patented, nor could it be.