Run DeepSeek R1 Dynamic 1.58-bit

Run DeepSeek R1 Dynamic 1.58-bit(unsloth.ai)

767 points by noch 1 year ago | 332 comments

Jasondells 1 year ago |

An 80% size reduction is no joke, and the fact that the 1.58-bit version runs on dual H100s at 140 tokens/s is kind of mind-blowing. That said, I’m still skeptical about how practical this really is for most people. Like, yeah, you can run it on 24GB VRAM or even with just 20GB RAM, but "slow" is an understatement—those speeds would make even the most patient person throw their hands up.

And then there’s the whole repetition issue. Infinite loops with "Pygame’s Pygame’s Pygame’s" kind of defeats the point of quantization if you ask me. Sure, the authors have fixes like adjusting the KV cache or using min_p, but doesn’t that just patch a symptom rather than solve the actual problem? A fried model is still fried, even if it stops repeating itself.

On the flip side, I love that they’re making this accessible on Hugging Face... and the dynamic quantization approach is pretty brilliant. Using 1.58-bit for MoEs and leaving sensitive layers like down_proj at higher precision—super clever. Feels like they’re squeezing every last drop of juice out of the architecture, which is awesome for smaller teams who can’t afford OpenAI-scale hardware.

"accessible" still comes with an asterisk. Like, I get that shared memory architectures like a 192GB Mac Ultra are a big deal, but who’s dropping $6,000+ on that setup? For that price, I’d rather build a rig with used 3090s and get way more bang for my buck (though, yeah, it’d be a power hog). Cool tech—no doubt—but the practicality is still up for debate. Guess we'll see if the next-gen models can address some of these trade-offs.

danielhanchen 1 year ago | |

Oh the repetition issue is only on the non dynamic quants :) If you do dynamic quantization and use the 1.58bit dynamic quantized model the repetition issue fully disappears!

Min_p = 0.05 was a way I found to counteract the 1.58bit model generating singular incorrect tokens which happen around 1 token per 8000!

smcleod 1 year ago | | |

min_p is great, do you apply a small amount of temperate as well?

huijzer 1 year ago | |

> That said, I’m still skeptical about how practical this really is for most people.

I'm running Open WebUI for months now for me and some friends as a front-end to one of the API providers (deepinfra in my case, but there are many others, see https://artificialanalysis.ai/).

Having 1.58-bit is very practical for me. I'm looking much forward to the API provider adding this model to their system. They also added a Llama turbo (also quantized) a few months back so I have good hopes.

danielhanchen 1 year ago | | |

Oh I love Open WebUI as well!! But glad to hear the 1.58bit version could be helpful to you!

rafaelmn 1 year ago | |

>Like, I get that shared memory architectures like a 192GB Mac Ultra are a big deal, but who’s dropping $6,000+ on that setup?

AMD strix halo APU will have quad channel memory and will launch soon so expect these kinds of setups available for much less. Apple is charging an arm and a leg for memory upgrades, hopefully we get competition soon. From what I saw at CES OEMs are paying attention to this use case as well - hopefully not following suite on RAM markups.

sliken 1 year ago | | |

Keep in mind the strix halo APU has a 256 bit wide memory bus and the Mac Ultra has a 1024 bit wide memory bus.

Here's hoping the Nvidia Digit (GB10 chip) has a 512 bit or 1024 bit wide interface, otherwise the Strix Halo will be the best you can do if you don't get the Mac Ultra.

dagmx 1 year ago | | |

Unfortunately, Apple’s RAM and Storage upgrade prices are very in line with other class comparable OEMs.

I’m sure there’ll be some amount of undercutting but I don’t think it’ll be a huge difference on the RAM side itself.

jairuhme 1 year ago | |

At my work, we self-host some models and have found that for anything remotely similar to RAG or use cases that are very specific, the quantized models have proven to be more than sufficient. This helps us keep them running on smaller infra and generally lower costs

michaelt 1 year ago | | |

Personally I've noticed major changes in performance between different quantisations of the same model.

Mistral's large 123B model works well (but slowly) at 4-bit quantisation, but if I knock it down to 2.5-bit quantisation for speed, performance drops to the point where I'm better off with a 70B 4-bit model.

This makes me reluctant to evaluate new models in heavily quantised forms, as you're measuring the quantisation more than the actual model.

smcleod 1 year ago | |

I just ran it up on 48gb (2x 3090) + overflow into CPU RAM and it runs at around 4tk/s (only a little 8k context size though) which while absolutely not something I'd personally use daily - it is actually usable.

eurekin 1 year ago | | |

I have similar set-up - can you help out with running it? Was it in ollama?

EDIT: It seems that original authors provided a nice write-up:

https://unsloth.ai/blog/deepseekr1-dynamic#:~:text=%F0%9F%96...

ryao 1 year ago | | |

Which one did you run?

elorant 1 year ago | |

Not everyone needs the largest model. There are variations or R1 with fewer parameters that can easily run on consumer hardware. With 80% size reduction you could run 70B on 8-bit on an RTX 3090.

Other than that, if you really need the big one you can get six 3090s and you're good to go. It's not cheap, but you're running a ChatGPT equivalent model from your basement. A year ago this was a wetdream for most enthusiasts.

whimsicalism 1 year ago | | |

There’s a huge difference both in capabilities and in meaning between “variations of r1” and “r1 distill”. ollama is intentionally misleading people on this but the distills are much much worse

Kye 1 year ago | | |

I ran whatever version Ollama downloaded on a 3070ti (laptop version). It's reasonably fast. Generative stuff can get weird if you do prompts like "in the style of" or "a new episode of" because it doesn't seem to have much pop culture in its training data. It knows the Stargate movie, for example, and seems to have the IMDB info for the series, but goes absolutely ham trying to summarize the series.

This line in the stuff inside the <think> section suggests it's also been trained on YouTube clips:

>> "I'm not entirely sure if I got all the details right, but this is what I remember from watching clips and summaries online."

An excerpt from the generated summary:

>> "Set in the 23rd century during a Z-Corp invasion, the series features action sequences, strategic thinking, and humor. It explores themes of international espionage, space warfare, and humanity's role in the cosmos. The show incorporates musical numbers and catchy theme songs for an engaging viewing experience. The plot involves investigating alien warships and their secret base on Kessari planet while addressing personal conflicts and philosophical questions about space."

"It explores themes of international espionage, space warfare, and humanity's role in the cosmos" is the closest to correct line in the whole output.

brookst 1 year ago | | |

Or if you want a large model but don’t need high performance, get a Mac with 128GB UMA.

F7F7F7 1 year ago | |

People would only be 'throwing their hands up' because commercial LLMs have set unreasonable expectations for folks.

Anyone who has a/the need for or understands the value of a local LLM would be OK with this kind of output.

bnchrch 1 year ago | | |

Everyone has the need for on device LLM, if the response rate was fast!

goosejuice 1 year ago | | |

I use commercial LLMs every day. The best of them can still be infuriating at times to the point of being unproductive. So I'm not sure I agree here.

ricardobeat 1 year ago | |

The repetition issue happens on simple quantization, what they are releasing is an approach that fixes that.

danielhanchen 1 year ago | | |

Yes exactly! I edited the blog post to make the wording a bit better!

JKCalhoun 1 year ago | |

Layman here — but I am hopeful for 1.58 bit plus custom silicon to be the Holy Grail. I suppose I am setting high expectations on Apple to integrate said in their next "A" chip.

Wishful thinking.

danielhanchen 1 year ago | | |

Ye a custom chip would be insane! 1.5 bit with a scaling factor seems to be actually usable for MoEs with shared experts!

sliken 1 year ago | |

I do want a 192GB Mac Ultra, I'm hoping the Nvidia Digit achieves similar at $3,000. Sadly no specifications or benchmarks, so tokens/sec is just a guess at this point.

yodsanklai 1 year ago | |

> I’d rather build a rig with used 3090s and get way more bang for my buck

I'm curious, what would you use that rig for?

apples_oranges 1 year ago |

Random observation 1: I was running DeepSeek yesterday on my Linux with a RTX 4090 and I noticed that the models should fit into VRAM, which is 24GB. Or they are simply slow. So the Apple shared memory architecture has an advantage here. A 192GB Mx Ultra can load and process large models efficiently.

Random observation 2: It's time to cancel the OpenAI subscription.

mtrovo 1 year ago |

Wow, an 80% reduction in size for DeepSeek-R1 is just amazing! It's fantastic to see such large models becoming more accessible to those of us who don't have access to top-tier hardware. This kind of optimization opens up so many possibilities for experimenting at home.

I'm impressed by the 140 tokens per second speed with the 1.58-bit quantization running on dual H100s. That kind of performance makes the model practical for small or mid sized shops to use it for local applications. This is a huge win for people working on agents that require low latency that only local models could support.

paradite 1 year ago | |

Btw completely off topic, but your comment triggered the internal classification in my brain, and it looks like AI-generated.

Not accusing you anything. Could be that you happen to write in a way similar to LLMs. Could be that we are influenced by LLM writing styles and are writing more and more like LLMs. Could be that the difference between LLM generated content and human-generated content is getting smaller and harder to tell.

j_bum 1 year ago | | |

+1 my LLM spidy senses were tingling.

It’s the exclamation point in the first paragraph, the concise and consistent sentence structure, and the lack of colloquial tone.

OP, no worries if you’re real. I often read my own messages or writing and worry that people will think I’m an LLM too.

mtrovo 1 year ago | | |

haha you got me. I'm real person using LLM to proofread the stuff I write. English is not my native language and I'm trying to improve my written vocabulary a little bit. Sorry if it reads a little bit too off.

ahmeneeroe-v2 1 year ago | | |

Very funny, I didn't mentally jump to LLM, but the language was so lifeless that I stopped reading.

Amazing that OP confirmed you're correct (and good use of LLM @OP).

danielhanchen 1 year ago | |

I was pleasantly surprised by 140 tokens/s as well! I literally thought I did something wrong but it was real!

raghavbali 1 year ago |

> Unfortunately if you naively quantize all layers to 1.58bit, you will get infinite repetitions in seed 3407: “Colours with dark Colours with dark Colours with dark Colours with dark Colours with dark” or in seed 3408: “Set up the Pygame's Pygame display with a Pygame's Pygame's Pygame's Pygame's Pygame's Pygame's Pygame's Pygame's Pygame's”.

This is really interesting insight (although other works cover this as well). I am particularly amused by the process by which the authors of this blog post arrived at these particular seeds. Good work nonetheless!

brap 1 year ago |

As someone who is out of the loop, what’s the verdict on R1? Was anyone able to reproduce the results yet? Is the claim that it only took $5M to train generally accepted?

It’s a very bold claim which is really shaking up the markets, so I can’t help but wonder if it was even verified at this point.

DogRunner 1 year ago |

>For optimal performance, we recommend the sum of VRAM + RAM to be at least 80GB+.

Oh nice! So I can try it in my local "low power/low cost" server at home.

My homesystem does run in a ryzen 5500 + 64gb RAM + 7x RTX 3060 12gb

So 64gb RAM plus 84gb VRAM

I dont want to brag around, but point to solutions for us tinkerers with a small budget and high energy costs.

such system can be build for around 1600 euro. The power consumption is around 520 watt.

I started with a AM4 Board (b450 Chipset) and one used RTX 3060 12gb which cost around 200 Euro used if you are patient.

There every additional GPU is connected with the pcie riser/extender to give the cards enough space.

After a while I had replaces the pcie cards with a single pcie x4 to 6x PCIe x1 extender.

It runs pretty nice. Awesome to learn and gain experience

cubefox 1 year ago |

For anyone wondering why "1.58" bits: 2^1.58496... = 3. The weights have one of the three states {-1, 0, 1}.

dist-epoch 1 year ago | |

They say something else:

> We managed to selectively quantize certain layers to higher bits (like 4bit), and leave most MoE layers (like those used in GPT-4) to 1.5bit

cubefox 1 year ago | | |

That was just improper rounding from 1.58 to 1.5. They say 1.58 in other places and explicitly link to https://arxiv.org/abs/2402.17764

tarruda 1 year ago |

Would be great if the next generation of base models was designed to be inferred with 128GB of VRAM while 8bit quantized (which would fit in the consumer hardware class).

For example, I imagine a strong MoE base with 16 billion active parameters and 6 or 7 experts would keep a good performance while being possible to run on 128GB RAM macbooks.

danielhanchen 1 year ago | |

So I remember Deepseek used float8 for training - Character AI also used int8 for training - it is indeed possible, but sometimes training can be unstable - Deepseek to my knowledge is actually the first lab to use float8 at a large scale without causing loss spikes - they used FP8 tensor cores, then every 4th matrix multiply, they accumulated to a FP32 accumulator - it seems like the Hopper Tensor Cores accumulation mechanism might not be actual FP32 accumulation. I wrote more here: https://x.com/danielhanchen/status/1872719599029850391

Davidzheng 1 year ago | |

Would be great, but unfortunately i think intelligence at that compute scale will be limit by hardware not its model. Though at hardware limit I would expect it to be roughly human level especially if optimized for a particular domain.

tarruda 1 year ago | | |

I remember that Llama 3 was trained on data curated by Llama 2 and it resulted in a model with a significant performance boost (even though it was trained by a previous generation model of the same size).

Maybe using a strong reasoning model such as R1 the next generation, even more performance can be extracted from smaller models.

TheTaytay 1 year ago |

Danielhanchen, your work is continually impressive. Unsloth is great, and I’m repeatedly amazed at your ability to get up to speed on a new model within hours of its release, and often fix bugs in the default implementation. At this point, I think serious labs should give you a few hour head start just to iron out their kinks!

danielhanchen 1 year ago | |

Oh thanks a lot! Appreciate it :) We're always open to collaborating with anyone!

afro88 1 year ago |

The size reduction while keeping the model coherent is incredible. But I'm skeptical of how much effectiveness was retained. Flappy bird is well known and the kind of thing a non-reasoning model could het right. A better test would be something off the beaten path that R1 and o1 get right that other models don't.

whimsicalism 1 year ago | |

yeah it is pretty unclear how lobotomized it is without benchmark.

i’ve gotten full fp8 running on 8xh100, probably going to keep doing that

hendersoon 1 year ago |

The size reduction is impressive but unless I missed it, they don't list any standard benchmarks for comparison so we have no way to tell how it compares to the full-size model.

amusingimpala75 1 year ago |

> DeepSeek-R1 has been making waves recently by rivaling OpenAI's O1 reasoning model while being fully open-source.

Do we finally have a model with access to the training architecture and training data set, or are we still calling non-reproducible binary blobs without source form open-source?

stackedinserter 1 year ago | |

It sounds like if they owe you the training architecture and training data set.

chris_pie 1 year ago | | |

It absolutely doesn't. It sounds like further diluting the term "open-source" isn't great.

miohtama 1 year ago |

Flappy Bird in Python is the new Turing test

danielhanchen 1 year ago | |

:) It's my goto test :) I did amp it up by adding 10 conditions and made a scoring card - I found the original R1 to sometimes forget "import os" or miss some lines as well, so I thought it was at least a good check!

I also like to ask the models to create a simple basic Minecraft type game where you can break pieces and store them in your inventory, but disallow building stuff

miohtama 1 year ago | | |

I feel any AI can fix those problems when they can finally act. The problem AIs cannot run or debug code, or even book a hotel for me. When that is solved and an AI can interact with the code like a human does, it can fix its problems like a human does.

mclau156 1 year ago | |

hopefully we eventually push them to make more classic games like motherlode

ThePhysicist 1 year ago |

In general, how do you run these big models on cloud hardware? Do you cut them up layer-wise and run slices of layers on individual A100/H100s?

phire 1 year ago | |

My understanding is with MoE (Mixture of Experts), you can and should shard it horizontally. The whole model is 600GB, but only 37GB is active during the evaluation of any single output token.

So you can load a different active subset of the MoE into each 89GB GPU, sharding it across something like 32 different GPUs (or can you get away with less? Wouldn't be surprised if they can infer on 8x H800 gpus). Some parameters are common, others are independent. Queries can be dynamically routed between GPUs, potentially bouncing between GPUs as much as once per output token, depending on which experts they need to activate.

Though, I suspect it's normal to stick on one MoE subset for several output tokens.

This has a secondary benefit that as long as the routing distribution is random, queries should be roughly load balanced across all GPUs.

yorwba 1 year ago | | |

Each MoE layer has its own router, and it activates 8 (out of 256) experts at a time. There's no reason to expect all of them to stay on the same GPU, so you're pretty much guaranteed to have to do all-to-all communication between the GPUs in your cluster after every layer for every token.

danielhanchen 1 year ago | |

There are a few ways - the most basic is per layer sharding - DeepSeek uses 3 dense layers, so that can stay on GPU0 (with the embedding layer). There's 58 MoE layers (256 experts, 8 activated) and 1 shared expert per layer. GPU1 would house layers 3 to 9, and so on.

Then by using pipeline parallelism, if a new request comes, we simply stick them in a queue - GPUs 0, 1, 2, ..., 8. Request A is at GPU 2, Request B at GPU 1, Request C at GPU 0 and so on.

The other option is tensor parallelism were we split the weights evenly. You could combine pipeline and tensor parallelism as well!

amelius 1 year ago | |

You could do that, and add pipelining to improve speed.

teekert 1 year ago | |

Was wondering the same, but for HPC clusters :)

ggm 1 year ago |

If I invested in a 100x machine because I needed 100 of x to run, and somebody shows how 10x can work, why have I not just become the holder of 10 10x machines, and therefore have already achieved capex to exploit this new market?

I cannot understand why "openai is dead" has legs: repurpose the hardware and data and it can be multiple instances of the more efficient model.

stevenhuang 1 year ago | |

because of discounted cash flow/valuation models.

you invest in a 100x machine expecting a revenue of X, but now you can only charge X/100 because R1 shows that AI inference can be done much more efficiently. see the price decrease of ChatGPT and addition of free O3 etc.

this reduction of future cash flows, ceteris paribus, implies that the present value of these cash flows decrease. this then results in massive repricing to the downside as market participants update their forecasts.

what you are missing is that to assume as you do, you must make the additional assumption that demand for additional compute is infinite. Which may very well be the case, but it is not guaranteed compared to the present realized fact that R1 means lower revenues for AI inference providers -> changes the capex justification for even more hardware -> NVDA receives less revenue.

ggm 1 year ago | | |

Thanks. Always a mistake to assume the price of something is bound to your own cost of doing it: the price is bound to the other guys cheaper price!

xiphias2 1 year ago |

Has it been tried on 128GB M4 MacBook Pro? I'm gonna try it, but I guess it will be too slow to be usable.

I love the original DeepSeek model, but the distilled versions are too dumb usually. I'm excited to try my own queries on it.

rahimnathwani 1 year ago | |

  I love the original DeepSeek model, but the distilled versions are too dumb usually.

Apart from being dumber, they also don't know as much as R1. I can see how fine-tuning can improve reasoning capability (by showing examples of good CoT) but there's no reason that would improve the knowledge of facts (relative to the Qwen or Llama model on which the finetuning was based).

emseetech 1 year ago | |

I'm downloading it now and will report back.

(I've been using the 32B and while it could always be better, I'm not unhappy with it)

TheTaytay 1 year ago | | |

How'd it go, and which client are you using? :)

Pxtl 1 year ago |

Is there any good quick summary of what's special about DeepSeek? I know it's OSS and incredibly efficient, but news laymen are saying it's trained purely on AI info instead of using a corpus of tagged data... which, I assume, means it's somehow extracting weights or metadata or something from other AIs. Is that it?

rahimnathwani 1 year ago | |

  Is there any good quick summary of what's special about DeepSeek?

Yes, section 2.3 of the Deepseek R1 paper summarizes the training part you're asking about, in less than a page.

https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSee...

Dwedit 1 year ago |

Is this actually 1.58 bits? (Log base 2 of 3) I heard of another "1.58 bit" model that actually used 2 bits instead. "1.6 bit" is easy enough, you can pack five 3-state values into a byte by using values 0-242. Then unpacking is easy, you divide and modulo by 3 up to five times (or use a lookup table).

danesparza 1 year ago |

Just ask it about Taiwan (not kidding). I'm not sure I can trust a model that has such a focused political agenda.

MyFirstSass 1 year ago |

Is this akin to the quants already being done to various models when you download a GGUF at 4 bits for example, or is this variable layer compression something new that can also be make existing smaller models smaller so we can fit more into say 12 or 16 gb's of vram?

beernet 1 year ago |

Big fan of unsloth, they have huge potential, could definitely need some experienced GTM people though, IMO. The pricing page and messages sent there are really not good.

danielhanchen 1 year ago | |

Oh thanks :) Yes agreed we do need better GTM - temporarily it's still me and my brother running Unsloth, so for now we're just prioritizing many more engineering releases :)

slewis 1 year ago |

It would be really useful to see these evaluated across some of the same evals that the original R1 and deepseek's distills were evaluated on.

patleeman 1 year ago |

Incredible work by the Unsloth brothers again. It’s really cool to see bitnet quantization implemented like this.

CHB0403085482 1 year ago |

DeepSeek R1 in a nutshell

youtube.com/watch?v=Nl7aCUsWykg

upghost 1 year ago |

Thanks for the run instructions, unsloth. Deepseek is so new it's been breaking most of my builds.

marcodiego 1 year ago | |

This is an important step. Especially for beginners or people who are not in the loop, being able to easily type some simple commands to download, install dependencies, compile and run everything needed for a LLM AI model gives a feeling sci-fi; it's almost like you can have a helping brain at home.

One thing I've being thinking about doing is to combine one of those LLM models running in llama.cpp, feed it with the output of whisper.cpp and connect its output to some TTS model. I wonder how far from Wheels and Roadie from the Pole Position tv series.

danielhanchen 1 year ago | |

Glad they were helpful! :)

indigodaddy 1 year ago |

Is there any small DS or qwen model that could run on say an M4 Mac Mini Standard (16G) ?

techwiz137 1 year ago |

How can you have a bit and a half exactly? It doesn't make sense.

dosinga 1 year ago | |

It's not a bit and a half. It is 1.58 or really log(3) / log(2) since it allows for three values, -1, 0 an 1

mclau156 1 year ago |

Is the new LLM benchmark to create flappy bird in pygame?

CodeCompost 1 year ago |

Can I run this on ollama?

benoitg 1 year ago | |

Yes, the instructions are in the OP.

homarp 1 year ago |

petesergeant 1 year ago |

It is going to be truly fucking revolutionary if open-source models are and continue to be able to challenge the state of the art. My big philosophical concern is that AI locks Capital into an absolutely supreme and insurmountable lead over Labour, and into the hands of oligarchs, and the possibility of a future where that's not case feels amazing. It pleases me greatly that this has Trump riled up too, because I think it means he's much less likely to allow existing US model-makers to build moats, as I think he's -- even as a man who I don't think believes in very much -- absolutely unwilling to let the Chinese get the drop on him over this.

fullstackchris 1 year ago | |

I have no doubt open source will catch up (it already has, eh?) at the end of the day, it's just creative / new iterations on what is ultimately the transformer architecture... the amount of "secret" moat-like stuff that OpenAI was doing was bound to be figured out or exceeded eventually, like everything in tech...

Not to make fun of OpenAI and the great work they've done but it's kinda like if I went out in the 90s and said I'm going to found a company to have the best REST APIs. You can always found a successful tech company, but you can't found a successful tech company on a technological architecture or pattern alone.

sylware 1 year ago |

site is javascript walled

80%? On 2 H100 only? To get near chatgpt 4? Seriously? The 671B version??

whimsicalism 1 year ago | |

they have not benchmarked the quantized model.

fsflover 1 year ago | |

> site is javascript walled

I use Qubes OS to protect myself from the JS.

sylware 1 year ago | | |

That site should work with a noscript/basic (x)html browser.

bluesounddirect 1 year ago |

Hi small comment, please remember in china many things are sponsored by or subsidized by the government. "We[china] can do it for less.." , "it's cheaper in china.." only means the government gave us a pile of cash and help to get here .

I 100% expect some downvotes from the ccp.

kccqzy 1 year ago | |

And the United States subsidizes plenty of things too. For example the CHIPS act has $39 billion in subsidies for chip manufacturing on U.S. soil. There's nothing wrong with either country's subsidies. I personally don't believe in maximum free market. Government subsidy is more often than not a good thing and we need more of them both here and in China.

tivert 1 year ago | |

> Hi small comment, please remember in china many things are sponsored by or subsidized by the government. "We[china] can do it for less.." , "it's cheaper in china.." only means the government gave us a pile of cash and help to get here .

And that's a really important strategic advantage China has versus America, which has such an insane fixation on pure(ish) free markets and free trade that it gives away its advantages in strategic industry after strategic industry.

Some people falsely infer from the experience with the Soviet Union that freer markets always win geopolitical competition, but that's false.

syndicatedjelly 1 year ago | | |

It’s false except for every time that it has been true

cynicalpeace 1 year ago | | |

> And that's a really important strategic advantage China has versus America, which has such an insane fixation on pure(ish) free markets and free trade that it gives away its advantages in strategic industry after strategic industry.

> Some people falsely infer from the experience with the Soviet Union that freer markets always win geopolitical competition, but that's false.

The data we have is 500 years of free markets in the western world and the verdict is overwhelmingly: Yes, more freedom means more winning.

Just invite some incompetent bureaucrat over your house to dictate how you should cook and you'll quickly agree.