Cloud, Why So Difficult?(winglang.io) |
Cloud, Why So Difficult?(winglang.io) |
aws is also inescapably imperative. it’s how the api and everything behind the scenes work.
aws is gonna have a lot of nonsense best practices, both 1st and 3rd party, that you have to aggressively ignore.
if you can come to peace with these three truths, aws is great. i use it like this:
https://github.com/nathants/libaws
for those trying to improve, the best aws docs are currently the go sdk and gopls.
It's just so unnecessary. Greybeards moved to the cloud but never changed.
After spending a couple of days in terraform (I'm no infra expert) creating roles to assume, cross account permissions, modifying my script to assume those roles, figuring out the ECS run-task api syntax and some other things I'd rather forget about I kicked off the jobs to copy the data and left for the weekend. Sunday cost alert email put that thought out of my head: I just spent 8k USD on writing 2 billion rows to two tables because I misunderstood how Dynamo charges for write units. I thought I was going to spend a few hundred (still a lot, but our bill is big anyway) because I'm doing batch write requests of 25 items per call. But the dynamodb pricing doesn't care about API calls, it cares about rows written, or write capacity units, or something. OK, so how do we backfill all the historical data into Dynamo without it costing like a new car for two tables?
Apparently you can create new dynamodb tables via imports from S3 (can't insert into existing tables though) for basically no money (the pricing is incomprehensible but numbers I can find look small). Now I just need to write my statistics to line delimited dynamodb-flavored json in S3 (statements dreamed up by the utterly deranged). You need to put the type you want the value to have as the key to the value you see. A little postgres view with some CTEs to create the dynamodb-json and use the aws_s3.query_export_to_s3 function in RDS Postgres and I had a few hundred GB of nice special-snowflake json in my S3 bucket. Neat!
But the bucket is in the analytics account, and I needed the dynamo tables in the prod and staging accounts. More cross account permissions, more IAM. Now prod and staging can access the analytics bucket, cool! But they aren't allowed to read the actual data because they don't have access to the KMS keys used to encrypt the data in the analytics bucket.
OK, I'll create a user managed KMS key in analytics and more IAM policies to allow prod and staging accounts to use them to decrypt the data. But the data I'm writing from RDS is still using the AWS managed key, even after I setup my aws_s3_bucket_server_side_encryption_configuration in terraform to use my own managed key. Turns out writes from RDS to S3 always use the S3 managed key, no one cares about my aws_s3_bucket_server_side_encryption_configuration. "Currently, you can't export data (from RDS) to a bucket that's encrypted with a customer managed key.". Great. So I need to manually (yes I could figure out the aws api call and script it I know) change the encryption settings of the files in S3 after they've been written by RDS to my own custom key. And now, 4 hours of un-abortable dynamodb import jobs later, I finally have my tables in prod and staging in DynamoDB.
Now I just need to figure out the DynamoDB query language to actually read the data in the app. And how to mock that query language and the responses from dynamo.
At least I'm learning a lot...
for round two, try:
- spinup a new subaccount to ensure you have total control of its state.
- data goes in s3 as jsonl/csv/parquet/etc.
- lambdas on cron manage ephemeral ec2 for when heavy lifting is needed for data ingress, egress, or aggregation.
- lambdas on http manage light lifting. grab an object from s3, do some stuff, return some subset or aggregate of the data.
- data granularity (size and layout in s3) depends on use case. think about the latency you want for light and heavy lifting, and test different lambda/ec2 sizes and their performance processing data in s3.
lambda is a supercomputer on demand billed by the millisecond.
ec2 spot is a cheaper supercomputer with better bandwidth on a 30 second delay billed by the second.
bandwidth to s3 is high and free within an AZ, for ec2 and lambda.
bandwidth is so high that you are almost always bottlenecked on [de]serialization, and then on data processing. switch to go, then maybe to c, for cpu work.
dynamodb is great, but unless you need compare-and-swap, it costs too much.
If you want easy (or easier) mode, you'll have to use a Platform-as-a-Service (PaaS).
The major cloud vendors might have problems with quirky designs and poor documentation, but beyond that is necessary complexity.
You want a high-availability website allows user-uploaded files and does asynchronous task processing? You're probably going to have to get familiar with servers, load balancers, queues, and object storage, at a minimum.
You want it all to be secure? You're going to have to configure network rules/firewalls and set up lots of access policies.
There's no free lunch.
I think people give large organizations credit for being mustache twirlingly evil when the collective consciousness that makes up AWS is simply not smart enough to be this evil. If AWS had the coordination to do this the product would be better.
It's much more likely that the complexity is the result of a huge number of teams working independently and integration complexity being 2^n. Like AWS had one good transformative idea to make coordination easier which is to be API first but that only forces superficial consistency.
Really? I disagree. I could probably build that with Rails and Heroku in an afternoon, after creating a single S3 bucket and an access key for presigned POST. AWS has "necessary complexity" in the same way a giant hole in your head improves your brain's cooling potential. (i.e. maybe, in some very rare cases, but you almost certainly don't need it)
Tangent - Their docs are abysmal. Written like a novel which I'm meant to cross reference to their SDK.
What evidence or convincing arguments are there for your position?
For me it seems clear that this is not the case, the needless friction in eg CDK dev experience seems ridiculous.
The complexity of the cloud exists because it wasn't designed very well and all reactionary.
You look at AWS and it feels like things are getting tacked on because there's "demand" instead of thinking of what the platform should look like and building it out. Every service is done by a different team that doesn't talk to each other as well. There's no consistency anywhere.
> If you want easy (or easier) mode, you'll have to use a Platform-as-a-Service (PaaS).
It's been blurred a long time ago so how do you make this distinction? They all have PaaS features / services.
> There's no free lunch.
You had to pay to begin with so what's free?
> When I started programming, I used Borland C++. It used to take about 100ms to compile and run a program on an IBM PC AT machine (TURBO ON). An average iteration cycle in the cloud takes minutes. Minutes! Sometimes dozens of minutes!
I'm a fast-feedback fan myself, and my weapons of choice in refuge from a dark decade of c++ are the Python notebook and Clojure REPL. With that as it is, the lurching tedium of cloud development (infrastructure especially) makes me want to pull my skin off.
What is so galling about it is that, for dev purposes, almost none of these SaaSes and cloud services are really so 'big' that they couldn't be run on a beefy local workstation for development. The galling reason that I have to wait N minutes for terraform or cdk or whatever to rebuild some junk and deploy it to a bunch of neigh un-remote-debuggerable-without-firewall-shenanigans lambdas and docker containers is commercial moat-keeping for the services.
At least Azure and GCP put some token effort into local emulators of their services. AWS work has to rely on the valiant but incomplete efforts of LocalStack if they want a fast and disposable way to test infra.
patching a lambda zip takes seconds. it’s done before you can alt tab and curl.
You think cloud is too expensive or unnecessary? Fair enough, this tool is not for you.
You think cloud infra is necessarily complex because you need to support <insert use case here>. You're right! This tool is not for you (yet?).
You don't need this because you already know <CDK / Terraform / whatever abstraction is already in your repertoire>? I agree, the juice is probably not worth the squeeze to learn yet another tool.
Are you approaching cloud for the first time or have been managing existing simple infra (buckets, queues, lambdas) via ClickOps and want to explore a feature constrained (hence easy to grok) Infrastructure as Code solution? Maybe give this a look.
While it's still early days, I suspect there will be many who will find this useful, and congratulate the authors for their efforts!
I just don't see this being true. Being "cloud agnostic" likely means it's an incredibly leaky abstraction where at the first sign of trouble, you're going to have to understand winglang + the specific provider API. Any IaC product requires you intimiately understand what it's actually doing if you care about security and performance. Just because it's a managed service doesn't mean you get to ignore all it's implementation detail, right?
All the cloud providers give you a function as a service, or a nosql database, or a file bucket: ignoring all the nuance as an agnostic is at a minimum leaving optimisation on the table and more likely dangerous and expensive, surely?
The fact they all offer similar but subtly different versions of every type of product and that cross platform tools like Terraform etc have some ability to paper over these only makes it worse. (Your google cloud bucket is just like your S3 bucket right? Until it's not). When I rant about platform independence people think I have a philosophical objection to lockin, but its really much more basic than that. I just don't have time to learn thousands of vendor specific APIs, bugs, constraints etc on top of the perfectly good built up knowledge I have from 25 years of working with software systems already. I am busy using all that time and brainspace trying to keep up with the fundamental knowledge that is actually important.
don’t learn gcp if you know aws. don’t learn android if you know iphone. don’t learn ruby if you know python.
instead use that time to building interesting things. these tools are much more similar than different, and their differences are inconsequential.
"BTW here's the new product I'm selling which requires you to learn a new cloud-oriented programming language and has its own CLI and has diagrams like [this](https://docs.winglang.io/assets/images/arch-f803472c761aa198...) on its introduction page!"
The cognitive dissonance is overwhelming...
If it makes things difficult, you shouldn't be using it.
It is overhyped and it sucks for most use case.
The easiest way to not get bitten by this is to avoid the abstractions and keep it simple as long as possible. Most apps can probably do fine with a single beefy box and a local sqlite database - this can likely scale vertically indefinitely with moore's law and still probably have less downtime than if you relied on all the fancy cloud technology.
cloud adds capability to any engineer.
if i’m on coffeeshop wifi with my low power laptop, and i need to do something intense like compile linux, i’m sol.
unless i know aws. then i can open a new terminal, spin up a massive spot instance for 19.27 minutes, get that done, then self destruct[1].
being able to test lambda to s3 io, or ec2 to s3 io, with the same ease one uses grep and sed, is for great good. also it’s fun.
1. https://github.com/nathants/mighty-snitch/blob/master/kernel...
This just reminds me why I just run (my personal) web apps on the server in my basement: it's actually simpler.
I really think the worst part of programming is dealing with the development environment.
I've looked into moving it to Google cloud or AWS and it just seems daunting. Honestly, I use ftp, cpanel, and phpmyadmin.
Is there a way to get this product into the 'cloud' in case it grows, easily?
Moving lamp stacks is a piece of cake. If you don't want to do it you could find a freelancer pretty easily.
There's the slippery slope into vendor lock-in via combinatorial explosion of complexity
The cloud (and by "cloud" I mostly mean AWS) in general is indeed insanely complex. Not only is it complex and hard to use for dedicated and trained DevOps/Cloud experts, it's even more overwhelming for developers wanting to just deploy their simple apps.
This statement is in my opinion almost universaly accepted - during our market research, we've interviewed ~150 DevOps/Cloud experts and ~250 developers that have been using AWS. Only ~2.5% of them have said that the complexity of AWS is not an issue for them.
That being said, I understand that AWS has to be complex by design. Not only it offers ~250 different services, but the flexible/configurable way it's designed simply requires a lot of expertise and configuration. For example, the granularity and capabilities of AWS IAM is unparalelled. But it comes at a cost - the configurational and architectural complexity is just beyond what an average AWS user is willing to accept.
An alternative to the cloud complexity are the PaaS platforms (such as Heroku or Render). But they also have their disadvantages - mostly significantly increased costs, lower flexibility and far less supported use-cases.
At https://stacktape.com, we're developing an abstraction over AWS, that is simple enough so that any developer can use it, yet allows to configure/extend anything you might need for complex applications. Stacktape is like a PaaS platform that deploys applications/infrastructure to your own AWS account.
We believe that Stacktape offers the perfect mix of ease-of-use, productivity, cost-efficiency and flexibility. It can be used to deploy anything from side projects to complex enterprise applications.
I'll be very happy to hear your thoughts or to hear any feedback.
those 250 developers are likely entrapped by the cloud experts.
this is fine, and is a rich market that should be served.
regardless, perceived complexity or generated complexity are not the same as actual complexity. all of these complexities are real, some are optional.
Growing up in a datacenter, opening tickets and checking them weekly, hoping for the vendor to finally ship the right backplane; datacenter engineering used to take weeks, months, years. Waiting 30 seconds for terraform plan to check 120 resources which corresponds to thousands of pounds of metal and enough wattage to blow up the city I live in... doesn't seem too bad. That said, I understand where you javascript folks are coming from with your iteration loops, but still, you've gotten understand: it's so easy now.
Leaky abstraction, sure. But it's always great to see innovation in cloud infra.
The Terraform AWS provider is a very thin abstraction. If your needs are not too specific, there are probably a few higher level abstractions out there that you can use. This is one of the main reasons PaaS are so popular.
We are building a B2B service in Azure using Az Functions & Az SQL Database as the primary components. That's about it. We figured out you can "abuse" Az Functions to serve all manner of MVC-style web app (in addition to API-style apps) by using simple PHP-style templating code. Sprinkle in AAD authentication and B2B collaboration and you have a really powerful, secure/MFA auth solution without much suffering. Things like role enforcement is as simple as taking a dep on ClaimsPrincipal in the various functions.
The compliance offerings are really nice too. Turns out if you use the compliant services without involving a bunch of complicated 3rd party bullshit, you wind up with something that is also approximately compliant. For those of us in finance, this is a really important factor. 10 person startups don't have much bandwidth for auditing in-house stacks every year. If you do everything "the Azure way", it is feasible for you to grant your partners (or their auditors) access to your tenant for inspection and expect that they could find their own way around. If you do it "my way" you better be prepared to get pulled into every goddamn meeting.
I am starting to wonder if not all clouds are made equal anymore. We also have some footprint in AWS (we used to be 100% AWS), but it's really only for domain registration and S3 buckets these days. GCP doesn't even fly on my radar. I've only ever see one of our partners using it.
I'm working on a relatively complex cloud solution centered round AWS Lambda, SQS/SNS and DynamoDB with many different lambda endpoints and isolated databases. It works, but it's incredibly hard to test. The fortunate thing is there's a system in place to stand up an environment at the PR level, but even that take almost an hour to test/build/deploy/test/deploy for the PR and every commit after the PR is made. Local runs are sorely lacking. And I can only imagine the number of environments with similar issues.
I've been playing with Cloudflare Pages/Workers and CockroachLabs (CockroachDB Cloud) on a personal/side project and it's quite a bit different. Still in the getting groundwork done and experimenting phase, but wanting to avoid the complexity of the work project while still providing enough scale to not have to worry too much about falling over under load.
Not every application needs to scale to hundreds of millions of users, and it all comes at a cost that may not be worth the price of entry. The platform at work is at well over 1.6 million in terms of the story/bug/feature numbers at this point... it's a big, complex system. But working in/on it feels a few steps more complex than it could/should be. It'll absolutely scale horizontally and is unlikely to fall over under load in any way... but is it really worth it with the relatively slow turn around, in what could have still used dynamo, but the service layers in a more traditional monolith with simply more instances running?
I have to say, I'm somewhat mixed on it all.
I've noticed the frequency of K8s headlines has diminished. (Very) roughly two years ago you never saw a HN front page without one or more K8s headlines. I suspect it has saturated the market that it appeals to.
I'm going to make an impossible request and ask that any readers ignore everything else they know about "crypto", but...this is one of the things that feels right in EVM development compared with normal cloud applications. Especially with frameworks like Foundry, unit tests for distributed applications run very quickly on local copies of the whole production environment. It's a lot more fun than anything which touches AWS.
Obviously, there are some major downsides (such as Ethereum being comparable to a late 70s microcomputer in computing power). But the model of a single well specified execution + storage protocol might be worth porting over to non-financialized cloud application development.
To a first approximation, it exists, and it's called Cloudflare Workers (and KV).
If I had to bet money, mine would be on the bet that Workers represents an early example of what will be in-the-main development in a decade.
“In existing languages, where there is no way to distinguish between multiple execution phases, it is impossible to naturally represent this idea that an object has methods that can only be executed from within a specific execution phase.”
This is not true. Several languages (Haskell, OCaml, F#, Scala, etc) allow you to define and use monads. Granted, monads are not something many developers know about … but it may make sense to learn about them before writing a new language.
Otherwise, this is a great read.
I'm personally a professional Haskell programmer and quite like it, but I think we are circling a core notion: There are many problems with having programming languages where any code can do literally anything at all, and being able to restrict it is extremely powerful.
Constraints liberate, liberties constrain. You can always loosen strictures, but once loosened, they are extremely hard to reintroduce.
They have a notion of phases of execution which are different execution contexts with an ordering. And yes most languages don’t have a decent facility for expressing this or staged computation. Let alone a notion of computation phases that map to distributed systems state.
It can be made even more secure with some relatively incompatible ACLs sprinkled in too.
PaaS solutions and not IaaS is also a solution for many.
No code / low cose beeing a solution for someone else.
It's not to many days since my solution was on HN. Windmill.dev is really something special.
And by mine i do not mean that i has any affiliation with windmill, just that it solves my problem. quick iterations including ui building.
But it's not a low code plaform either. I would call it a code platform for developers
Last question how actually the simluator works, is it one of those case where it try to emulate some high level concept but then your prod code break because the simulator was 40% accurate?
About the simulator, it is a functional simulator to be able to test and interact with the business logic of the application. There are other solutions, like LocalStack to simulate the non functional parts too.
Edit: after reading: "Generated automatically from intent", well it's a red flag for me.
Apps need some kind of persistence state. That requires some thought and annoyances to deal with however you do it. There is no leakproof abstraction. Takes 5s to retrieve the data? Get digging!
Really the best you can hope for with a layer on top of system configuration is to just reduce boilerplate in the lower layer.
Also why is AWS so complicated? That’s where the money is: enterprises!
bring cloud;
let queue = new cloud.Queue(timeout: 2m);
let bucket = new cloud.Bucket();
let counter = new cloud.Counter(initial: 100);
queue.addConsumer(inflight (body: str): str => {
let next = counter.inc();
let key = "myfile-${next}.txt";
bucket.put(key, body);
});
If they can deliver on their promise, it's actually promising. (I haven't evaluated it; just taking things at face value)Just configure *nix “user” like they’re containers, or queues, or buckets.
How many DSLs do we need for the single domain of managing electron state?
I don’t mean to say other abstractions are unnecessary, I mean for that realm of platform/sre/ops/sysadmin metrics, telemetry, observability. Really though why isn’t it just fork() all the way down and code thread counts reacts to eBPF? Jettison Docker, k8s. Just release “user” profiles that are dotfiles, namespaces, cgroups rules, and a git repo to clone.
Better yet just boot to something better that mimics CLI for the lulz but isn’t so fucking fickle
And honestly, ChatGPT is so well trained on AWS CLIs, Terraform, CloudFormation, the SDKs for Python and Node, I can throw most of my problems at it and it does well.
If we work like this we'd have no frameworks. Everything started from 0.
Yes it’s possible to buy your own building, and your own DS3/OC3. And HVAC. And electrical. And backup generators for the HVAC. And the personnel to design and specify the racks and the hardware in them (all of the different configs you need). And to assemble and connect the equipment. And to maintain it when something breaks. And the network engineers to design your network and deploy and maintain it.
And do it again in a different place for geographic redundancy.
And, if you have any money and personnel left, then you can think about a virtualization infrastructure (because of course who would be stupid enough to buy VMware when you could build your own open source equivalent around HVM or whatever.
And now you’ve got like a tiny fraction of what the cloud can offer. And I guarantee you that the TCO is way higher than you expected and that your uptime is a 9 or two short of what a cloud provider would give you.
I’d you are running a single cloud-scale workload (Google Search or Dropbox or Outlook.com) then you probably can do better financially with your own optimized data center. But you almost certainly can’t beat cloud for heterogeneous workloads.
And the biggest benefit of all is savings in opportunity cost as your tech people can focus on your own unique business problems and leave the undifferentiated heavy lifting to others.
> And do it again in a different place for geographic redundancy.
With all due respect this is a trivial and misleading point of view.
Most people who do cloud don't need or employ redundancy across machines; much less so in different regions. But they do have devops teams or programmers who are required to learn bespoke cloud dashboards and AWS products. Even though most of the time they could just ssh into a box and run nodejs in screen and be 99% of the way there. Cloud providers convinced everyone that it's really hard to run a computer program. And companies set money on fire because spending signals growth, especially to investors.
Literally everything you said is the opposite of how I've seen people use "cloud" in the real world. I don't know what universe you're living in where things are as wonderful as you proclaim but I wish I was in it because mine is a nightmare.
Of course you’re right a VPC or a CoLo would be better. However that’s not what most people think of “cloud”
This is a simplistic view that doesn't discuss any of the trade-offs inherent in the choice between running your own hardware and using a cloud service.
Yes, it's a higher price, but it allows you to stop paying for it when you stop needing it. You can scale up rapidly. You don't have to deal with buying, maintaining, or replacing hardware.
You can rent servers much the same once you at least have a bit of scale. It's never colocation or cloud. There are lots of in-betweens.
economy of scale says hello
Or that your business has to replace a router on one of the dcs and then you have to do all the work to ensure nothing goes down yourself. You cant blame anyone if it goes bad.
Then you realize how much work that is. The cloud is really convenient.
Source: Always worked “in the cloud”. Current client is on premise(s) for solid reasons. A very unusual case though. Fun. Inconvenient. Makes you respect the big clouds even more.
Have you never heard of a colo? Rent 1-2 racks in one of those. And you probably won't need more than 1-2 racks because that's what Stack Overflow runs on.
You want to run your web app off your basement server? Go ahead, but if you have a blog post that hits the front page of HN, there's no way to scale up. If your home server has hardware failure then you're out of luck until you can get new hardware. Your home has a power outage or ISP outage? Your site is down.
If you can tolerate those things, great! I wish my employers and their customers would tolerate it.
Or in reality, it's weird transient failures you can't debug, and unexpected bills for some asanine reason. And scalability sure seems like something to avoid until you actually need it (as in the demand on the site is large enough, not that your performance is so bad you can't handle traffic).
Obviously I can be more cavalier with my uptime, as it is a personal server.
> but if you have a blog post that hits the front page of HN.
HN traffic is so small that unless you are using Raspberry Pi Pico, it is fine.
Your 5-year old Android phone probably can handle HN traffic just fine. (HN traffic is in a range of single-digit kilo qps).
For-most-purposes-infinite on-demand resources, provided in a format where you don't have to pay for when you aren't using it. And yes, we all know that this does mean you pay a premium for it when you need it, and (good) cloud folks understand that you have to design for that reality.
If you don't need that advantage, that's fine, but it's silly to act like it doesn't exist. Find the point where the delta in variable costs outweighs additional capex and steer your solution to that point, make peace with inefficient spending, or get out. You don't have to be weird about either extreme position.
This matters when you're an op in an IRC channel and someone joins the channel, starts spamming racial slurs, so you ban them, and they respond by DDoSing you.
If I had a dollar for every time it happened, I'd have $2, which isn't a lot of money but it's pretty annoying that it's happened twice. The first time, they only ran the attack for a few minutes. The second time it happened, they ran it for over an hour. Releasing and Renewing my WAN IP didn't stop the attack, because I still got the same WAN IP. I had to call my ISP support and spend way too long on the phone trying to talk to someone who knew what I was asking for to get a new IP address before I was able to dodge the attack.
Using AWS, I can configure security groups so that my VM never even sees the packets from a DDoS if I've configured them to only allow connections from my home IP.
many users are single cloud, unless your primary target is multi cloud enterprise.
there is benefit to multi cloud facade, but i’m not sure it’s worth the cost.
regardless, cool stuff!
Cloud is often viewed as if these don't exist: - Dedicated servers - Managed hosting - VPS - etc.
Most small to medium-sized enterprise could opt for such options instead of the cloud.
That's really what frustrates me the most about modern technology. It's like tech and software developers love to rube-goldberg things, and everything is vastly more complicated than it needs to be. The thing is, tech people seem to like complexity for the sake of complexity. It gives them a fun [read: masochistic] puzzle to work on and makes them feel smart.
Wing will pave the way and come up with a good abstraction or two, then become obsolete once general purpose languages with better ecosystems can do the same.
An additional goal seems to be to create a simplified language that doesn't have the surface area and dependency issues another language might have. While I understand the reasoning, I've been in tech long enough to know that usually doesn't work, and just creates another real world example of XKCD 927 (https://xkcd.com/927/)
They had a complete backup site with redundant servers, modems (for point of sales systems) and a redundant staff because the cost of being down was so high.
We never had to use the backup site the entire three years I was there.
If you're not going to be as big as the entirety of Amazon you don't need to serve 20 million concurrent users. Ever.
scale to zero is very good when zero usage is regularly reached.
I’ve been working in SaaS since 2008 and in cloud providers since 2012. And the use case I see over and over is people who don’t think they need or want geographical redundancy, until they do, and then they want it yesterday. Typically they are running fine for months or years and then there is an outage - maybe an AZ or a network partition - and then all of a sudden they’re scrambling for failover. Cloud often (usually?) has higher availability than the infra they migrated off of, and they grow addicted to it while not wanting to pay the dev cost for true high availability.
I've seen more region wide outages than just an AZ e.g. where AWS us-west-2 goes out and so it doesn't make a difference.
> and then they want it yesterday
Some great SaaS you've experienced. I've been at 1s that say they want it yesterday but then realize it's too much work (already in the cloud) and just let it go until the next outage and complain again but yet again nothing gets done.
Then that goes out the window the moment you have any media on that Pi... ISP upload speed in the US is balls.
Oh, did I mention that most ISPs in the US don't allow hosting websites? So are you putting that Pi in a datacenter?
And, I forgot to mention, what happens with you piss off some bored troll and get DDOSed?
Not disagreeing all the challenges you mentioned. Just "traffic from HN front page" is so easy it is a poor example.
I've become a much bigger fan of well-designed libraries. Do one thing and do it well, and preview a simple API for doing it.
But the two most popular front end frameworks that came out over the past few years didn’t exactly come from small companies.
Great tools can, and often do come from solo developers without large corporate backing.
And isn’t that the ultimate in survivorship bias? How many other languages and frameworks would you have left you screwed if you jumped into whole hog in before they had popular uptake?
You don’t need to physically rack servers, lots of systems integration vendors and remote hands will gladly put servers together for you. Most colos will gladly help you figure out connectivity. And there are lots of vendors, like Cisco, who will deliver a rack to your datacenter with virtualization software installed and everything, plug and play..
My point is there isn’t an either/or choice of using the cloud or building the universe from scratch, there are so many available options in-between. And while those options aren’t available conveniently behind an API and might require a few old school phone calls, you can save millions of dollars, get access to better performing hardware, have better control over data sovereignty, and 90%less lock-in if you choose to go down that road. It’s not for everyone but a /lot/ of workloads that people are running in the cloud can be done better elsewhere.
The person you're replying to is definitely overly flippant, but you've taken a sort of Gish gallop approach where you think if you list enough individual things that have to be done, that'll be overwhelming evidence that it's impossibly difficult. But the things you've listed aren't as hard as you want them to be on reasonable small business scales.
We are a company with 4 IT employees including myself, and two of us alone (both full-time programmers) handled our hybrid cloud migration. We rented a rack in a colocation facility. I learned how to design racks in a couple days and did the rack design myself. We bought servers from Dell and network equipment from Meraki. The colo facility found us an inexpensive contractor who racked and stacked everything to my design, and remote hands does any ongoing hardware maintenance. The other guy had an old, outdated CCNA and he designed the network. We got a fiber connection to AWS up and running for a hybrid cloud approach. All of this was very doable for a part-time two-man team with other job responsibilities and we're saving a ton of money for database and workstation hosting--big, expensive, totally static workloads. Perfect for on-prem. The ongoing savings vs. pure AWS exceeds my own salary.
It was clear from the outset that we could accomplish this. I wouldn't have signed us up for a boondoggle. Certainly, there are more demanding configurations where the complexity would be too high, but people act like on-prem is literally impossible without a team of dedicated staff in every case. It's not. It can be doable.
On-prem is like that. Yes, you have all the skills to originally stand it up. But you don’t know what you don’t know, and you make a bunch of resource trade-offs, usually by not implementing stuff that you’ll never need (until you do).
That was the point I was trying to make.
As I said though, the unique value of cloud is letting you focus on a business specific problem instead of reinventing wheels that have already been invented many times over.
As other a have pointed out, other benefits are scale-on-demand, pay only for what you use, and agility - if you have a great idea you don’t have to do a PO and wait months for a server.
AWS vs. on-prem is always a tradeoff. You have to look at the costs and benefits for your particular situation to decide which is best. We decided to go with both, because AWS has benefits for our dynamic workloads and on-prem has savings for our static workloads.
But what you described sounds like a packaging / software distribution issue.
Like, someone writes a one off Python script or program to do a thing and a year later it doesn't work because the host machine is using a newer version of Python and the dependencies need to be reinstalled to the new site-packages and they didn't document if they used the package manager or a virtualenv and a pip requirements file or setup.py or whatever.
The "it works on my machine" thing isn't really a "cloud" thing? It doesn't really solve the issue of having a weird bespoke service that nobody understands. Even if it's so abstracted from a normal computer that it has some esoteric requirement like an OCI image to run software, if the Dockerfile/Containerfile or whatever that generates the image doesn't exist/work/make sense then you have the same problem.
> As I said though, the unique value of cloud is letting you focus on a business specific problem instead of reinventing wheels that have already been invented many times over.
Reinventing the wheel like with docker ansible terraform kubernetes nomad aws?
Recently I was asked to help a company receive out of office replies to their web service that sent mail from Amazon SES. The client was sending mail from app.foo.org (with MX SPF for amazon) and wanted to receive them to foo.org (MX and SPF for outlook). Setting Reply-To or some other headers to foo.org worked in testing but not in practice. I maneuvered the amazon product menagerie and set up SES to get notifications on out of office replies and that also worked in testing but not in fact. Even then it would not store a list or provide details in the dashboard about replies without further using lambda or SQS or something. Every deficiency in an amazon product is "solved" by another amazon product. You're swallowing a horse to swallow a fly. In the end I just added AWS to the foo.org SPF records along with outlook's and set the From header accordingly; way simpler, didn't need to any more AWS products, and knowledge of DNS is more portable than knowledge of AWS. AWS is in the business of inventing wheels and trying to get you stay in their wheel ecosystem.
Not to contradict everything you're saying like you're wrong or something. I wonder what the circus is like for those of you who run it. Everything you say reads like high-level manager/sales engineer marketing talk from someone who spends all day in meetings. Not to say I'm an authority and that your voice is illegitimate; I'm just a resentful out of touch NEET waiting for the world to change to the point that I have nothing left to offer it.
From what I understand you need both a large or complex computational load and a lot of traffic before clouds should become the weapon of choice.
But Im not entirely sure.
It’s much less maintenance than my days of maintaining servers myself.
And not to mention half the reason I went to cloud was not that I didn’t want to deal with administering servers, I didn’t want to deal with server administrators.
When I was at the 60 person company where I got my start in “cloud”, I could experiment with different types of databases, scaling, and other technologies just by throwing something together and deleting the entire stack.
I worked for a company that aggregated publicly available health care provider data (ie no PII) for major health care providers. They used our APIs for their own websites and mobile apps.
When we got a new customer (ie large health care provider), our systems automatically scaled.
When a little worldwide pandemic happened in 2020 and our traffic spiked by 100%+, guess how long it took us to provision new servers.
Hint: we didn’t, everything just scaled by itself.
I compare that to the old days when it took us weeks to provision an MySQL server.
Managing infrastructure is doesn’t provide a competitive advantage unless you’re something like Backblaze, DropBox or another company where your entire reason for existing is your infrastructure expertise.
And the discussion is how much extra do you pay for it.
> Hint: we didn’t, everything just scaled by itself.
Again it's not free so what's the surprise? Are you surprised that you get water out of your tap? Hint: it just flows!
> I compare that to the old days when it took us weeks to provision an MySQL server.
Sounds like you've burnt in the past is all. So your on-prem is slow does not equal all on-prem is bad?
> Managing infrastructure is doesn’t provide a competitive advantage
How do you know it doesn't? You've only looked at it from your use case and based on it making you happy and saving you time. Nothing to do with the business needs at all.
So you didn’t see the rest of the paragraph that you snipped?
“unless you’re something like Backblaze, DropBox or another company where your entire reason for existing is your infrastructure expertise.”
> So your on-prem is slow does not equal all on-prem is bad?
How fast can you spin up a dozen VMs? A message bus? A scalable database with read replicas? An entire redundant data center in another region? A few terabytes of storage? A redis cluster? An ElasticSearch cluster? A CDN? A few load balancers? The procurement process to get an extra server provision in a colo will by definition be slower than my deploying a CloudFormation stack.
I bet it's true for many. I approximate it from what I see in backend/frontend teams - they don't even deal with eachother, not even system administrators.
Luckily [in the current project] devs don't have access to production and very limited to dev environment in terms of ssh/db endpoints.
I didn’t know cloud from a whole in the wall. But the internal IT department treated AWS just like they did their Colo. I thought AWS was just a bunch of VMs and I treated it as such for a green field implementation.
I studied for the AWS Solution Architect certification just so I would know what I didn’t know and to be able to come up with some intelligent ideas for phase 2.
I ended up leaving that job and working for a startup. The CTO knew I had only theoretical knowledge of AWS. But I had good system design instincts and he liked my ideas. I was hired as a senior developer. But that rapidly morphed into a cloud architect role. I took advantage of AWS and all of its locked in goodness including moving everything to either Lambda and Fargate (serverless Docker).
I had admin rights to everything until I voluntarily gave myself the same constraints to production that everyone else had when we hired a couple of operation guys.
We scaled without any issues as the company grew and Covid happened - we worked in the healthcare industry.
Now I work for AWS. But I’ve done my share of managing servers since the mid 90s as part of my job. That’s a life I don’t ever want to go back to.
Your examples here are just examples of situations where you basically need a cloud solution by definition. If these are your requirements, then yes obviously you should use cloud for it. That said, your points are a bit confusing. It's not an either-or. For situations like you're describing, you use cloud. For situations where you don't need to use cloud, you can consider something else like on-prem or colo or ...
You seem to have a (literally) extremist position where it's all cloud or nothing. It's not.
I literally just gave examples where a colo or on prem makes complete sense - anytime that managing infrastructure is a competitive advantage.
If you have a static workload and your company has the competencies to manage infrastructure, go for on prem.
I’m the last person to recommend someone move to any cloud provider just to treat it like a colo.
> Managing infrastructure is doesn’t provide a competitive advantage unless you’re something like Backblaze, DropBox or another company where your entire reason for existing is your infrastructure expertise.
You don't need to be a company "where your entire reason for existing is your infrastructure expertise" in order for managing your own infrastructure to be a competitive advantage. Managing (some of) your own infrastructure can be a competitive advantage even managing infrastructure is not your core competency or even your goal. It is a competitive advantage of the TOC is lower. It sometimes is.
But if you're now saying you agree with my statement, then I guess well we're in agreement.
But, really how often do you need to do that and what % of users really need to?
Also, once on the cloud some business management take so long to "approve" new expenses that in reality it may not really be feasible to do things fast enough for it to be a benefit.
I've quite often seen the need for 5-10 meetings or 2-3 written documents to get approval for 10 new VMs for developers or new servers for backups.
When testing something or you want to spin up your own isolated environment for yourself or for your team? Very often.
> Also, once on the cloud some business management take so long to "approve" new expenses that in reality it may not really be feasible to do things fast enough for it to be a benefit.
And that’s get back to my other point that when you do a “lift and shift”. If you don’t change your processes both IT and technical, you won’t see any benefit from the cloud and you will end up spending more.
There are so many ways that you can both give developers freedom and still have the necessary guardrails. I’m speaking about AWS because that’s the one I know best (and where I work). But I’m sure there are equivalent services on other providers.
For instance you can have a vending machine type of setup where you allow department heads to set up non prod accounts with organization controlled service control policies. You can use a Service Catalog approach where you surface Terraform or CloudFormation defined products where the users can only provision infrastructure defined by their administrators. But they can do it themselves.
Depending on which level of the organization I’m working with, I try to convince the IT department to give individual departments their own organizational unit to monitor and to embed someone from IT into their team - ie a “DevOps” philosophy.